 Hi, everyone. My name is Dan Phillips. I am going to be talking today about ideas and experiments in building a WebAssembly First OS, and WebAssembly First being purposely vague because of the use case of building a operating system that is for WebAssembly and possibly with WebAssembly. Just quickly about me. I am an engineer and the WebAssembly lead at a startup called Loophole Labs. We focus on working on primitives for developers. So engineering primitives, we do networking stuff. We also do plenty of WebAssembly stuff. We just recently released earlier this year the scale function runtime, which is a zero-dependency runtime that is written in Go that leverages Y0, which we have a Y0 hero back here. So you can check that out. We're going to be using it a little bit in this talk, but we'll get to that a bit later. On the Internet, I'm mostly at this handle, the underscore filler on Twitter or whatever people are calling it these days. And also, I'm from Chicago and I run the Woz in Chicago group, which is a meetup group. We're kind of a reading group. We're trying to do a little bit more of both. We actually just had Ben come speak back here from Dilib a couple of weeks ago, and we'll have some more talks later in the fall. We will also have some in-person events coming up too, hopefully around KubeCon, if everyone's going to be at KubeCon, stay tuned for details there. Okay, so talking about operating systems. We'll just briefly get into a brief history, but not to bore the room. Today, we sort of think of an operating system as starting with a blank canvas of a CPU, but throughout history, this wasn't quite the same perspective. Typically, things were developed for things that kind of now resemble operating systems, but were much more like operating system-like in the past, mostly due to physical constraints, hardware constraints, and the goal of the system itself. It wasn't really until we got to microcomputers that we have something that even resembles what we think of today as an operating system. So it goes without saying that UNIX, when we think about operating systems, we have to talk about UNIX, especially in this context. UNIX is about 50 years old or so. Give or take, POSIX is about 30 or 40 years old, and we're gonna kind of talk about things from a perspective of UNIX, but also sort of what they left to be desired and what the same group of people tried to rectify with something called Plan 9. Who's heard of Plan 9? Awesome, this is, yeah, okay, good. Not many rooms, people would raise their hands like that. So Plan 9 came from Bell Labs, and it was sort of probably the result of a feeling that many of us are familiar with, where you see what you released, and then you look back and say, wow, I wish I could have done these things differently. So we're not gonna talk about all the specifics of Plan 9, and I was actually really happy to see that Chris mentioned it in the last talk. But Plan 9 also has influenced some things, possibly indirectly with some design choices around WebAssembly and specifically Wazzy, which is pretty cool, and we'll talk about how that fits in later in the talk. And then of course, this is the embedded room, which I think that maybe there was no better place to put a talk like this. So there are embedded operating systems. We're gonna talk about them, but sort of the core of this talk is gonna be a little bit more about general purpose operating systems and what the general, not population, but the general developer thinks of when they think of an OS. So when we zoom out, if we could think about and kind of just philosophize about what a minimally viable OS is, what are the sort of tent posts that we could all mostly agree on would constitute an operating system. I think that first and foremost, that would be process management, right? So whatever you think a process is, however you define it, you need a way to manage them, to start them, schedule them, kill them, and then possibly communicate between them and so forth. Memory management. So an operating system has to be able to manage memory. It has to be able to allocate memory, de-allocate memory. It needs to be able to protect memory and do many of the things, but those are kind of what we're gonna start with. File system management. This one is a little bit more contentious, so it's possible that an operating system doesn't need a file system, but if we think about Unix, there's a particular detail in Unix that has persisted, which is that they believe that everything is a file. This room would be interesting. Do we all agree that everything is a file? Okay, good. Okay, wait, we got one, right? So part of the design re-implementation with plan nine was to sort of look at this and be like, oh, you know what? Maybe everything is in a file. In fact, most things aren't files. But when we think about Unix, everything is a file, and we're gonna treat it as such, regardless of its file-ness, right? So file system management is... I'm gonna argue that many people would say that an OS kind of needs a file system. I owe. This is a challenging one, especially in the context of this talk, and so I would like to summarily wave my hand at this part. But specifically, we can think about UI, networking, drivers, and put that all in I.O. for the purposes of this discussion. If you have any questions about this later, please don't ask me, because I'm not gonna talk about it. So, no, I'm kidding. I am gonna talk about it, but that could be a whole other talk, right? And then security and access control. Who can do what with what, right? And how is that decided? This is also a very cool thing that Plan 9 went back and reimagined outside of the constraints of groups and users. And this is also something that I think most people in this room probably understand that WebAssembly does quite a bit differently, which is pretty interesting. Okay, so these five things. We're gonna go with these five. Speak now or forever, hold your peace. But I'm gonna take these five. Great. So our goal here, like I mentioned, is for a general use approachable OS. So that's what we're gonna go for. I'm gonna talk through a few experiments and we'll see what we can do. But first, why would we do this? Well, WebAssembly redefines OS primitives, right? Things like a process. You heard, if you went to Luke's talk earlier, you said we have sub-process processes, right? Sub-process sandboxing. So a WebAssembly process, the execution of a WebAssembly instance, could be considered a process, but it's not really a Unix process. In fact, if it's like a Unix process, it's just because something is being processed, right? But it comes with a lot less baggage, quite literally, in fact. In a process communication, this is another thing that you could imagine exists if we think about a WebAssembly execution, being able to communicate with another WebAssembly execution. Then we have IPC, but it's not nearly as heavyweight as we have in Unix. Security and access control. This is part and parcel of WebAssembly itself, but also there are many, many details here, and it redefines how this could be implemented, both in WebAssembly and in WASI. So the same assumptions may not be applicable, which means there are opportunities for increases in performance, for sure, and resource utilization. And lastly, I mean, why not try this, right? Why not talk about this? Okay, so I'm gonna talk about some prior art in this space. There are a couple projects. These three some people in this room might recognize them, and these three are different, interesting projects that all work on running a WebAssembly runtime in Ring Zero in privileged space, okay? So this first one, Nebula, is a really cool one from Google Summer of Code. It was a project, doesn't quite work now, but it's pretty close. It uses CraneLift for the machine code. Really, really interesting. Similar one is Quast, K-Wast. I'm not sure about pronunciation, but also a very interesting one that takes the same approach with a couple more dependencies, but basically you can run a WebAssembly runtime in privileged space and see what you can do with it, which in this current status is not a ton, but it is a proof of concept. And WASM kernel goes a step further and just makes it happen right from the firmware, but this one has definitely had some rough edges, but it's also fun to look at. So I bring these up because a lot of people think about, okay, if we wanted to make a WebAssembly-based operating system, this is where we would start. And what I'm gonna do in this talk is actually talk in the other direction, okay? But I highly recommend that you check out the other talks that are sort of in this vein. We've got Ralph's talk talking about using this with SystemD, I believe, and then also Dan's talk, which is right after this, which I'm gonna hurry over for so that I can check it out about WASI specifically. So we're gonna approach this from the other direction, like I mentioned. And that means we're going to start from the perspective of a user space program, what we would consider a user space program. And then we're gonna progress concentrically into a more overall encapsulating system. And the goal is, like I mentioned, to make a system that is optimized for WebAssembly but also with WebAssembly, and I hope that becomes clearer throughout this talk. And as an aside, we're gonna use the scale runtime, which is similar to a plugin system, but also just a function runtime that we built at loophole. And if you wanna check it out, here it is. Trust me, that's not malware or anything. This just goes through our doc site and you can see about how we implemented the function runtime and go, and then also how we deal with things like strongly typed modules with our scale signatures project. And you can play around with it and use it right now. Okay, the first experiment is a system layer. Some assembly required. Some WebAssembly required, yeah. And if that's the worst joke in this talk, hopefully we can just leave it at that. So, GVisor. Who's heard of GVisor? Someone with a Google shirt on. Yes, okay. Yeah, some other people in the back. Great. So, GVisor is an application kernel that's written in Go, which is really nice. It's also a really beautiful piece of code in terms of how it's written. It's really clean, well documented. So anyone here who works at Google, tell your friends, I really enjoyed digging through it. What it does is it provides an isolation boundary between applications and the host kernel using an OCI runtime called RunSC, right? And this integrates nicely with Docker. And the RunSC runtime works well with Docker and Kubernetes for Sandbox container execution. And why would you need something like this? Well, containers by themselves, as probably most people here know, are not secure Sandboxes specifically. Why are they not secure Sandboxes? Well, they use a shared kernel, typically, not always, but this can lead to vulnerabilities and potential container escapes. Although in the history of containers, people would know a lot better than I do that this has gotten a lot better in the last 10 years or so, right, for sure. And then what GVisor does is it offers this another layer, which is implemented, it implements Linux within Linux in user space, and then you don't need to do anything with fixed physical resources. So what we decided to do is take it and remake it, fork it, and call it ScaleVisor. And what this does is the syscalls in the nice way that this is implemented in the GVisor project. We can intercept the syscalls, and then we can replace them with scale functions, which are WebAssembly functions. And what this essentially means is that the GVisor project provides us with a really nice set of interfaces, and then we take care of the type translation with scale signatures. So it looks something like this, right? So you have your OCI complaint image. You use the GVisor runtime. It makes a syscall that is intercepted and replaced with a scale function, right? So just do this quickly. Right, so here is a basic C program that opens, reads, writes, and closes a file, right? And what we have in terms of the container is a very simple Docker file, which takes the Ubuntu-based image, makes some directories, changes to them, and then executes our executable. And what that means in terms of the actual syscall layer is in the guts of GVisor, what we did is we found, like I said, very nicely documented, we found where these syscalls were implemented, right? And these are specifically referring to POSIX compliant function interfaces. And so this is where, for example, the open call is implemented, and just like in the kernel, I believe, or at least in GNU, the open call actually makes an open at call, and what that does is then tells you what wants to be open and then returns the file descriptor. So we took that, and we swapped it out for our own runtime. And the result is you get this user space code, okay? Oops. And you can pass in. So this is a forked version, and I built it locally and hijacked all these interfaces. You can take RunSC, pass it in as the runtime with Docker, and then you get, as if you are on a native platform going to native syscalls, this is where you get the WebAssembly intercepting the syscalls with a file system, which I'll talk about next, and then returning them back to the host. So nothing on the host OS is touched. This is all technically in user space, but that distinction I'd like to do away with, hopefully towards the end of this talk if I have time here. Okay. So that's the first experiment. What does this enable? So actually sandbox system calls, right? Like I said, some assembly required, like I didn't implement all system calls. That just didn't happen. So you can do it, though, if you'd like. You can check out this project, or you can contribute to it. Right now I have four, and that was enabled by this project I'll talk about next. What this allows to is observability. You can record IO. You can do things. You don't have to worry about EVPF. You don't have to deal with the simplicity of IO. You ring all these things, right? And you can selectively decide what gets passed to the host system in terms of this calls. So for example, you could say, okay, all file system stuff goes to an in-memory ephemeral file system in WebAssembly. All networking stuff goes to the actual host. And how this is done, how this can be done, is by what we've done to the runtime, which I'll also talk about. Okay, experiment number two, add a virtual file system. If you noticed, I did say scale VFS open. So syscalls are great, I'll cart, but it might be better with a stateful file system for file system operations. And it turns out there's no reason we can't have near full POSIX compliance with a file system. In fact, it's almost the same and maybe better. Depending on how you look at things. So if you see the middle-large rectangle that says VFS that's wrapping all these other things like a block-based FS, network FS, almost all Linux distributions come with a virtual file system, right? So when you're interacting with a file system, it's actually an in-memory file system, like for the most part, with what you're executing. And it's very good at making sure that the in-memory stuff gets the actual block devices, et cetera. But what we do have to do with Linux is you have to context switch, you have to go to kernel space to do this stuff. So it's possible that if we took this same idea, we don't worry about the persistence layer for now. We take this and this executes in another WebAssembly instance, possibly in a component in the near future and also with a better story for module linking, things like this, that context switching, we can get rid of. We don't have any of the overhead involved with that. So in that sense, you can make the argument that this type of file system operation using the same type of thing with an in-memory file system is faster than native code in some respects. So you can check this out. We've been working on this for a while. We are implementing full-posix compliance and we are, like, 80% done minus some rough edges. You might be wondering, how does this work? Well, we just looked at posix, which is nicely documented. We created a very similar system. We did it in Rust. But there's some really key differences with things like sync, flush, and lock. And I'm not going to get into those details, but I'd be happy to talk with everyone else here about that. Right? Cool. Yeah, and so, like I mentioned, there are different architecture patterns so the one that we went with for this example was having your guest program run in a single Wasm instance and then having it mediated between the runtime with another stateful instance that is just the virtual file system. So for the sake of time, I think I covered that, but we're going to keep going. Third experiments, which was a slight diversion. So the diversion here was I was thinking, well, okay, so we have this WebAssembly system layer and most people wouldn't even know that it's WebAssembly. They don't need to know. You can open, read, write, and close a file. That's all great and everything. But if we want to do something where we use WebAssembly and take advantage of its own sandboxing properties for the user space program, why don't we do something like bring the user space program to Wasm? So we currently have the container runtime, syscull interceptions, and AVFS. And we just came up with this small way in kind of a hacky diversion to take a Dockerfile and then spit out a Wasm binary, right? And so that's actually not as hard if you have a Dockerfile, which is the key thing. To take it, spit out a Wasm binary and then optionally maybe in the future, I tried to do it for this talk, didn't quite get it done, but optionally why not bundle the runtime itself and make it a standalone binary that you could just run, right? Kind of similar to the sort of RL box that Chris was talking about. Kind of cool, kind of fun, and hopefully easy, right? So this is the Dockerfile, and you'll see we have these four steps. And with a command just like this, it's called WasmBoxer. You build it just like you kind of would in Docker, and that's that. The actual code that I have is right now more just an explainer doc, but this is it, and this kind of explains how it works. You can kind of poke around with it. And also because of the API spec for Dockerfiles and for the declarative format, it wouldn't be that much more work to actually include full compliance with these commands to do something like this. So kind of a fun project. I'll try to post more as we get it done. But how does this work? So like I mentioned with the virtual file system, we have libc interfaces. We have another project that provides the actual interfaces that you would have in GNU libc, and that is basically just looking at the POSIX spec. It has the interfaces, and then there's also wazelibc, which are the interfaces if you compile something to wazelibc. And then we have a sandbox file system. So here when we're taking the Ubuntu latest image as the base image, we're actually not getting anything but the libc interfaces. This is kind of the don't look at the man behind the curtain part, right? Then we have the make directory, which is just the file system, just the virtual file system. We have copying, which is essentially just an mmap into the file system. Then we have changing the working directory, so just in the file system that we built. And then running the command, which is actually the web assembly runtime executing the command. And that's it, yeah. So like I said, kind of a diversion, but now back towards the main point. So moving toward a minimally viable OS. As I mentioned with those previous projects, one of the ways that a lot of people try to do this is they start with a unicolonel and a small runtime. We've all heard of unicolonel, yeah, for the most part. It was very popular like five years ago, and then Wazen took over, and here we are. You know what I mean? So yeah, cool. So unicolonels I think are super awesome. And so if that's the way we're going to go with this, what we tried is we have a thing like a library, this like unicolonel idea. We have a small runtime, and for us that would probably be whammer, which is the web assembly micro runtime, which is managed by the bytecode alliance. Really great little C project. But we have these libc interfaces. We have this Dockerfiled generation, Wazen user space program. We have a VFS. And then we have host exports for other system calls, which in the future with our scale function runtime is going to be what we call extensions, which I hope that I don't give too much away. My boss is sitting right over here, so we'll just keep that there. But so like I said, whammer is what we would prefer to use. But to actually build the unicolonel, which takes a look at your host program, your guest program, and your user space code, and says, okay, I just give me a kernel that can execute these CIS calls in a way that makes sense. And probably the best way to do this is with Unicraft, which is really simple, but there's also Solo5 and IncludeOS. These are some pretty cool little projects that can help you get started. So is this a library OS? In this state, it absolutely is, right? It's a library operating system where you build a library for your user space program, and then that's what you ship. So it kinda is, and, oh yeah, I forgot about this, this kinda, this is another diversion. You should check out Dan's talk on the philosophies about WebAssembly and Brace is kinda really, really great. But our goal would be to take this one step further, right? So instead of just having a library OS that has this all together with the actual runtime, what if we did something like no kernel and the runtime is the kernel, okay? This is what I had ambitions to start, and I decided to go the other direction to sort of see how we could do it incrementally. So what would this mean? This would mean that we look at the runtime, say was0, which I might have done this the other day, and you could strace it and see what kind of system calls does it actually depend on. And I did this with a few runtimes and it turns out it's not that many. It's not a ton. It's things like you would expect, right? Like allocation, deallocation, protecting memory regions, stuff like that. So theoretically, what you could do is take those, okay, and then you also need another way to boot the thing, right? So you need a knit system. So that could vary depending on what you're trying to do. But what allow you to do is take those, implement them in user space in the runtime and then take that and boot it in privileged space like ring zero, something like that. Or if it's not x86, some other privileged part of the CPU. So to recap, what we've had so far is a user space program that needs an OS, libc interfaces, WebAssembly, that can be generated anyway, but this Docker thing just kind of shows you the steps in terms of what kind of stuff that an OS might provide. We have a virtual file system and we have host exports for other system calls. So if we go back to the very beginning when we thought about a minimally viable WebAssembly operating system, we could argue that process management is done in the runtime. That's pretty simple, right? That's a very simple analogy that people can point to. Memory management is done in the runtime but it's really just part of the structure of WebAssembly itself and the actual implementation. File system management, this does need a file system. So it's virtual only in the sense that everything is virtualized. Security and access control, WebAssembly imports and exports. And then IO, which I'll leave for last, also the responsibility of the runtimes. But specifically, WASI, and I believe Dan's talk is going to touch on this quite a bit, but that would be a whole other talk in terms of doing this and having the virtualization layers in place to have this make sense, which is something that the component model promises to do quite well. And from what I've seen, especially with the most recent tooling, that could be very, very interesting going forward. So, again, what would this be for? Well, it's smaller, safer, faster. And people might say faster, that's why there's an asterisk here, is because, sure, the actual execution may not be faster than native, but by removing the context switching specifically, certain things become orders of magnitude faster, right? Which is really interesting because a lot of people in system spaces think a lot about how to minimize the cost of syscalls. Overall resource allocation can become more efficient, for sure. I think everyone is looking towards that with using WebAssembly for lots of things, but in this way, it changes, it changes not just the execution cost, but also the cost of actual memory allocation, things like that. And then, finally, there's no kernel attack surface. If you don't have a kernel, there's nothing to attack, right? But you obviously have other problems, right? So this doesn't change things that are internal to this construct. The future. You could run it in the kernel. This is also something that we played around with, which some people have done. In fact, Wasmar has an implementation of this from a few years back. You could do it as a kernel module, which we also played around with a little bit that started the play with. Ring zero, like I mentioned, quite a bit. Very obvious one, but super fun. And we would like to have an implementation like that done for this talk. But, you know, of course, time's not always on your side. And then the component model, this will change how these things are abstracted, and hopefully in a way that makes it even more approachable. And what that then does is it kind of pushes the problem space. If we have these sort of fungible units of compute, this pushes the problem to the space of the distributed computing, that you will, instead of dealing with things at a sort of local level in an operating system, you could have many operating systems where you have to deal with them and how they are spun up, scheduled, moved around, et cetera. And then lastly, a WebAssembly CPU. So I kind of laugh because this seems really ridiculous. And, you know, this, even talking about this, bringing something like the Wasm ISA to the actual hardware is so far beyond my pay grade. I'm not going to tell you if this is a good idea, a bad idea, or an interesting idea, it's just kind of an idea that's out there. I did see a very fun Twitter exchange with Brendan Eich talking to someone about this. And it kind of made me laugh. So who knows, people are thinking about it, I guess. Cool, thank you very much. Yeah, appreciate it. And let's see, I think I got like five minutes, four or five minutes for questions. I'd be happy to take any. I can't guarantee that I can answer them though. So, yeah. Anybody? Yeah. Have I personally? Yeah, you certainly would. But that would also become the problem of the runtime itself. So, yes, have I personally dealt with them? Not for this talk, no. Yeah. Absolutely. Yes, absolutely. I mean, and that is definitely possible using Whammer. So there's quite a bit that we could do there. We most recently got started. I have some limited experience with Whammer. We are working on having that as sort of a configurable runtime for our function runtime that we're doing. And I've only recently gotten deep with it, but it's absolutely possible. Yeah, cool. Okay, thanks very much. Thanks again.