 Okay, so my name is Vanilo or Dan Ciaroloni and I'm here to present to you Sandboxing Your Sandbox, leveraging hypervisors for WebAssembly security. Now before I get started, can I just get a quick show of hands for who was here for my talk yesterday at Rust Global? Is there anyone in the room that was here? No? Okay, perfect. This talk, the idea is we'll build upon that one so if anyone was here for that, I was just gonna preface that you'll probably see some repeated slides. But yeah, if not, this will all be new and novel and exciting. Okay, so now I'm sure that this title in itself can spawn a million questions in everybody's minds but I wanna start by answering a very simple question of who are you listening to? So as you saw in the previous slide, my name is Dan but other than that, I also work at Microsoft with a project called HyperLite. Now HyperLite is a project where we leverage virtual machine managers or hypervisors to execute untrusted or third party code safely. If you're into social media or GitHub, those are my handles right there. If you have any questions they wanna ask that maybe we can get to right now, that would be a good place to reach me. So you may think that with working with HyperLite, very security driven project, what I'm saying here is that, you know, I'm something of a security expert myself. But no, that's not the case. In fact, I made my very first contribution to HyperLite and consequently security only about one year ago. So what I'm trying to get across here is that even if you yourself are not something of a security expert, that's okay because I've been there and I am there and hopefully this will be a comprehensive talk and we can sort of learn together. Cool, so now what is this talk about? And before I tell you directly, I wanna show you a little clip from Microsoft Build with Azure CTO Mark Rusinovich when he introduced the project that I work on HyperLite. It's one of the things that inspire this talk in the first place. So I feel like he gives us a little bit of context. Now, the way is that we keep our infrastructure very efficient is by tightly packing customers together. Either virtual machines and that technology has been around for a long time. More recently within the last decade, we've introduced hypervisor based isolation for containers with Hyper-V. So you can see that in the middle. And those address, you know, traditional OS based applications, containerized applications, but we really see a place for user-defined functions, small functions that will execute either in the network data plane, like on a front door service, or as a user-defined function sitting inside of a storage service. But we need to strongly isolate them as well. And Wasm, while it has a great sandboxing technology and it's really got a great ecosystem around it, lacks the kind of isolation that we require for running a public cloud. So we've been... Okay, so I wanna focus on that very last statement from Mark Russ saying that Wasm, while it has a great sandboxing technology and has really got a great ecosystem around it, lacks the kind of isolation that we require for running a public cloud. That's what this talk is about. It's entering why. Why is Wasm's isolation sometimes not enough, and particularly in our use case of running a public cloud? I don't know how many people here were here listening to Nick Fitzgerald's talk. Mention how much Wasm time goes through fuzzing and whatnot, so why is that still not enough for us? And perhaps maybe more importantly than that is discussing how do we solve it and how do we address that we make it be enough for running a public cloud and still utilizing Wasm. Okay, so today's agenda is we'll start off with an overview of Wasm security. Then we'll split it into two. We'll discuss binary security and we'll discuss host security. And in the middle, we'll have an aside for what role language choice plays in generating secure Wasm. This was an aside that I had here for my Rust global talk, as you can imagine was discussed Rust, but I figure it will still be interesting to point out. So I left it in. All right, and lastly, we'll finish the talk talking about strategies for improving your WebAssembly security. Okay, so let's begin with the very first item in our agenda, overview of Wasm security binary security. So what comes to mind usually when I say Wasm security? Whenever I ask people these questions, two words bob up the most. The first one is sandboxing, right? Wasm is completely isolated from its host environment and that's resource isolation. And so any functionalities that it does have to use have to be explicitly stated. That's the number one thing that people mention. And the second one is also memory isolation with WebAssembly's linear memory, preventing out-of-bounds accesses and accessing the host. Those are the two main things. And those are great, those are great things that we definitely do need, but is that all we need? And to answer this question, I want to kind of flip it upside down. The snippet there you're seeing here is a snippet from the WebAssembly specification itself. It discusses, it introduces buffer overflows and it says that something, mitigations that exist for preventing buffer overflows in native binaries, such as data execution prevention, DEP and sex matching protection are not needed by WebAssembly. So to answer that question, perhaps I figure, like I said, to look at it in an inverse way, is this something that we don't need to play around with this statement. But before we get, we dive too much deeper into that, I just want to make sure we're all on the same page with what sex matching protection we're also known as secondaries are. So here on, I believe, your left, we have a very simple C++ program or a function called vulnerability. It takes in a string and we have a buffer of length eight and we do a string copy onto it. As you can probably guess by the name of no function vulnerability and also by the code because it's very simple. It's very easy to get a buffer overflow there, right? You pass in a string that's very, very long and we'll go over the buffer. Now, in native binaries, what we have is that the compiler will insert prolog code that will run with each execution to generate some bytes, i.e. the secondary, that will allow us to detect buffer overflows. So the stack will look somewhat like this, right in the binary, we have a buffer, we have a secondary and then we have things like the return address or so forth. And so if the buffer does get overflowed, it will corrupt the stack in an area which is something that we knew at the beginning of our program and so we know it's been corrupted and we know we were suffering all the stack smashing attack. Now, back to the question. Is this something that we don't need in a while, essentially? And rather than telling you, I figured we could look at a demo. This is a demo, it'll be displaying a vulnerability in the libpng library. This is a library written C. The vulnerability, as you can see, is an out of bounds write in libpng, the 8.8 vulnerability. And this is a demo I got actually from a paper from Daniel Lehman saying everything old is new, again, binary security of WebAssembly. And as you can probably guess by the name, this is something that we don't have in native binaries, but that perhaps has made new again. And I wanna follow my own demo, but we'll take a look. So, close this real quick and we'll open TS code. All right. So the example that I have here is we have the C++ program. And what it's doing is it's gonna use the libpng library, libpng as png utilities, right? And in this case, we're utilizing a function, PNM to png, which is where that vulnerability exists. And as you can probably guess, we'll convert our PNM file to a png file. And we're being, I'll say, pretty naive, right? To display the vulnerability here. What we're doing is we have, we're writing directly to the browser, so this will display it there. We have this image tag, right? We call convert and then we append that directly onto our document. Now, let's take a look at how that works. And we'll see a good example. So, let me run Python. Server.py, cool. So now we're serving this application. And I should say, so this is a C++ program, utilizing in script and we compile it to Wasm and then we have it running like that. And here we go. So I'll show you a good example first. I have a repos, inputs. This monarch.png, so it's a monarch butterfly and if everything works, we get a beautiful monarch butterfly. So that was a PNM file successfully converted to PNG. That means our users are behaving and the program is behaving as expected. However, what if we provide something, what if we're malicious, right? What if we provide something that it's intended to break the program? We can provide this. Now, I don't know how many people here know PNM files, but this looks very different from a normal one, which VS Code looks a lot like this. So what we're doing here is we're performing a stack to heap overflow attack. And so we write a bunch of garbage up until we reach the point in the heap or the linear memory that we're concerned with and then we can sort of mess up with that point of memory. And as you'll see here, we have some fun planned. So now let's upload that file. I'll reload the page, choose file and I will upload our exploit and lo and behold, we get a cross-site scripty attack. Now again, we didn't have that anywhere in our C++ code, right? All this does is it converts the image and put it to there. But with that exploit file, what we did have is we had this script alert and we have it even more. If we click okay, we see that we got pund. So yeah, this can be pretty catastrophic, right? And there's a reason why that's an 8.8 vulnerability because I mean, this is a pretty simple example, right? But say we had a social network, whatever, someone could post in our behalf, right? So if someone's having using Wasm, they're in someone posts on Twitter or X or whatever people call it nowadays. Cool. So that is a vulnerability in binary security. But how does that look with native binaries, right? I told you that the title of the paper was everything old is new again. And now, like I asked, how does this look with native binaries? This looks someone like this. I don't have the example with libpng, but I brought back that example that we had prior with that vulnerability function receiving in the string and passing in a string that's too big. We see that we do get a stack smashing detected. So we do corrupt our stack canary and this just doesn't exist with normal C or C++ or native binaries while it does exist with weather assembly now. So why does the Wasm specification say that we don't need SSP, right? Like, are they lying to us? And no, no, not really. And that's because in Wasm, we have two types of memory. We have managed memory and we have unmanaged memory. Managed memory is our code space while unmanaged memory is our linear memory or heap or whatever you want to call it. And those two are completely disjointed. So normal things that people will use, buffer overflows to do, like corrupting your own execution environment. This will not happen with WebAssembly. You can't have any go-tos or jump arbitrary locations in memory because that's in your management in your code space. So to quote Dango will look back at the people that you might have seen around the conference. At worst, a buggy or exploited WebAssembly program can make a mess of the data in its own memory. This is from another paper bringing the web up to speed with WebAssembly. Cool. But as we've seen, this can still be sometimes pretty bad with that A-pay and A-formability that's made new again in WebAssembly. But enough to speak, how can we counteract this? And I'll go deeper into this later in the talk as you saw in our agenda. Our item two is strategies for improving your wasm security, but that's the time and the agenda where we're talking about that side of language choice. So I wanna say Libpng, like I mentioned, is coded in C, right? And memory unsafe languages will still generate memory unsafe wasm modules. They will not fix your code for you, so you still have to do your due diligence to ensure that your modules are not vulnerable to malicious input. People can still mess up your memory. I didn't show it here, but say you have one vol function and you have another vol function, if they look the same or somewhat similar, you could even corrupt other stack frames and go and mess up another function. So we can use Rust, and that's one option, you know, it's a memory safe language. And to keep it simple and to maintain our theme of buffer overflows, I have a very, very, very simple example where on the left we have a C++ program, we have an array with a length five and we write it on the fifth element and that in C++ does compile, obviously you can get warnings, right? And it even runs and you can have undefined behavior while in Rust it doesn't even compile. And the point that I'm trying to get across for that very simple example is that with Rust's memory safety guarantees, like, you know, having to be explicit about unsafe code and even the ownership model, it's way harder to follow the common pitfalls that you would follow with C or C++. But let's put that in the back of our minds for a second because we'll get into strategies for improving wasm security later on in the talk. Right now it's time to talk about host security. So is this the world that wasm has secondaries? Is that all there is to it? No, previously we looked at a module that is vulnerable to malicious input, but what if the module in itself is malicious, right? What if we, the ones that are running the module have malicious intent? That is the concern of host security, which is, that's a beautifully stated here, the ability and fortification of the environment against wasm modules with negative intent. So this has a whole new attack factor with a couple things that we have to be concerned about because now we also have to be concerned with where we run our code. Like, what if the runtime allows us to break out of send box or linear memory things as you can probably guess can get pretty bad because we'll have access to the host. But again, rather than me telling you, I want to show you a demo. This time it's a demo from wasm time. This is another out of bounds write on x86 and 64. This is a 9.9 vulnerability, so pretty much as bad as it gets. And I'll say thanks to Alex Quikton for helping me put this together. Cool, so let's take a look. Now I am on a Mac M1, so I have to go to my VM, my dev box. Okay, so here we are. And we have two watt files. Very first one, all that we're doing is we have this load function that will load somewhat arbitrary looking piece of memory. Actually, let me make that a little bit bigger. Is that better? Some watch arbitrary piece of memory, but that's related to how the vulnerability works. And then in B, and I just want to point out, all we do is we have this function that calls load, right? It's not modifying its memory in any other way. So technically, A's memory should be all zeroed out. But we have B. B will import A, again, right? And every functionality has to be explicitly imported. It, like A, it sets up its memory and we have a function called trigger, which will call A's function again with an input that's related to how the vulnerability works. But other than that, it's also setting something in its own memory. It's setting that value 42 there. So what we're gonna do is now we're gonna run this. And okay, thankfully I have the code here, I have the command. We're running wasn't time run, polling allocator. What this does is that it will put the two modules side by side. So we have, we're aware of how the memory layout is gonna look like. And we preload A and then we call B and we call a function in B trigger bug. So again, A doesn't have anything in its memory with A calling, with B calling A, that trigger bug we should get nothing, we should get a zero. However, we get 42. That's pretty bad, right? We have one module breaking out of its sandbox, accessing another module and getting hold of its memory. And now again, this is very much crafted to display the vulnerability, right? We have the polling allocator putting the two modules side by side. But if you think about say an embedded example, right? Where we have sort of linear memory or non-linear memory, restricted memory. Again, things are very tightly packed together. It's very easy to grab ahold of your embedded host and who knows, maybe we're running in a Cisco. And that's the reason why we have that 9.9 vulnerability there. So let's go back to our example or to our slides. So now we're reaching the final point or agenda for strategies for improving wasm security, right? We displayed a binary vulnerability, we displayed a host vulnerability. If binary vulnerabilities can sometimes even leak into host vulnerabilities, so we have an idea of the landscape and how things look like there. But what can we do to fix all this? The strategy that we're proposing in this talk is security in layers. What does that mean? This is where we grab rust that we put in the back of our minds right at the beginning and we bring it back out. Because rust is the very first layer that we can add a memory safe language which can go a pretty long way. On top of that, we're gonna layer wasm. You know, it will provide us with resource and memory isolation, but as we've seen, it can still have problems, it can still have bugs. So we need a third item there. Now, as you can probably guess by the title of the talk, this is where we leverage hypervisors for other things like security. And that's where we bring in harder virtualization or virtual machine managers, right? So what do hypervisors bring to the table? But before that, I just wanna say, we're adding hypervisors on top, right? To address the fact that humans are fallible, right? And we can still have issues, we can still have bugs, but no solution is perfect. In the same way that we have bugs in wasm time or whatever runtime we wanna use, we can still have bugs in here. The whole point of security in layers is to increase the hurdles that an attacker has to overcome to reach an issue. So we're decreasing the attack surface. Okay, now let's do address that question. What do hypervisors bring to the table? The principle behind VMMs is running multiple VMs under one physical host, right? So something, so things will look somewhat like this. We have a physical host, we have a memory safe Rust wasm app. On top of that, we have a wasm sandbox and then your memory. And on top of that, we have our VM. And what that brings, first and foremost, is deep isolation, right? So if a wasm vulnerability, binary vulnerability that leaked into a host vulnerability caused a problem or even a wasm runtime has a bug, everything or any damage that it causes will be restricted to that VM. So it doesn't touch anything outside of that. That's one thing. The next one is resource control. That is the capability to hold execution if an app is taking too many resources. So something modern hypervisors allow, give you. And so if say a program is hogging, you can stop it. And lastly, again, another feature of modern hypervisors is snapshotting and rollback. So if we do have a security incident but we had a previous state that we're happy with, we can always rollback to it to our snapshot or rollback. Okay, so now I wanna finish off with sort of coming full circle and showing the demo from Mark Russ again, the demo that he showed at MS Build display and how we're using HyperLite with to answer that question or case study here. Let's go see just how awesome this is. This is one of my favorite demos here. So here I've got, I'm gonna show you, I've got, running on a Linux machine, Linux DOM zero is what we call Linux as the host partition for a Hyper-V and you can see Microsoft Hypervisor there. If I take a look at the amount of free RAM on this system, you can see it's, or the use RAM, it's got 1.1 gigabyte used. Now I'm gonna kick off a script that's gonna launch 2,000 micro VMs. And it did it and you can see it did it in less than two seconds. I've got 2,000 micro sandboxes running on this VM. And if I go take a look at how much memory I've consumed, each one of those is just about 300 megabytes in size. So tiny, able to handle user defined function. Now I'm gonna have, make some function calls just to show you that this is kind of performance that's close to just calling another function across a process, but I'm actually calling into a virtual machine. And you can see the latency was about 250 microseconds. And as far as programming models, we wanna make it really easy for people to create these. So here's a Blazor app, which is a website that says hello and prints a request count, which is a static variable. And if I launch this, there it is, running on plain vanilla OS. And if I do refreshes, every time I refresh, I'm gonna see the request gonna go up because that's a static variable that's increasing. So let's say that I wanna strongly isolate that code, I'm gonna use the HyperLite isolator. And it's as easy as invoking that chunk of code and giving it to HyperLite to put it in a sandbox. And now each time I call invoke, a new micro VM's gonna create it from scratch, which means none of these are gonna share any state with any other. There it is running wasm. And if I do refreshes now, that count stays the same, which is because each one of those is a fresh micro VM. So that's the kind of programming model that we're creating for this thing. I think I'm back on. So what just happened is that we had, or WebAssembly module, at the time module right now, it could be a component, right? Running on top of your host OS and your hardware, and that was about it. But what we did is we thinly wrapped within a HyperLite isolator micro VM. So we're adding that layer on top to provide extra isolation. And that allowed us to have 2,000 micro VMs in under two seconds, where each micro VM is about 300 megabytes in size, and also capable of making 50,000 function calls with an average response time of around 250 microseconds. So that was a big challenge there, right? We don't just wanna add another layer on top of WASM, say just a normal VM that has a full OS and then lose everything that WASM has, that's interesting. We wanna have a thin layer, just that micro VM, that maintains WASM benefits, but also provides an extra layer of security on top. Now, and also while keeping a very simple and easy to use programming model with that invoke that has like that Lambda function that creates a brand new VM for each request, right? That one-to-one relationship between user defined functions and VMs and WASM, all hugging each other. Okay, so now before I open up for questions, just wanna do a quick recap of what we did today. We one start off with an overview of WASM security, right? We discussed things that people usually think about with sandboxes and linear memories. Next, we look at binary security and we looked at a vulnerability there. Then we looked at host security and a vulnerability there and we saw how security in layers is our proposed solution to sort of counteract this and remedy bugs that can exist in different layers of the stack. And then we showed at the end, our case study Microsoft's approach to address this with hyper light and saw sort of the challenges of maintaining WASM's benefits while also adding more security on top of things. And that is where I end. Any questions? Please go ahead. So there's no network stack there. We're just speaking with the VM as if it was like a normal function call. Yeah, please go ahead. Yeah, that's a good question. So we're using sort of the term micro VM sort of loosely here and that sort of used around sort of literature to describe a myriad of different things. But for our case, we have this micro VM or sandbox and that does not have an OS. So yeah, WASM was running even without an OS in that case. So yeah, so you may think, right, how are you even handling WASI, right? What, how can you have a system interface if you have no system? And the idea there is that we obviously we have some sort of smokes and mirrors, right? But the idea is that we have, so say, we had a printf function, right? That printf function is actually routed towards your hypervisor function, your host function. In that case was host print, I think we call it. And that prints within the hypervisor host that you can implement. And we have different implementations we have. For hyperlight, we leverage a KVM. We leverage hyperv, hyperv for Linux. And so it depends on the hypervisor as well. It functions host there. Please, can you say that again? Yeah, there's a little bit of, we'll add a penalty because we are going inside of the VM with guest and function calls. So the guest will call onto the host. That's one barrier that we have to cross. And then the host also has to reply to the guest. And we can get sort of exceptions, right? Everywhere in the stack. So say a host gets an exception, guest gets an exception, and handling that creates the overhead that you saw. Please. Yeah, so we fully intend to use this for production. We're discussing it internally to utilize for Azure Front Door, so our functions servicing. But we're also planning on open sourcing this. We have a very hand-wavy timeline for an open source in this. And I can discuss with you after and in the all. But yeah, definitely for production and moving towards that. And hopefully if things work out, open sourcing. Very soon. Yeah, we had another question, or? Oh, with open source? Yeah, all right. Lots of good questions, please go ahead. Can you say that again? Yeah? Yeah, there's still quite a lot of benchmarking work that I think we have to do to sort of see and compare bench or performance with, I think others. There's also similar projects, right? From AWS and Google, like Firecracker and G-Visor. That's sort of somewhat target similar things. So we definitely have work to do to compare there. But yeah, I feel like it's inevitable that we're gonna have increased size and also decrease the speed by a little bit. But the goal really is to continue and try and decrease that to get closer to the benefits that Wasm still provides with the added security. Yeah, please. So we have 300 megs, 300 megs per micro-VM. And the total there, I think we spent up to 2000 micro-VMs. And it's, well, I forget the words played in the video, but I think we went from 1.1 gigs to 4.1, something like that. Yeah, I mean, I wouldn't be surprised if I made the math wrong, but you can probably correct me there. Thank you. Thank you. I appreciate that. Even better? Yeah, so you can, well, obviously we didn't get to display it here, but with HyperLite, you can completely define your memory configuration. You can define how much memory you'll get on top. So you could over-provision and have more over the minimum that it does use. Any more questions, Joe? So does the binary vulnerability that I just played in the beginning escape the sandbox? That's your question? Yeah. It doesn't, no, it doesn't, I don't believe you could escape in that sense of messing up your host. All that it's doing really, it's messing up its own memory, right? So we have management, we have code space, and we have unmanaged memory. So, you know, your heap or your linear memory, and that's what is messing up because we have structures that are set there. And like I mentioned in another example, we could even corrupt different stack frames, right? So you have one buffer, you could corrupt the other buffer, that's all in your own, linear memory, the only problem. Yeah, I mean, one thing that it does add, you can have, right? It's if you write a bad program, that's easily exploitable buffer overflows, that's still gonna display with your awesome. And that's one of the reasons for perhaps using Rust. But yeah, yeah, so that's correct. Yeah, I haven't gotten to play around much with vetting, and also fuzzing. Nick's talk was actually quite interesting because I think there's a lot of work that we can do with fuzzing and also vetting for even hyperlite with wasm modules there, even wasm's myth generating a bunch of wasm modules for testing things could also be pretty interesting. Yeah, another question? Yeah. Yeah, so the idea there, right, is to target the multi-tenant. So you have one function, one VM, but we have VCPUs, right, that are specific to your sandbox. And so those will be one-to-one relations. Any other questions? Okay, if not, I think that is the end. Thank you.