 So my name's Ben, I've been in Wazen since the beginning, and I'm gonna talk about my research engine called Blizzard. So just to kind of motivate where I think of WebAssembly, I think of it as a portable executable format, and so my tagline, I don't know if I really believe every word here, but I would say, WebAssembly will be like Unicode. It will not be perfect, but it'll be the one standard for a compiled executable code. So that means it's gonna be useful for many contexts, many of which are represented and explored here today, but also things we haven't even thought of yet. So I really like working on language implementation, that's been my career, and my background has primarily been in the early days it was working on Java and Java VMs. So I think about things in terms of what a Java VM can do and what it might do with your code, and so I've worked at various places on that and research primarily, and my history with WebAssembly mostly is these last three things. I worked on V8 for about seven years, and I worked on the optimizing JIT and started WebAssembly with Luke back in 2014, and now I'm an academic, so I focus on doing research, and so I think about engines and how they might be designed in five, 10 years from now, where we want from engines in terms of capabilities, but also their architecture. Why do we build research VMs? So it turns out that students, they don't know anything yet, we have to teach them something, so we need a vehicle for that. So they will hopefully become virtual machine experts, but they need to get into it, and so we need to kind of have a shallow end of the pool for them. They need simpler artifacts that have the core ideas so that they can get into it, and you can then look at and compare and contrast what a research VM is and what a production VM is. So of course we want to do actual research, and so in particular I want to work on new runtime techniques, new optimizations, things that have never been thought of. WebAssembly is an interesting view on how bytecode is designed and what it's used for, and I think that demands new optimizations. Also, it turns out that programs have bugs and sometimes run slow, so we want to analyze them, so you need a tool to do that. Production VMs aren't particularly focused on that, they're more focused on performance. Also, there's going to be new languages over the next, who knows what AI looks like, but in the future there'll be new programming languages with new features, and they will need targets and compilation techniques, and so you want to have a VM that can support that as an exploration vehicle. And also, WebAssembly should be subjected to some research. If we add things to the standard, maybe we should have at least tried them out or alternatives to that, and so you need a research VM to try out many possible alternatives before you settle on one of them. And this also means that we can steer language evolution. We can try out different extensions to existing languages and then easily experiment with our implementation if that requires an extension to the VM itself or a runtime system feature that maybe it doesn't surface as something which is added to Wasm itself. The other thing is, I think it's actually good to take pressure off production VMs. So I worked on a production VM that is 900,000 lines of code, and it's primarily oriented towards performance, and if it also has to serve as a research vehicle with all the extensible extensibility mechanisms, it's kind of serving two masters. So I think you can kind of specialize the system so that production VM doesn't have to do that. So there's a good example that I'm kind of using in the back of my mind. So around 2000, a system called Jigs RVM came out, and this was a Java VM that was designed for research. It was written in Java. It was designed for people to do all kinds of funny ideas and funny techniques that nobody had ever thought of. Their tagline was, it provides a flexible open testbed to prototype virtual machine technologies and experiment with a large variety of design alternatives, and there was many papers published. There are still papers published about this, but this is a research VM. It wasn't something that people used as a product, so we're kind of going that same space here. Wizard is about that, flexibility and experimentation. One of the first things that I published about this design was how to do an interpreter. How do you actually interpret the WebAssembly bytes themselves as opposed to rewriting WebAssembly or compiling WebAssembly? I'm not gonna talk about that, but there's a paper. It's been around. Maybe people in the community have seen it before, but what I will talk about in this talk is how do we do this instrumentation? I think this is really important for being able to analyze what a program does. Now we'll talk a little bit about the compiler, so it's not just an interpreter, and there's other things that I'm not gonna talk about. So there are several in-flight proposals in WebAssembly land, like WebAssembly GC, exception handling, multi-memory, memory 64. There's several of those that are already implemented in Wizard. I'm not gonna talk about extensions and support new languages, and I'm not gonna talk about the language that I implemented in, although that would be a lot of fun to talk about that. Okay, so what are the priorities when we build this engine? And I came with these four, starting. Number one is observability. Basically, every VM out there can run code, many of them very fast, but they don't show you what the code's doing, and if you have code that's broken, you wanna debug it, so that was the primary thing. If you're a researcher, you wanna watch what programs do. Especially if you're producing code and your compiler's broken, you wanna see what the code does. The other thing is students are gonna use this. It should be approachable for them without a lot of VM background, so you don't have to be steeped in the lore of V8 and how V8 came to be and how V8 does things. It's something that you can kind of just eyeball and get an idea of what's going on. It needs to be competitive performance though. It doesn't have to necessarily be at par, but it needs to be close enough that if you do measure the speed of something, you get an idea of would this transfer to a production VM? So it has to be in the ballpark. And then the last thing is, can you extend it in ways that are not just observability, but extend it in ways to add a new feature, for example. Contrast that to V8, system that I know well. There's really only one priority. I mean, there's subsidiary priorities that I tried to come up with what they would be, but they're way, way down the line. V8 is about running fast. And also maybe not crashing too, and there's security and things like that, but performance is absolutely number one. And actually all the subsidiary things that are priorities, actually I think end up to this fundamental point is that engineers don't lose their minds because it's a complicated system and they're doing many optimizations and they're revolving it. And so the evolution, all of that is so that the development team actually can make sense of this and they can still make forward progress. And forward progress means next language feature, also offering whatever observability makes sense and debugging and so on. But there's really a gap here between the top priority and the rest of them. So what does observability mean? So in particular, compilers produce bad code. When you're developing a new compiler feature or from scratch, it's broken. So the engine is gonna choke on your bad code. It parses it, it says you got a type error or the binary is encoded improperly. So having nice tracing modes in the engine that shows you what it's doing in a nice readable format as it's parsing the code, as it's type checking the code, extremely valuable to any compiler writer. So there's lots of tracing modes that you can turn on. And I worked pretty hard to get decent error messages. So it tells you, it doesn't tell you, oh, bad binary, this byte is wrong. I was expecting this, this should be a type. And it kind of, kind of in the Rust style, not quite as nice as Rust error messages, but kind of going in that direction and making it a little bit nicer to use. And then I'm gonna talk about execution and debugging. But we're talking about two dimensions here. We wanna observe the execution of the engine. So as it does things that are not just execute byte codes, but like grow the memory or whatever load a module and stuff like that, but also of the user program too. Like maybe I'm debugging my program, it just so happens to be compiled to Wasm, and I want to debug the Wasm code and show me what happens when I call function number 57 or whatever it happens to be called. And then debugging too. So debug the engine. So if you're a student, you're working on a class project, it's not right the first time. And so you have to debug what you're working on in the engine. And so you're adding distinction to wizard, you have to debug it. So getting that to be more ergonomic is actually really good to accelerate research. And of course debugging the user program too. WebAssembly still lacks a really good debugging story, it's getting better incrementally over the years. But I hope to kinda point the way where I think this could be better with wizard. Okay, so dynamic analysis. That basically just means watch programs as they execute. And the insights I had from looking at many different types of dynamic programming analysis, they're primarily, you wanna instrument some location in the code. So there's some specific place, maybe there's many types of places in the code, but ultimately instrumentation is attached to places in the code. So it makes sense to have an API where you can do that. It's sometimes attached to data, but usually you can get to data indirectly through instrumenting code locations. It's often usually sparse. If you run a program, it executes a billion instructions. Where's the needle in this haystack? You need to have very targeted instrumentation. You can have a tracing mode, but you don't wanna parse like a 100 megabyte log file and figure out what's wrong. So you wanna really zoom in on what's going on in a specific place. But beyond that, there's really not a lot of commonality. It's clear that if you think about profiling, or you think about debugging, or you think about studying any aspect of a program, you effectively need to have a programmable mechanism for instrumentation here. Not just static things that you can do. And so examples here that I thought of, code coverage, which functions, which basic blocks, which instructions have they executed. Branch profiling, what's my program doing? Does it ever even actually go here? Or how often does it go there? Does it spend a lot of time in this loop? What's going on? Maybe there's other things about the program's execution that are important, like how's it using memory? Is the cache actually being utilized? What does my program do in terms of its call graph and so on? And if you build a debugger, you want breakpoints, so you need that. And you may wanna look at memory, so maybe you wanna see when a particular memory location is updated, this is something that you should be able to do. And then tracing. And it gets more complicated than that, but ultimately this boils down to we need to have this programmable. So this is the first mechanism that I came up with. Since Wizard has an interpreter, it's very easy to think about the execution of a WASM program now in terms of it executes one instruction at a time. This is like a very naive way to execute code, but it's also very convenient too, because if there's one instruction, like that call right there, maybe that's really interesting, because it calls, I don't know, so a network call or something like that. I'm interested in that. So the bytecode interpreter loop in wizard allows you to insert callbacks. They're called probes, and whenever that instruction executes, you get a callback and you're in the language which the engine itself is written in, and you can see all the state of the executing program. So you can see the value stack, you can see the execution stack, you can see memory, and all of these go through a nice API. So it's effectively the simplest possible VM. Okay, so that's great. Everybody's written an interpreter before. That's easy to see how that works, but we're gonna make it fast. Okay, so the first thing is that we can have probes that are attached to a specific location, but sometimes we also wanna do everybody code. This is the global mode, and so I found a trick to make this fast, and the way it works here is that there's a dispatch table which contains the logic for every interpreter bytecode that you could execute. So it's like the machine code that actually interprets, it's called an opcode handler. The machine code that interprets the logic of that bytecode. Normally, there's a dispatch table register that points to the table for the logic to do for each instruction, for each kind of opcode. And if we wanna turn on the global tracing mode, we just switch the register to point to a table that actually has an indirection where it actually calls probes. So with one flip of this register, we can turn the interpreter into a mode where it will just call back for everything. And the key thing is that we can also switch that off. So that mechanism can be in the VM at all times. There's no debug mode for the VM, and it only costs you the space of this table and this one extra thing over here. So this thing can be in the engine at all times, and so you don't have to build the VM in a certain way. So it's like production ready. So it could be in the production version. Okay, so we can use this local probe also to make more interesting analyses, because we can also, so the way this is implemented to make it efficient is the interpreter basically just overwrites that instruction with an illegal instruction. So when it's running through and it hits the probe instruction, which is purposefully this ugly purple color, is that it knows that actually we should call user code back. And the reason why it does that is that it can keep track of the offset of where you are. So because obviously the user instrumentation wants to know where you're coming from. You spread probes throughout the code, and you maybe want to insert more probes later, you don't want to rewrite the byte code to insert stuff. So the user callback is called and it does whatever it's logic through that API. And then since we have a copy of the code, you can actually execute the original one. And if we decide later that we don't want to have that probe anymore, we just copy that byte back. So the key thing is that you can insert and remove super cheap. The interpreter has no problem. And it's all production worthy, right? There's no overhead on any of the other byte codes. Okay, so the probes are the building blocks for more complicated analyses. They can be inserted and removed. And that may seem like a party trick, but actually it turns out to be quite useful. So let's take an example. We want to do code coverage. So we want to see which of these instructions have been executed, which of these basic blocks have been executed. And the easy way to do that or actually efficient way to do that is just insert probes at every basic block boundary. You could do every instruction, you could do every function if you like, but we're gonna do every basic block. And then we're just gonna keep a map on the side, which is just one bit for every block. And initially they're all zero. And so when you hit the first probe, when it's executed, you go and you set the bit. But you can remove it. So we're doing coverage. We only care if that instruction has ever been executed once. There's no need to execute that probe again. So we just remove it. So as overhead we'll go away. Because the next time we execute, we won't need to execute the probe. And so as the instructions you hit them, the probes fire and then they update this table and then they remove themselves. So we'll just go down the line. And eventually the places where the probes remain, that's the code that's not been executed. So actually I kind of lied, you don't even need that side table. As long as you can tell whether or not the probes are there. And then in the limit, eventually your program is gonna run and the hot parts are all gonna remove all their instrumentation. And then if the JIT compiler kicks in, which it will, I'll talk about that in a second. Then this code will be compiled and won't have any probes in it. And it will basically run the same speed. So it's kind of asymptotic. You kind of have some overhead in the beginning as you're figuring out what's been executed. And then you eventually JIT compile all the hot stuff that has been executed at the top. So it's again, it's another thing where this mechanism can always be there. You can always use it. And you don't actually have a special mode in the VM. Okay, so I've implemented, and actually students have implemented many analyses that are useful. So in particular, the loop monitors actually, so one of my students spent a lot of time making the nice terminal colors and stuff like that there. But that basically just profiles every loop. And that's really simple to implement. You just put a counter probe at the beginning of every loop byte code. WebAssembly is nice in that it tells you where the loops are, which are the loop byte codes. And you just count and then it just prints it out at the end. And like literally like the, maybe 25 lines for instrumentation and like 200 lines to make it look nice with the loops and the nesting. Branch monitor is a similar kind of thing. It gives you a nice printout. It basically, it will profile every single branch in your program and tell you how many times it went each way. So that's done with the probe that's inserted at the actual branch that reads what the top of the value stack is and figures out which way the branch will go and then tallies them up and then just prints them out at the end. Code coverage we showed before, but all the tracing modes that are in the VM for execution, those are just implemented with probes. So there's not like a special thing in the interpreter which is like, oh, we're in tracing mode. It just supports probes and that's it. So there's only one thing that it needs to do. There's also a call profile or two so you can see like a flame graph. I didn't show it here, but that actually records the time. And then this thing called, I call the hotness monitor that actually records the, that records a profile of your entire program and then ranks it according to how much of the program takes, how much of the execution time. So you can see, for example, a benchmark spends 90% of its time in one tiny loop. And it turns out that like Polybench, which is the benchmark suite that everybody uses is exactly like that. And so the hotness monitor told me in like five seconds that Polybench is not representative of programs because they just ranked them that way. And then we actually do have a debugger too and that, so it's a redevelop print loop. So you basically load your program and you can give it commands and you can set break points and watch points too. And it's all implemented with probes. So there's nothing in the VM interpreter or anything that knows anything about debugger. It just only speaks probes. So it's just the one mechanism that does it all. It turns out that actually maybe probing the byte code isn't the level that you wanna write every analysis even though you can build every analysis from probes. They are really just things that you start to build a hierarchy. And like you could literally build everything from that one global interpreter instrumentation but it's really slow. Local probes, since they're sparse, they express the intent. They only care about certain parts of the program. So what we started to do is build a higher level API that expresses the intent of what programs, what monitors want to do to see about a program. So utility methods that instrument memory, maybe the entry and exit of every function. And it turns out that that's surprisingly tricky to implement. So for example, okay, this is a wasm function. It just so happens to be, just so happens to start with a loop. So if you thought, well, we'll instrument this function by adding a probe to the beginning and that will be our entry event. Well, every time you come back through the loop, like the probe will hit again, right? And if you think, well, the exit is actually at the end, well, there's other exits too, like there's a return in there too. So you actually have to be careful to implement the probes properly to just get the function call entry and exit. So instead, what you can do is you can have a library that implements the call stack emulation. So it implements this in terms of inserting probes at the right place, but it has the user level callback for function entry and exit. So the engine still doesn't do anything special for instrumenting function entry and exit. It still only does the local probes thing, but the higher level user API is, yeah, just tell me the function entry and exit. So you can just keep building blocks on top of each other. So this is a nice library. That's what the, it's a nice pink color library. Okay, so observability is great. What is this graph? Observability has low overhead. I claim that yes. This is running in the interpreter. So 1.0, so this is execution time, which means up is bad. So that's slower. So you want to be 1.0 means that's no instrumentation. That's just the baseline speed. So what happens if you add the branch monitor, which is the thing that records the direction of every branch? It turns out you pay about five to 10% with local probes. And it's about, if you were to do that with bytecode rewriting where the engine doesn't support anything and you just inject a bunch of code, it's like a lot more. It's like 2x because you've doubled the number of instructions, right? But the problem is here is that this is an interpreter, right? This is not competitive performance. Yeah, you only made the interpreter 5% slower. Good job, right? But like everybody wants to run JIT code. So wizard does have a JIT and it runs a lot faster than interpreter. It is like five to 25 times faster than interpreter. What does it mean there? It means that this overhead number doesn't matter at all. Okay, I don't actually have a graph for what it is. And there's lots of optimizations that we do to make this in the JIT code faster, which I'm going to show now. Okay, so wizard obviously having something that is just interpreter-based is gonna be a joke. Nobody is gonna care about what your performance is. Let me say, yeah, we added 5% overhead to an interpreter. Nobody cares. So baseline JIT time. This is what I spent most of the beginning of the year doing. Building a state-of-the-art single pass JIT compiler. And it supports local probes and it doesn't support global probes, but local probes are good enough to implement all these analyses. And you can just think of that as like every time the JIT compiles a function, it's been marked with exactly where you want the instrumentation and we'll just compile calls in. And one of the neat tricks here is to use the same stack frame layout as the interpreter. So there's a lot of things going on in this slide. I'll just briefly mention the blue thing at the bottom is the execution stack. So like the x86 stack that the VM is actually running on. So those things that are, they kind of, those are execution frames that are stacking up and going down. Those are the same size for the interpreter in JIT. And the sort of orange-ish red color, that's the value stack, that's where the WASM values are. And that layout is also the same between the JIT, which means that you can flip between the two execution modes really easily. So the interpreter can run along and then you can be in the middle of a function and decides it's really hot, baseline compile it, swap the frame, that's like a constant time operation. And vice versa, because you can insert instrumentation and remove instrumentation. And if you've compiled code that's specialized instrumentation and then you change it, you need to go back to the interpreter. So wizard does that. It will actually switch modes depending on what you've done with the instrumentation. And so that means that you run pretty much at full speed and it adapts to what you're doing. So that gives you easy observability too. So let's say that your program has a trap and it crashes and you wanna get a stack trace. Since the stack looks exactly like it does in the interpreter, you get a stack trace for the same price. And you can see all the values. So it's basically like really great for debugging too. There's not a lot of work that the compiler has to do to support that. So tearing up for the interpreter of the JIT, that usually happens for hot loops. And then the other direction is really only for changing instrumentation. The VM doesn't need to go to the interpreter under normal execution. It doesn't do optimizations that are speculative that have to be deoptimized. There's also through the API, you can change the values on the stack. So these actual values here, you could go in and change them. For example, if you're in the debugger and you're like fixing continually, change the local variable. That also, that means that the JIT code which may have been specialized to the values is out of date. And so we'll just switch the interpreter and the interpreter will do the right thing. Okay, so it basically looks like this. The baseline compiler is run through all the instructions, top to bottom, generate code for every single one of them. It's a whole lot smarter than that, but that's the basic idea. It will actually do register allocation as it goes. So it doesn't generate code for every single instruction. But the key idea for probing is that it basically, when it hits the probe, it's time to call into your instrumentation. And of course, that's a call. So you're leaving the JIT state and you're leaving the JIT execution context. And so there's gonna be some overhead. Well, it turns out that there's certain kinds of probes that are very common, like just count how many times this hit. And you can just write that as a probe that keeps track of its own state or you can use a counter, which is like a special kind of probe and it will just inject the code right there. It will actually just be two instructions. And so the overhead of that is like almost nothing. Okay, so this is a picture of the trade-off between startup time and execution time. This is actually a picture that's a year or two old from my interpreter work. So the wizard's interpreter is over here at the left. So left is how much work you have to do per byte as you parse it. And then up is how fast do you, it's actually execution time. So up is bad, left is good. So wizard interpreter lives over here. It's like much slower than a compiler. And optimizing compilers are like, they take a lot of time to process every byte, but they produce good code so they're over there to the right. And you can see the different kinds of tiers, baseline also fits somewhere in between. And wizard's jit is like right there. It's like nice in this nice little region. Those aren't actual data points, but that's basically the region it lives in. So the other baseline compilers are from all the web engines here. So it fits like right in where you expect a baseline compiler to be. So it's fast. It's basically on par with V8's baseline compiler. There's some situations where it's better, some situations where it's worse. The key thing is that you get all the instrumentation capabilities. Like you don't know that it's there. You can pretend like it's just an interpreter and insert all these probes and do all these callbacks and it just works. I can make it better. That's what some of these say. And it really doesn't de-optimize for any reason other than the instrumentation. Okay, so another aspect of making a research VM is making it approachable. I don't know how to measure this, honestly. So it's really just maybe in the eye of beholder, but you can at least measure the simplicity of the system. Wizard is a lot smaller than all the other web dimensions out there. It's 18,000 lines of code, including the interpreter, including the baseline compiler, and a lot of assembly code that I used to make the interpreter fast. And that implements all the parsing and all the type checking and all of that. Just for context, the V8 wasm engine, I wrote a bunch of that. Just no execution tiers is 57,000 lines of code. That's like parsing and building the IR and doing all the things that you need to do wasm and also integrating a bit with V8. And then once you add execution tiers, it's like a lot more. Just for one architecture, the baseline compiler is 17,000 lines of code. And the optimized compiler is 289,000 lines of code. Obviously, I don't have an optimized compiler yet. It's gonna be big. But I think for students, there's just not even enough time to pass your eyeballs over that amount of code. They're not gonna do it. So I think this is more approachable just by being a bit simpler. How robust is it? Good question. So I wrote this in a memory safe language. It's kind of a side project of mine. So it's not written in C++. Rust is also a memory safe language. But you get a lot of properties by writing this in a nice language. And students, I think, find virtual approachable. There's actually another tier that I didn't talk about. There's the interpreter and the baseline compiler. But there's actually two interpreters. There's one simple interpreter, which I wrote, which is like it uses arrays of stuff. It uses algebraic data types. It uses higher order functions. And that one is like really easy for a student to look at and just add a bytecode to it. So that one is like tier zero. You don't really run that in production. Not that the wizard is a production engine, but it's the thing that you can see what does WASM even mean to run? How self-explanatory is it? I don't know. I mean, I've tried to comment every all the major pieces and try to make it clear. But that's also a bit subjective. It needs more documentation. I know I need to do that, but it's not too bad. Okay, just an idea of where does all the code go in V8? This is the representation of values in WASM. And this includes the GC values too from the GC proposal. So we've got references in I31s and I32s and I64s and F64s. And if you've programmed in a language that has ADTs, you know why you would write it like that. It's very nice, right? You get the value semantics and all. This is the Virgil syntax for writing that. But like you get these in Rust and OCaml and SML and all those things. 14 lines of code, that's the whole thing. That's the values. This is the same code in V8. 241 lines because it does it the C++ way. You have a class, which is the base class. Inside there's a union and then there's a type which represents the type of this thing. There's a rendering method and then there's subclass and stuff like that. And then there's utility methods. And all it does is really implement value semantics for an ADT. So you wouldn't show that to a student and they would immediately reverse engineer into realizing that is how a value is implemented or be able to add one. But you can add one here too. So obviously you can do that in nicer languages too. So it's not just completely unique to Withered. And that's it, that's all I got. Questions? You can't modify the code. You can modify the value stack. This is like an internal representation of the code. It's only modified to keep track of what instruction is probed. But that's an idea, I don't know. Maybe dynamic update of the byte code. We could support that. Other questions? That's right. Yeah, that was a very first. That was a long time ago, that was like 2006. Your memory is pretty good, yeah. So Virgil's language that I did in grad school and it was designed for compiling to memory constrained devices and it compiled to C. And so I had like static initialization of the heap and everything. This version of Virgil is like general purpose. You have a heap, it has a garbage collector, it has like interface to the operating system and stuff like this. So this is like many years advanced, like almost 15 plus years of working on that in the background. Oh yeah, oh that's a whole other story. So yeah, I mentioned JVM's in the beginning and so what I did, where did that show up here? Java prototype, yeah. So I thought, I've been thinking about like how do we build the next layer of languages on top? And to implement Java, you basically need to have a pretty complicated runtime system that has like reflection, it has like representation of classes and a bunch of stuff. So Java is Java as web assembly. And it was, I haven't published a paper about this yet. I wrote a paper and it was rejected. But there is a prototype that is actually in wizard. It's not included in those 18,000 lines of code but it's a little tiny JVM and the way it works is actually it doesn't interpret Java bytecodes but web assembly with a small set of extensions. And actually the extensions are encoded in a way that's not really a set of extensions. It's mostly just some foreign functions but it basically gives you the Java bytecodes that are missing from web assembly and so you can run like very simple Java examples. It's not an actual full blown system. It's pretty slow. That was not super fast. That was before I had a jit. I wanna come back to that at some point. Yeah, take over the world. No, actually I want people to experiment with WASM proposals. Stack switching is one. I'd like to have a student working on that. I do not currently have a student working on that. I have implemented the GC proposal but it needs some help to go fast and there's some issues behind that. Like basically I need to make the virtual garbage collector be able to deal with dynamically like new shapes that come from the web assembly code that you're loading. But eventually when I have an optimizing compiler I wanna explore doing all kinds of analysis tools all built on tops of probes and things like that. We're using it so you know about WALL-E. I won't mention it other than that like having WALL-E in wizard. So being able to run platforms on top of this. This only implements a bit of the WASI pre-v1 not all of it and that needs to be finished up. I don't know what Preview 2 entails. It would probably a significant amount of work and maybe nice to have a Preview 2 implementation that was itself WASM. That would be great because then I could just run it on top. I don't know how to make that work but there's like a lot of different things that we can explore going up a level. Interesting question. So for this other project there was asynchronous signals that you could deliver to web assembly and the way that is implemented, we used WIMR for that project, but there's a pole in all the loop headers. There's a pole for is there an event? And so you can asynchronously send a signal to a thread running in web assembly, mobile threads and then they get an interrupt and then they can call some function. So if you had hardware interrupts and you wanted to map them on this then maybe you could make that work too. There's a small challenge in that like you might get a hardware interrupt in between like while you're in the middle of interpreting a bytecode you kind of want to get to the end of a bytecode before you service that, but yeah, cool. Any other questions? All right, thank you.