 All right, welcome everybody. My name is Kyle Brown. I work at Single Store as a senior software engineer and I'm a co-chair of the Bicycle Alliance's special interest group for guest language support. Today I'm going to be talking to you about how we can get WASM components for every language. Now, there are two main points I want to make throughout this talk. The first is that we can make components for every language. That's a feasible, practical thing that we can do and already have the tools to do. And second, the components that we make in different languages can work together. That we're able to have them compose and interoperate trivially. And so the structure of the talk largely follows those two points. First, we're going to explain what components are and how they're structured and represented, what's actually in the guts of a component. They're talking about componentizing and walks through a few different approaches to making components starting from the very bottom and doing it by hand, building up to some future possibilities. Then we're talking about composing, what composition is, how we compose components and wrap up with a demo where we stack a bunch of WASM components together. So, components. Well, what are they? Components are, I believe some of you saw Luke Wagner's great talk earlier, the component model is a proposal that's layered on top of Core WebAssembly. And where Core WebAssembly defines modules, the component model defines components. Modules are lower level. They speak in terms of numbers and linear memories, integers and floating point values. And components are higher level and are able to express more complicated semantics like records, lists, arrays, strings, resources, and other facilities. And they're instrumental in expressing things as complicated as the WebAssembly system interface, WASI. Now, that layering idea that component model is layered on top of modules is also directly how it works. It's not just a metaphor. A component fundamentally contains a module if it's going to do any real computation. And your computation is still expressed in terms of normal instructions and things in a core module. What's different is that the component has its own level of import and export that are at that level of abstraction that we were just talking about with strings and records and such. And components contain the module, these imports and exports at its own level, and the lifting and lowering represented by those two arrows that adapt the semantics of the component model down to the lower level of the modules and give them a concrete interpretation. So if we want to look at the actual anatomy and guts of it, we'll find that the sections of the component match that diagram I just showed you, that we have an import section of a function at the component level, C import here, then there's literally a section that describes the lowering behavior, how to take that imported function at the component model level and turn it into a core function. And those options in that sort of dot-to-dot-ops section there are things like what memory is being used to put the data in, what encoding the string is in, how to allocate data and such, and that helps us translate from the higher level to the lower level to give it a concrete interpretation. Then we have the actual module itself with your code in it, the full bytes of it are in your component, components contain modules generally, and an instantiation of it, which looks a lot like sort of linking definition because fundamentally component model is also a marriage of the interface types in the module linking proposal. It's sort of a discovery that really it's hard to treat those two problems separately and they're most usefully handled by creating a new thing that wraps those two semantics together. And then we have a lift on the other side that takes that core function exported by the module in the center and lifts it and gives it to semantics and it tells the component model how to understand it, and this has all the same options as the lower did, as well as potentially a post-return function that can be bound to one of those options. It tells you what to call to clean up things after you're done reading the values from an export. And finally, we export it at the component model level. We make it visible as the component model function as an external export of the component. And this is in the actual AST that you would see in the explainer specification for the component model. This is how components work inside. And I'll note that I only have one module in this sort of example component, but you can have as many modules as you want. We'll see examples later where there are more modules inside a component. And even components inside components. So componentizing, the act of making components, component as a verb. The most trivial version of doing this is by hand. We'll write ourselves quickly a concrete component that does something somewhat interesting anyway. And that component is going to print a hello message, hello from what that we see in memory here. And so what's inside this component to begin with is a module. The module has memory 8, which is the offset where the string begins in memory, 15, the length of the string, and then that string itself at offset 8. And a function that returns 0, because 0 is the offset in memory where that slice is described. And then we have a sort of core instance before. We tell it to instantiate that instance. And we alias its memory, which is just a sort of thing we need to do so we can refer to it later in the lift. Then we have a section with a lift in it that says, hey, use the canonical AVI to lift this string. Use that memory that I just defined and take this exported function called greet, use the string encoding UTF-8, and ha-zah, we now have a function that returns a string. That's its result type. This is part of an interface. And this right here you see on screen is actually a thing that you can pass into wasm tools parse and get a real component binary out of that you can call using wasm time and get a string out of. This actually works. So you're not really likely to do that by hand very often. I don't know how many of you have, like, solved a business problem by writing the business logic in assembly or web assembly directly. What you're likely to do, at least for languages that we can ahead of time compile to modules already, is to have the wit-byngen method. And to explain the wit-byngen method, first, let's talk a little bit about wit. It's this format. You all actually got, like, a flyer in your materials today with a wit guide, a little wit cheat sheet that explained the semantics of wit. This is what wit looks like. We're finding a little interface, an uncreatively named interface itself in my little wasm con.23 greet package. And that interface just says export a greet function that returns a string when you call it. And the component we just looked at before that exported just this greet function in the shape this world, like Luke was talking about earlier, where it just exports this interface. That's all it does, exports this interface. But we can also define other worlds like this proxy greeter world that say, hey, I'm going to import a greet interface and I'm going to export it. This means that when we call this component's greet function, it can call this other greet function and compose them together. It can concatenate things onto it, for example. It's what we'll do in the example later. So the wit-byngen method, your component passes it into bindings generator to create source language bindings for your component, and maybe, you know, if some of your languages would see your rust, you'll get those sort of bindings. As well as embedding some information in there, usually, with the information about the type of the component, because we'll need that later. Then when you combine that with your business logic, your code that the user wrote, and you compile it through your compiler, you'll get a module at the other end that has a custom section with the type info. And this is really useful because if you're new, the Luke also mentioned the talk today, they can take a module with this custom section that defines its interface and turn that into a component. It'll generate the lifts and lowers and needs to adapt your module to becoming a component, and it will wrap it in the requisite way. And so, concretely, this is how you would do that for C. Wit-byngen has a C export, and it is able to make a header file and a source file that are named sort of the name of your world, and it contains the type info we were talking about earlier that we need to smuggle into the system somehow. We're doing it through an object file that becomes a custom section that then gets interpreted by the component new tool. And then we're going to combine that with this component.c file that has our actual logic for proxy greeter, and we're going to get this module we need. And from the module, once again, we run this component new tool, and it works just like before. And what's actually in component.c? Well, this is an implementation of component.c. This is the greeter that appends and C to whatever the imported greet function did. So if your imported function would print the string, hello from what, what we did before, and you compose that with this one, you get hello from what and C. And all we've done is really take this and C suffix string and allocate and concatenate that onto the string that we got from our import. And this is also the same flow that happens for Rust, albeit inside of car component, but it's a flow that wraps this process that you don't have to call these functions yourselves. You can just call car component build and this will happen for you. But under the covers, it's also using wit bind gen. It's generating a module of code that is inserted into your sources that's combined with your library code, and it's compiled with Rust C to make a module of the custom section. And like before, a module in a custom section that has type info can be made into a component. And so we do so. And our source code to do that looks like this. And you'll notice the Rust code is a little easier to read than the C code. All we have to do is implement this trait guest for our component where we call the imported greet function and add and rest to the end of it. And we have a component that adapts an import to produce an export in Rust. Now, these approach I'm talking about, this bind gen approach works great if you already have a compiler that can take your source code and ahead of time compile it to a module. But what if your language is interpreted or it has a runtime and garbage collection of other facilities that aren't just the whole code AOT compiles to WebAssembly directly if it's not quite so easy. Then we have another approach layered on top of this I'm sort of tentatively calling runtime wrapping in this case. And in this process we're going to do roughly the same thing. We'll still generally generate bindings in your language that adapt the types and create sort of the stubs of the functions that you need and you'll combine your logic with it to make a code module. But that module might actually only have the binary data if it's like a byte code then JVM your module there could literally just be a data section that is the byte code. And that's fine or it might be ahead of time compiled to functions. But either way, what you're going to end up doing is composing it with a pre-built runtime in a way that makes a new component that links the runtime into the sort of source code and compiled source code to make this runnable component. And because these runnitimes all tend to assume that you've accessed a runnable component in the world, you tend to get a component that has this import of the WASI CLI world stuff. But if we want to, we can use WASI VRT to hide that import of the WASI CLI stuff. We can create a virtualization of the file system of the clock of random in a way that's sufficient to run these runnitimes and create a component with no additional dependencies where we just have our import of greet and our export of greet . So there's two main tools that currently do this approach. One is Componentized.js, which if you're in this room, you're missing the talk on by Guy Bedford, unfortunately, due to timing. And this uses a pre-built spider monkey runtime that has been designed to be instrumented and have arbitrary bindings added to it. It has some cool features like the ability for users to configure the sort of JavaScript environment they work in, the globals and imports that are available in it, the flavor of JavaScript environment for your code. It also uses snapshotting to improve startup speed. This means that since it's WebAssembly, it's able to run it and pre-initialize it, snapshot the memory state, and then compose that into your result as well so that you start up with the engine already at the state where it's parsed to your code. Doesn't have to wait for it to parse your code for no reason. And that execution is all completely sandboxed so that there's not any concern that you are non-deterministically pre-initializing it where the pre-initialization is going to damage the isolated process. The code for this is very simple, actually. The binding generation for JavaScript is quite nice in that you literally get to import greet. Because our interface is that we import greet, it's not wonderful. And then we define this greet interface, which is our interface that we're implementing for the world with our greet function that does very trivially implement greet as calling the imported greet with the end JavaScript at the end and we export it. And what's cool is that this is actually also the shape of the component. The other tool that we have is componentizepy. And this uses a pre-built Cpython runtime. It has high-level bindings that are generated for Python, but the low-level lifting and lowering stuff in the real nitty-gritty bind gen is actually generated as a wasm module, which is a little different than it works for some of these other things. You don't generate, like, C code or something, you get actual wasm. And part of this is to work with some of these other cool things that Cpython is doing, one of which is extension leaking. Because Python has so many different popular libraries that use C extensions like NumPy, Panda, Psykit, etc., a sort of dynamic linking approach was needed. And componentizepy has this ability to link together dynamic libraries as modules inside of your component and figure out how the symbols need to be defined and imported and link all that together. It's a remarkably clever thing that you should go to Joel Deis's talk tomorrow to learn more about. Also, it does the same sort of pre-initializing snapshotting that the JS does, as well. And the code for that is also very simple. We import a class that we need to implement, this abstract class interface, off of exports, and we import interface from our imports. And that means that we're able to call the greet function in our greet function. Once again, these are very simple implementations because all of the hard work of making a component has been externalized to bindings generators and these componentizing tools. So, fundamentally, these two approaches can take code that can be OOT-compiled to modules and code that can be OOT-compiled to modules and make components out of them. So, fundamentally, we should be able to componentize code from arbitrary languages. There shouldn't be a limit to what we can componentize. That said, in addition to these two approaches, there's also some future possibilities that go outside or beyond what's currently been outlined in those two examples. For example, you could start to use the garbage collection proposal, Wasm GC. And one note that I want to start out with though is that you can already implement GC languages using linear memory. There's maybe advantages to using GC instead. If you are going to use GC, some notes that are some challenges that currently exist is that Wasm GC doesn't build in all language features, features used by different languages, garbage collectors. So, if you have some like interior pointers and things, you may have to use a high level abstraction to support Wasm GC. There may be some effort there. Your GC is often tightly coupled to your run time. So, some work would be needed to decouple it. Wasm GC's support is still work in progress outside of browsers, unfortunately. Like our previous speaker mentioned, the work is still underway in Wasm time. In addition, the component model doesn't yet support Wasm GC at the boundary. Now, you can use GC internally. The component model won't have any opinions on what you do inside of your module, but if you want to pass something out to the other code, the lifting and lowering system and then unobstruct data to move it through the system, you can't read directly from GC. So, you'll have to make an intermediate copy in your linear memory that the canonical API will then read and write from, which is the current limitation that will the future not happen. But in the future, as things go on, as these things become more mature and more developed, Wasm GC can be really useful, especially if you're a new language that gets to start from the bottom and build Wasm GC as your GC or you're a language that's small enough that you can decouple and rip the GC out of your system and substitute these GC instructions instead of having all the GC code that's already linked into your runtime. Conversely, if you're a really big, really old project, I expect Python might have some challenges finding a way to rip GC out of their system and abstract it to using Wasm GC. But if you do that, that will require you to componentize differently. Right, the other thing that I want to see, I think would be cool to see in the long run, is deeper toolchain integration, which is to say that compilers, could interpret the width and the components directly, could understand them, it could do the mapping of its type, so if you have a JavaScript string, or if you have a build tool for some language as its own string type, and it's going to have a function that returns a string, if it just knew how to generate the lifts and lowers required to do that on its own without meeting binding generation, then that could be really interesting. One thing you could get out of that potentially is a better compiler user experience. If you exported a function that returned a string, but you needed a function that did this, that's different than maybe finding some unclear error in your binding generation. It says, oh, the bindings type didn't match your type. If it didn't just say your component type and your source code types don't match, then that would be more clear to users. Also, as async features are added, languages might want to be more aware of them. And the component model is going to have an evolving async support in the next year with preview three hopefully coming sometime in the next year or the year after to more deeply integrate. And the more that they understand the component model directly, the more it's possible there. So it's good for thought. GC and deeper toolchain integration. Without a way, let's talk briefly about composing. I'll say frankly that I don't actually have a ton of slides on composing because it sort of just works, quotes. Which is to say that it works a lot like it would work for normal modules. If you have imports and exports whose types match, you can instantiate them together in such a way. You don't have to match all of them, of course. You're not required to produce a closure that fully wraps all the imports and exports. You can compose together components that as long as some of their exports and imports match, you're able to create a graph of these components. One thing that's cool with components is static composition. We're able to take two components and actually make a new component that expresses their composition. You don't have to necessarily use the APIs in your linker or in the JavaScript API for WebAssembly to instantiate one and create this compound expression of the two components that is then itself a component which is a powerful tool that we don't have currently without it. Now the one challenge that composition has for components that is not a thing modules have to think about because they only have one type system which is this low level integers and memories is how to actually lift and lower across two components. So strings as an example of a type that have different concrete representations, they're coded differently. String just means a sequence of Unicode scalar values but that's not a bit pattern. String is fundamentally like a Unicode string does not have a bit pattern. UTF-8 has a bit pattern. UTF-16 has a bit pattern. Strings do not. So if we want to pass a string from one place to another, it's important to know what the underlying representation of those strings is. For example, if in one side it's UTF-8 bytes that are lifted to being a string and the string on the other side is being lowered to be UTF-16 needs to be brought down then the operation we do fundamentally is whatever is minimal. In this case it's a transcode. So if we lift on one side and lower on another side we can create sort of the minimal conversion adapter between the two concrete representations. But a string is never constructed. There's not a process whereby we say this is the format that all strings have, the single ABI of all strings, we're going to make that string and then pass it across the boundary. String is an abstract type and depending on what concrete type is on either side we do the minimum conversion possible. For example, that if the two things on either side were both UTF-8 we would get to just have a fused copy and validate operation that just says, is this value UTF-8? I don't want to give some deals of garbage and potentially propagate sort of issues between the systems. And they're disjoint linear memory so we need to copy them. But that's all you have to do. There's no transcode, we're not transcoding to UTF-16 in back and this is true for any pair. UTF-16 to UTF-16 would do the same thing. It would just do a copy validate. We don't force arbitrary round trips through concrete types. And that's pretty cool and that's fundamentally what composition looks like. And the types might be more complex. It might be records and such on both sides, but the philosophy is the same. There's a lift on one side, a lower on the other side. You link them together and make the adapter. And that's what your component tooling is doing for you. Now we're going to go to a demo I'm calling Tower of Wasm. And first what I want to do is just show the code for each of those examples in sort of this VS Code project that I have. And so all those examples we saw before, the actual code I'm going to run during the demo. That's sort of the point I want to impress on here. And they've been built to components in a way that using tools that exist in the open source today. And these components down here are the five components I showed during the talk today. And to boot we can use this cool tool called WasmBuilder that lets us drag and drop components together to chain an arbitrary sequence of components, linking their imports together exports to their imports to their exports. And so what we can do for example is take this component downloaded as we'll just call it component one. And then if I go to my shell let's actually go to the other shell we can use this runner that I've created using WebAssembly, using Rust that embeds WasmTime to execute this component. And the cool thing here is that that component we saw before we want to read through this. This says hello from what and C and Rust and Python and Rust and JavaScript and C and Python. If we go back to the example we had in the browser there we'll see that it's C and Rust and Python and Rust and JavaScript and C and Python which hopefully is this exactly the same. What's interesting is that we just executed a composite component there that has Python on the top and then C and then JavaScript and then Rust and Python and Rust and C and what and we executed a call all the way down that import graph and all the way back up again building this string and it works. The different encodings between the different languages, the bindings requirements the composition, the lifting and lowering all happens. And this is a component that includes JavaScript and Python which are actually embedding Wasi virtualization in them in order to virtualize their file systems and all of this is something that is present and working today. And the cool thing that I want to invite the audience to do assuming we have time is we're going to build our own as an audience here today. So the one requirement is that what has to be on the bottom because as I showed earlier it only has an export but I want to ask the room to just shout out a language and we'll pick what goes next on the end of this chain. So can I have a language? Rust. We'll do Rust next. And another? And another? Well, let's go out of order a little bit, guys. Something else? Python? We're feeling very creative today. I'll start just grabbing some random ones then. But I'll grab Rust and we'll go back to JavaScript again for kicks and then pick me something else. JavaScript again. JavaScript again. Why not? And finally, we'll do one more. I heard C. Right. And now I'm going to need a second to link all these together with my little trackpad. But all I'm doing here is I'm drawing an edge between these imports to say, yeah, I want to take this export of this one component and make it the import of another. And you guys probably can't even see the lines. When we click the download button at the end, it will generate a composed component and during that process it will validate that these imports and exports... Oh, checking odds, here we go. Well, then we're already getting validation as we go. And the last thing I'm going to do is I'm going to check this box on C that says that we're going to export the exports of this as part of the overall component we build. We're going to download this as component two. And then... Oh, I think I clicked outside the box there. Component two. Oh, yes, for JavaScript and Python it does. And so what we're going to do here... Whoops. Before we run it, let's verify what we expect it to say. We expect it to say, hello from Watt and Rust and C and JavaScript and Python and Rust and JavaScript and C. And let's see if that's what we see. Right. So we have hello from Watt and Rust and C and JavaScript and Python and Rust and JavaScript and JavaScript and C. And would you look at that? That is the exact order of composition that we just did. And what's cool is that we can do this in any order we want. The reason I want to do an audience interaction thing is to show that this isn't something that works under specific combinations or specific order. There is generally the ability to compose languages, compile the components that are legal with the WIT type system. And that's mostly the talk. So... All right, I want to do some acknowledgments really quickly. Single story, my employer for helping me work with the byte code of Lance in these projects as well as lots of people who gave input on these slides or helped make the tools and the demos work. And with that, that's what I've got. Right, so we have some time for questions actually and I'm happy to revisit slides to answer any questions people have. So what's lifting and lowering was the question? And let's go back to a previous slide as a tool here. Lifting and lowering is... Conceptually, I want to think about it in terms of abstraction. Like low level, like a concrete thing, like UTF-8 bytes represent a string. In a high level thing, the component model string type. And lifting is the operation that tells you how to go from, like if I return a string and I return some bytes, right, then my function returns a string. So I need a way of getting those bytes to somebody else. And it's sort of to go through the component model, there's this like V pattern you go through. You lift up to the right and then you lower down to the left because you're going up through the component model. And that's sort of a way to visualize it is that it's a way of converting concrete types to these abstract wit types and back again. And so we lift exports and we lower imports. We lift and lower arguments and returns based on which thing. So an export that returns something, we have to lift the return of an export. We actually have to lower the argument of an export because if you pass a string into something, that thing is going to have to tell it how to turn that abstract string into a concrete memory, you know, string in memory that it can process. And lifting and lowering is just the name we give to this process of mapping values from the Core Web Assembly to the component model. But JavaScript string is a concrete choice of what a string looks like. So we're lifting that concrete choice of what a string looks like to this abstract concept of a string by telling it what encoding to interpret it as, you know, to interpret it as a sequence of unicode scalar values because JavaScript doesn't have concrete bytes. Then it would be on the right-hand side of this example. It would be like the UTF-16 bytes on the right. Actually, UTF-16 even, so it works there. So then, yeah, if you're passing a value into JavaScript, the JavaScript is calling an export of something else like imports that returns a string, then it would need to lower it into its internal JavaScript. And then we have a couple other questions. Yeah, so with the exception of resources, which are handles that act as a sort of referential passing mechanism, strings and records and all these things are value types that are structurally passed. We have a queue of three questions, so I'm trying to get them in order. What do you mean by pass through strings? Oh, that's an interesting question. There's not really a way to get like an opaque handle of a fixed person without ever seeing it. Yeah, fundamentally, there's not sort of that ability to take something from an import and pass it straight to an export without ever seeing it yourself. It has to get lowered into you, and then you would lift it for the other person. There's maybe an optimization that would be possible that would be interesting if we could have some way of doing that. I don't think that's fundamentally impossible, but it's not something that's a use case. Yeah, that's true. For strings and value types, you cannot do this. For resources, if somebody passes you a resource, you can get a handle, you get an integer. To pass this handle to somebody else, you pass them an integer. There's no... That sort of represents an alias to that string. Maybe it has a get method that returns a string, and you just choose not to call that method because you don't care. Right, so for example, the HTTP request and response headers are going to be resources that you can get the string from, but if all you're doing is proxying it and maybe modifying a couple of things, you don't need to see every byte that makes up that request. You're just passing it the handles through. There's another question here. There is a pool of... Right, so we currently have what's called the canonical ABI, which is a single parameterized ABI. It's parameterized in terms of encoding, for example, for which memory you're using, for your allocation function, but it's a single parameterized ABI that's currently possible, and that does have a finite number of parameter values of finite encodings, and so in the moment, yeah, passing strings between is just a question of having an answer between the finite parameters that are possible for lifting and lowering a function import and export. In the future, the interface adapters as given by the interface types proposal will eventually, far into the future, be a thing that the component model will attempt to pursue. So for example, we'll likely have functions at the component level that are adapter functions that let you give you full control over the ABI that your data is being passed in. In fact, that's one of the ways we could eventually support GC really well is by letting you customize the adapter by which we lift and lower your data. Then what it would need to do is it would have to marshal to a valid source of the canonical ABI and then do that. And so for example, if your language internally uses GC today, then you're going to copy it into linear memory in one of the supported encodings and shapes to do this. We had another question here. Wonderful. A question back. The canonical ABI is only so parametric, is only so flexible. If your language was really far from the canonical ABI, then that could involve some cost between two. But as much as possible, and specifically in this example for strings, we're able to do the minimal amount of work. And as we eventually get adapter functions in the distant future, they will really be the full realization of reducing the work between two things as close to zero as possible. But what we have right now is a distinct amount of work that you have to do to do these representations. Another question. Yeah. Since they're passed by value here, if you modify a value you've received, it doesn't do anything to the person who gave it to you. There's some interesting things that are possible here. It's sort of this question of like passing data through components without requiring multiple conversions of like, there's some interesting things that I would be curious to see about, you know, if I call an export and it gives me a value and I pass that value to the argument of another export, wouldn't it be cool if I reduce the data movement that's there? You ideally don't want to do a double re-encode at the moment you would because of current limitations to the canonical ABI. Right. There are possible ways of caching and preserving multiple versions of these values, but that's nothing that's currently possible. We only have a couple of minutes, so I'll take one more question, then we'll wrap. In the back there, we have one last. That's a great question. There is work on debugging and improving, like adding debug symbols to WebAssembly modules and components. There's a special interest group of the bytecode alliance focused on debugging. Currently, I think if we wanted to debug what we just ran, I imagine it would be quite a pain to be fully blonde. So there is a representation of debugging symbols information as custom sections in WebAssembly. The challenge here is that since there's so many of those, I don't know that the current tools are... Peter, I don't know if you have thoughts on that. Okay. You look like you maybe had something to say. Well, then I'll revise my answer to, I haven't tested it, so I can't promise. And with that, I think we'll wrap. Thanks, everybody.