 All right. It is my pleasure to introduce to you today our final speaker, Luke Wagner. Luke is a distinguished engineer at Fastly, where he works in the Office of the CTO on WebAssembly standards and evolution. Luke is also currently a co-chair of the W3C WebAssembly working group and champion of the component model proposal. Before Fastly, he worked on Firefox for 11 years, where he helped create WebAssembly itself, and today he's going to explain to us what is a component model and why. Please help me welcome Luke. Hi, everybody. Thank you for that introduction. Very excited to be here. And yeah, my name is Luke Wagner. I work at Fastly, and I'm gonna be trying to answer the question, what is a WebAssembly component and why? But first, to motivate that, I'd like to ask the more basic question, what is WebAssembly and why? And I appreciate that we're here at WasmCon, and people probably already have a pretty good idea of this, but just bear with me for just a second. So to quote the standard definition, WebAssembly is a binary instruction format for a stack-based virtual machine. Wasm is designed as a portable compilation target for programming languages. And I think the operative phrase here is compilation target, which means we can take our variety of source languages, and then in addition to being able to compile them to all the different native instruction sets, we can compile it to WebAssembly, and then we can take that Wasm and send it to a browser or a Wasm engine of some sorts, which we'll internally compile it to the actual instruction set it's running on. And what this buys us is portability, determinism if we need it, control flow integrity, and sub-process sandboxing. So what are we doing with all these cool powers? Well today, in browsers, popular use cases deport a large code base, especially C++. Examples include Unity, AutoCAD, Photoshop, Earth, and Figma. Also, offloading compute-intensive subtests from JavaScript to Wasm. The examples including codecs, compression, inference, encryption, and filters. Outside of browser, popular to embed Wasm in your existing system to bring your guest code closer to your system. And examples include CDNs, like we're doing it fastly, databases, client apps, SAS apps, Kubernetes, and streams. And lastly, people are using Wasm to explore alternative models of distributed computing, like serverless, distributed actors, record and replay, and edge computing, also like we're doing it fastly. And so that's awesome, that's a lot of great uses of Wasm leaning into its special powers, but here at WasmCon it's fun to ask, like what if we wanted to see a hundred times more Wasm usage? What big new use cases could we unlock that would make this happen? And so I'd like to consider four. Number one, today on many compute platforms, the client controls the application code, of course, and the platform controls the implementation and the interface, but the platform also controls the language runtime, therefore the choice of language you have. And the SDK is used to access the platform. This is both a lot of work for the platform to build and maintain, and also limits the choices of the client. So instead, we can lean into Wasm's language neutrality and linkable modules, and say, yeah, the platform controls the implementation interface, and the client controls their application and compiles that to Wasm module, but also gets to pick the language tool chain. And the language tool chain brings a language runtime, compiled to Wasm, but we don't want to duplicate that in every client application, so let's share that in a registry so that we're actually sharing the machine code at runtime. And the language tool chain also brings a bindings generator, which can generate the SDK that the client uses to access the platform interface. And then it's these three Wasm modules linked together, that is the actual client application. And yeah, this is a pretty normal thing to do with like open API and GRPC, but we're doing this without any network stack involved, so that we can achieve a very lightweight embedding. And so the wins from this approach, are the client gets to choose the language, the host does a whole lot less work, and this also helps emerging languages, because now you don't have to ask permission to like run your new language on someone's platform, you just, you know, build the tool chain. So I call this use case SDKs for free. Number two, today if I'm gonna reuse code, I have a ton of language choices and packages and registries, but they're all relatively siloed from each other, so it's difficult to reuse code across language boundaries. And furthermore, when I build an application, all those packages that I use transitively go into the same memory space where they can collide in shared memory. And they also share the same capabilities for things like secrets and file systems and network services. So if a package upstream that I'm using is exploited and I pull that in as a transit dependency, it has access to all the capabilities of my application, even though that particular package didn't really need that particular capability. So we can lean into Asm's language neutrality and sandbox reports. I'd say let's have a single registry of different packages from different languages compiled into Asm modules. Ideally one that stores its contents in OCI registry so it can reuse all the existing cloud infrastructure that exists today. And then when I build a single application, I wanna put each one of those packages in its separate memory and give it just the capabilities it needs. By doing that, I can reuse code from any language while mitigating supply chain attacks. So I call this use case, oh, and it also helps emerging languages because when you're new language, you don't have to bootstrap a whole package registry, I've accessed all this code already built in all these other languages. So I call this use case, secure polyglot packages. Number three, today if I want to avoid a big ball of mud architecture, one option is to adopt a microservice architecture where I put my separate modules and separate services and have them talk to each other through HGP, which achieves a very strong degree of isolation, but it does come at the cost of significant complexity and runtime overhead. And maybe that's what I need. I wanna, if I wanna independently scale the services as part of the overall scalability of my application, but what if I don't need that independent scalability? And this is like a popular question that, you know, every month it seems like there's a top rated Hacker News post that's like emphatically saying like just because you need modularity doesn't mean you need microservices. And instead they'll recommend a modular monolith where you just use modules in your source code language and you just call between them with fast function calls. And this addresses that complexity and overhead, but now I'm generally limited to a single language family on the VM I'm running on. And I have relatively less isolation because they're shared global state in most languages that these modules share. So instead we can lean into Wasm's language neutrality, sandbox support and linkable modules and say we'll compile the separate modules to separate Wasm modules and give them separate memories, but we'll still be able to do fast function calls between them because that's the thing Wasm can do today. By doing this we combine the benefits of the microservice architecture with the memory isolation and language choice with the benefits of the modular monolith with the efficient cross module calls. And so I call this use case modularity without microservices. And lastly, number four. Today if I wanna factor out code between teams, let's say I have four teams with four services and I wanna factor out some authentication or RPC or observability logic. Maybe have that owned by my platform engineering team. Got two popular options. One, I can embed that shared code in each service independently as a embedded shared library. And this works but it's harder to update when I make update the shared code. I have to like roll that out to all the different services. There's a certain amount of per language effort to make that library if I'm using multiple languages and its libraries can be a leakier abstraction. So what's popular in the service mesh world is to use a sidecar proxy pattern where you say I'm gonna have my service talk through a network stack to the shared code which then talks to like the real network. And that can address those previous three problems but it does come at the cost of significant runtime overhead. So instead we can lean into Wasm Sandbox support and linkable modules and say my separate teams separately develop their code and compile them to separate Wasm modules. And between those teams there's usually some sort of interface. Sometimes it's called an internal developer platform. And when I deploy my business logic module to the internal developer platform I just link it with the other module to make a compound module. Now I can deploy that to the underlying infrastructure platform which just has a Wasm engine that doesn't have to know anything about my internal developer platform just running two Wasm modules linked together. So the benefits of this approach is that it combines the low overhead calls between modules with the strong encapsulation between layers. And this really starts to matter once you have multiple layers. For example, let's say my business logic is actually heavily using POSIX and doing lots of file system stuff and I don't actually have NFS in my internal developer platform. I have a key value store. So I wanna virtualize the file system using emerging Wasm ecosystem tooling that I'll actually be talking about next. So to do that I can build time. I can link those two modules together, get the compound module, link it a third time. Now our three modules that I'm running on my Wasm engine which still doesn't know what's going on above it. It's just running three cool modules linked together. Let's do it one more time because let's say I'm developing my internal developer platform not directly on the infrastructure but on some function as a service or platform as a service who can develop its own higher level features and infrastructure abstractions as a Wasm module that I link together. Now I have four modules linked together and kind of see the pattern and as the number of layers grow I hope this illustrates that you really want that low cost strong isolation. So I call this use case virtual platform layering. So okay, four use cases. How do we support these use cases? And I think fundamentally what this requires is we have to expand the scope of what we're building from a new compiled target for existing ecosystems where we're comparable to things like X86 and ARM to a new ecosystem built around a new compiled target where instead we're comparable to things like containers, NPM, Nix, Maven, Helm, Debian. So what makes an ecosystem? Well, I don't have a fully general answer but at least just for the comparable things I just talked about, you know, the pattern seems to be you start with a standard distributable format. You have tools to build that format from different sources. You have tools to deploy and run that format and then you have tools to share and compose that format. So for example, in the container world, OCI of course defines the distributable what's a standard container. Docker build will build you a container from sources, Docker run and Kubernetes can run it and deploy it. And to share and compose I can use Docker push and Docker compose. So natural question is, well, can we just use WASM? I mean, obviously it's definitely a standard distributable format. But the challenge is, oh, and of course it has existing WASM compilers that we wanna use. But the challenge is WASM mostly supports shared memory linking. So if I have two WASM modules and I wanna link them together practically the only way to do that is to have them share a memory so they can pass compound values through that shared memory. And that's kind of like an operating system DLL or shared objects. But what we need for those previous use cases is when I compose them I want them to still have their own separate shared separate memories that aren't shared. And then I need some way of passing complex values between them and passing ownership of resources and other things that can't be copied between them. More like an operating system executable. So we can almost use WASM but it's just a little too low level. So the next natural question is, well, okay, well let's just wrap that with some POSIX and get up WASM executable. But let's speed run this. People don't just distribute a single POSIX executable because you don't just want one you have usually a collection of executables that need to work together some configuration files and then some static assets and then a little directory structure. So you don't just distribute that you button them up into what? A container. So we speed run this it's just containers but with WASM as the instruction set in the middle which is fine that could be useful but it doesn't unlock these kind of exciting new use cases. So I think if we want to do that we need something new something that wraps WASM so we can use existing compilers and that thing we're proposing is called a component. So finally getting to the title of the talk what is a component? So in one sentence component is an emerging standard portable lightweight finely sandbox cross language compositional module which is a super loaded sentence so let me break that down word by word going backwards. So a component is a module by having imports, internal definitions and exports where imports are things like imported functions like a log function and in general imports capture the IO performed by the components and its implementation dependencies instead of a fixed set of sys calls or a fixed runtime global namespace. The internal definitions this is the meat of the components is the actual code that runs and this is just embedded WASM modules one or more and this is 99% of the bytes and it can call the imports. But components can also nest other components so they're recursive by nature and lastly exports just take internal definitions and imports and then make them public to the clients of this components with names and types and so if I had a graphics plugin it might export a tick and a render function and in general exports capture host events and triggers and how generally client code calls into the components instead of having a single fixed main function and what's important is that all interaction with the outside world goes through these imports and exports there's not some sort of side door where the WASM just reaches out and touches other WASM goes through the imports and exports which means that we wanna say what's the type of this component we wanna say what does it look like from the outside all we need to describe it is just the imports and exports their names and types and this concept is so important that in width which is the IDL we're building as part of the component model it has a first class concept for this called a world and a world's just a synonym for component type but it's a little less technical sounding and more fun to say. So a world defines a contract between guests and hosts where guests components target a world by they can call the imports and implement the exports and a host supports a world by implementing the imports and calling the exports and we can import and export not just functions but whole interfaces for example this file system interfaces a collection of things including like a read and write function or I could export an HP handler interface and then once we can import and export interfaces then we can give them standardized names so that when we mean the same thing we can say the same thing and interoperate and so in this context you can frame WASI's job is standardizing common interfaces for use by many worlds for example in my plugin world here but like a thousand other ones too not defining a single fixed world that all hosts have to implement and that all code has to target and if you wanna hear a lot more cool stuff about WASI check out Dan Gohmann's talk later today. So that's what it means for a component to be a module what does it mean to be a compositional module and this is a nuance term that I won't attempt to fully define in the abstract rather I just wanna show by way of example here's a thing you can do with components that I think's rather compositional. So I wanna start with a C program written 20 years ago that generates a thumbnail let's say and I wanna expose this as a web service through HTTP and then locally test that. So I'm gonna start using WASI SDK to compile the C code and that will generate a components that expects POSIX things so it imports a file system and exports a main function but what I want is a component that has no imports and is kind of just a pure computation it takes an input stream of bytes and gives me back an output stream of bytes because this is a very usable component that I can use in a lot of places. So to do this I use a new tool in the bytecode alliance called WASI VRT this is a guide Bedford's gonna talk about this later today and what WASI VRT will do is adapt the exports to take the incoming stream and store it into a virtual file system in linear memory and adapt the imports to implement while file system operations in terms of streams and once we've done this we now have a very reusable component so we wanna publish it to a registry so other people in other languages can reuse it to do that we use another new bytecode alliance tool called WARC and Danny McAvay's gonna be talking about that I believe tomorrow and so once I publish this components I can use it from JavaScript let's say to implement my HP interface I wanna compile that to a components to do that I'll use another new bytecode alliance tool called componentize.js which guide Bedford will also be talking about later today which will generate a components that exports an HP handler which is how it receives HP requests to implement this API and to do that it'll call thumbify and then cache the results in a wazzy cache now I have two components but to deploy them I need to have one deployable thing so to do that I use wasm compose which is also a bytecode alliance tool that exists today and that composed parent components now targets a very simple world this caching server world that just implements a cache and exports a handler so I can run this on a whole variety of difference cloud, edge and network hardware because it's very easy to implement and call those interfaces so that's the component I can like run in production but now let's say I wanna test this locally on my machine and then for the sake of arguments my local machine only supports the command world which only knows about file systems and sockets but not HCP and caches so to run this deployment component on my local machine I wanna compose it one more time using wasm compose and we're using two components in the registry I'm gonna link in an FS cache component here to implement the cache in terms of say a directory in the file system like in my local testing directory and then I'm gonna link in an HCP server here that's gonna speak raw sockets and call my wazzy HCP handler and this will allow me to test my production component locally as part of my local development workflow I think what's really interesting about this use case is that we have two implementations of wazzy file system in this one composite there's the outer one which is probably talking to the native system and then we have the inner one that was fully virtualized by wazzy verts and like it needs to be totally different than the outer one right because they're doing totally different things and this I think is the key to virtual platform layering and it also suggests that maybe instead of wazzy standing for WebAssembly system interface instead it should stand for WebAssembly standard interfaces credit to this point to Oscar Spencer author of the grain language because I think it's a great point because we're not necessarily talking to the system through these interfaces like virtualization is like a big part of what we're doing here and so yeah we can retcon that without changing all the acronyms that's good so that's what it means to be a compositional components you can do stuff like this so that's how it kind of feels to use components what does it look like to actually like implement one of these things in like a normal programming language which gets to the cross language support so it starts when you want to target a world and let's say I want to implement this in JavaScript so for JavaScript I can just import the imports with import so I can import a log function and export the exports so I export a run function and then the JS bindings will make sure that list of string turns into a JavaScript array so I can just call join on it produces a string and I can pass that out to the log function and that'll just work so I can componentize that with componentized JS producing a component that targets this world in Python I can use a Python import to import the log function Python doesn't have exports but we can kind of emulate them by saying you define a class whose name matches the world and then the methods of the class are the exports of the world so then I can define a run function that way I can componentize that with another new BikerLines tool that Joel Dice will be talking about tomorrow called componentizepy and lastly I can implement this component in Rust by implementing it a trait generated that has all the exports of the world in the trait and then I can call the imports through an external crate that was generated by the bindings generator and I can compile that to components with cargo components which is another BikerLines project and so to deduplicate or to not have to have all these language tool chains doing the same work we have a bunch of shared projects like bit bindin and wasm tools that try to factor out as much of the shared wit logic, parsing logic and the type validation, the type rules into the shared projects so that way each language tool chain is just focusing on the interesting bits of that language and it's easier to add new languages. In addition to these string and list types I've shown we have a whole bunch of other value types records, tuples, flags, variants, nooms options, results and for things that are in so values are passed by copy between the memories, as we'll see next for things I don't want to copy either because they're too big or because they're non-copiable like a socket or a database connection we have resource types so for example my world can import a buffer resource type with a constructor and an append method and then I can use that type in other function signatures and then to use this from say JavaScript I just import a constructor function, buffer and then in my JavaScript code I can just call new buffer and that calls the constructor and I can call a method because the bindings generator put it on the prototype chain and then I can return ownership of the buffer by just returning the object handle so that's a taste of what it looks like to target this from various languages but how does it work kind of at the lower level in the bits and bytes that which gets into the fine sandboxing of components so fundamentally each component has its own separate linear memory and tables and just to show how this works from the example let's say my first component exports a secret store interface where a secret store interface defines a secret resource and a get function for looking up secrets and my second component wants to import that secret store and export a run function I don't have time to show the full implementation but the gist of it is the Rust code will implement some traits for the get function and another trait for the expose method and then the JavaScript code will get to import the get function and call that just passing a string getting back a handle to the secret and then it can call the expose method so let's talk through how this kind of call actually works under the hood so a call comes in from either the host or some client components two of my run function the JS engine puts that string in linear memory and we get a pointer to it address four now I pass that pointer to the key to the get function but that's a pointer into JavaScript's linear memory Rust does not have access to that linear memory so the component model specifies an adapter that sits in between the two components which will copy the key string into the Rust memory and getting a pointer to the Rust memory which is actually passed into the Rust code now in the Rust code we can call the underlying HTTP API that gets the key from the database or the secret from the database which we get into linear memory now and let's say the secret is xyz at address one now instead of just returning this key we wanna put it into a resource so we're gonna take that address and stick it into this resource which is opaque to clients and now we have a handle to this resource in our table so I'm gonna return this handle to the secret resource that I've just created by saying return the handle at index zero in my table and the adapter will then move that handle into the JavaScript table so now JavaScript has owns the handle to the secret it can't get inside and see that one but it knows it can call the expose method by passing index zero into the table to the expose function which will then say call the expose method on this secret so hopefully kinda this example gives an impression that this is a lot like in a process communication where we're copying values between memories and we're passing handles to things kinda like file descriptors but we're using fast local synchronous function calls so it can be a lot lighter weight and then speaking of lighter weights components inherit the same lightweight execution model as wasm today so today if you have a wasm module and you wanna run it on an execution platform there are generally two phases that you can do stuff in there's the ahead of time phase and the run time phase and from kind of a server context you would say like the ahead of time phase what happens in the control plane when I deploy a component and the run time phase what happens like in the data plane when a request comes in and so ahead of time when I deploy a wasm module I compile that to machine code you can blast that out across the fleet and then when a request comes in the wasm engine can very cheaply spawn up a new instance with a fresh memory reusing that shared code and this is how we achieve the sub process sandboxing and microsecond instantiation we have today so when I deployed component there's multiple wasm modules inside that thing so I do one upfront step of fusing them all together into one wasm module using the multi-memory feature of wasm once I've done that though the whole rest of the pipeline is effectively the same we're just compiling the wasm machine code and distributing it over the fleets the only difference things because we're using multi-memory we can have multiple memories in each one of those instances but importantly there's no big additional component run time with a whole bunch of component services and other stuff you might have seen from component models of the best it's mostly just wasm running as usual but we do have a new option if this component is reusing shared modules in a registry which is at the execution platform can maintain its own registry kind of shadow registry of compiled versions of each of these shared modules so that when I deploy component that shares modules that are already in the cache I can have a linker sort of script that points into, it reuses these shared modules and says how they link together and then at execution time this is effectively giving me DLL like code sharing between components where I'm sharing the code which is immutable but not the state so that's how components are lightweight lastly components are portable by layering on top of the WebAssembly core so the WebAssembly core has a formal specification a reference interpreter and test suite and with component model we're working on the same things WebAssembly core is of course in browsers today in Hasmans since 2017 and is exposed to the rest of the browser through the JavaScript API and we're proposing to extend that so components will be exposed to roughly the same API to browsers but of course this isn't implemented in browsers today it'll take probably quite some time to viso so today what we have is a tool also that guy Bedford will be talking about called jcotranspile that can turn a components into a core module and JavaScript glue code that it does the same thing as the component would do and people are actually using this today to run components as Wasm and browsers today Wasi is then layered on top of the component model and it's different proposals and interfaces and where possible these can also be polyfilled on browsers today in terms of web APIs and JavaScript polyfills so in a nutshell that's what a component is and I know we've been talking about this for a little while but it's finally becoming real planning and development or planning and developer preview release this year called preview two it covers both the component model and a subset of Wasi interfaces the top line goals is stability and backwards compatibility particularly we have an automatic conversion to convert preview one core modules into preview two components and then we're committing to in the future having a similar tool to convert preview two components into whatever comes next and the features of preview two include a first wave of languages in particular Russ, JS, Python, Go and C a first wave of Wasi proposals namely file systems, sockets, CLI, HCP and possibly others a browser node polyfill in the form of jcotranspile preliminary support for Wasi virtualization in the form of Wasi verts preliminary support for component composition in the form of Wasm compose and experimental component registry tooling in the form of Warg so if this is interesting we're gonna find out more there's a lot of great details in the bytecode alliance roadmap blog post written by Bailey Hayes on the bytecode alliance blog so that's what we'll be doing this year next year it's all about improving the concurrency story because preview two does the best it can but concurrency is admittedly wardy async interfaces are gonna be too complex for direct use and need manual glue code and our general goal is to be able to use the automatic bindings directly without manual glue code streaming performance isn't as good as it could be and concurrency is not currently composable just to say two components doing concurrent stuff will end up blocking each other in some cases and if you virtualize one async interface it ends up being that you have to virtualize them all so preview three aims to fix this by adding native future and stream types to it and components which allows to build ergonomic integrated automatic bindings for many languages and efficient IOU ring friendly ABI and composable concurrency for example in preview two we need two interfaces for HTTP one for outgoing requests so one for incoming ones and they have different types and different signatures and preview in the transition to preview three we'll be able to merge these and just have one interface as a handler which is kind of the obvious one which I feel a little simplified here by taking out the result and other sorts of types but basically that and what that will allow us to do is have a single component that both imports and exports the same interface so I import a handler so I can make outgoing requests and I export a handler to receive incoming requests and because they're the same interface I can take two services that I wanna chain together and just link them directly together using component linking and now executing the whole compound request is just an async function call which can support our modularity without microservices use case so yeah that should be pretty cool so please stay tuned to preview three so in conclusion a component is an emerging standard portable lightweight finely sandbox cross language compositional module we want to use components to grow as an ecosystem providing SDKs for free secure polyglot packages modularity without microservices and virtual platform layering there's a stable developer two preview release coming soon and if this is exciting you please get involved there's a bytecode aligns Zulip chat there's a meetings repo the bytecode lines org that lists all the meetings minutes and future ones that you can join even if you're not a bytecode lines member everyone's welcome to join and there's this componentize the world event this Friday which I should definitely go to and talk to the people who are building all the stools and start building stuff yourself so thank you very much