 I'm extremely honored to be speaking the Cloud Native WebAssembly Day today. So today I'm here to speak about the textual and the binary formats of WebAssembly. And I know you'll think, and I definitely thought of it as a more theoretical and conceptual sort of a thing rather than something that is to be presented at a conference, but I hope I can change the perception for you all. So before I go ahead, I ought to introduce myself because this is my very first session at my very first in-person conference in nearly two and a half years. So I'm Devya Mohan, as I aforementioned, and I work as a technical writer with Ranchelabs that's now acquired by SUSE. But over and about the documentation work that I do at my day job, I'm also one of the documentation maintainers for the Kubernetes and the Litmus Kiosk projects. I also do a bunch of community stuff as part of my roles of the CNCF ambassador and being an AWS community builder. And I'm extremely excited about helping future generations of technologists take their very first steps in the open source as well as the Cloud Native ecosystem. So that's a bit about me and I won't go really deep into it. But coming to today's topic, who is it really for? Because like I mentioned before, there's an extremely theoretical, conceptual topic, pretty boring if I can put it that way. And why would anyone speak about this at a conference? So there are two sections to this part. So who is this for exactly? So if you're just a developer who's trying to load WebAssembly modules into your code, probably it's a potential overkill, not trying to be exclusionary here. But if you're interested in the behind the scenes stuff that goes on when you actually compile your program to WebAssembly, this could potentially be of interest to you. Or if you're an enthusiast or a hobbyist like me, who really likes learning all there is, this might be of interest to you. And of course, people who are veterans in the space already are aware of this, but if you are looking to write WebAssembly compilers of your own or you're potentially looking to optimize the compiler performance, this session might be of interest to you. And why am I talking about this right now? Is another question that probably might get thrown my way because this has been around for five years. Like even if you go right now onto the WebAssembly website, the version two of the draft will be there. So why am I talking about this right now? Now, WebAssembly as an ecosystem is just getting started. A lot of newcomers are stumbling up on the specification. And unfortunately, when I did back in November 2021, a lot of it wasn't clear because of my lack of programming knowledge. WebAssembly specification was really difficult for me to navigate through as a person who came from a non-coding background. And I basically knew only the bare bones of programming and did not really understand a lot of it. So I want to make this accessible and this is purely a selfish motivation, I understand that. But in the spirit of making this ecosystem a more welcoming and accessible space, I want for more content or want for more stuff to be out there that's equally accessible to people who are from any career level or any level of knowledge. So that being said, how do I plan to go about this session? Like with everything in this universe, whether you're learning a programming language, whether you're learning cooking, whether you're learning any musical instrument, you first start off with the building blocks and then you probably add on structure. So if you take a language, for example, what you typically do is you start off with characters or alphabets and then you sort of build upon words then sentences then essays or whatever. But there's a slight difference here. The difference being that when we speak about the WebAssembly program, we know the end result. We know that it is going to be a program, albeit in a different format. It's not going to be something that is apparently readable to us because we are not all of us are not assembly level programming enthusiasts, but we definitely do know that it resembles a program in one format. So what I aim to do at this point is to take a WebAssembly program, deconstruct it, deconstruct its structure into the various building blocks and walk you through it as though we were talking about a regular program in a normal language that we probably write programs in. So this, sorry, this is a program that I did not write. Let's be really clear about that. I did not write this program. I took this as an example of wasm2wordgithub.io. So I took this because it sort of has all the elements and what I aim to do is deconstruct it into its parts, which is sort of labeled on the slide that you see behind me. I know that a lot of y'all already know the various components of a WebAssembly program, namely the module and how a module looks like, what are its subcomponents. So we're not going to delve deep into the actual components of a WebAssembly module, but what we're going to do is look at the program from the perspective of how we could potentially write it and what its grammatical syntax would be. So now I said that I would not look at the module, but I do need to just go over it a little bit so that we have our concepts clear. So a module basically is the base, fundamental unit of a WebAssembly program and all of us know that so I'm not going to repeat it. And in the binary context, if you look at it on the far right side of the screen, you will see that that's exactly what it translates to. Now, the module on this slide is empty. So basically it does not do anything. It is just, it's not going to take any inputs, it's not going to throw any outputs. It's basically just there. But this is a perfectly valid program and basically if you could liken it to a normal program, it would be the hello world, but I don't want to say that out explicitly because then that would be very simple for anyone to write. So this is an empty module and there's a perfectly valid first WebAssembly program that you could write. When we speak about modules in real life, that's not how it's going to look like in real life because we do have to have some sort of inputs, we do have to have some sort of output. Otherwise there's no point of writing software in the very first place. So what does a WebAssembly module essentially look like? There's a very poorly drawn WebAssembly module structure that I managed. I'm no Picasso, so sorry about that. But the WebAssembly module is composed of these various sections that are there in green and red. We'll come to that in a bit. And most importantly, it's composed of that preamble. Now you may ask, what is a preamble? So for that, I'll just have to go back to the previous slide for a bit. So in this slide, if you remember, this first set of numbers and alphabets that you can see, that's essentially what a preamble for the WebAssembly module is. It basically defines, it basically tells the machine that you are a WebAssembly program and you are at version one. So coming back to this particular slide, wherein we have the module anatomy. So why is there a difference between the various sections and what exactly is the thing that differentiates them? So except for the coloring, the fundamental differentiator here is that the known sections which are in green are supposed to be arranged in that exact order. And honestly speaking, we should all be thankful for the fact that we don't have to write. We have compilers to actually do this because remembering another sequence of sections would probably have been hard. So this sequence has to be maintained across all WebAssembly modules. And if it is not maintained, obviously the validation does not happen and your program's not allowed to go further. And in case of the red boxes or sections, why is it known as the custom section? Because A, it's customized in the sense that it doesn't have the same sort of, you know, it doesn't follow the same set of rules that the known sections do, which are in green and they are lazily loaded, which means they are not subject to the same validation rules that are there for the known sections. But I digress. I was here to speak about a WebAssembly program and section it in, you know, deconstruct it for y'all. So we've gone about looking at how a module looks like in general without delving too deep into its different sections. Now, in a typical program, what can y'all think of as, you know, potential things that you would incorporate? One is, you know, the input or maybe, you know, some sort of variables or constants that you'd like to pass. Then you obviously have the output. Then you have instructions that possibly would be required to act upon those inputs and outputs. And you probably might need libraries to, you know, run that whole program. And obviously this program needs some memory because there's nothing that runs without memory these days. So I think that's a no-brainer. And last but not the least, you will need data, maybe not for initialization, maybe not for any other purposes, but just general data to sort of test your program or whatever. So that's exactly how we are going to look at this WebAssembly program as well. So the first section that we're gonna look at is the operant section. Now, this is not, you know, the specification-related language and I've tried to drill it down as far as possible. So the operants are basically the supported types within WebAssembly. And this section, after the section rather, what I sort of want people to walk away with is what are the kind of types that are supported in WebAssembly? Sorry, how do we sort of pass them? And where do they need to be defined? Coming to the actual thing that's written in the specification, we have a lot of types that are defined within the specification and they're all listed down in this table here. But I believe, and I think it's the fundamental thing across languages that there is, there are a set of types that are supported at the base level, upon which other types are built on. In the case of WebAssembly, the supported types are only four. That is the number type of integers 30, integers 30 to 164 bits and floating point 30 to 164 bit. The vector type and all the subsequent types sort of build upon that. And for each type, fun fact, there is a corresponding instruction set awaiting in the subsequent slides as well. So where exactly do we define these specific types? Because as we saw in this program that I'll just bring up here, as we saw in this program, we do not have any specific, in the anatomy bit, we don't know where we place this particular thing. So in the type section, what, sorry, in the type section in the anatomy module, anatomy of the module, is where we would place the function signatures. The reason being, typically, what tends to happen is we pass values as part of a function and return them as part of a function. We wouldn't probably define them elsewhere. So that's how we define the grammar and syntax that's outlined here. And this is essentially corresponding to the type section that we saw in the module as well. Now, when we speak about prerequisites, when I started off programming, I'll be unwillingly during my graduation. I was doing electronics engineering. I was forced to write the same hash include stdio.h a million times as far, and now it's like a bad dream at this point, because I only remember that particular line from every C-plus-plus programming language that I've written. So what is the wasm equivalent of that? I mean, I don't mean to say that we have to exactly write the same thing. And what would we expect it to do if we were to write export and import statements in web assembly? So for that, there is the next section, that is the import section. Syntax is pretty much up on the screen as well. And the best part about this is that you don't have to remember the hash include stdio.h, which probably gave me nightmares for longer than I can remember. Coming to instructions, which obviously are required because instructions form the structure that we sort of, not structure, maybe they are required so that the machine knows what to do with the values you pass to it. So there are as many or probably more type of instructions as the number of types. So you have numeric, your vector, and you probably can read them here, so I'm not gonna go over and over again. But in this particular program, we have a drop instruction. And you must be wondering what sort of instruction that falls into, it's a parametric instruction. And just like with the number of types, we basically can use these instructions for both vectors as well as for numeric types. So that being said, all of this needs to work somewhere in memory because you need to somehow store values, you need to retrieve values. And we all know there is something called as a linear memory in WebAssembly. So can we actually amend the memory being allocated to us because WebAssembly is following a binary instruction format. And what are the different kinds of memory? Now, as we all know, there is the linear memory that I won't go deep into because it's allocated with every module. And in fact, if you, if I can just switch back to the slide for the anatomy, you see that it is allocated within the module itself. So there is a section dedicated to memory. And in fact, within the actual, sorry, within the actual program itself, if you see, memory is sort of allocated via offsets. And over and above this, we also have the intermediate stack from which values are pushed and values can be pushed into and popped out of. So that's a bit about the memory aspect of it. And last, but not the least, we have data. Now, data doesn't need to be, it doesn't need to necessarily relate to actual initialization or actual requirement for any other functions. But you might need to just pass data for the processing to actually happen. So how does that get declared? So this particular section sort of defines that. And the grammatical syntax here is also pretty straightforward. So I don't think I need to go very deep into that. But yeah, I think that was all from my side. And summing it all up, I think that we are at a pretty, sorry, we are at a pretty nascent stage here. And as the specification sort of evolves, we will have more flourishes coming into the textual and binary format. And it's upon us to actually keep it as relevant and as fresh for the newcomers as possible, because a lot of the content out there is sort of jargony, if I might add, and newcomers might not exactly be able to understand it if you come from a standpoint of just listing stuff down. So I hope this was helpful. And I'd like to wrap up the presentation for today. Thank you.