 My name is Jacek. I work at Status. I do research there. The dominant here right now, one of the bigger projects we're working on is Ethereum 2 in the research team. I have a couple of other ones going on. Privacy, messaging, everything stuff like this. There's one thing you want to remember about me. It's that I changed Ethereum 2 to little again. Very many happy people actually. I will actually go into the reasons a little bit why they're up. I mentioned earlier, if you want to follow along, the code we'll be looking at is online. Even notes are live, or it's literally the site. Please don't let it. There's a couple of people I wanted to mention. Like, where this project is coming from. We were sort of sitting around and thinking whether we should start this new fund project in NIM. As every programmer knows, the very first thing that you should do when you're starting a project is like sitting down and optimizing the shit out of it. Those of you that have a manager, you call this a disability study. Anyway, this is what we did over a weekend basically, and that's sort of the basis of what we'll be talking about today. Yuri, a good friend of mine on the NIM team, sort of started this idea to do contracts in this brand new language called NIM. We were sort of experimenting with its status. It's a nice little language. NIM has a very good template for programming tricks. It offers a very nice environment for a programmer. At compile time, it translates a lot of things to very efficient constructs for the computer to execute. The other person I wanted to mention is Paul D on the UASM team. So we posted this thing on Twitter and then we started communicating. Back and forth we started sparring and we were like, hey, I can compile here. We were like, oh, I can compile there. So together we came to where we will be going today with this thing. Finally, we have Jacques who took up the project. It's actually a nice little interesting project. It's like a smart contract environment. We're experimenting with it to see what it would look like to program smart contracts. We're planning to do a play for now. Something you should totally check out online. Or there's a presentation going on at the workshop with the embarks team also from status. They're sort of presenting a little bit about how you can deploy your contracts on UASM. So that's the cool stuff. So we're talking about UASM here. UASM was developed as an alternative for Ethereum 1 execution and maybe Ethereum 2 execution. And that goes a little bit back and forth whether that will be used or not. For the purpose of this presentation we'll be using the pretty old version of UASM, the one that's targeting Ethereum 1. And it kind of defines how the VM should work, what contracts should be able to do, how to access the Ethereum 1 state, a couple of system contracts metering. And it was sort of thought of as a gateway to Ethereum 1. So a lot of the stuff you'll see here, if you're familiar with Ethereum 1, you'll probably recognize the names of the functions, how it sort of looks at Ethereum state. And so on. It's not easy to directly look at Ethereum 2. Even Ethereum 2 was sort of right now at the stage of exploring state-class models. And then maybe something like this will be available under Ethereum 2 in the form of an execution environment. Or maybe not, like that. It's still being decided. But it doesn't really matter. One of the reasons why UASM is interesting is because it's this generic VM. There's being explored for a lot of different lock chains. So whichever lock chain you happen to be using that is targeting UASM, the stuff that you're optimizing would be more or less the same, right? And I mean, why would you even go here? Well, on Ethereum, on any lock chain, storage is at premium. So if you deploy the large contract that's bad for everybody, it's bad for your own baseline in terms of cost, it's bad for your users because they'll be paying lots of gas for the execution. It's obviously bad for the ecosystem as well, because you're just exploring the state. How does UASM work? Well, how does UASM work? First of all, you have an environment in which your code is executing, you have it in the sandbox. Then you have your contact code on the one side that's slightly stopped if you're right. And then the environment provides functionality from the outside world, whatever that might be. If you're running this stuff, if you're running Wasm in the browser, for example, there's going to be access to the JavaScript world and whatever the JavaScript code instantiating or Wasm code has given you to play with. There's going to be the DOM that might be like a UI canvas, stuff like this. If you're doing slightly more modern stuff, there's this standard interface called Wasi, which gives you access to files. It's basically the interface, it's a little bit like what you would have from the kernel library or just the library, right? That's the interface around there. So UASM in particular defines these functions, which are basically code that our company, for example, comes straight from the EVM, right? Storage is the S store, S node, and so on. So it has like a function for everything in EVM that is not present in Wasm natives, so to speak. And you get access to this from inside your own sandbox. It will conduce to the outside world. The contract will be optimizing today. This really small and simple WRC20 was a challenge that the UASM team posted a year ago. It's kind of like a little token contract, like a transfer contract. You can see that you have a balance function and you have a transfer function, right? And this is basically an example of this NIMP play contract environment that we're talking about. This is the pre-alpha, pre-pre-preversion. We're experimenting with this sometimes now. If you look at Viper, it's a little bit similar. So you have functions then. You can get an address, return a U, like V balance. You can access tables. You can do comparisons and checks. And at face value, it's kind of like a simple contract. There's no match there. But when we give it to the compiler, it will go through this pipeline of multiple stages, right? So we start with your code. In this case, this is the WRC20, right? And the first thing that happens is that it goes to the NIMCompiler. And the NIMCompiler expands it into, like, a key-sugar version, you could say. I was talking about macros earlier, basically, at compile time. There's a bunch of code executed that expands it into slightly more primitive constructs in the language, right? And you access this library that we're talking about. They're called Datacopy and so on. Then it goes into the compiler ASD. Like, that's just a standard compile pipeline, right? The language will do a little bit of processing, try to find cases for optimization. It will pass it to LLBM. And LLBM, in turn, has its own intermediate representation. It looks to optimizations. And then it will compile it to Wasm. Now, Wasm is this instruction set that is executed by the Wasm environment later on. And there will be one more translation into the native instruction set of the computer that you're running it on, right? And that might be a phone, that might be a browser, like, on your laptop, or whatever. And at every single stage, when you're optimizing something, there's, like, opportunity to save either time or space for whatever you're optimizing for. Today, we'll be talking mostly about space. This is just because I want to limit the scope of this presentation, yeah? Can you use, like, the Wasm in the last books, like, interchangeably with the Wasm? Or, I mean, this is WRC when it's going to be, like, a Wasm contract, right? Yeah, so you could write your Wasm, like, technically, you could execute this code in a browser, or you could execute, like, you could write this by hand, starting here. Like, there's a lot of these things, but I... Oh, let me say it first. Was it, like, a Wasm? No, it wasn't. It wasn't. Exactly, yeah, yeah. It wasn't in the end, so... Right, right, okay. Yeah, so it wasn't defined, like, basically, a little bit of semantics outside of Wasm. Yeah, I like that. So just to give you an idea of what this looks like, right? The first stage of the pipeline, so we take your code, and we sort of convert it to these low-level calls, right? And why would you use the higher-level thing when, obviously, it's easier to code in? But, like, the experience that we as the sort of maintainers of this contract environment is that we take all these optimizations that we figure out, and we sort of try to encode them into this translation, right, so that they don't have to call it. This is the premise of why you're using a high-level language instead of, like, one straight line. Wasm stuff. So you can see here, the key transfer function. Previously, it was just, like, one line here. It's expanded into, like, all right, we're going to check if we have enough call data. We're going to get the caller. We're going to get the recipient. And then we're going to copy back to the person calling the code, right? The name compiler then compiles this to LLVM. And LLVM is this kind of generic IR for lots of languages. The reason why LLVM does this way is because they can then generically optimize on this, like, medium-level abstraction. And it's looking even worse right here. And here it's starting to... You're starting to get really heavy, right? So you add a bit memory. You're offsetting pointers into these structures. You're doing tail calls. Stuff like this, if you recognize them from your computer science test, right? Otherwise, this is starting... Like, here you have to have a number of specific models to understand what's going on, right? And finally, we have the stack where LLVM takes this and not puts it to Wasm. LLVM register-based, Wasm stack-based. This is, like, two ways of looking at the same problem. This mismatch is one of the reasons why some people think it's better not to go through LLVM and create, like, Wasm compilers. Because, obviously, in every translation step, you lose a little bit of efficiency. And that's true when you're going from the high-level constructs to, like, medium-level ones. And every time you take a step like this, you could lose a tiny little bit. But we'll be exploiting some of these things that we know are losses, and we'll try to take them away, all right, by being smart about what we're doing. Our starting point is roughly at four and a half kilobytes. If you take, like, that smart contract code, you'll end up at four kilobytes and there's this tool called Tweeting which gives you, like, where the size is spent. Immensely useful, so you can see that, like, just the fact that we're saving these long-function names that we had, like, the Duke transfer takes up a lot of space in your contract. So the first thing you want to do is, like, make sure that you're calling your tools correctly. That you want the tools that, like, strip away all this stuff and try to do some basic optimizations just by using the compiler and, like, calling it with the optimization on, we can bring it up to roughly a third. But obviously, like, when you're looking at the code, you can't really tell what's going on anymore. But the compiler doesn't, or really the execution environment, doesn't really care for your long-function names, which means to know where the code is. The second thing is that, like, when we started coding this, like, we were thinking, like, all right, let's do 128-bit integers. And this is, like, this is an instance of really not understanding your problem because the challenge was for 64-bit integers. And, like, we wrote this with 128-bit integers because that's what we often use in material for storing way values. While it doesn't, in particular, has issues with this. There's no adequate carry operation, for example. So when you want to add two 128-bit numbers together, you have to do it piece-by-piece, and then you have to do a little branch to see if you overflow. And then you need to add that to your number. And it becomes, like, a fairly big piece of junk of, like, raw code. So we thought, all right, fine. Let's skip the 128-bit stuff. Let's move to 64-bit values strut. There we go. 100 bytes of savings roughly, right? I thought it would be more. I really thought it would be more because at least add-with-carries and so on, you tend to take up a lot of space if you do them with limbs. But it turns out that we meant the optimizer. And what does the optimizer do? Well, the optimizer looks at your code and tries to make it better in various ways, right? And one of the functions that we have is to begin anything, which if you implement it by following the bits, if you don't have access to an instruction, it does this for you. This is what you need to do with the basic operations. You need to do lots of ends and shifts and stuff. But LVM is so smart that it recognizes this pattern and turns it into a B-swap intrinsic B-swap, which is like the swapper of bytes. There's one more thing that you don't have available in Wasm. But when we take the step from Wasm, from LVM to IR, LVM does not know about this really. So it thinks that this B-swap call is very cheap and enlightens it. So there's six that makes perfect sense. It's one instruction thing and you really don't want to be making a function called access to this one instruction. Wasm, not so. All right, let's turn off inlining for this particular function and suddenly look, we lose 150 bytes, something like this just by giving the optimizer this hint. We know something better than the optimizer, right? We're turning off inlining for this one particular function. Moving on. We have the language itself, right? NIM is kind of like the same language. When you initialize things, it tends to put them to zeros, like old values. Knowing this, we haven't really told NIM that this call data property function would overwrite the data. So what NIM does here in this instance is that it first writes 32 zeros to this array that we're using and then we overwrite it with the data that we're putting. So in order to know that you have to do these things, you can go back and look at the assembly, like we wasm up with and see what's going on and you're thinking like, oh, look, we're writing lots of zeros and then we're overwriting those zeros with data, right? What do we do? Well, we turn off the zero initialization. We write 20 bytes of address and then we zero the rest. Simple optimization, right? 200 bytes. You have to know your library. Like when you're using a library that kills me, I have to really understand what different functions do. In this case, naively, we thought we'd be safe with write some extra, say, code error. We checked the code error and said, do we really have 24 bytes or not of call data? Like did they really pass in enough data for us to run this function as they wanted us to? All right, let's check and revert. It turns out that the call data copy, if you just call it and give it like, all right, I want the last 24 bytes. If there's not 24 bytes available, it will revert. It will do this work for you, which means basically that you can just remove this and the program will function exactly the same. The optimizer. Well, the optimizer works by looking at your code and trying to figure out the best way to schedule it, but it doesn't really understand all of your functions. But what do we know? Well, we know that if we're doing the transfer, we're removing the value from the sender and we're giving it to the receiver. In the original code, we're first loading all the data, doing the transfer and then saving all the data, but instead we can recognize that we don't need all this data available at the same time. We can load the sender's value, subtract it, save it, and then we can load the receiver's value, add it and save it. And this means that we don't have to use memory for both these operations. So, what's left here? Well... Oh, no, I skipped one. What's left here? When we've done all these optimizations, like we have your contract code and we have this big engine optimization, and then we have a bit of craft that we can't really get rid of. This is the way that we tell our contract about the environment. So, if you listen to Paul's presentation, you would know that we can also compress your contract using like Zlib or something. You can save a little bit more. But the quick summary here is that no every step of your pipeline, right? You can use the knowledge across the board. Yeah, alright. Alright, here we go.