 Up next we have Paul and Guillaume who are going to present on WASM engines. Take it away. I'm going to do this one alone. I'm going to do WASM engines alone. Okay, hello. My name is Paul. I'm going to talk about WebAssembly engines for E-WASM. Those are they. One, two, three, four, five, six. The first three or four we have working in some capacity. It seems that there's a big push to use the Firefox and the Chromium WebAssembly engines. They have various sort of tiers of engines. They have their baseline engines which are just single pass, linear pass to compile to machine code. Then there are the optimized versions. So I was asked to talk about WebAssembly engines, but I just want to talk about what WebAssembly is in general. So there's a 150 page specification and it specifies a syntax of the language or a grammar, so how to construct programs. And on top of this definition of the syntax they define a validity which says you're guaranteed to return or pass some arguments and it's guaranteed to do certain things. And then finally they define execution. Execution is defined as rewrite rules. What is that? Okay. So somewhere in some WebAssembly program we have these two opcodes. Const zero and call zero. What does it mean? What does it do? Imagine zero, okay. So first of all, WebAssembly is given to you as a module. It's a WebAssembly module.wasm file. It's a set of functions. It's a memory. It's some abstraction that lets you do function pointers that's called a table. Some things to initialize these tables and these memories. Things to import and export. And you have this module which is a set of functions. So it maps well to an Ethereum account which is also an account code which is also a set of functions. And you have imports and exports so different modules can call between each other. But somewhere in one of the modules and one of the functions we see this. What is it? What does it mean? So const zero, 32-bit const and call zero. Call zero means call a function, the function in this module which has index zero. So imagine that's the function that has index zero. That's WebAssembly code. You don't have to know what it means. But it's some chunk of whatever that you can execute when you see this code somewhere else in that module. So this is how a WebAssembly program executes. We see these two opcodes and then we execute these opcodes. By rewriting wherever we see this we replace it with that chunk. So what is this chunk? We take everything in this function, loop to get local. We copy it, we paste it here. We use something called the frame. This is just what it says in the spec. I'm just mechanically doing what they say in the spec to do. It says when we have in these brackets we say locals. There's a parameter. It takes that const and zero. We copy and paste it here. Then we wrap all of this code in something called the block. That's just what it says to do. So that's what I'm doing. I'm just rewriting that. I'm deleting it and I'm replacing it with that. Then what happens? Then we rewrite the thing that says block with this. I just read the specification. I'm just doing what it told me to do. Then what? I rewrite where it says loop. I put in labels, zero, and then I copy and paste all of this text, including the loop part, into here. I wrote it in small text so it could fit. What happens next? This get local. So we're sort of going along. We saw a block. We saw a loop. Then we see a get local opcode. What does that mean? What does it do? It says to take that const zero. That's the first local. There could be more. Copy and paste it. Delete where it says get local. Paste into it where it was. I32 const zero. Then what? So we're sort of going along, and then we see these three opcodes. What do we do with these three opcodes? It's a rewrite rule again. We replace these three opcodes with the sum, the add with the sum of the two. So we replace it with a const one. As you see, it's just rewrite rules. The whole semantics, the whole execution semantics, is we're just rewriting chunks. We're copying and pasting text. What's next? We have a T local, which means we take whatever is above it, put it into the local const, and then we also paste it here, and we delete where it says T local. I'm just rewriting stuff. I'm not doing anything controversial. It's just what the spec told me to do. Then I'm doing a less than unsigned operation and replacing it with a one, because one is in fact less than three. So anyway, this just keeps going and going, and finally I get to this. This is my last slide for the Wasm Engines talk. So what's the point? Why did I just waste your time with this sort of rewrite rule stuff? I went from this somewhere in my code, and I just rewrote it a bunch of times all the way down to this. That's how we executed WebAssembly. These rewrite rules, so what's the point? That's how the spec defines it. It's just rewrite rules. Nobody implements their engine like this. People implement their engines with stacks. There could be one stack, there could be two stacks, there could be three stacks. When I implemented the engine, I did it with three stacks for various reasons. There's stacks for these function call frames. I have a stack for the function call frames, I have a stack for the labels, and I have a stack for the operands. Other people keep their function call frames and their control flow labels in the same stack. There's advantages to doing it my way because if you keep these in the same stack, every time you return from a function, you have a linear time if you're nested in a bunch of functions, so that could be a slowdown. The point is everyone implements their engines, however they want to implement it. People do it differently, but the whole goal is to make WebApps. They want to have alien fighting games in their browsers and things like that. If that crashes, if that fails, who cares? If everyone's implementing things their own way, then maybe someone's going to have a bug, someone's going to introduce an invariant that they're going to ignore later on, and something's going to go wrong. If it's a WebApp, who cares, but if it's a DAP when there's a lot of money on the line, then we might have problems. I think we have to focus and spend a lot of time making sure we implement it right to audit these engines. We talk about Firefox and Chromium. They want to have some micro-optimizations and benchmarks that they're interested in, but do we want these micro-optimizations? I'm just asking questions about do we trust this stuff? The spec says to rewrite stuff, no one's rewriting it, everyone's just doing it their own way. What's the right way to do it? I think we have to audit and keep a close eye on everything. That's the WebAssembly Engines talk. Okay? Swizzling, so I'm giving the next talk as well. It's called TurboEOSM, which is increasing transaction throughput. TurboEOSM is our code word within our team of increasing transaction throughput. This is joint work between Geomonite and other people on the team, of course, have had good conversations with us. So the big idea is what are we doing now? Ethereum execution. It's serial execution, as you all know. We're disk IO bound, they say, and we have max 20 transactions per second. What does it mean? What does serial execution mean? We're executing transaction one, then we're executing transaction two, then we're executing transaction three, and so on until the last transaction. What does disk IO bound mean? This isn't to scale, by the way, but if you took some operating systems course, you've seen some image where you're doing some work, and then you're waiting for IO, and you're waiting, and if it's disk IO bound, these sort of IO ranges dominate the work ranges. And so we're waiting, and waiting, and waiting for disk IO to come back, and it comes back, and we do work, and we do work, and we do work, and now we're waiting, and waiting again. And so we're doing 20 transactions per second, and I don't know, when sort of engineers see this kind of thing, they're bothered. Why can't we fix, improve this? Because this is awkward, and we're engineers, and we know how to do these kinds of things. Guillaume and I had some conversations about this. So Ethereum shards, we might have the same thing. Shard one might also be limited to 20 transactions, likewise with shard two, likewise with all the shards. So some use cases like these plasma on-ramps, off-ramps, state channels, payment channels, whatever, might not be usable if we only have 20 transactions per second. So sharding does scale, help us scale in some sense, but is it meaningful? We still can't use, you know, if a million people want to get out of this payment channel, it's still unusable. So let's try to talk about increasing throughput on each individual shard, and maybe cross shards. So when people say, why can we use sharding, they start talking about independent universes, independent galaxies, things are independent. What does it mean? What does independence mean? So I guess we'll define what independence means. Two transactions are independent if they do not touch the same state location. This is an illustration, transaction one and transaction two. Transaction one touches the upper left, the transaction two touches the two bottoms, and you can consider these transactions independent of each other. So within, so in different shards, we're guaranteed, they say, that a transaction in shard one and a transaction in shard two are independent of each other. So we can, that's the whole idea. There are independent galaxies, we can execute them currently. But how about transactions within the same shard? So the focus of this talk is within shard transactions and cross shard transactions. A cross shard transaction is a transaction, for example, from shard one to shard two. Okay, so what does someone that wants to scale do? An engineer usually will go to perhaps Wikipedia and look up models of concurrency and read about actors' models and process calculus models, and maybe we can use these things. And people are using, people are looking at this. I spoke with the actors with capabilities, the gentleman that invented the primea, which DFINITY is using, then AGORC is using these ideas, process-calculi, R-chain, rolling, doing great work, fantastic work, and they're talking about paralyzing and increasing transaction throughputs. They're also using this idea of independence. With primea, they're called VATS, or with the actors and capabilities in general, they're called VATS, a shard is equivalent to a VAT, in some sense, and for process-calculi, they call it namespaces. So everyone's using independence. So a good scientist or engineer says, what is this independence thing? Let's define what this means, and let's take it as far as we can take it, because we're engineers, we're working at the margins, let's see what we can do. Okay, so we're going to define what an independent subset of a block is. An independent subset of a block of transaction is such that each transaction is a subset, such that each transaction in the subset is independent of all transactions outside of that subset. That's the toughest definition of this talk. For example, an independent subset is transactions two and three. They're independent of all transactions outside of the subset, transaction two is independent of one, transaction three is independent of one, so these two are independent subsets. because transaction one, do not write to any location that the set of Tx2 and Tx, transaction two and transaction three write two. Correct. So we have some sort of maybe you can have a visual proof that you see that transaction one writes to this and it's independent based on the definition of independence which was in a previous slide which is a relationship between two transactions. And an independence partition of a block of transactions is partition into independent subsets partition. We know disk partitions partition of a set is breaking it into subsets such that their union is the whole set and their pairwise intersections empty. So the independence partition of this sort of block would be one subset is this just this transaction a subset with this transaction and other subset is these two transactions. And we notice that we might be able to run these concurrently because they're independent of each other just like shards are independent of each other. Okay, so how do we build this independence partition for three transactions we could just look and point and say look these are independent but if we have a whole block of things it might be difficult so we have some algorithms to partition the pool, a transaction pool or block of transactions into independent subsets. What is access list first of all? There's EIP648, there's a push with each transaction you say I'm gonna touch these state locations and only these state locations. If I violate this then my transaction is invalid so there's a push for access lists and shards so there's two cases either we can do it with access lists or without access lists. Let's talk first about with access lists. So I'm gonna do a visualization and it might be tough so we have our block. We order our transactions from one just forget about anything below transaction zero, transaction one all the way to transaction 63. And then we write the same transactions on this column we notice that transaction zero and transaction zero interact which are dependent on each other so we put an orange square there and also 63 whatever this is based on access lists we take these access lists we compare see if they intersect with transaction zero and 63 yes they do for zero and 63 we put a orange square there and we can partition this block of transactions into independent subsets so this subset you know this chunk of transactions is independent of all other transactions why because there's no orange things colored in here so this is maybe the kitty cat app this is a decentralized exchange this is some mixer this is some ICO these are one-off transactions and we can execute each of these sort of chunks independently of each other because of this independence property just like shards are independent yeah I'm just saying do you want me to do that so that you don't have to walk all the time I got it okay likewise if you have many more transactions you can break it into independent subsets so this is just an idea to parallelize a given block we're not doing it yet so that's the big so you see these independent there's many so we can do it we can scale this okay without access lists it's awkward we currently don't have access lists in Ethereum so we might execute a transaction see if it sort of hinders our concurrency so if it touches a bunch of these independent subsets then we won't include it of course there's a DOS tech worst case we'll go to serial execution but this is one possibility to scale the main chain and it's all based on independence it's a simple idea that independent universes galaxies we're just taking this as far as we possibly can like we're maybe scientists or engineers so then each independent subset we execute on its own CPU perhaps we've already known this since EIP648 this is nothing new but this sort of this looks like the sharding picture we're shard zero shard one to shard n but each one is sort of not shard we're sort of dynamically creating shards and we're solving two important problems within a single shard and cross shards we can use these ideas so we ohm and I are sort of trained in computer science and we sort of talk about operating systems and when you take an operating systems course you learn about threads and preemption and we have this disk IO bottleneck so our idea is to use threads so each thread corresponds with an independent subset so we start executing thread one which is the top left corner of independent or whatever it starts executing it asks for disk IO there's a scheduler in operating systems as you know it'll preempt this thread while it's waiting for IO and schedule this thread so it's doing work this thread starts waiting this independent subset is waiting for IO preempted this one's doing work so essentially we're trying to use 100% CPU so we're trying to eliminate this IO bottleneck whether it's disk IO whether it's network IO whatever it may be and we're just using things from operating systems we can just use a regular operating system with P threads or whatever this stuff is available now or do it you know with many CPUs so hopefully we can use 100% of hardware that's the dream of like you know computer scientists use 100% of hardware I think so yeah I'm pretty clean up so as a quick maybe a reminder so the question thank you was about bottleneck so IO Paul has very well explained that IO is a bottleneck in Ethereum 1.0 it's disk IO it's apparently apparently it's debated whether it's the biggest bottleneck but it's definitely up there in Ethereum 2.0 if you start having stateless clients if you need to get your state from the network it's not so much disk IO well ultimately it will be disk IO but it's not so much disk IO as network latency and things like that but that means like you will spend a lot of time waiting for your data to arrive or to be written and as a result this really makes sense to try to get the computer or the miner to do something while this is happening so what we want to do with this work like that was the preliminary research work what we want to do is to increase to extend the definition of independent independence between transactions do you want for example imagine you have like let's say an ICO or a CryptoKitty kind of profile where a lot of transactions correspond to the same contract can you also generalize the independence to cause within the same contract can you for example use something like SIMD or some MapReduce kind of programming paradigm so that you can make sure that every single thread or every single transaction touches a different area of the state and run them concurrently knowing that they wanted to fear so yeah that's what we're looking into we also want to look into other ideas for example gas calculations so you heard before about the metering or the sentinel contract something that will scan your contract and add calls to use gas to make sure that the gas get properly counted for what we would like to do maybe is well we would like to see if we can explore what we would like to explore possibilities to actually pre-meter the contract so reduce the need to do this because well first you have to scan the contract but then every time you have to make a call and that's kind of inefficient so we want to see if we can just have simple gas rules see if we can have an upper bound or a formula we just look at your binary we say okay this is roughly what it's gonna cost and charge that one of the interesting things that really where wasm really brings a benefit it's the replace contracts call with function calls so right now when you call a contract you have to re-instantiate a lot of things like you basically, I'm talking about gas you create pretty much a new VM a lot of data structures but ultimately this is what you want to do is a function call you're not actually calling a contract on a different computer you're calling something on the same computer so why not make it a function call so you would have something like import you know like pseudocode or pseudopython import that function from a contract address and just get everything loaded and you can start doing lots of things that get inspired by operating systems like managed modules like operating systems managed libraries if you see that a module gets loaded very often you just leave it in memory so that when you load it you it's already there you can do something require modules to have the full list of contracts to basically like you had access lists for state locations you could also enforce the same thing for modules we also get inspired by turbo gas unfortunately the turbo gas presentation was at a concurrent time in a different room but yes, so try to replace all the state try or at least try to optimize the way it's stored the way it's accessed maybe that's still a suggestion from the name escapes me Alex say thank you smart guy yeah very smart guy how to maybe have all those contracts all those states in a like linear contiguous space why not with protection of course yes and we also want to use some of the features of wasm that are on the roadmap they haven't been released yet or at least not officially agreed on yet but you have mutexes you have SIMD so for those who don't know SIMD is a single instruction multiple data so the processor has one instruction that does the same thing several times it uses the parallelism that is inside the processor and yeah that's what we're looking into and I think that was it doesn't work yep that's the last one okay that's the last one that's what thank you thank you very much thank you Guillaume and Paul that was really intense big applause again for them one addition to all of this is we are considering all these different improvements and we're in a really lucky position that we can consider adding these improvements to the wasm test net or spinning up a new wasm test net and add these improvements to it but hopefully if we manage to add any of these they're going to be a really good test bet to have them fully specified and finally implemented on the main net now next one on stage is Casey DiTrio Casey has been around forever he's really into research and he's more into keeping Ethereum alive and into improve Ethereum and his talk will be about a new idea how you could do scaling on a much bigger scale and we may also consider having this on the wasm at some point big applause for Casey thank you it's not exactly a new idea it's just Ethereum 2.0 the Jasper Serenity research which I heard in the news recently that Ethereum 2.0 research is stabilizing finally stabilizing which is good to hear because maybe now we can answer one of the most basic questions just how are we going to deal with a large and growing account state this is one of the main problems in scaling even if we wanted to scale 1.0 we would have to deal with the account state sorry and this actually doesn't have really anything to do with Ewasm because Ewasm is just an execution engine and the execution reads and writes to the state so if you want to implement an execution engine then you've got to decide how you're going to handle the state so I guess it kind of falls on us to decide that so we wanted to say a little bit about it this talk is not going to be highly technical this is supposed to be an overview of the developer experience and the user experience and the challenges and changes that will have to be made in order to scale Ethereum so just the goal is having a lot of users a lot of transactions and transactions and where the gas price remains cheap if the gas price is if the gas supply is limited and you have a lot of users then the gas price goes sky high so how do we keep the gas get gas prices cheap again well if you don't want to sacrifice decentralization you could just run all the validator nodes on well all transactions on 21 validator nodes on supercomputers but if you don't want to trade off decentralization then ideally you'd be able to run a validator node on any consumer laptop that way you can just keep adding more laptops and the network and scale so to solve this dilemma our hero sharding Jasper McChard face what's the plan phase one we collect a lot of data blobs we have a pretty comprehensive spec for doing this you know it's like eight different teams building it out it's the beacon chain they say you know some of these prototypes should be functional soonish and then then there's phase two well phase two we think we're going to use eWASM but we still I mean we're not really sure how we're going to do phase two we still have to iron out some details but there's good news because phase one is actually the the hard part so phase one is the consensus protocol the consensus protocol comes to consensus on the order of the data blobs coming to the consensus protocol is actually very complicated because the outcome is not deterministic every validator has a different view of the network data blobs arrive in different orders and different validators we have all kinds of hairy game theory issues with malicious actors and network partitions and yeah it's a challenge phase two relatively speaking is much more straightforward all you have is it's a deterministic system the order of the data blobs is already known you have one block you process it you get the next block comes you process it and so forth so if phase one is behind us and you only have to worry about the execution engine and phase two next then you're actually you know not that far from the finish line but we learned yesterday that Shasper is actually kind of lame he can be a bit of a buzz kill my two main complaints about Shasper are that for one it won't it won't scale contracts that exist on the main chain Shasper is going to be Serenity will be a new parallel universe of 1,000 empty shards and users will have to migrate to them redeploy contracts transfer state but when it launches it will be a ghost town get it so secondly shards are going to have the gas prices will be independent on each different shard so in theory this is not really a problem because if you end up on a really popular shard and the gas price is expensive then you can just move to a cheaper shard whether it's not as much activity I think this is it's kind of like telling people who live in San Francisco and are complaining that the city is not scaling the supply of housing that well it's no problem because you can just move to North Dakota one point no problem the scaling problem in general 1.02.0 is that there's too much state users pay a one-time gas fee to create an account in the state tree and that account is there forever and all the full knowns all the miners just have to keep all this junk around indefinitely so what techniques do we have what approaches do we have to reduce the state we have two of them stateless clients and storage rent with stateless clients there's a single 32-byte storage route and when you send a transaction to the network so miners, validators, have only the 32-byte storage route and when you as a user send a transaction to the network you have to supply the Merkle proof of your account so you can prove that your account has such and such a balance and you can send either these Merkle proofs have to be up to date so I have a visualization of that next but in contrast, storage rent validators basically users pay validators to do this job to keep the account in their state and to keep the Merkle proofs up to date and when you have storage rent with an eviction or sleep in a wake if accounts don't pay rent then they get evicted from the tree but then there's ways that they can revive their account and come back into the tree then basically those techniques are very similar to how stateless clients work so it's kind of like with storage rent there's two classes of users there's like first class people who can afford the rent who validators handles all their Merkle proofs and the second class which is all the poor people who have been evicted and will have to supply their Merkle proofs in order to reawaken their accounts with stateless clients there's just one class it's everybody you know in coach everybody has to supply their Merkle proofs it's not a new idea either this is from the Ethereum analysis report from Lisa Authority back in 2015 and it was even discussed in Bitcoin before then the problem with keeping your Merkle proofs up to date is that at the top is the state route so at the top of the tree at the bottom are the leaves these are where the accounts are so your account is one leaf and then next to you is another account and that's somebody else's account but the branch the proof data that you want to supply is this Merkle branch from your leaf account up to the state route and in that branch are these intermediate tree nodes now these intermediate tree nodes change when the guy next to you makes a transaction so it's not so if you go offline and then you come back it doesn't matter that you didn't make any transactions it matters that other people are making transactions and now your proof is out of date so you have to go back in history and kind of process these transactions to get your proofs up to date so people really don't like either statelessness or storage rent the reason they don't like statelessness is because again the problem of keeping your Merkle proofs up to date also somebody who flirts and smokes weed a lot is that when I wake up from cryosleep I just want my phone to work I don't want to have to you know you imagine if 50 years later your proof would have just been changed within the first year after you went to sleep so if those blocks are still available somewhere then you're okay but the question is what happens if those blocks are no longer available well you're screwed and that's kind of why people also you know there could be some service that says I promise I will store you know all the old blocks even if you come back 50 years trust me they'll be here and people say I don't want to trust these rickety you know layer 2 services I want to have some guarantee that validators will be storing my account obviously people dislike rent because you get evicted if you don't pay also capacity is limited so the fees can spike high it breaks user experience that we've come to expect that a contract will still be there there are workarounds for this like if you know again this sleep in a wake but that's just it adds complexity which maybe you can talk under the hood from the implementer's point of view if you're starting from scratch then in my opinion stateless is obviously the way to go it's clean simple it's ideal validator only has to store a 32 byte state route rent is messy and complex there are lots of parameters you have to decide well do you set it at 16 gigs or 32 gigs how do you do the fee market and so forth I think some people even the stateless is almost too simple engineers who sometimes like complexity and they just can't resist from adding a rent feature onto the stateless if you're not starting from scratch like if you're in 1.0 today then it's probably easier you already have a full tree that the miners forced to to store it's probably easier to add on a rent feature and start trimming the accounts then swapping out the whole thing and switching to statelessness so lastly the point is that change is coming here currently miners have this full burden and while it seems kind of easy because somebody else is I don't know if you're using inferior or made a mask or a light client it's relatively easy to sync if you're running a full node it's still difficult to sync what is going to happen is that it's not sustainable for miners to keep storing all the states so validators are the disk space is going to be minimized the core of the network you'll be able to sync very quick so life for a validator will become great you know the process of syncing and processing blocks will become very streamlined but on the other hand if you are a user or a wallet client life is going to get somewhat painful syncing is going to be harder scraping the data to you have to manage your own data you have to come up with your own proofs unless you can pay the rent so that's it wallet developers are in a tough time out of you thanks thank you KC big applause for KC again so thank you all for being here today we're going to pull up the URLs again please join the Gitter Room try the testnet the URL for that www.vism.vaterium.org all the demo what Jarrett has shown should be visible there join Gitter again get the discussion started we're going to have this EVM panel now focus on EVM we're probably going to be free after that if you guys have any more questions that's all the details you need to know thank you