 Okay, good morning everybody I'm Merylian security and protocol dev at optimism and this morning. I'm happy to introduce you to oh hang on you have Franklin and Olivier from consensus who will be speaking to you about proving EVM bike code execution in the ZK EVM, okay? Take it away guys Hello, hello everybody. Thank you for coming So yeah, we are both from the Zeke sorry from consensus R&D. We've been working on a ZK EVM pretty much for about a year and In this talk we want to talk to you about It's everything say arithmetization how we implemented it and We'll have an announcement at the end All right, so first of all why is the key? Rollups the screen is not working for me Well in terms of scaling Ethereum one of the bottlenecks that is addressed by a ZK roll-up is that of the state so in order to validate a state transition which is You need to basically execute the transactions within a block and the state is a big object and this Validation is a resource intensive operation. So the promise of a ZK roll-up is to basically alleviate that workload from most of the nodes in the network so when you have a ZK roll-up you have basically this powerful node, which is an operator which provides proofs of state transitions and These proofs typically verify the following the validity of the transactions in the batch The fact that the internal logic of the roll-up is respected roll-ups are usually application specific and the fact that they induce the proposed state transition and Here and throughout when we talk about proof we talk about proofs of computational integrity which in practice Implemented as zero knowledge proofs So a ZK EVM is a particular kind of a ZK roll-up The big distinction or the thing that makes it a ZK EVM is the logic part The logic that is being executed or and proven in the proof is the execution of the Ethereum virtual machine It also means that the transactions that roll in Pretty much obey the same format as standard L1 instruct transactions. Sorry. Okay, so How does that work in concrete or semi-concrete terms? Well, you have your L2 which has a state and a bunch of transactions roll in and they induce a Executing them induces a trend transition of state and the ZK EVM basically plugs itself at this point it extracts the required data from the previous state it takes into account the transactions and basically the diff of the new state and It does its magic and it produces traces which it passes on to a prover and the prover then produces a proof which ends up on mainnet and In this way you bundle all of these L2 transactions into a single L1 transaction so Why would one want to build a ZK EVM? There's two parts of this. There's the ZK and the EVM part. So the EVM part Basically the advantages or the interesting parts is that it allows you to reuse or to Use all the existing tooling which has been developed for L1 You can basically write in a ZK EVM write and deploy smart contracts on the set on L2 Just as though you were doing it on L1 Furthermore, you can redeploy already deployed bytecode onto L2 and the ZK part and the ZK EVM well, it gives you the basically the scalability boost and also finality faster finality just because of the fact that there's a proof associated to it So that makes it interesting as well Okay, when we set out on this project, we had a few goals So we wanted to be able to in our ZK EVM prove the execution of unadulterated Native bytecode and respect the logic which is specified in the ethereum yellow paper We wanted full coverage of all the op codes Where we allowed ourselves to deviate is in terms of the representation of the state and so for instance We will not be using catch up. We are building a type 2 ZK EVM in the sense of the classification put forward by Vitalik so this project I'm sure you're aware presents a lot of challenges There's a lot of complexity that comes from the EVM itself Which is composed of many parts which are tightly coupled and have complex interactions There's a lot of intricacies that are really specificities of the EVM. You have families of op codes that have slight Variations in their execution Slightly non-uniformity There's a completely different kind of challenge which is that of audit ability so Frank Lange will touch upon that in his portion of the talk and The main challenge with challenge which everybody faces today is that of performance efficient proving schemes Yes, and this is something we will communicate on at a further point today is really about the arithmetization Okay, so here's basically the basic setup of how it's going to work You will have a modified Ethereum node and execution client which receives transactions sent by users or by dApps We plug ourselves into this execution client and extract some data which we used to fill some traces I'll be talking more about traces later and These preliminary traces are fed into this tool which we call Corsair which does many things Among them it is responsible for producing the constraint system And it also expands and computes the remaining parts of the trace All of this constraints and expander traces are then fed into our inner proof system The inner proof system we use is not compatible with Ethereum per se So we have to feed it into a verifier, which is a circuit over BN254 this is where we plug into with Gnack and Gnack then produces the outer proof which is then posted on mainnet Okay, let's talk about a bit about the arithmetization so first of all on Monday we published an updated and Expanded version of the spec which is now a pretty hefty document and its contents are basically the arithmetization Whoops of the EVM So when we talk about arithmetization in this talk at least we mean To basically construct or write down a system of polynomial equations The simultaneous set satisfaction of which perfectly encapsulates or captures Particular computation performed on particular set of inputs for us the computations of interest are Valid executions of the Ethereum virtual machine given a set of transactions and an initial state. Okay, so since the EVM is a Beast of complexity. It's a big thing We basically need or it's it pays off to try to De-couple as many of these components as we can to work in a modular fashion to sort of Concentrate the complexity in different places. So this is the general architecture. We have we have the central piece Which we call the hub which is basically our stack and our cold stack and then we have plenty of smaller modules that are tasked with doing specific kinds of operation such as arithmetic or binary operations or storage and I Don't know if you can see it, but There's also the memory parts. Okay, it doesn't work the memory which is this MMU and MMIO modules Okay, and when you run the ZKVM, what do you get? You get these traces that I was telling you about which are these large matrices which contain data represented as field elements There's one such trace per module and each trace obeys its own internal constraint system So on the previous slide you had these arrows which pointed from one module to another and this is basically Connections Peter cup connections which allow us to transport data from one place to another the other kinds of constraints are basically the internal constraints So for instance when you update the program counter you expect something particularly to happen But you also have another kind of constraint which is sort of global constraints global consistency constraints Which range over the entirety of the block rather than two or three consecutive rows and that for instance may express properties such as that Well when you re-enter an execution context and you load something from RAM you get what you last put there So let's zoom in a bit on this central piece the hub which is our as I said our stack and our call stack it gets its instruction from the ROM and What it does when it's once it has an instruction is to basically dispatch this instruction wherever it makes sense Once it has an instruction that it loaded from ROM it first does some preliminary decomposition of that instruction it extracts some parameters which are hard-coded and It decides on certain things such as how to interact with the stack how much data to excavate from the stack and where to put it in the in the trace the layout basically of the data it also raises some module flags etc and The next step is then to dispatch the instruction but before we can go there we actually have to deal with potential exceptions and The the hub also deals with the exceptions. It's basically some of them. It can detect others It imports from other modules and if an opcode makes it past this hurdle Then you have the instruction dispatching per se which Kicks in and you have these flags these activation bits that light up and you have some stamps that are updated because Keep track of temporality. Yeah at this point these activation flags Well, well, they tell you what will be active So for instance in when you do a create you will be touching RAM So you'll be touching those two modules MMU and MMIO You will also be touching ROM because you'll be deploying initialization code and you will touch gas and memory expansion And same thing for create 2 but since there's a larger hash involved for the initialization code You tap into some hash modules. Okay, so let's talk a bit about the next big piece, which is RAM so RAM is Probably the most complex piece in our Authentization by the way all the figures that I put here are available in the document And some of the well one reason why it is so complex is that it has all these Data stores with which it can interact and the different instructions Which interact with RAM actually have different sources and targets. So that there's already some first complexity The next source of complexity comes from the fact that Operations which are atomic from the point of view of the EVM such as returns or creates have to be broken down into smaller elementary operations in the ZK EVM and so the first targets task is to basically do a lot of upset computation and deciding when some padding has to Be done the sort of stuff and once all of this has been decided Well, you can start writing instructions and this is still just writing them without executing anything You just have this sort of workflow that tells you in this case I need to do this and that some exo RAM slide chunk or something and once you're at this point you Are at the phase where you can actually start doing something So this is the the work of the mmio, which is the the actual RAM in a sense This is the the component that touches data and that actually does the sort of bite decompositions Recombination slicing dicing surgeries, etc And then you have these consistency checks that I told you earlier about which is basically finishing the memory part Basically, what you've written last is what you should retrieve next time So I'll stop here for the arithmetization. I'll hand it off to Franklin So thank you for this very extensive description of what is a constraint system on the arithmetization And now I'm going to talk to you about this comeback Thank you. I'm going to talk to you about how to go from this Well, let's say conceptual data to how we actually implement the whole thing There is a lot of challenges when we want to go from the specification to the implementation and the Most biggest one is that there are three moving parts on the one on we have the actual specification 250 pages then we have the implementation of the specification and then finally we have the proof system of the verification of the of the traces So all of these stuff is developed by different people working on different teams We still need to maintain a hundred percent conformity between all of these pieces So of course if the prover is proving something else that's what the spec is describing Which is it's itself something else that what is actually implemented nothing will work and Finally, it's hard to audit three diverging code bases or three diverging sources of data So what kind of solution did we find? We developed a formalized single source of truth, which is then exported to multiple targets So what happened is that we have in some format that I will show you in a few seconds a Description of all of these constraints system and from this single source of data. We are able to produce first Go library defining the constraints on the data that will be in there That will then be used by the prover to to actually prove stuff Then we have another the go libraries that is used by the ZKVM implementation to ensure conformity with the specification on Finally, we can generate latex data for integration within the final specification document on the 215 pages of PDF So here is how it looks so you can see that this is a very clearly Lisp inspired languages with a lot of parenthesis or other kind of stuff on this is a Very simplified example of what you could find for example in all of the MMU So what do we have in this? First we have the column definition Those are like the columns of the matrices that Olivier show you a few minutes ago And you can see that they can either be normal like the one at the top or they can be We have a very rough typing system type systems that is used for some optimization on finally for the pure Pure ease of implementation on ease of use. We have some kind of very simple arrays that kind of stuff Afterward you have helper functions which are functions that can be defined like any other list function to Act on this data on for instance here You have two functions. The first one is checking that two arrays of length 8 are actually Element wise equal one to the other on the other one is computing the Checking that an array of 8 elements is actually a byte decomposition of a given value So so do these do not really exist per se, but they are just like Small syntactic sugar for the ease of implementation and finally we have the the meat of the the meat of the data Which is the constraints themselves. So here we have an example of two three constraints So the first one is doing a party composition of some data The other one is checking that memory is aligned and so on and so on and from this we will run it through courses that Olivier evoked a few minutes ago on course a will do quite a few things and among other it will for instance generate a Lot of go code for the prover that's because you can see here and you can imagine how writing this by hand would be a living nightmare Oh, we can also generate latex So here you are for instance a piece of piece of latex code on the PDF renderings that is ready to be incorporated in the spec so this whole stuff that we call course a is really a Cornerstone in our work workflow on the our implementation of the ZKV now I will talk to you about some results that we have reached for now from the specification to the implementation to the actual results So the first thing we want to benchmark our implementation against is the EVM test suite Of course the EVM test suite is a golden standard for ethereum clients in cleaning Not restricted to the EVM itself. So we have tested our current implementation of the ZKVM Which is well advanced but not yet finished. So for now you see here's a list of modules that are ready So we have the hub the MMU the ROM, the ALU, the binary on some comparator functions and On the over 17,000 tests we run our EVM on this and 16,000 are a success. So it means that it runs and the traces are validated Zero are failing. So it means that we do not have any problem handling this This test and finally we have one thousand three hundred on three Which are heating functions that are not implemented in the in the EVM in the ZKVM Namely in this case the pre-compact contracts and the self-destruct operation. So for now We have a 92.6% success rate on the EVM test suite, which is we believe a good start Another way to test and to benchmark our implementation is of course to work on real data, right? So we have quite a few real-world examples, but the most striking one we have is I believe the successful execution validation of proof of existence using the Unisop contract and The successful execution and validation of random mainnet block So what we do is that we run our Ethereum client on the mainnet and we generate traces for some blocks there on there And then we validate all of the constraints So I will show you a little bit about this work We are going to check the verifications of the constraint of the on the traces we generated for this block 0x35b7e90 blah blah blah This block was created yesterday in the afternoon. So let us start and while it is working I will show you a little bit of what is in this block So here is the data on it or scan of this block As you can see it is like 223 house all we have quite a lot of different Transactions in it. So we have a basic transfer transaction. We have some wrap stuff We have some multi-call we have failing transaction. We have successful transaction. We have uniswap. We have tether So it's quite a nice example. It's not a very big block It has only 52 transaction if I remember correctly and it's only using 2 million guys on the half, but it is it is still quite a quite inclusive in what it What it provides if we take a quick look at the traces that we generate for this block So here you have a decomposition of all Other data that we generate for this block So on the very big red stuff at the top of the screen you can see that in the end we generate Relative 13 certain on the half million of cells. So which is like the actual content of the traces You can see that the biggest one are the binary with 2.5 million of cells Then you have the ROM well, sorry the ROM first we send the five millions Then the binary with two million on the half the hub, which is also before the binary story with three millions On all of that is culminating into 13 million of cells that have to be proved on check And then cryptography checked. So here we can see that Corset is actually doing a very naive check of all of this Because it will basically just run all of the numeric numeric constraints that we define on all of the Lines on rows of the matrix one by one, which is obviously very suboptimal absolutely not cryptographic but it is only Debugging tools that I show you for to prove you that our stuff is still working on as you can see The validation is successful on the blog 0x35 be is actually validated with all our constraints on Our EVM will be able to give all of these data to the prover on the prover will be able to generate the proof and put it on the main chain Now going back to the to the presentation. So sorry I can't play because no Wi-Fi so you will have the development version so in conclusion the complexity of the VM implementation has been Partially solved with modules well completely solved with modules the intricacies of the EVM. Maybe Olivier you want to say something about this Yeah, so in terms of basically finalizing the arithmetization We have two big too big chunks that are still left to be done But we are quite confident that will be done basically for the testnet Regarding the auditability we are we have we don't have yet an audit or a formal verification of what we have done But who's also a single thought-of-truth mechanism We have laid the groundwork to actually be able to work on that and prove in a single strike all of the Successes all of the three components with a single single audit and finally regarding performances We are now connected to the prover system and There will be more more information regarding that soon in in a coming paper So, thank you for our attention. We will be launching a testnet soon So if you want to join a please just scan the stuff or just go to the to the URL and if you have any questions And we will be happy to take them Can you just give a high-level overview of of the differences between your implementation and the current other ZK EVM such as from scroll and and polygon, etc. I Don't know exactly the inner workings of hammers and skull I know that we share some design principles with both I know that our prover will be different our Arithmetic is also going to be very different and I Think this probably represents actually a strength overall not for us in particular But for the ZK EVM ecosystem as a whole keeping in line with this multi ZK EVM prover future that Vitalik has been talking about But In terms of concrete details, I'm not quite aware Regarding concrete details the big difference with crawl is let's say the arithmetization method that they are there Well, the both of us are using different methods on we will see which one will sustain the test of time on Regarding the difference with polygon Hermes the main difference is that Polygon contrary to us doesn't directly works on the EVM bytecode but the first chance is the EVM bytecode into another bytecode that is running not on the EVM but on ad hoc Register machine which is then approved. So there is a supplementary step that neither scroll nor we do do have What are the opcodes that have been the most difficult to implement in the cheek a EVM You may be surprised, but call data load has been horrible You would expect if you do if you can do a M load you can do a call data load But in the way that we are hypnotized things. It's actually quite difficult because of the Provenance of the data and the fact that there's some padding involved But in terms of the real complexity the real complexity is actually for anything that involves Writing a whole lot of data With padding potentially so anything such as code copy Xcode copy, I don't know Yeah, basically these kinds of opcodes have been the most complex and if you look in the arithmetization About MMIO. There's literally pages upon pages of I'm defining We're defining nibble nibble seven and bits eight and they have to interact in some complex way Memory has been the worst