 Oh, hi, Andres. Oh, thank you. Yeah. So hi, everyone. Thanks for joining me. I'm Ying Tong. I'm a research scientist at the Ethereum Foundation. And I also work part-time as a quantum computing research assistant. So today, I'm going to tell you about succinct proofs for scaling Ethereum. And if this title makes no sense to you right now, then you're in the right place, because it's going to make perfect sense at the end of the talk. So before I get properly started, I'm just going to go high level and ask a show of hands who has heard of a blockchain. Yeah, pretty much everyone. Who thinks they know how to describe a blockchain? Yes, Ilya does. OK, cool. Well, so Ilya, correct me if I'm wrong. How I describe a blockchain is basically a replicated state machine on a network of peers. So what a state machine is, it's basically a program that maintains some kind of state. So for example, a list of accounts and a list of balances in each account. And it's able to update this state when it receives transactions from the outside world. So it's able to make a state transition from one state to the next. And what Ethereum does is it takes this state machine and it distributes it across a peer-to-peer network so that every peer in the network has to maintain the same copy of the state machine and has to update it with the same transactions. And when you have a bunch of transactions and you put them together, we call that a block. And when each block refers cryptographically to the one that came before it, you call it a chain, so that's a blockchain. Is that right Ilya? Cool. So the thing about peer-to-peer replicated state machines naively is that they don't scale very well. And this is because basically there's a lot of requirements for a full node. A full node needs to maintain a full state, number one. It needs to maintain the full set of transactions. So like every block that comes in, the full node needs to respond to it, take note of it. And lastly, this is also a real kicker. The full node needs to make sure that all the transactions are valid. And the only way to do this really is by re-executing all the transactions as they come in. So this is a lot of work and this puts a limit on the transaction throughput. Basically, it's limited by the computational power of the weakest node. So if we make the block sizes too big, we try to process too many transactions per block. We're gonna get a lot of weaker machines just dropping out of the network because they're unable to handle that kind of throughput. So yeah, so this is the scaling bottleneck and this is the problem we're gonna try to solve today. So the first, the most fundamental approach to solving scaling is called layer one scaling. So this is not the topic of my talk. I'm gonna just skim through this really quickly. So layer one scaling, actually you guys can talk to Ilya about it. Layer, yeah, he works at Zilica, which is a sharded blockchain. So basically it splits up the network into different shards and so nodes on each shard only need to basically validate transactions on their own shard. And you run into all kinds of interesting problems. For example, how shards talk to each other. Yeah, and we can discuss this later after my talk. So this is trying to increase throughput on layer one itself, but a different strategy to scaling. Oh, this is Ethereum 2.0, the basically our vision of a sharded Ethereum, coming soon, anytime between now and 2022. So that's layer one scaling. Another strategy for scaling is called layer two scaling. So instead of trying to increase throughput on the chain, what we do is instead move as much as possible off the chain. So we wanna move all our difficult computations all the heavy workload somewhere off chain. And how this works naive, the most naive way to do this is through a side chain. So if this is your main Ethereum blockchain, you would first want to somehow transfer your assets to somewhere off chain. So you own some ether, you deposit, you lock it in a smart contract. And from then on, it's understood that you're entering this off chain world. And on this off chain world, you don't need all the peers to agree on everything you're doing. You don't need to pay people to maintain like a replicated state machine for you. All you need is one operator to perform your transactions for you. So this could be a visa. This could be like my computer, and the thing is that this is gonna be cheap because it's just one person doing it. And at some point, once you're done cheaply performing all the transactions you want, you at some point want to commit the results back on chain. So you go back into the real world and still get all the expected results of your transactions. But what I wanna ask now is, can anyone see any problems with this naive approach? Do I see a hand? Yes. And then if it's like, then it disrupts the whole chain. And also you come in? Yeah, yeah. Basically like the idea is correct that what happens off chain is really, like it's the wild, wild West because you don't have the security of all your peers in a network validating every single transaction that the operator is claiming to make. And so at the end of the daily operator can commit some state on chain. But how do we know? Oh, right. But how do we know that this result was arrived at through a series of valid and legal transactions? There's no way for us to verify that. So that's the problem we run into in layer two scaling. How can we be sure that the state transitions made were all valid? One way to do it is to simply, require that the operator literally publishes all every single action that they made on chain. And then once that's on chain, then we can verify that on chain. But you see, then that's totally missing the point. The point was that we did not want to do all these transactions on chain where it's expensive. So what we would like, so yeah, what we would like is some way to verify that they've performed valid transactions. And there are two classes of solutions broadly that try to provide some certainty for layer two computation. The first class is called fraud proofs, or informally it's known as plasma in the Ethereum world. And basically, plasma involves an exit game that allows anyone to challenge what they consider an invalid state transition. So for example, if I've made some transactions off chain, but I don't see them being reflected on chain, then I can send a challenge to the smart contract and dispute the operator's actions where it matters. The second, and this happens, but note that this happens after the state transition has already been made off chain. The second class of solutions is called validity proofs and more informally it's called rollups. So validity proofs take on a different strategy in that they require the operator to prove that their state transition was valid at the time of making it. So it's not the case that the operator can do something illegal and then be challenged or not challenged about it. Validity proofs make it such that the operator cannot do anything illegal. So validity proofs are gonna be the strategy that I'm focusing on today. Right. So like I said, it's informally known as rollup and what it does is requires each state transition to prove its validity before it's accepted on chain. So how this looks like in a picture, we have some state one on chain and we have a bunch of transactions that happened off chain and we would like to go to some state two. But in between, we want to be able, we want some kind of proof of these transactions here, some kind of proof that they were processed in a valid way. And more precisely, we want to be able to verify this proof on chain. So if you recall once again, the point is that we do not want to do a lot of work. If the naive way to verify it is simply replay every single transaction. We don't want to do that. So if possible, we want to verify this proof as efficiently as possible. We want this proof to be as small as possible. So we're looking for a succinct proof, a short proof that can be efficiently verified in a very short time. And what if I told you that for any program, any arbitrary program, there is a way to produce a constant size proof. Like how much shorter can you get than that? A constant size proof that takes constant time to verify. Well, you can do that. And oh, by the way, if anyone has any questions at any point, please stop me. Okay. So yeah, how do we get a constant size proof that can be verified in constant time? So we use ZK-SNARKS. ZK-SNARKS stands for zero knowledge, succinct non-interactive arguments of knowledge. And we're gonna break this down right now. So I'm gonna talk specifically about graph 16. It's a ZK-SNARK construction that's the most efficient known one, meaning it produces the shortest proofs that we know of. What do you mean by constant size proof? The constant size proof means that no matter how large your program is, you only need a constant number, a constant size information to prove that it was performed correctly. So we'll get more concrete later on, but in this case, for graph 16, a proof consists of three group elements. And that's never gonna change no matter the size of your program. Yeah. So, right. So what a ZK-SNARK does is it takes a circuit. So this is any arbitrary program you want. It takes some public information about this circuit. So some public inputs and a public output. And basically, from then on, it's a game between the prover and the verifier. The prover must somehow convince the verifier that they have a solution that satisfies this circuit. So a solution meaning they have a set of intermediate gates and possibly some private inputs that the verifier doesn't know about, such that when these are put through the circuit, they map the public inputs to the expected public output. And it is key to note here that it is key to note that the verifier cannot know the intermediate gates and the private inputs. By the way, gates, I will explain it later. Gates are basically how we do computations in a circuit. But yeah, it's zero knowledge in that all the verifier knows is the public inputs, the public outputs and the circuit. And the verifier has to be convinced that the prover has a correct solution. And as we mentioned, this proof must be succinct. It has to be verified efficiently compared to how long you took to compute the proof. You must be able to verify it way faster than that. And we also want soundness and completeness. So completeness, first of all, it just means that for every correct solution, you must be able to convince the verifier. So you must be able to generate a valid proof that is accepted. And soundness, it just means almost the inverse that for an incorrect solution, you should never be able to produce a proof that convinces the verifier. Well, it's important to note here that DK Snarks only give you probabilistic soundness, meaning that there is a negligible chance that someone with a wrong set of inputs could still produce a proof that convinces the verifier. And we'll see how that happens later on. As I mentioned, this is a constant size proof and constant time verification. Any questions? Yes. I have a question. So when you talk about circuits as a computational model, how expressive can we compile the program into a complete language into a circuit? Yes. So there's actually a theorem that any program can be reduced to basically a circuit, basically addition and multiplication gates over a finite field. But the caveat is that this field has to be large enough to basically contain like your computation. Yeah. Did I skip a slide? I did. Oh, right. And so this is an example of what we might want to do in a circuit. There's a little preview of what rollup does and how it works. But I just put the preview here so that you can see roughly the structure, how there's a public input and public output and some private inputs that must not be known to the verifier. But we'll go through the logic of this later on. How am I doing on time? I'm doing real. Okay. I'm doing half a period. Okay, cool, cool. So yeah, I think by this point, everyone knows what a circuit is. It's a series of addition and multiplication gates. So for example, a fan in two-circuit just takes in two values and either adds or multiplies them and gives an output. And you can basically generalize arithmetic circuits. Oh, what I'm doing right now, by the way, is taking you through very quickly the construction of growth 16. So ZK Snarks are also known as like moon math, because a lot of people think it's some kind of magic. It's really not. And so you'll see exactly how unmagical it is. So right, we all know a circuit, boring. So right now what we're gonna do is generalize an arithmetic circuit to a system of arithmetic constraints. And you can convince yourself by looking at this system of linear combinations that they are general enough to express anything that you might wanna say with addition and multiplication gates. Once we have that, we can, from a system of arithmetic constraints, we can generate a quadratic arithmetic program. So I wouldn't worry about this too much. It's basically another way to represent the same information. Did I wanna say anything else? No, I guess that's it, yeah. So from a quadratic arithmetic program, we can then generate a non-interactive linear proof. So this, actually this step is one of the more important steps, because this step is where you introduce the probabilistic part of the construct. So basically the non-interactive linear proof, you lose like your 100% certainty and you're only left with sort of a very high probability that the proof that you're accepting is for a valid witness. And basically this is, this makes use of something called the Schwarz Zippolema. Oh, was I not supposed to move? It's all right. All right. The Schwarz Zippolema, which is a statement about the nature of polynomials and how many points to distinct polynomials can intersect it. So basically it places an upper bound on the probability of someone finding a valid proof, even with a wrong set of inputs. So there's only so much probability that you're gonna be convinced by an incorrect input. And from non-interactive linear proof, we then do this final compilation. So this was an information theoretic sort of compilation. And from here to here, it's just a purely cryptographic compilation. So we use this trick called biolinear pairings to intuitively move these computations up into the exponent of our group elements. And the reason why we want to do this is basically to shrink the size of our proof. And what we can see in this construction is that we can shrink the proof down to just three group elements. Actually, I do have slides for that as well. But I think I'm just gonna skim third is really fast. So like I said, the proof size is three group elements. And what the verification entails is just checking that these group elements, when put together, fulfill a sudden equality. So it's a pairing product equation in the biolinear group. Yeah. Very nice question. So could you please give some information why conducting the verification with this given circuit is actually cheaper process than just propagating this computation and taking the output of this? Yeah. I think intuitively you can think of it. So these, this class of constructions, they're called pre-processing snarks because basically a lot of the work is being done beforehand by the provers. So in the setup phase and in the proving phase, actually most of the work is done here in multi-exponentiation. So that means that we are minimizing as much as we can the work that the verifier has to do by shifting it to the prover. Okay, then another question. So prover is an adversarial entity, right? So how can we make sure that the prover actually gave us a correct setup? Oh yeah, no, that's a great point. So yeah, actually that is a problem with ZK Snarks is that with the setup, a malicious party could actually generate bad proofs. So that's why in ZK Snarks we call this a trusted setup. And yeah, it is completely possible for a prover to break the soundness of the scheme if they choose to abuse the trusted setup. Yeah. But is there a cheap way to validate the former setup? Just not pre-do it in the same thing, but. I don't know of any way around this trusted setup. I don't know that proof-carrying code for example, so the prover need to provide a certificate to the construction that they have done in actually following the rules. For sure. I might be knowing what I'm talking about and just asking how that can be prevented. I do have a slide at the end talking about exactly this. Yeah, about how we deal with the trusted setup problem. Yeah, but it is a problem. So I think that's all the math I have actually. Yes, that's all the math I have. So now that we have hopefully some intuition about ZK Snarks and Groth 16, basically taking a large computation, compressing it down into a small proof, we can start thinking about how to apply this such that it's useful on Ethereum. So basically if you recall our problem statement, we wanted the ZK Snark here, right? We wanted some succinct proof that represents a larger computation such that the work that we do on chain is just constant time verification. So this is where the ZK Snark will sit in our scheme. Yeah, so this is an overview of how we're doing it. So like I mentioned before, it's kind of like a mini blockchain. So we do maintain a state of accounts and balances and information about these accounts. And then we accept transactions that change the state of these accounts. And we want to prove with each batch of transactions that we followed certain rules, that we followed the protocol. So, and we want to prove this in a second. In a succinct way. So I will take you through an example because that's the easiest way to explain it. So consider accounts A and B and consider a transaction from B to A of $1, right? So I'm gonna take you through basically what our circuit should enforce. So first of all, our circuit should check that the transaction was signed by the sender. And we can do this using signing and verification algorithms. Secondly, our circuit should check that the sender account actually exists in this state. So what we can do is actually put all the accounts into this construction called a Merkle Trink. It's a kind of accumulator that basically commits to a state using a very short route. So what we do is at each level of the tree, we perform a cryptographic hash function of the two children, the two child nodes. And we go up this way through the tree until at the top we're only left with one route that commits to all these children below. So if we want to convince ourselves that some account actually exists in the tree, all we need to do is provide a path that goes from this account to the top of the route. And in this case, we need the sibling node here. And we get this inner node. And then we need its sibling node to get this inner node. And then finally, that sibling node. So we're convinced that B has signed a transaction that this account actually exists. And we can move on and go ahead and update the state of B. So we debit $1 from B's balance and we increase the nonce of this account. So it's made one transaction. And then we use the exact same path to this time to alter the state route. So to reflect that the state of B has changed. And we get a new state route. Afterwards, we basically do the exact same thing for A. So in this new state route, we check that A exists by again giving the Merkle path. We update the state of A. And then we hash A all the way up again to get our final state route. Yeah. So that's what the circuit is doing. So, yeah. This is exactly what we just went through. Seven, no, eight steps. And we can generalize this scheme to take in multiple transactions at a time, not just one transaction. All we need to do is provide all the intermediate state routes between each update. Any questions? Okay, cool. So yeah. And now we're back here and in this diagram and I hope it makes a lot more sense now. So we are sending in some transaction from a sender to a receiver for some amount. And this has to happen with reference to a certain state route. And at the end of it, we hope to get out our new state route, our final state route. Yeah. And what we can do as well is remember all the proofs and Merkle paths and signatures that we needed to convince ourselves it was a valid transaction. We can push all of that into the private inputs. And you might wanna do this for a number of reasons. First of all, you just might not want somebody to know this information. But secondly, private inputs actually reduce the cost of computing a proof. So the more stuff you can push in here, the better. So why not just push everything in here? It's because we would like to have some information be publicly committed to the Ethereum blockchain. So we do want all the deltas or all the changes caused by the transactions to be made publicly available. And this is so that anyone looking at this record can essentially backtrack and recreate the latest state. So we're not held hostage by any one controller of this data. And if at any point the operator decides to leave or cheat, someone else can just look at all these public inputs and reconstruct this thing and pick up where he left off. Yeah. Wrap up. Okay, I think I'm almost done. We have another construction that pushes more things into private inputs and that comes with its own trade-offs as well. Yeah, I think I had a few slides that go into, oh yeah, that go into like this particular design that I was part of implementing. But instead of taking you through that, I'll just play a little quick demo. Oh, or will I? No, I won't. Oh man, forget it. Come to me after this talk for a demo. And yeah, I think actually that's all I had. I had some considerations and I did mention trusted setups and some research that's going into working around that. Some future directions to basically generalize this construction. And yeah, that's it, that's it for me, yeah. Did that make any sense? Okay, good, good. I see like careful nodding, hesitant nodding. Yeah, it's pretty technical. I think Purnima told me to keep it high level and I was like, okay, I will. I think I started off trying to define the problem and if there's one takeaway from this talk is that there's a bunch of different strategies that people are trying to scale blockchains and to basically increase the number of transactions we can process on blockchains. And these each have their own trade-offs and their own intricacies. I just personally find this one one of the more fascinating ones because of the math behind it, yeah. Yeah. Will this solution ever be sort of hybridizing multiple approaches? Because you mentioned there is that layer one, layer two, is that ever? Yeah, a hundred percent. For example, between shards, to communicate between shards, you could use some variant of roll-up. That's one example. For example, you could do recursive roll-ups as well to basically compress the history of your chain. Yeah, combinations of layer one and layer two solutions are like, it's one of the most exciting places to be, I think. Yeah, yeah. Where are you hoping to take your research into? Yeah, so we want to get a demo out ASAP and then hopefully look into like at scale productions. And for me, actually, the part I'm most excited about is generalizing it. So instead of just doing transactions in your circuit, what you can do is verify a snark proof in your snark circuit. So it's one layer of recursion. So you're proving inside your snark that you have a snark proof that is valid. So what this would entail is for each account to have a verification key and for each transaction to provide a proof to the verification key that is verified in your snark. So that's like the one layer of recursion that intrigues me the most right now. Yeah, is Andres? Okay, yeah. In practical terms, when do you think you will release something like? Right. So actually, there's this demo already by a company called Matterlabs. And there's also another company called Starkdex who are doing similar things. But I don't think either of them are open to the end user yet and they're not on mainnet yet. And a lot of this is because like, we're figuring out kinks in proving times. We're figuring out like what we think about trusted setups, things like that. So yeah, it's hard to say. Yeah. That's all? Yeah, that's my time. Thank you. Thank you. Thank you.