 Okay, so we're on to our second talk of the morning and we're going to hear Gemini, elastic proofs for diverse environments. Take it away. Hello. Hi. So, yeah, this talk is concerned mostly with proving very, very large circuits. This is a joint work with Jonathan Buto, Alessandro Chiesa, and Jonson Hu. So, we're going to focus on succinct arguments. You already heard the description just in the previous talk, but essentially we have this dialogue between the prover and the verifier. At the end of this dialogue, the prover should be able to convince the verifier that there exists a witness that satisfies some relation. Arguments means that the security of this protocol relies on computational assumptions. Succinct means that the verification power is less than the witness. We will actually focus on preprocessing succinct arguments, which means that the verification time is even lower than the size of the statement. So, we put a lot of effort in this kind of proof systems. Snarks belong in this class, and we know how to do them. We spent a lot of time proving time, verification time. Now we were almost optimal in both cases, but not so much on space. In fact, if right now we were to use a practical proof systems like Marlin or Spartan, for millions of gates, they would just crush. We are not able to prove very large instances. As we try to deploy these proof systems in the real world, so it grows the need for being able to prove larger and larger circuits. For instance, if we just consider the compression function of chart 256, this is like 30,000 constraints. If you put it in a miracle tree, or you try to verify different transactions, you're already far outside benchmarks that we put in papers. And if you multiply it by like a thousand times, because you're like doing ZK rollups, then you really cannot do not have a proof system that works in the real world that can deal with these sort of circuits. So there has been already some literature on how to build a space efficient succinct arguments, mostly TCC papers. Initially, we've been working a lot on recursive proofs as a mean of preserve time and space, as Yusef just mentioned in the previous talk, but more recently we've been talking more about streaming provers, and provers that can produce a proof after receiving the inputs as a stream. What do I mean by this? I mean that in practice, the prover gets as input a stream of data, then runs, it will run actually in logarithmic space, and then produce a proof. In theory, sorry, even more in a real world scenario, what do I imagine? Imagine my phone that is unloading an instance and then producing a proof. In theory, what do I mean? I mean that we have a Turing machine that does not have the input in random access, but has an oracle, and this oracle can give me next element, next element, next element, and then I can also ask to seek at the beginning of the file. This Turing machine must run in sort of log space. We build on the top of this model, and we introduce this idea of elasticity. What does it mean elasticity? It means that you have a proofing algorithm that admits two different implementations. One that is inefficient, one is space efficient. The verifier couldn't care less about which setup was used in order to create this proof. In particular, we build up. We say that these creatures exist, and we propose a proof system for circuit satisfiability that admits two different prioritizations, one that is linear time and one that is log space. The other, we like to have the two of them, but until then we have this sort of different type of configuration, and in particular, the proof is independent from the configuration that we choose and the verification run in log time. In addition to that, our proof system has the peculiarity that it is possible to move from space efficient to time efficient in the middle of the proving execution, in the execution of the prover, and we will see what this means in a second. In addition to that, the space complexity, the complexity of the space efficient prover, this was also known in previous works, but we improve on that in that if the circuit exists as a particular structure, it's like an inner product relation, or it's a finite state machine, then this can actually go down to N log N. We build this proof system for an NP-complete relation, rank one constraint system. Why do we choose this relation? Because it's what people use in practice, and it is easy to build this relation from a circuit. What do I mean by as a circuit? I mean addition and multiplication gates, and if I have this circuit with addition and multiplication gates, then I can build a matrices, A, B, and C, where size will depend on the size of the circuit for which, if the circuit is satisfied, then there will exist a Z, possibly different from zero, for which this is satisfied, and this dot here means Hadamard product. How do I build it? Very easy. In A, I put the left inputs, left wires for the multiplication gate, so one, one, and then all zeros. In B, I put the right input wires, and in C, I put the output wires. What does it mean? It means that if I do AZ now, I get Z0 plus Z1, BZ, Z2, multiplication of them, Z3. I do this for all the multiplication constraints. There you go. I enforce that the circuit is satisfied. Now if I give you this relation as input to the prover, then the prover can build the proof. But what does it mean to have streaming access to this relation? Well, we define it, and it means for us that I give you streaming access to the vector Z, so I give you the element one by one, first to last up. I give you access to the matrices, but what does it mean to give you streaming access to a matrix? It has two dimensions. It means that I give you access in row major and in column major. The prover decides which one to take. In addition, I will give you also what we call the computation trace, which is essentially like the addition in the circuits, namely AZ, BZ, and CZ. And if the circuit has a particular structure, again, it is possible to build this inner product in a space-efficient manner inside. And in fact, in the paper, we have this notion of composability of streams, where you can give as input many streams. You can build a new stream that can be fed into another procedure. So all these relations are linear relations. So what I'm going to talk about is what I think is sort of the cornerstone protocol here, which is the inner product. Once we have inner products, it's very easy to do matrix, vector, and multiplication, and it's very easy to do Hadamard product. And for the more technical people, what I'm going to provide is a polynomial interactive proof for inner product. This protocol will be later composed with a polynomial commitment scheme. And in our paper, we have this theorem that says if you have an elastic idealized protocol, if you have an elastic commitment scheme, then guess what? You can combine them, and the result is elastic still. So there is something that is already elastic and that has been going on for a while in the literature, and these are like protocols that are inspired from some check or folding arguments. And all these protocols have this nice structure where I go from a claim of size n into a claim of half the size. And then I invoke it recursively until I end up with a claim that is trivial, the size one, and that the verifier can check immediately. In particular, at every round, the verifier is sending me one random coin, and the verifier is using this randomness to what we call fold the instance. For instance, I can do even or not folding, which is more inspired from FFT, or I can do left and right folding. That's essentially the same thing. And as I go on through the protocol, I will have these linear combinations of the randomness with pieces of the vector. Until at the end, the verifier will have to check these smaller instances, which are actually multivariate polynomial evaluations, or at least that's the way that I want you to see them for now. Now, we know how to run these protocols in linear time. Why? Because over here I'm doing, oops, something is that, okay. Because over here I'm doing n divided by two multiplications, then I will do n divided by four, then I will do n divided by eight. This is linear, no? And but how do I do it in a spatial efficient manner? And actually we prove that you can, we show that you can also do it in linear time construct the foldings. How do we do it in a spatial efficient? Well, you have as input the stream of F. What is the stream of F? Are all the coefficients, no? One by one by one, etc, etc. And I keep a stack. And in this stack, I can start feeding elements from the stream. And as soon as I have two elements of the same level, well, I fold them with a respective randomness. And then I keep ingesting elements. As soon as I have two elements of the same level, then I fold them. Again of the same level, I fold them until I end up with the coefficients that I want and then I can return it. How much space does this thing take? Log n, because I will keep at most true elements for every level. How much time will it take? Still linear, no? Because I'm doing n divided by two multiplications, plus n divided by four, plus n divided by eight, and so on and so forth. And so at the end, because I will have to run this protocol log times, I will end up with a protocol that can use log space. And overall, n log n time. But this protocol has another particularity that is sort of the little nudge of our proof system that you can explore this recursive structure in such a way that you can fix a memory budget. You can say, now my prover is going to take one gigabyte of memory. You fix a threshold, and then you can run the space efficient prover until you reach a level where the instance can be written in memory. And then from then, you can use the time efficient algorithm. So you have a proof system that can run optimally for certain sizes, and as soon as you hit very large instances, you will use streaming algorithms. So this is sort of the basic idea. And once you reach the end, what you have is this sort of multivariate evaluation that needs to be taken care of. And sort of in the ideal world, when we work with polynomial IOP, we have these very fiery tasks for a multivariate evaluation. And in reality, what does it mean? It means that the prover will commit to a polynomial and then provide an evaluation proof. Actually, in the paper, we're able to reduce this multivariate query to univariate queries, generically. And why is this useful? Because then we can get rid of the multivariate part and use just a univariate polynomial commitment scheme. Why is this useful? Because if I end up using things like KZG, the verification time is much more efficient. We're talking about log pairings versus true pairings. It's relevant. And so what I want to do now is spend a little bit of time talking about how do you build an elastic commitment scheme? So we work on KZG. And KZG is sort of pretty streaming friendly. Just as a reminder, we already talked about this a bit in the previous talk. But the commitment key is these consecutive hours of tau, where tau is selected during setup and is never shared afterwards, not with the prover, not with the verifier. And the commitment algorithms will consist in the multi-scala multiplication of the commitment key with the polynomial f. Does it mean to do this in a space efficient manner? It means that I have the coefficients of f, no? In a streaming way, so one by one. And the same for the commitment key. And I will build a commitment on the top of it by accumulating the product of these elements. So I will do the scalar multiplication of these things together. Now how do I do evaluation? Evaluation in KZG is a clean and division by the polynomial x minus alpha. So I get the quotient, I get the reminder. The reminder is the evaluation of f in alpha. And the quotient is the thing that I'm gonna commit to. So what do we do? We essentially do pen and pencil division by x minus alpha. We get the streaming of f top to last coefficient. And then we produce the stream of q by using Ruffini's law. So the first is the first of the quotient. Then I apply recursively by using this. I apply Ruffini's law by using that requires me to store only the previous coefficient of the quotient. And so this can be done in a constant memory. Okay, so now we have this polynomial commitment scheme. We have this polynomial IOP. We put them together. We get a proof system that is elastic. We try to go down to the implementation. We actually implement it. And implementing it requires quite some engineering effort. Why? Because you end up having to build the whole polynomial commitment stack. And we have an implementation for a multivariate polynomial commitment scheme and one for KZG. You have to build in a product. And from here you can build directly as NARG. But what we go up for is a pre-processing NARG. So we have to build, the prover needs to be even more work. And there are some others of protocols that need to be implemented. Some of which are not necessarily trivial to implement in a stream of fashion. And in addition to that, because we have this elastic notion, you have to implement them twice essentially. Because you have to build the time efficient implementation and then the space efficient implementation and then how to move from one to the other. And so given that we have this sort of modular framework, we started trying to push some stuff into artworks so that it can be also reused into other projects. So for instance now, we have three different implementations of KZG in artworks. And for instance, for novices or PhD students, it would be a nice project to try to see whether this can all be merged and we can have a unique implementation that is both space efficient and time efficient depending on the setup that you want. And that has, for instance, other things like multipoint proofs. And just more on the technical side, how do we implement them? This is all implemented in Rust. And there are many ways in which these things can be implemented in Rust. One of them is by using the stream libraries of Rust which provides sort of a generic interfaces. But we actually go for iterators. So if you are familiar with Python, we are really just using iterators for building streams of our elements and then composing them. But because Rust has this very strong type system, we end up having to embed within the type, the whole computation tree. So if you have a matrix and then you're doing matrix vector operations and then you're doing algebraic caching on the top, you have this type that gets more and more complicated and you have to work with that. And this is the price for having a very efficient implementation. And so we end up benchmarking these and sort of what is the takeaway from these? Is that while in papers, generally we stop at like millions of gates. So around two to the 20, we go up until two to the 37. This is like statistical parameter security. And even if we try to use our own time prover on a large machine that has 72 gigabytes of memory, we stop at around two to the 27. And so all of this space that you see here, these are the things that we can recover through streaming techniques. We also have some stats for pre-processing, the stop I bit earlier, but I wanted to stop and think about one thing. In the theorem, we have this n log square n and this line is a straight line. And the reason for this is that the log factor doesn't really kick in and the biggest, and I mean, this is sort of a general thing for proof system, the biggest are still the cryptographic operations. The new discovery modification is really what takes most part in a proof system and that's why we don't have really this log square factor kicking in yet. And the same goes for space. If I set up my space budget to be large enough, in our case around one gigabyte, then even though the space is supposed to grow logarithmically, we have this nice flat line where you can just let a prover run for as much time as you want and you will just occupy one gigabyte until the end of the execution. And yeah, so the bottom line is if you care about logarithmic factor, you should also care about constants. And so yeah, these are some resources and sort of wrapping up. We have this proof system that is sort of a, this idea of elasticity is essential. You have a proof system that can be implemented in a time-efficient way or in a space-efficient way. And we find a way of also merging the two together. Thank you. Thank you for the call, too. That was, yeah, really cool. Okay, I'll start with my question, because not when I was thinking of another one. Have you got a call application of this? Is there a call application where you really want to prove something that's so huge, you can't get it in memory? ZK rollups, I think, are one of them. Like you have a lot of transactions that you need to prove at once and you need to provide a proof for it. And these transactions are really like in the order of thousands of them. And you have many more to prove in them. I think, actually, you've got many local trees. You've got to prove many hashes at once. Or like hash chains. A thousand times the size of a hash function is really quite big. Yeah, yeah. And we also chatted with some of the Filecoin people, but I think for them it's also important to have the proof that they're small. In our case, our proof are logarithmic size, our log size with the size of the circuit. Okay. Yep. I just had a quick question. Did you look at other polynomial commitments that are streaming friendly or is it just KZG? So in KZG, the particularity, I guess, is that we are doing division by a low-degree polynomial. And that's really what buys us this idea of streaming easily. If I already move into Lagrange basis, where I have to divide potentially by a polynomial that has a high degree, then I don't know how to do streaming. Other polynomial commitment, like PST, which admits a trusted setup, we can also do streaming on them. Like a bullet-post slide or fly-based. These are more like inner products, and on them, yeah, I guess the idea there is a similar to inner product, I think. Yeah. So on a general level, how do you see the future of Skate? I mean, I understand this like a very serious problem, but do you think it's like distributing the prover or streaming? I don't know. I think there are multiple directions. No, there is the streaming direction. There is the direction of recursing and then like incrementally verifiable computational recursive proofs. And then there is also the MPC direction. And I'm really sure which one will take off. And also I think maybe there are even different applications. Like if you do recursion, you have to embed the verifier circuit into the prover and this is a big overhead. Whereas if you do streaming, maybe it's easier. And if you do MPC as well, I don't know. I really have no idea. Thanks for the talk. Obviously what would be better, it would be to have both base and time efficiency. So you wouldn't have to switch. But it wasn't so clear for your presentation to me at least, when you presented the streaming version for the inner product argument, I think the slide 15 or so. Why you wouldn't be able to get also optimal time? So when you're folding each time, you will have an instance of size n divided by two that then you fold in n divided by four. When you're doing streaming, you cannot store the partial instance, that's the partial folding in memory. So every time you have to go through the previous folding to build the next one. Because you cannot store it in memory, literally. And so I think there is really these, on the bigger level, I think there is really this memory space, it is the space and time trade-off that is very common in like, I don't know dynamic programming or like Divided Impera where you have, you can either choose to use a lot of space and then you have a fast algorithm. So you have a spatial efficient one, but then you have a higher time complexity. I think it really boils down to that. And I'm not sure it's even possible. Yeah, but good question. Thanks. Any other questions? Going once, going twice? No, okay. Thank you very much. That was really cool. Cheers.