 Hi, everyone. I'm Yun Cong. I'm here to present Marlin, which is our work that construct pre-processing ZK-SNOCKS with universal setup. This is a joint work with Ali, Mary, Pryu, Shinoah, and Nick. Okay, let's dive in. Before we talk about pre-processing SNOCKS, let's first take a look at 16 non-interactive arguments. We have the prover and the wirefier. The prover has a specific function that is applied to a public input and the private witness. The prover wants to convince the wirefier that it knows the secret witness so that the function can be satisfied. And naturally, the prover is going to run in time that is proportional to the computation time, but the important thing here is that the proof that the prover sends to the wirefier is exponentially smaller than the computation itself. Variation sometimes is linear to the computation, but if you want succinct verification, it should be also run exponentially faster than the computation. One of the goals of our work is to achieve succinct verification, but the question is when this succinct verification comes from? Depending on the type of computations, there are two options. If you are doing machine computation, you can leverage the uniformity of the computation. Then it is possible for the wirefier to run in time that is proportional to the description of the computation, not the size of the computation. But if you are working in the circuit computation model, in general, the descriptions of the circuit is as large as the computation itself. So when you design a snark for circuit, you might wonder how could it be possible that the wirefier to run in time sublinear in the circuit size? Since at the very least, the wirefier should know about the circuit and there's no smaller description than the circuit itself. The point here is that you can somehow preprocess the circuit to get the succinctness. There is an offline procedure that produces a short cryptographic summary of the circuit, and then you can use it many times to verify many proofs relative to the same circuit. In this talk, we're going to focus on the second one, the preprocessing snarks for the circuit computation. There are a couple of ways to preprocess the circuit. The first is to use the circuit-specific setup. You have a setup algorithm which takes the randomness and the circuit as input and outputs a proving key. We have seen many constructions in this form, and the most efficient snark is also in this form. But the problem with this solution is that to do this circuit-specific setup, you either require a trusted party or an expensive MPC protocol to generate the proving key. And this trusted setup needs to be done once per circuit, so it is not scalable, especially when you want to change circuits for different applications. Another option is to use the universal setup. How it works is that you have a single trusted setup for all circuits or for a given maximum size N that you want to support. It outputs a reference string and you feed the reference string and actual circuit into a deterministic specialization process to get the circuit-specific proving key. Note that this specialization process includes no randomness and is supposed to be entirely deterministic, so you only have one setup and they use the reference string for all circuits. There are a few ways we know how to do the universal setup. First of all, we can use the universal circuit with the circuit-specific setup, but all the best solution we know for the universal circuit is at least a cause and linear in the circuit size. In particular, they have a pool and sympothecs for the approval. To overcome this, people start to design new snarks specialized in the setting of universal setup. And the first work in this line is GKMM in 2018, but it's mostly a feasibility result because the setup and the preprocessing is quadratic. Next, we have Sonic, which is the first snark that achieves the optimal sympothecs in the universal setup setting, but it has a large constant. In particular, it is 100 times worse than the best circuit-specific snarks. In summary, the basic problem here is that there are a number of snarks with universal setup, but they are all inefficient, and there is no clean methodology to construct better snarks. Our goal is to create a methodology to create universal setup snarks and use this methodology to construct a concretely efficient snark. So the first contribution of our work is that we provide a methodology for constructing efficient universal snarks. In the end, we have the first theorem, which is a compiler that constructs 60 snarks with universal setup. It takes two components in the algebraic holographic proof and an extractable polynomial commitment. It takes in these two components to produce preprocessing thick snarks with universal setup. Note that the algebraic holographic proof is a new notion, which we will explain later, but the key idea here is that we connect the holography of PCPs with preprocessing for snarks. The nice thing about this compiler is that this modularization is not only very beneficial for understanding the construction, but also for the stock designer. You can go ahead and focus on improving each of these components individually. And our theorem guarantees that as long as your construction satisfies the requirements of our compiler, then you can just plug in your components and you can get a secure snark with universal setup. Actually, if you look at all the exciting works for the past years, such as Sonic, Planck, VRAM, Marlin, and Supersonic, their core contributions are to optimize one of those two components. The second part of our contribution is that we provide an efficient ingredients in instantiating our compiler. We provide an efficient AHP for C-set and a variant of extradible polynomial commitments. We plug these two components into our compiler to obtain Marlin and efficient thick snarks for C-set with universal setup. Our snarks achieve a simple text of the best circuit-specific snarks just like Sonic, but the concrete parameter is much better than Sonic. Our third contribution is to provide an implementation in Rust that is available online. It turns out that our compiler also provides a sort of modularity that enables a clean and efficient implementation. We compiled our Marlin implementation with the state-of-the-art circuit-specific preprocessing snarks, which is gross 16. And what we found is that for the relevant performance, such as approval time and verify time and the proof size, we are less than like an order of magnitude away from the state-of-the-art scheme. In particular, the approval time is about 6.6x and the verify time is about 4x, and the proof size is about 4.5x off. As people continue optimizing the two components, the AHP and the polynomial commitment, we can see even better results. In this talk, we will focus on the methodology and the compiler because it shows a clean and straightforward way to construct preprocessing snarks. Here is the paradigm of our methodology. In the next few slides, we are going to present each of these components and finally see how it combines these two components to get out the snarks we want. Our compiler really highlights the key to achieving preprocessing and hence succinct verification is what we call the holography. Okay, let's first see what algebraic holographic proof means in our context. I'm going to introduce algebraic proof first. At a high level, algebraic proof is just an interactive proof, where the proverse message messages have a kind of algebraic structure. In particular, in our formalization, we restrict proverse messages to be low-degree polynomial. For example, it can be univariate polynomial with degrees that is much smaller than the size of the field. When the verifier receives the message from the prover, the verifier does not need to read the entire low-degree polynomial. Instead, it can access the polynomial as an oracle. For example, the verifier can curate the value of the proverse polynomial at arbitrary points without paying cost to actually read and evaluate the polynomial. And then the verifier replies with some challenges because we are working in the public coin setting, so the verifier only sends random messages to the prover. This continues until the prover sends all of the polynomial P1 to Pn, and the verifier gives the final challenge. After that, the verifier provides a set of points that are used to curate each of these polynomials. Once all of these polynomials give the evaluation results, the verifier will plug those evaluations into the decision procedure and decide whether or not to accept. Okay, this is a paradigm of the algebra proof. The properties of algebra proofs are very similar to the standard properties of interactive proofs. We have a competitiveness. Whenever the prover has a satisfying witness and a follow-up protocol, the verifier will accept. We also have the proof of knowledge. Whenever the verifier accepts, that means the prover actually knows the corresponding witness in its head. And then finally, we have a notion of bounded curate zero knowledge. Basically, what it means is that as long as the verifier doesn't make too many curates to the polynomials, then it knows nothing about the witness. Although there might be some other advanced notion of zero knowledge, we don't discuss them here because our compiler only needs this curate bounded version. Okay, the algebra proof so far looks nice, but it still has some problems. You know, we are working towards succinct verifications. So what's the complexity of the verifier in the algebra proof? The problem right now is that the verifier's complexity is at least linear in the size of the circuit because it has to read the circuit as the input. And if the verifier needs to do some more complex computation based on the circuit, the complexity could be even more than linear size of the circuit. That's not what we want, since our goal is to build a succinct verifier that has that most logarithmic in the circuit size. So the solution is to take the circuit out of the verifier, but the question is how to do that. The fact verifier still needs to know some sort of information from the circuit in order to check the computation. What we want to do is that instead of having this verifier read in the higher circuit, we want the verifier to be able to access the algebraic version of the circuit. And this is where holographic comes in. We have some sort of holographic preprocessing algorithm, which encodes the circuit and provides the algebraic encoding of the circuit. In our formalization, it is a kind of polynomial representation. And now the verifier can access these polynomials as oracle just like the proofers message. So the proofers and verifiers right now can interact with each other as before, send the message and replace the challenge. And now the verifier is going to provide the query point not only to the perverse message, but also the circuit polynomial. So in this way, we get rid of the linear dependence on the circuit. And the verifier will plug in the evaluation both from the proofers polynomial and the circuit polynomial to decide whether or not to accept. So as long as the query and the decision procedure are at most logarithmic in the circuit size, the overall verification procedure is succinct. Now we have the algebraic holographic proof. Let's take a look at the second one. So at a high level, a polynomial commitment is a cryptographic primitive that allows people to commit to the polynomial and later prove that the evaluations of the committed polynomials at some chosen challenge points. And the verification should be more efficient than evaluating the polynomial directly. So we have a setup algorithm for the polynomial commitment scheme which takes in the maximum degree that we want to support and outputs the universal commuter and the verify keys that work for any polynomial with degree up to the input maximum degree. And later the commuter can use the commuter key to commit to the polynomial and send the commitment to the verifier. This commitment is supposed to be smaller than the size of the polynomial. Okay, the verifier now wants to know the value of the committed polynomial at some point Z. So it sends the challenge point Z to the commuter. Then the commuter evaluates the polynomial at that point and provides a proof of evaluation of the commitment at the challenge point. The commuter sends both the evaluation and proof back to the verifier. The verifier checks the proof of evaluation to make sure that the committed polynomial actually does evaluate to the value V at the challenge point Z. So the properties that we want of the polynomial commitment are also similar to the AHP. We want to we want to competitiveness if the polynomial evaluates to the value V at Z, then the verifier will accept. We also want to extract extra ability that whenever the verifier does accept, then the commitment contains a polynomial of degree at most D. And then finally we also want hiding which says as long as the verifier doesn't make too many queries, it learns nothing about the polynomial except information such as the evaluation. This query bound is very necessary because for example, if the verifier makes up to maximum degree D plus one queries, then it can interpolate the polynomial and learns all the coefficients. This is a syntax of the default polynomial commitment scheme, which was first proposed in Kate at all paper in 2010. But to plug into our compiler, we actually need more properties. For example, each round in the AHP, the prover may send multiple polynomials to the verifier. And we also want to be able to commit to polynomials across multiple rounds and also the extra ability holes for all those polynomials. We may also want to be able to open polynomials not at a single point, but a set of query points. And finally we want to be able to have different degree bounds for different polynomials. To do so, we introduce a chain process which takes in the public parameters output by the setup process and also the degree bound information for each polynomial. That is because the polynomial bound may be much smaller than the maximum degree bound capital D. And we don't want the verifier to read the whole public parameters and become non succinct. Note that this chain process will output succinct to commuter key and verifier key is specialized to the input degree bound. But unlike the setup, there is no secret generated during the chain process, which means it is totally public and deterministic. Basically what Trim does is to figure out which part of the public parameters are useful for the given polynomial bound. Okay, the default polynomial commitment doesn't provide those properties in a straightforward way. So we need to do a little bit more work to make sure all the other properties hold when we introduce these additional properties. So you can take a look at our paper for more technical details. All right, we have both of our components, the HP and the polynomial commitment. Let's take a look at how our compiler actually work. And the key idea here is to leverage the holography in the HP to achieve preprocessing. Let's first recap what a preprocessing snogs with a universal setup is. It has those four algorithms, the setup, how to put the universal parameters and also circuit specific preprocessing and approve and verification. And we also have those standard properties. We want to go over this since it's a standard notion. Let's see how our compiler constructs each of these algorithms in a step-by-step manner. We start with the universal setup. So in the universal setup, you're taking the maximum size of the circuit that you want to support. And you ask the HP, hey, this is the maximum circuit size you want to support, and what's the maximum possible degree bound D. And you run the polynomial commitment setup to obtain the public parameters for the polynomial commitment scheme. And then the public parameters will support any preprocessing output. So you just output this public parameters as the universal public parameters. Okay, let's take a look at how to do circuit specific preprocessing. Now you can get in the universal public parameters as well as the circuit. First of all, you run the HP preprocessing, which outputs algebraic encoded form of the circuit, or we see the circuit polynomials. Then you run the trim algorithm to obtain the circuit specific commuter key and verify key. Finally, you just commit to this circuit polynomials using the polynomial commitments and output the circuit prover key and the verify key. Okay, now let's take a look at the main protocol. So we have the argument prover and the verifier, where the prover has the witness and the verifier doesn't. The argument prover simply runs the HP prover and whenever HP prover produce a polynomial, for example, the P1, instead of sending the whole polynomial directly, the prover is going to send the commitment C1, Cm1 to the argument verifier. And then the argument verifier asks the HP verifier the randomness and send it to the argument prover. This is continued until the prover runs out of the polynomials. And now we start the query and decision faces. The verifier runs the HP query algorithm to get a set of query points and the argument prover evaluates those polynomials as a query set and sends back the evaluation. And now the verifier can run the decision process to decide whether or not to accept. But this is still not sufficient because the verifier needs to make sure the prover gives the correct evaluations. So the prover used the polynomial commitment to produce a proof of evaluations to prove that when the prover's polynomial evaluates the query set, it results in these evaluations. Then the argument verifier needs to also check the evaluation proof is correct. So far we have the interactive protocol and we can make it non-interactive by using the file Shamir because all these messages are random. This construction is very clean and the idea is to use the holography to preprocess to the circuit and then instead of sending the polynomial, we send the polynomial commitment to prove the evaluation are correct. The nice thing of our compiler is that it does not only provide clean construction but also all of the properties like completeness, proof of knowledge, zero knowledge and all this property follows very cleanly from the corresponding properties of two components, the AHP and the polynomial commitments. We also get succinctness if both the AHP verifier and the PC verifier are succinct. Okay, so that's it. In conclusion, in this talk we have seen how to construct a universe of preprocessing Zick-Snox from algebraic holography proof and extradible polynomial commitments. In this paper you can find more details about those components such as our efficient AHP for CSAT to evaluate low-degree extension for arbitrary circuit. We can also see the extended key ZG10 polynomial commitments to achieve additional properties required by our compiler. All right, thanks for listening. These are the links to our paper and the implementation and feel free to take a look for more details. Thanks.