 This talk is on Haloinfinity or proof-carrying data from additive polynomial commitments. I am Ben Fish from Stanford University and my co-authors on this work are Dan Bonet, Justin Drake, and Arielle Gabison. Proof-carrying data is an extension of non-interactive proof systems, so let's begin by briefly reviewing non-interactive proof systems. At a very high level, a prover sends a proof to a verifier that helps the verifier check some claim. The reason why this is useful is there is typically some asymmetry between the prover and the verifier, either in terms of computational resources or knowledge of secret information. One flavor of a non-interactive proof is a non-interactive zero-knowledge proof. Here, the prover wants to convince a verifier that a statement is true without revealing some secret evidence behind the statement's truth. In other words, the prover knows a private input w to some program such that the program evaluates to 1 on this private input, and the prover sends a proof to the verifier. The verifier runs verification algorithms to accept or reject. Soundness says that it's infeasible to produce a valid proof if the prover doesn't know such a w, and zero-knowledge says that the proof doesn't reveal anything about w. When the proof verification is asymptotically faster than program evaluation, then we call this verifiable computation, or SNARTS, which stands for succinct non-interactive arguments of knowledge. Here, the proof should be small and fast to verify even as the program grows very complex, or the witness grows very long. Zero-knowledge, in this case, is an optional extra feature. IBC, incrementally verifiable computation, enables incrementally updating a proof of a computation's correctness as the computation evolves. Each proof attests to the correctness of the computation up until this point. Importantly, for IBC to be non-trivial, the cost to produce the next proof at the i-thederation given the local state should be cheaper than producing the proof from scratch, in other words sublinear in the depth, and should not require the previous local inputs, which may have been discarded. IBC has been important for recent real-world applications, including to blockchains and certificate transparency logs. Although not quite the standard way of presenting IBC, I like to describe IBC as also optionally passing along a separate prover state in addition to the proof and the local computation output. This is not needed to verify the proof, but helps the prover to produce the next proof, which is why it makes sense to separate these components. Classical constructions of IBC via proof recursion did not utilize any extra prover state, but newer constructions such as in this work do. Proof carrying data, or PCD, generalizes IBC to any distributed computation where the nodes of the computation are organized as a directed acyclic graph. So IBC is a special case of proof carrying data corresponding to a path distributed computation. Similarly to IBC, we can separate the proof components passed along the edges of this computation into a proof and a prover state. The first is the actual proof that can be verified, and the second component helps the next node of the computation produce a proof. We can also have separate efficiency requirements for the two distinct proof components as they're used for different purposes, and as their requirements for each of them change based on the application. Classical constructions of PCD or IBC from Snarks used an idea called proof recursion. The high level concept is that the incremental proof pi i plus one is a proof for a witness consisting of the previous proof pi i, the previous output z i, and the local input y i plus one. The program checking the witness is checking the Snark verification of this previous proof pi i, and also checking the local computation. So the proving time is proportional to the size of the local computation and the size of the Snark verifier. Now I'm skipping some technical details of how the program works, which enables it to reference its own verifier. In a bit more detail, here is how it would work for a pre-processing Snark, which is a Snark that has a pre-processing algorithm that can produce compact proving and verification keys for a given program. Programs represented as having a variable public input x, which will be an input to both proof and verifier, and a variable private input w capturing the witness. So the recursive program on x w parses its public input as a variable verification key, the index i plus one, and the local output z i plus one. It parses w as local input y i plus one, the previous output z i, the previous proof pi i, and it runs the Snark verification algorithm on the verification key, the previous index, previous output, previous proof, and also checks the local predicate f. The actual verification key that we pass to this program will be the actual pre-processed verification key of the program, but is referenced as a variable inside the program since we don't have it yet. Halo is the first PCD based on an underlying proof system that has a linear time verifier. The classical method of constructing PCD solely from proof recursion would not work with a linear time verifier, but Halo manages to use proof recursion in a way that does not include the full verifier in the incremental provers program. This work generalizes Halo for a broad class of proof systems beyond the specific one used in Halo paper. There are two important recent papers related to ours that I'd like to mention here. The first is BCMS 20, which formalized and generalized the Halo method, identifying an underlying necessary property of argument or proof systems called an accumulation scheme, and BCMS 20, which further generalized the Halo method to work with even non-succinct argument systems that have something that they call a split accumulation scheme. Now, while these works focus on defining the abstraction of accumulation schemes for proof systems and how to use them to build PCD, our work focuses more on properties of polynomial commitment schemes that lead to proof systems with accumulation and split accumulation and also gives generic constructions of accumulation. Let me briefly review what polynomial commitment schemes are. A polynomial commitment scheme provides algorithms that enable producing a small value committing to a polynomial of bounded degree over a finite field, and then later to provide proofs of the valuations of this committed polynomial at chosen points. Although not a strict requirement, ideally these proofs should be succinct or sublinear indeed and efficient to verify, also sublinear indeed. More formally, it combines a normal commitment scheme with a non-interactive proof. The normal commitment scheme should satisfy the usual definition of committed binding to its input domain, which are polynomials over Fp of degree at most d, and the non-interactive proof should be an argument of knowledge, of an opening of a commitment to a polynomial, such that the polynomial evaluates at z to a point y. A polynomial interactive oracle proof is a public coin interactive proof between the prover and verifier, where the prover sends polynomials to the verifier that the verifier accesses through oracles and uses those oracles for polynomial evaluation. It checks some equation of the responses that depends on the program, and the polynomials of the prover sends are dependent on the prover's witness. The protocol overall is a proof of knowledge of this witness. To compile this to a concrete interactive proof between the prover and the verifier, and also to compress the size of the communication, the prover instead sends commitments to the polynomials, and replaces the oracle polynomial evaluations with evaluation proofs using the polynomial commitment scheme. Finally, this can be transformed to a non-interactive proof using the standard Theatres-Chemier transform. An additive polynomial commitment scheme comes with a defined addition operation that enables taking linear combinations of commitments. The linear combination should itself be a valid commitment to the linear combination of the underlying committed polynomials. So there should additionally be an algorithm that derives an opening hint for this linear combination from the opening hints of the input commitments, which enables producing eval proofs for the linear combination. Now, it is always possible to trivially define this addition operation. For example, it could simply be the formal addition of the commitments. It is non-trivial when the addition operation is actually compressing, or there is a set of commitments to polynomials of bounded degree that is closed under addition that has polynomial bounded size. It is in this case that we call the polynomial commitment scheme additive. Linear combination schemes for polynomial commitments generalize additivity. Perhaps there is no compressing addition operation as previously defined, but there is a protocol between a prover and a verifier which outputs a succinct commitment to the linear combination with a proof of correctness to verifier checks. The security property of this linear combination protocol is that running an evaluation proof on the output C star is an argument of knowledge for openings of both C star and the formal linear combination of the input commitments to the same polynomial f. We call a linear combination scheme efficient if the lincombine protocol has a verifier that is sublinear in the maximum degree of the input polynomials. Note also that in our example the lincombine protocol takes just two commitments, but it could also take a linear combination of multiple commitments at once. So to summarize a non-additive polynomial commitment scheme construction, which may have no compressing add operation, could still have an efficient linear combination protocol with a succinct output. The result of our work stated informally is that we can achieve proof carrying data from any polynomial agreement scheme with an efficient linear combination scheme where the proving time inherits the same efficiency as the polynomial agreement scheme roughly on polynomials of degree of the size of the local predicate plus the size of the linear combination scheme verifier. If additionally, the polynomial commitment scheme is is additive doesn't just have an efficient linear combination scheme, then the PCS proof size is also inherited and there is no need to transfer prover states in the proof carrying data. Towards showing this result, we introduced two types of aggregation schemes for polynomial commitments, one that we call public aggregation, which is related to the BCMS idea of accumulation and the other which we call private aggregation, which is related to the BCLMS notion of split accumulation. Private aggregation is a public coin interactive protocol between a prover and a verifier, which aggregates commitment evaluation tuples. An evaluation tuple is a commitment and a claimed point on the committed polynomial. For simplicity, let's define the protocol for aggregating just two commitment tuples. But this generalizes to aggregating more than two. The verifiers inputs in this case are two commitments and two claimed points on the committed polynomials. While the prover additionally has the opening hints of the input commitments as private inputs to the aggregation protocol. The output of the aggregation protocol has a public component received by both parties, which is a new evaluation tuple in a private output just to the prover, which consists of an opening hint for the commitment to C star, and the polynomial F star that C star commits to. The security property of this aggregation protocol is that the composition of running the aggregation and subsequently running the eval protocol on the output tuple C star X or Y star is equivalent to running eval on both input tuples. In other words, it is a proof of knowledge for both input tuples. Public aggregation is very similar, except the key difference is that the aggregation prover does not need to know the opening hints for the input commitments. Instead, it has eval proofs for each input evaluation tuple. Here, the prover and verifier therefore have the same inputs. Only the verifier does less work than the prover. The output of the aggregation protocol is a new evaluation tuple as before and an opening hint for this tuple. For efficiency, the verifier does not need to read the full output. In particular, it does not need to read the opening hint, but it should be possible to derive this from the protocol transcript. The security property is the same. I will sketch how we can improve the efficiency of PCD with proof systems built from polynomial equipment schemes that admit aggregation. So given a proof system for a program on XW with the structure that proofs are a tuple T rho A, where T is an evaluation tuple, C alpha beta, rho is a PCS evaluation proof for T, and A is an additional component of the proof. And the verifier is split into two parts. One is a succinct part, which is efficient or sublinear in the size of the program. And the other is the evaluation proof verifier, which checks rho for T. Now, the first step is to show how we can build an aggregation scheme for these proofs using the aggregation scheme of the PCS. The aggregation prover will aggregate proofs by aggregating the evaluation tuple components into T star with proof pi aggregation. And the verifier will check the snark verification component of each proof, and then run the PCS aggregation scheme verifier to check T stars correctness. So the high level idea of how we'll improve PCD is that the prover will produce separately a proof pi f for the local predicate and aggregate this together with the previous proof into the new proof. So this new proof pi two is a proof for a program that calls the aggregation verifier with the previous proof pi one and the local proof pi f and an aggregation proof as a witness. In more detail, the proofs that are passed along edges of the PCD have two components, one being a primary proof component, the other an aggregation output. We preprocess the local program f so that the prover can independently produce proofs of the local computation's correctness. The primary proof passed along the edges is a proof for the program recursively defined as follows, it parses its input x as a variable verification key and index a local output and an aggregation output. It parses its witness as a previous output, a proof of the local computation correctness, the previous primary proof pi i and previous aggregation value T star i and then an aggregation proof and it runs the verifier of the aggregation protocol on the aggregation proof and these proofs pi i pi f and the previous aggregation value T star i in order to check that the new aggregation value T star i plus one is correct. So we can think of each T star value as a value that accumulates the unchecked polynomial commencing eval components of all prior proofs in the chain, which is why this technique has also been called proof accumulation and the complexity of the incremental prover's work is proportional to the size of the efficient part of the proof system verifier v snark and the size of the verifier for the polynomial scheme aggregation. If we are going to use private aggregation instead of public, then the next incremental prover needs to also receive the opening for the evaluation table T star i and the evaluation top component of pi i passed along as a prover's private state. Importantly, this additional witness does not grow in size. It's the same at each depth of the PCD. So the main theorems in this work concern the efficiency of aggregation schemes for polynomial commitments. A public or private aggregation scheme is said to be efficient if the aggregate protocol verifier is sublinear in the maximum degree of the committed polynomials. So our first theorem shows that every polynomial commitment scheme that has an efficient linear combination scheme also has an efficient aggregation scheme. And our second theorem is that every additive polynomial commitment scheme has an efficient public aggregation scheme. Both theorems are constructive. We show how to build an aggregation scheme from a linear combination scheme, which inherits the efficiency of the linear combination scheme. And we also show how to construct public aggregation for any additive PCS. I'll first show how to get private aggregation from a linear combination scheme. To simplify the presentation, let's assume the input evaluation tuples for claims about roots. It's easy to reduce the general case to proven claims about roots. And it's also easy to generalize the protocol that I'll give to aggregation of more than two commitment eval tuples. So the prover begins by defining the zero polynomials zi of x is equal to x minus xi and z of x to be the product of z1 of x and z2 of x. The verifier can do the same. The verifier sends a random challenge row. The prover constructs this quotient polynomial z2 f1 plus row z1 f2 divided by z of x. Note that both z2 f1 and z1 f2 have roots at both x1 and x2. It commits to this and sends it to the verifier. The verifier responds with another random challenge. And the prover constructs this polynomial g of x is equal to z2 of r times f1 plus row z1 of r times f2 minus z of r times q. This is defined such that if done correctly, g will have a root at r if the claim is true. If the scheme is additive, the verifier can homomorphically derive a succinct commitment cg to this polynomial g. But more generally, we can use the linear combination scheme in order to produce succinct commitments to g. The output of the prover is the commitment c star g, r and zero, and the opening of g. And the verifier's output is just the commitment. If the prover and verifier subsequently run the evaluation protocol on the output double c star r zero, then this turns into a batch evaluation protocol for the original claim. So this protocol builds on the classical one-round protocol to batch proving that multiple committed polynomials have a common root by opening a random linear combination of the commitments at this root. And the knowledge extractor in that case is based on the invertibility of a Vandermonde matrix. But to batch prove multiple committed polynomials at different points, the protocol here involves one more round as shown. And the knowledge extractor ends up being based on the invertibility of the Hadamard product of a random Vandermonde matrix with a square matrix of non-zero field elements. Next, we'll show how to build public activation for additive polynomial commitment schemes. An important building block for this is a succinct proof of knowledge protocol for pre-images of a homomorphism from integer vectors to an arbitrary abelian group. The protocol we described for this is a generalization of the bullet proofs protocol depicted here. Now I won't walk through the details of the protocol, but the verifier's final check is checking that the integer x prime times the homomorphism applied to an integer vector u is equal to the group element y prime, where the integer vector u is the coefficient vector of this polynomial which is defined from the transcript challenges in the protocol. So it's something the verifier can check on its own because u is publicly derivable from the protocol transcript. A critical observation that comes out of our analysis of the security of this protocol is that the verifier doesn't need to explicitly derive the vector u and check the evaluation of the homomorphism on u, which is a linear time part of the verification, if instead the prover provides any proof of knowledge of some pre-image of y prime. Now you might wonder why this is helpful because the goal is to construct a proof of knowledge of homomorphism pre-image in the first place. Repeating the protocol on u instead of having the verifier check u directly, obviously wouldn't reduce the verifier complexity of the protocol here, but this gives us a tool for public aggregation. The key idea being that we can think of this protocol as a reduction from proving a claim about a hidden pre-image x into a claim about a publicly known or derivable pre-image u, which can be extracted later by another prover just from reading the proof transcript. So we'll apply this sub-protocol as follows. First we'll need to do a slight compilation of an additive polynomial increment scheme into one that can be publicly aggregated. The new commit star must implement a homomorphism from integer vectors to g. This is often already the case, but if not we can define a new commitment algorithm that takes linear combinations of commitments to the monomials. And the new eval protocol on an evaluation tuple Cxy will work as follows. So first we define a new homomorphism Hx of f of bold f here denotes the coefficient vector of the polynomial which outputs two components versus the commitment and then the evaluation of f of x y. It then runs the homomorphism pre-image proof for the statement that Hx of bold f is equal to cy to get this integer vector u defined by on the previous slide and two values C prime and y prime such that Hx of u is equal to C prime y prime and then it runs eval on the evaluation tuple C prime x y prime. So notice u is public it can be derived from the HPI proof transcripts and so this reduces an eval instance with a private witness to an instance with a public witness. So finally if we're given eval star proofs for input tuples C i x i y i each of these contains an eval proof for a new tuple C i prime x i y i prime with a public witness u i derived from the proof transcript which enables the aggregation prover to run private aggregation, deriving these witnesses from the proof transcripts to reduce to a tuple C star x star y star with a public witness u star and opening an open star derivable from the transcript. So a major difference from the original halo is of course that the original halo was for a specific polynomial pigment bullet proofs and this is much more general however there are also mechanical differences beyond direct generalization and minor technical differences that it does not involve the verifier testing u of x at a random point but a more important difference is that the commitment function H does not need to be a collision-resistant homomorphism whereas the original halo relied on the collision resistance of the Pedersen hash. So in general the commitment function of course needs to be a binding commitment to polynomials but that is a weaker requirement than being a collision-resistant homomorphism over the integers.