 Hello, my name is Cody Freitag. Today I'll be talking about SPARCs, which are succinct, parallelizable arguments of knowledge. To start off, I wanna talk about the notion of succinct arguments for NP, which have been studied for quite a while now. So these are protocols where there's a prover who is interacting with a verifier for some statements X and whether or not it's in the language L. And so because this is an NP statement, we provide the prover additionally with some witness helping it prove this fact. And what we require from succinct arguments is first that of completeness, which says if the statement X, which is given as input to the protocol, is in the language, then the verifier will accept. Second, the standard soundness notion says that if X is not in the language, then no cheating prover P star can convince V, at least with noticeable probability. And in this work, we'll actually consider a stronger notion, which is that of arguments of knowledge, which says in fact, if the prover can convince V, then it must know a witness. And you can formalize this using an extractor, but this won't be the focus of the talk today. So I won't get into that. Just know that this is the notion we're using and requiring in this work. But the most kind of important or interesting aspect of succinct arguments that I wanna talk about is that of efficiency. In particular, the verifier here is supposed to run in time, which is polylogarithmic and a time bound T, where you can think of as T is the time to verify just the original NP statement. So if the prover just sends the witness over, this is the time it would take the verifier to check it. But in this interactive succinct case, we require that the verifier runs in polylogarithmic in this time bound T. And additionally, the communication is also polylog in this time bound T. And together you can think these are very good. So polylog in T is essentially independent of T and usually is going to be much more efficient than just running the NP verification procedure itself. On the other hand, we usually require that the prover runs in time, which is polynomial in this time bound T, where this could be some arbitrary polynomial. And really, this arbitrary polynomial is a huge bottleneck in making succinct arguments actually useful and practical in everyday scenarios. So as a running example, I want to consider the task of delegation. So this is where some client computer needs to perform some expensive computational task. And instead of computing it itself, it's going to ask a server to compute it for it. So it says AWS, please compute this function M on input X. So AWS comes back with an output. However, being the cryptographers we are, we don't trust Amazon, we don't trust AWS. And this by itself requires us to do that. Instead, we trust in kind of mathematical proofs. And so this creates this idea that first AWS is going to compute the function and then prove that it's correct. And how it does this is it, for this proof it uses a succinct argument or the statements that this machine, however it's represented on input X, outputs some value Y. And this is why it's usually in this kind of, let's first compute the output Y and then prove this very clear, nice statement. But the problem is that if this overhead, as I said before, is some arbitrary polynomial, even T squared, this is essentially useless in practice. As a concrete example, if this computation took, say, two to the 40 steps, this will take about an hour to do on most computers. Whereas squaring this, doing two to the 80 steps, will take nearly 400 million years. So there's just, it's not worth it to trust in the math at that point if it just has such a huge overhead. So what do we know as far as proof efficiency goes? So in this, again, I wanna emphasize, I'm considering this general NP regime where there's just an arbitrary prover with some witness who's interacting possibly over many rounds with a verifier. And so in the setting, what we know for proof efficiency, well, first it seems that it's been solved at least from a theory point of view. And why is this? Well, we have solutions which are essentially T times some polylog team factor. And this is the overhead you can think of computing a proof in this delegation setting. And at a high level, these follow from efficient constructions of PCPs, which are essentially these long strings, which even in the efficient case have size T times polylog T, but the verifier can efficiently check these proofs. And then these can be compiled down into succinct arguments using a collision resistance and kind of general cryptographic techniques, which is kind of known as Pylian's argument which I'll refer to it as. But the problem is that even though this is really nice, beautiful line of work from the theory side of things, in practice this is not the most efficient. So these polylog T factors may be a factor of say a thousand or more in practice. So like, if you're taking an hour long computation and it takes a thousand hours to prove it, maybe it's not worth it. And even so, it seems like this approach is just inherently a problem because we've been studying these PCPs for over 30 years now. And it would be a huge open problem if we could get these PCPs to have overhead which is only linear as opposed to this T times polylog T overhead. So even if we could solve this problem, I claim it's not even enough for a lot of practical scenarios because even if we have a delegation scheme where the overhead is just a factor of two more, this might be too much. So a week long computation, proving it in an extra week, maybe just isn't worth it. And so one way around this problem, and again at least in this theory land is by parallelizing these PCPs. So if the argument is that you can't, it takes too long to compute them, let's try to do it as much as we can in parallel. And so this work by BCGT in 2013, they give these efficient size PCPs. So kind of the best we know size PCPs of size T times polylog T, where after computing the computation, you can compute the whole PCP and only polylog T depth. And when you translate this into an argument, this gives an argument where the parallel time or the depth is only T plus polylog T. So you can think this is essentially T whenever T is large. However, it requires roughly T processors. Again, if T is something like two to the 40 for an hour long computation, this number of processors just is not reasonable for any practical setting. So just to recap the landscape of a proof of efficiency that I've discussed. So on the X axis here, we have kind of the additive overhead. You can think the extra time it takes to computer proof. And on the Y axis, we have the number of processors required to do so. And so first we have these original kind of polynomial time PCPs from the work of BFLS in 91. And then these can be compiled into succinct arguments, either via Killian's protocol or analogously Macaulay's protocol, which gives a non-interactive succinct argument or CS proofs. And then as the kind of landscapes of PCP improve, so did the landscapes of proof of efficiency into succinct arguments. And this work in 2005 by Bencison and Sudan, they give quasi-linear overhead PCPs. And again, for both of these works, you can think they just require a single processor. And in these regimes, having slightly more doesn't give any benefits. And then this work of BCGT in 2013, they show how you can kind of trade off this time to compute the PCP using a linear number of processors. So this is what gives these succinct arguments with essentially no additive overhead. So a t-time computation can be proved in essentially t-time, but it requires roughly t processors. So what I wanna kind of look towards in this work and strive towards is kind of getting the best of both worlds. Can we get essentially t-time proofs? So no overhead using very few processors. And this is what I want to introduce with this notion of sparks, which recall our succinct, paralyzable arguments of knowledge. And so in this new paradigm, I wanna say that let's forget the old method of let's first compute the output and then prove this clean statement. Instead, try to focus on computing the proof in parallel to the computation itself. I wonder what I mean by this. I mean, we have the same interactive arguments regime from before, but the input will only have a statement, a function M with input X and a time bound T. And the point is that the prover should compute the output and the proof while maintaining the other succinct argument of knowledge properties where the prover's efficiency is this t plus polylog t parallel time. So for t-time computations, again, this is essentially t-time, but we allow the prover to use some polylog rhythmic number of processors. So this is what we require for a succinct paralyzable argument of knowledge or sparks. And what we show in this work, first is that assuming only collision resistant hash functions, there actually exists four round sparks. So very round efficient sparks that also preserve the time complexity of the underlying computation. And seconds, assuming a little more, so if we assume kind of quasi-linear succinct non-interactive arguments of knowledge, so snarks, then we actually construct non-interactive sparks. And underlying both of these results is a somewhat generic transformation from any quasi-linear succinct argument of knowledge. So this is a succinct argument of knowledge where the prover overhead is this t-times polylog t factor. And we take that and we transform it and get a spark. And this assumes only collision resistance. And how I want you to think of this is in our transformation, you can think we have some t-times polylog t overhead succinct arguments, which maybe means for an hour long computation it takes 1,000 hours to prove. And we're transforming this to say, instead we're gonna take this hour long computation and using say around 1,000 processors, prove it in exactly an hour, while also computing this. And so how we get the above theorems is in the first one, we're just plugging in Killian's argument with an efficient PCP into our construction, which gives a four round spark. So it preserves the round complexity of Killian's argument. And in the second case, we start with just any quasi-linear snark. And the resulting thing we achieve is a non-interactive spark. And the focus of this talk will be on this transformation going from this quasi-linear succinct arguments to sparks. But before I get into that, I wanna briefly talk about some of the applications of sparks. So first is kind of what I alluded to before of getting these really efficient delegation protocols, which we'll call time tight. So this is where the client gives a task to a server and the server for the proof can just use some say non-interactive spark to do this proof. And the high level idea is that sparks give a way to take any function and make it verifiable. And in particular, this is really interesting when you start considering maybe some specific functions of interest. For example, if you start with a verify, if you start with a sequential function, so this is a function that takes a long time to compute and can't be computed much quicker with a lot of parallelism. If you take a sequential function and you append a non-interactive spark, this immediately gives a VDF. And VDFs have been hugely influential in the last two years in blockchain applications. They can be used for kind of a distributed and decentralized randomness generation process, among other things. And I wanna emphasize that previously we only knew such generic constructions of VDFs, assuming iterated sequential functions. Think of like repeated hashing or repeated squaring in a group of unknown order. And what we show in this work is that actually we have a transformation that works for any starting function. So any sequential function essentially gives a VDF. And why this is interesting is if the sequential function satisfies some other hardness motion, for example, if it's memory hard, then our transformation preserves this hardness. So start with some memory hard function or memory hard sequential function, apply a non-interactive spark and we get a memory hard VDF. And these memory hard functions are, I don't wanna define it formally. In fact, there are many different ways to define this notion, but I wanna just keep it a high level. This says that the bottleneck in computing this function is actually the memory accesses needed to make. And such functions are actually known to exist, at least are secure in the random oracle model. This is under various definitions. And why these are important is they give a way to compute functions where kind of the best architecture, we know how to compute them is just standard computer architecture. So they give kind of an ASIC resistance property where there's no benefit in buying specialized hardware to compute these certain functions, which is a lot of interest in these blockchain applications. So I wanna continue in the rest of the talk to focus on this construction from quasi linear overhead to synced arguments to sparks. And I'm gonna make a few simplifying assumptions for this talk. The first is I'm only gonna consider the non-interactive regime. And the second is I'm only gonna consider sequential RAM computation. But in the full version of the paper, we show how to deal with both of these. So as a warmup, I wanna consider sparks, the synced parallelizable arguments of knowledge. Whenever we start with a computation, which has small space, and this is actually based on these original VDFs works that get these kind of VDFs with almost no overhead or tight VDFs. And so what we have here is kind of on the horizontal axis, I have kind of time plotted from zero to T. And we know that this machine, this function M on input X with witness W takes T steps to output some value Y. And what we wanna do is we wanna compute a proof in parallel to this computation that finishes in essentially time T. So the observation made in these works is well, what if we had a snark that for T time computation, so any T time computation only takes T extra time. So in total, it'll take two times T time. So in this setting, the conclusion is well, we can start off just by taking the first half of the computation, so the first T over two steps, and we can prove it and finish the proof by time T. So this is kind of in some sense, the most we can do given the tools we have. So this two times T snark, this is the most we can do with just a single processor. However, at time T over two, we can then recursively compute and prove more using this idea. So at time T over two, we then see, well, if we compute T over four steps, we can compute this and then prove it and finish by time T. And at time three, T over four, we can do T over eight more steps and prove by time T, and we keep continuing until we get down to some base case, where T over two to the M is like a constant. And at this point, we know the output Y, and by putting all these proofs together, we can prove the overall computation. So this results in kind of a logarithmic and T number of proofs. And when you consider this as a full argument, this gives a log T factor increase in the complexity of the prover verifier in communication, roughly because the prover has to generate these log T proofs and the verifiers to check all of them, as well as kind of make sure they all fit together properly. And if we started instead with a snark with some general overhead, say A times T instead of two times T, then the number of protocols we'll need here is just A times log T. And what's nice is that if we start with quasi linear overhead, this A value I talked about is just a polylog factor in T. So our resulting argument will still be succinct. So why doesn't this approach work? Well, as I said before, this only works for small space computations because our intermediate statements that we need to prove are of the form starting with some space, some configuration of the machine, running the machine for case steps results in some final space. And the problem here is that the statements actually grow with the space of the computation. So for a T time computation, the space might be even size T. And so the verifier will need to read these intermediate statements in order to verify the computation. And so there's no hope of getting the verifier to be polylog T, to have polylog T complexity when it has to read statements which may be of much larger size. So the first attempt to fix this is let's just hash down the space. So our intermediate statements, instead of including the space, will include kind of a hash or digest of the space. And what the prover needs to prove is that there exists some initial space and final space consistent with these digests, such that starting with the space running the machine results in this final space. And here you should think of this initial and final space is the witness that the prover needs to use in this sub-proof, in the sub-protocol. And the problem we have is that, essentially we've just taken the problem before and pushed it into the witness, but this will actually increase the complexity of the prover. So while this fixes the communication and verifier complexity in the resulting arguments, the prover will still be inefficient. So this is kind of what we need to solve now. How do we make the prover efficient in this kind of main idea? And what we'll use here is updatable hash functions. So if we use updatable hash functions, we can have the exact same statements, but the prover now will prove a different relation to convince the verifier of these statements. Specifically, it'll prove there exists some sequence of updates, such that the machine specifies these updates, these K updates for these K steps, and starting with this initial digest, applying the updates results in this final digest. So this is what I mean by an updatable hash function. And here, the key observation is that, now this witness that the prover needs to compute for each sub protocol scales only with the steps and not with the space of the computation. So there's some hope to kind of efficiently computing these and the question remains, can we compute these actually in parallel to the computation? And for this, the tool we use is kind of this new idea for Merkel trees of actually using pipeline to Merkel trees. So again, we have the same statements, the same proof, but we're gonna use Merkel trees to compute these digests and we're gonna, these Merkel trees provide these kind of local updates properties to get these updatable hash functions. And if we just pipeline these updates as they come by, and the idea is that we can just perform each level, each axes needed at each level in the Merkel tree at a time, then this will satisfy the efficiency property we need. In other words, we can compute these updates completely in parallel to the computation. So putting it all together, what does this look like? So we start with some initial digest, which you can just think of hashing the empty database or the empty space the prover might need to use. We run the protocol for, we run the computation for K1 steps, where I'll talk about what K1 is in a little bit. And in parallel to doing these steps, we can compute the updates. Here, we're just kind of throwing all of these updates at a Merkel tree and pipelining them as efficiently as possible. And this will give some overhead, but the key is that it's only an additive overhead, whereas naively, if you just did all of these updates sequentially, it would give a polylogarithmic overhead to compute all of these updates. But by pipelining them, we're really leveraging the fact that this only results in a small overhead over the original computation. Then, given kind of this witness we've computed, we can prove the statements that M on input X running for K1 steps goes from this initial digest to the next digest. And how we choose this number of steps K1 is just, we say, what's the largest number of steps we can compute and then prove and finish by time T. So as I said before, we can just recursively apply this idea. So now we have some other number of steps we compute, steps that we can compute and prove them by time T and we just keep recursively applying this idea. And so, as I said before, if we start with a snark with quasi linear overhead, the number of sub protocols we'll need here is only polylogarithmic and T. And we've guaranteed that everything finishes in parallel time, which is essentially T. So it only has an additive polylog T overhead. And the spark complexity we get is essentially M times that of the underlying snark and all the other parameters of interest pretty much. And because M is polylog, this preserves the succinctness and efficiency properties we need. And briefly, I wanna say for argument of knowledge, it's a little non-trivial in the fact that we have to actually compose the underlying extractors when proving argument of knowledge, but we show the full details of that in the paper. So to conclude, I wanna just recap what we've shown. So sparks, which are succincts, paralyzable arguments of knowledge, they allow computing and proving RAM computations where the parallel time is essentially T and we only incur kind of a polylogarithmic overhead and the number of processors we need to use. And we do this via this generic transformation from these succinct arguments with quasi-linear overhead to sparks assuming only collision resistance. And one reason these are so cool is they give a way to take any function, apply a non-interactive spark and you get a verifiable version of that function, which in particular can be used to construct the first candidate memory hard verifiable delay functions. And I wanna leave a final thought, which is saying that we kind of view this work as a theoretical feasibility result for now. There are a lot of interesting open problems on how do you take this, implement it in a way that preserves all these things. So working in the RAM model doesn't exactly translate to how computers actually work. But I think this leaves a bunch of exciting open questions of how do we kind of take this nice theoretical results and make it practical and hopefully get to that kind of ideal delegation sort of protocol that I mentioned earlier in the talk. So yeah, thank you for your time.