 Hi, my name is David Heath. In this talk, I'll be presenting stacked garbling for disjunctive zero-knowledge proofs. This is research done by myself in my advisor, Black Kolesnikov, and the two of us work at the Georgia Institute of Technology in Atlanta, Georgia. Our work is in a paradigm that uses garbled circuits to construct zero-knowledge proofs. This paradigm was established by Jorak et al. in 2013, and they showed that there's an elegant protocol that uses garbled circuits to build zero-knowledge proofs. The interesting property of this protocol is that the computation and communication costs scale linearly in the size of the proof statement. This means that compared to many modern proof systems, this garbled circuit technique is extremely efficient even when faced with very large proof statements. More modern systems tend to focus on small proof size and fast verification time, and these properties often come at the cost of super linear proof or scaling. So for this reason, this garbled circuit for zero-knowledge remains a very interesting technique even today. In our work, we extend this paradigm by reducing the communication cost, which is the primary cost of this JKO paradigm. We reduce the communication cost if the proof statement itself contains destructive statements. So in other words, if the proof encoded as a program contains conditional logic, then we reduce communication, and in particular, we reduce communication by up to the branching factor of the program. So if the program contains 10 branches, our communication improvement can be up to 10 times. Next, I'd like to motivate this technique a bit. The question perhaps is, why do we care about improving the performance of large proof statements that have conditional logic? And what I'd like to show is one interesting use case for this technique that we discuss at length in our paper, which is about proving the existence of a bug in a code base. So to motivate this, let's assume that we have two players, Alice and Bob, and Alice has a collection of code. Perhaps Alice is some large corporation, and Bob is an outside observer who takes a look at this public code base and notices that there is a bug in it. So perhaps Bob would like Alice to know that there is a bug in her system. Perhaps this means that he can exploit her system somehow. And he'd like to tell her that this bug exists, perhaps to gain leverage over Alice, but he doesn't want to disclose where the bug is in her code base because that would allow her to immediately fix the bug, which would cause Bob to lose his leverage over Alice. So the idea is here, what we can do is we can run Alice's program as part of a zero knowledge protocol where the proof statement is basically an execution of the program that says some undefined behavior was exercised. Perhaps an array was accessed outside of its bounds, or some other problem was exacerbated. The point here is that Alice's code base is arbitrarily large. And so if we want to build a system which can scale to very large code bases, we need a proof system which scales elegantly in the size of the proof statements. So this is why we want this to use this garbled circuit paradigm. And in addition, normal code tends to have lots of conditional logic. So this is why we're interested in this notion of disjunctive proofs. Because in reality, the bug that Bob is going to demonstrate will only exist on one execution path. We would like that the cost of constructing a proof scales with the size of execution paths, an individual execution path, and not with the size of the entire program. So this is the improvement that we achieve in our work. And what I'll be showing for the remainder of this talk is how this is done. How do we decrease this communication cost in the case that the proof statement has conditional logic in it? Now to begin with, I'd like to start with the JKO protocol and therefore also with a bit of background on garbled circuits, just at a very high level, because we're going to extend the protocol of JKO directly. So just to remind ourselves how garbled circuits work and to explain how JKO uses garbled circuits to achieve zero knowledge, let's start with this simple circuit on the slide. In the JKO paradigm, the garbled circuit generator is the proof verifier. And the garbled circuit generator's core purpose is to construct an encryption of this circuit that will be sent to the prover. Then the prover will run this circuit under encryption. And the key idea here is that only if the prover has a valid input, a satisfying input to this proof statement, will the prover be able to construct an output in label of a logical one, and hence successfully getting the garbled circuit to output a valid label for one constitutes a zero knowledge proof. So let's look at that idea in a bit more detail. So in garbled circuits, first, the verifier will choose keys, labels for all of the wires, where each label either is an encryption either a logical zero or a logical one. Here, I've depicted encryptions of zero as red keys and encryptions of one as green keys. Her job is to take these logical keys and then connect them by the semantics of the circuit. So in particular, for each gate, she considers the function of that gate and encrypts an encrypted truth table corresponding to the logic. The idea here is that the prover, later on, when he gets only one key per wire, will only be able to unlock one of these rows and so can only get an output key for a particular gate if it corresponds to his input keys. Therefore, by induction, he can only get an output green key, an output encoding a logical one if he indeed has something that satisfies the proof statement. So this is how Alice encrypts the circuit and she does this for every gate. This is really just garbled circuits that I'm showing you at the moment. The key point is that in the garbled circuit protocol, Alice constructs these collections of encrypted truth tables, which we refer to as the material. She takes this material and she has to send it over a network to Bob. And this sending of material is the most expensive part of garbled circuits, and therefore the most expensive part of this JKO paradigm of using garbled circuits to perform zero knowledge. So decreasing this communication, this sending of material, is what our work is really about. And again, the idea is that we will need to send less information over the wire if the program contains conditional branching. So now let's have a look at what Bob does, the prover, once he gets this circuit encryption. So he gets some input keys to the circuit. This is achieved by using oblivious transfer. So now he has his input encoded in this garbled circuit construction, and he has all of these tables. And the idea is he can just step through this circuit gate by gate, decrypting and eventually producing a green output key. Now, the idea is that in order to turn this garbled circuit into a zero knowledge proof, all we need to do is have Bob send this green key to Alice. Again, the key idea here is that the properties of the garbled circuit ensure that the only way that Bob can get one of these green keys is to follow the protocol and have a satisfying input to the circuit. Now, what I have shown you so far is truly only honest verifier zero knowledge, so there is nothing from preventing Alice from constructing an invalid proof challenge. JKO go on to show how to enhance this protocol to also account for malicious verifiers zero knowledge. However, that's outside of the scope of this talk. Our paper also works in this malicious verifier setting, but I want to skim over those details for now for sake of time. So now I've shown you how JKO established this paradigm about using garbled circuits for zero knowledge. And next I'd like to explain why the solution of previous work, why is this JKO solution not quite satisfying in this use case of zero knowledge bugs. First, I'd like to say a bit more about exactly what we mean by proving that there exists a bug in the program. The arbitrary statement that we encode in our garbled circuit will be of the form your program exercises some unintended behavior on some input. And so some examples, as I mentioned earlier, are things like accessing arrays out of bounds, division by zero, triggering some debug assertion in the code base, or any other problem that you can imagine that can be directly encoded as a circuit. So in other words, what we're doing is we are compiling the verifiers program into a circuit instrumented to output a logical one if some unintended behavior occurs. So now that we've defined this problem more formally, let's talk about what the issue is with trying to just throw JKO at it and where things start to become very inefficient. Again, returning to our problem, Alice the verifier has this collection of code and Bob knows a bug in it. So one simple way we can think about representing this problem just to start getting a handle on what's going on is to say, well, let's let Alice break her code base up into a number of snippets. Each of which is separately representable as a different circuit. And then Alice and Bob will perform a proof that says indeed one of these circuits, it is undisclosed which one, but one of these circuits exercises a bug in the code base. Now the problem here is that as Alice's code base grows, as in as the disjunctive nature, the amount of code paths through her program grow, the cost of the JKO paradigm grows as well. And again, the reason for this is because the amount of material that Alice has to send to Bob is proportional to the size of the overall circuit. This is where our work comes in. Again, it's about reducing communication programs with conditional branching. Now traditionally, it has been assumed that the garbled circuit generator has to send material for every branch separately. After all, we have to make sure that the verifier can't tell which program branch is actually being executed. So the natural way to make sure that this is the case is to have the verifier send enough material for all branches and therefore she cannot distinguish which one is actually being proved. But our work breaks this assumption. We can still keep the same property. It is still the case that Alice will not be able to tell which branch is actually executed under Bob's input, but at the same time we only have to send enough material for the single longest branch out of any number of disjuncts. Now, I'd like to change our problem slightly. In our original problem where we looked at JKO, we were considering only one circuit. Now, because we're talking about conditional branching and disjunctive proofs, we're going to consider more than one. For the purposes of these slides, we'll just look at two circuits. The idea here is that because Bob is the prover, he has all of the inputs to this program and therefore he knows which of these branches is actually executed in a run of the program. Whereas on the other hand, Alice the verifier does not know and should not learn which of these branches is actually taking. I'll start by discussing two of the key ideas which are the backbone of our approach. The first is about viewing these collections of encrypted truth tables, this material as just a string. And in particular, we can manage these as strings and as we'll see, we can take two collections of material and we can add them together using Bitwise XOR. So this is the first high level idea of our approach. And the second is to view a circuit encryption and garbled circuits as the expansion of a pseudo random seed. So in particular, we will allow Alice the verifier to start from a small seed and use this as the source of all of the randomness that she uses to encrypt her circuit. The idea is that by doing so, a seed becomes a compact representation of the circuit which can be sent directly from Alice to Bob. Now in general, it is not secure for Alice to send a seed to Bob because by sending the seed, Bob is able to derive all of the keys and therefore can forge a proof. However, what I will show you shortly and what we prove in our paper is that in certain cases and in particular, if a seed is only used to generate a branch which is not taken, then it is secure to send one of these seeds directly from Alice to Bob. Let's return to our two circuits. Again, Alice knows that there are two circuits but she does not know which of them is supposed to be executed, only Bob has that information. So what we will do is we will have Alice generate starting from pseudo random seeds, both of these circuits. So she'll use some seed one to generate this first circuit and some seed two to generate the second circuit. And what I'm indicating by this blue arrow on this slide is that the first circuit is the one that Bob knows is going to be executed. So now Alice has all of this material and so what we'll do at this point is use Bitwise XOR to so-called stack this material together. So we have these two strings of material M1 and M2 and we construct this aggregate M1 plus M2. The idea here is that now Alice need only send this sum. The key point is that sending this sum is cheaper than sending M1 and M2 separately. So Alice sends this sum M1 plus M2 and then what we need is for Bob to obtain one of these seeds from Alice, in particular to obtain the seed for the branch which is not taken and we'll see why shortly. To do so, the two players will use oblivious transfer where Alice will be the OT sender and she'll put into the oblivious transfer both of her seeds and the prover will put in a selection which says that in this particular case, he needs seed two. So in general, if there are many, many circuits the players will engage in multiple oblivious transfers and the prover will select seeds corresponding to every branch which is not taken in the course of the proof. So as a result of this oblivious transfer, the prover receives the seed two and the idea is that from here, Bob can copy the actions of Alice in constructing an encryption of circuit two. He can do this of course because he has the seed, the compact representation of the encryption. So he uses a seed to expand an encryption and therefore obtains M2 which he can now unstack to obtain M1. The key point here is that although Bob was able to obtain this material M1, he never saw the seed corresponding to circuit one and therefore he is unable to forge a proof in this first circuit. So now Bob simply takes this material M1, applies it to the first circuit, steps through it as normal in order to obtain an output green key, completing the proof. So this was stacked garbling for zero knowledge. The key idea are about using material as random strings that can be stacked together with bitwise XOR and in addition, viewing material as the expansion of a pseudo random seed. And by using these two ideas together, the generator, the circuit verifier can efficiently send circuit material by stacking together the material from multiple branches and then obliviously transferring seeds. I claimed at the beginning of this talk that this gives us communication improvement proportional to the number of branches. And of course the reason is that we're sending only enough material for one branch rather than B branches. And so I'd like to show what kind of improvement that yields. So here are some results from our paper. In this chart we are comparing the prior work of JKO with our work. Again, JKO for this setting of extremely large proofs is the best way to construct proofs very efficiently. If the cost you are interested in optimizing is wall clock time, the amount of time from the start of the proof to the verifier being convinced that there is a proof, then JKO was the prior state of the art in this setting of large proofs. As you can see here, our work very clearly improves over JKO in the case of this conditional branching. So what I have plotted here is along the X axis is a branching factor where there are up to 15 branches. The contents of these branches for the purposes of these experiments are randomly generated circuits. And on the Y axis you can see the amount of time, the wall clock time it takes for the prover and verifier to complete a proof from beginning to end. What you can very clearly see is that our communication improvement is improving the total wall clock time by a proportion related to the number of branches. So to relate this back to this zero knowledge proof bugs example, again we had Alice who had this corpus of code split up into all these snippets. And now the idea is that we don't in some sense care how many snippets there are because our approach will efficiently stack together all of these snippets and thus our communication cost is proportional to only one snippet. So this was stacked garbling for zero knowledge. Again, our key contribution is a communication improvement in the case of disjunctive zero knowledge proofs which arise from conditional branching in the underlying program. I also showed you one exciting use case for this improvement which is to prove the existence of a bug in a code base. To give you one more piece of evidence that our approach is efficient, we performed an experiment where we simulated 1,000 snippets of code with between 30 and 50 lines of code to simulate a medium sized code base with tens of thousands of lines of code. And this proof was completed in around seven seconds on a wide area network. So this was stacked garbling for zero knowledge. Thank you.