 ProRAM is a new construction that efficiently handles random access arrays inside zero-knowledge proofs. ProRAM is specifically for interactive zero-knowledge protocols and achieves excellent asymptotic and concrete performance. In particular, for an oblivious transfer-based zero-knowledge protocol that I'll describe in this talk, ProRAM incurs only order of log-in oblivious transfers per access. In this work, we formally specify ProRAM and incorporate it as part of an efficient arithmetic zero-knowledge protocol. The resulting system supports arbitrary proof statements encoded as arithmetic circuits that have access to efficient random access arrays. We also implemented the resulting system and we examined its concrete performance. Let's get started. Recall that a zero-knowledge proof is a type of proof system run between two parties, a prover and a verifier. Both parties agree on some public statement, which the prover claims to be true. As evidence to this fact, the prover holds a witness. However, the prover may wish to hide her witness from the verifier because it may contain sensitive information. An interactive zero-knowledge protocol allows the prover to convince the verifier that the statement must be true while keeping the witness hidden. That is, while the verifier becomes convinced that the statement is true, he learns no information from the protocol interaction. Today, zero-knowledge proofs are widely studied, largely due to their relevance to blockchains and to post-quantum secure cryptography. The types of proofs needed in these scenarios are often short and use non-interactive zero-knowledge techniques. However, proofs of large statements are also interesting. Large proofs can be used to encode statements like a given program has a bug or a private certified database stored in the cloud satisfies some property. When working with larger statements of this kind, it is interesting to consider completing the proof interaction as quickly as possible. In this end-to-end wall clock proof time metric, interactive proof protocols remain an excellent choice. In this work, we consider these kinds of interactive proofs of large statements. Typically, zero-knowledge protocols allow for arbitrary statements where the statement is encoded as a simple Boolean or arithmetic circuit, and the witness is encoded as an assignment to the circuit's input wires. However, as we scale to larger and larger proofs, it becomes harder and harder to efficiently describe the statement as a circuit. Therefore, a recent line of work has pushed for zero-knowledge machines that, in essence, run a powerful CPU inside a zero-knowledge protocol. The advantage of this approach is that we can describe the proof statement as a program written in a high-level programming language like C. This allows users to easily describe extremely complex proof statements and has the added benefit that the engineer who designed such a system requires very little cryptographic training. Designing end-to-end proof systems of this kind is a large, multifaceted challenge. In this work, we consider an important problem motivated by such systems – efficient RAM access. More precisely, as the CPU runs the compiled program, it needs to read and write data from a large main memory. The CPU accesses main memory frequently, and so a slow main memory will greatly harm performance. In this work, we build a new and efficient construction that can implement a main memory. Informally, a suitable RAM must provide two properties. First, it must provide obliviousness in the sense that the verifier should learn nothing about the RAM access order from the sequences of accesses in the actual proof. Second, a suitable RAM must provide authenticity. Here, it is useful to consider a malicious prover who wishes to cheat in the proof interaction. Suppose that this malicious prover wishes to substitute some value looked up from the RAM by some other value. For instance, on the slide, I have indicated that the RAM slot holding the value 42 is to be read, but the prover might try to substitute in some different value, say 50. Note that the prover might try to substitute in some arbitrary value, as I have depicted, or she might try to substitute in some value from a different slot of RAM, say 21. A suitable RAM should defend against these types of attacks. We refer to a construction that satisfies both of these properties as a ZKORAM because of this problem's connections to the field of oblivious RAM. In this work, we introduce PRORAM, a new zero-knowledge ORAM. PRORAM relies on only very simple primitives, namely oblivious transfers, and is efficient. In particular, each RAM access uses only two log N oblivious transfers. In the remainder of this talk, I'll dig into the details of PRORAM. At the end, we'll consider the concrete performance of the overall system. Let's start with some notation. Our proof protocol will support arbitrary arithmetic circuits with multiplication gates and addition gates. For concreteness, suppose that each circuit wire holds a value in a relatively large field, such as a field with around 2 to the 40 elements or more. For each wire in the circuit, the two parties will hold a kind of shared encoding of the actual value on the wire. For instance, if the wire holds the arithmetic value a, then the verifier holds a random value big a. While the prover holds the specific value big a plus little a times delta. Here, the value big a is local to this particular wire. Each wire in the circuit will have a different value of the kind big a. Delta, on the other hand, is global to the entire circuit. Each wire sharing is written in terms of delta. Delta is a random non-zero value. When these individual shares held by the two players are not relevant, I'll package both shares together into a single value written with this double bracket notation. We refer to these joint sharings as authenticated sharings. In the literature, such sharings are sometimes called information theoretic max. I'll note a few important properties of authenticated sharings. First, we will always assume that the prover p knows in clear text the value on each wire. Specifically, p knows the value little a. However, we will enforce that p will not know the value big a and also will not know the value delta. This is how we achieve authentication. It is infeasible for p, who knows a share of little a, to forge a share of some other value, say little a plus one. Forging such a value would require p to guess delta, and delta is a random value in a large field. Thus, p can guess delta only with negligible probability. Finally, notice that big a acts as a simple one-time pad on the authenticated value a times delta. These one-time pads are central to proram, and we will discuss them for the remainder of this talk. Our ZK proof system implements arithmetic circuits that propagate authentic sharings, as shown on the slide. Recall that our goal is to implement a RAM that works with arithmetic circuits. Therefore, the RAM should produce as output and accept as input authenticated sharings. Before moving on, let me introduce one final piece of notation. For a sharing with mask big a, I will sometimes write big a in the subscript of the authenticated sharing. The RAM should output an authenticated sharing masked by some arbitrary value m. We're now ready to look at the RAM problem a bit more formally. Suppose that we simply store the RAM as a sequence of authenticated sharings. That is, each RAM slot is simply a distinct sharing authenticated by some mask. Suppose that at runtime, the arithmetic circuit requests access to a particular index of the RAM, say, index 3. First, note that the prover knows that the circuit requests index 3 in clear text. Indeed, the prover p has the entire RAM access history in her head. Since this is the case, p can simply retrieve her share of index 3 directly. However, the problem is that in doing so, she retrieves a share with the particular mask m3. We need the output of the RAM to be valid authenticated sharing, but to form a valid sharing, both parties must agree on the specific mask m3. But v cannot know the specific mask m3. Otherwise, he would learn the access index and the RAM would not be oblivious. However, v does know all such masks mi. We're now ready for the high-level intuition behind program. Note that this will just be high-level intuition, and I will leave several crucial problems on the table that must be solved. First, let's suppose that it's possible to describe the RAM access order as a permutation. With this restriction made, the idea is that the prover p will somehow request from the verifier each of the RAM masks in some permuted order. The idea is that the prover will request these masks in the order of RAM access, in essence aligning the order of the masks with the order of RAM access. Note that if the prover holds these masks, then she can read elements from the array one by one by stripping off the masks. As I said, this intuition leaves on the table several important problems. First, my description implied that p will learn each of these RAM masks. This is clearly insecure, since my entire argument for the authenticity of authenticated shares was based on the secret one-time-pad nature of the masks. Second, I assumed that the RAM access order can be described as a permutation, but this is clearly insufficient to describe arbitrary RAM access orders. In real RAMs, we assume that we can access the same slot repeatedly. A permutation cannot support this capability. The rest of this talk will be about resolving these two problems. Let's start with this first problem. Our high-level intuition will still hold. P will request from V a permuted collection of masks. However, rather than obtaining these masks in clear text, P will instead obtain encodings of the masks that hide their clear text values. To encode masks, we will introduce to our system a second type of sharing. In addition to authenticated shareings, our system will also work with simple additive shareings. Here, as with the authenticated shareings, the parties hold a shared encoding of a field element, little a. However, these simple additive shareings do not involve the global secret delta. Additionally, for additive shareings, we will enforce that the prover P does not know the clear text value, little a. This is starkly in contrast to authenticated shareings and we assume that P knows all encoded values in clear text. One of the crucial insights behind program is that we can use these additive shareings to encode the masks for authenticated shareings. Very shortly, we'll use the ability to operate on additive shareings to express permutations. Since these new shareings are additive, they trivially support homomorphic addition and subtraction operations. To this list, we'll add one further operation. This operation allows P to scale a shared field element B by some bit A of her choice. I won't describe the procedure for this operation in detail here, but it can be easily implemented by a single oblivious transfer. Have a look at our paper if you want further details. These three operations together induce a vector space over additive shareings, where private bits chosen by the prover act as scalars. A vector space is not sufficient to encode arbitrary computations on the field elements, but at least informally, it is sufficient to arbitrarily move field elements around inside the circuit. More specifically, the vector space operations are powerful enough to build permutations, as we will see next. Let's return to our high-level intuition. We will substitute each point text mask by an additive sharing that instead encodes that mask. From here, we will permute the masks according to a permutation pi chosen by the prover. Let's briefly discuss how we can implement such a permutation. The idea is to use a classic construction called a permutation network. A permutation network is a recursive construction, whereby we build a permutation on n elements using two permutations on n over two elements. In addition to the two recursive calls, the network uses a linear number of so-called swap gates. A swap gate either swaps its two inputs or leaves them as is. It's easy to implement a swap gate that is controlled by the prover using our vector space operations. Specifically, we'll calculate this scaled difference between the two inputs. Then we will add and subtract appropriately to construct the gate outputs. Let's observe that this operation is correct. If p's choice bit s happens to be 0, then the scaled difference is also 0, and therefore the inputs are propagated directly to the outputs. If instead p's choice bit s is 1, then the scaled difference goes through and swaps the two inputs. Hence, the prover can control a single swap gate by a single scaling operation, and therefore a swap gate uses a single oblivious transfer. We can use this capability to implement an arbitrarily large permutation network. Our permutation networks will take as inputs shared masks MI, and then output them in a permuted order chosen by the prover. The entire permutation will consume only n log n oblivious transfers. If we return to our top-level view, we can now understand more formally our high-level intuition. The verifier v starts by constructing additive sharings of each of his masks. The parties then apply a permutation network to these encoded masks, generating a permuted set of encoded masks. Crucially, throughout this entire process, the prover does not learn any mask MI. Let's now look at how this permutation supports RAM accesses. Consider the first RAM access in this example, which will be to the index 3. The parties hold these two relevant sharings. If we unpack the definition of these two sharings, we can see that the parties have enough information to locally construct an authenticated sharing of the appropriate RAM value. Specifically, P subtracts her share of the encoded mask from her share of the encoded RAM value. Meanwhile, the verifier chooses as his input negative r. Some simple arithmetic demonstrates that this is a valid sharing of the appropriate RAM value x3. Notice that this has solved our original problem. The parties have managed to agree on a specific mask negative r without leaking a mask to the prover. Thus, we have solved this first problem by incorporating two different types of sharings in the same circuit. However, one high-level problem from our intuition remains. As described, our RAM still only supports access orders that can be expressed as permutations. To support full RAM, we need to upgrade the RAM to support arbitrary access orders that cannot be described as permutations. To implement general-purpose RAM, we will perform a reduction to a simpler object that we call a sword RAM. A sword RAM allows random access reads, but only allows each RAM slot to be read at most one time. Meanwhile, the sword RAM only allows sequential writes, that is, the parties can only append values to the end of a sword RAM. Since we write sequentially, on each write, the parties can trivially agree on masks for stored authenticated sharings. And since we read each element at most once, a permutation suffices to describe the read order. Thus, we have all the tools we need already to implement a sword RAM. I'll now show the reduction from RAM to sword RAM. For sake of example, suppose we wish to instantiate a size 4 RAM. For our reduction, we will need a sword RAM that is twice as large as the desired RAM, so I've drawn a size 8 sword RAM on the slide. Recall that the sword RAM supports sequential writes. I'll indicate the next location to be written with this arrow. The parties start by sequentially writing the initial content of the RAM into the sword RAM. The numbers on the slide indicate the RAM index that is stored in the sword RAM slot, so sword RAM slot 0 holds RAM slot 0. Recall that P holds in her head the entire RAM access history. P uses this information to program a permutation network, as we've already described. And this permutation network will allow her to read from the sword RAM in some permuted order. For sake of example, suppose that the first RAM access indexes into RAM slot 2. P uses the output of the permutation network to read the appropriate element from RAM, and the parties can thus construct an authenticated sharing of the appropriate RAM element x2. Note that the access sword RAM slot is now essentially burned because it can never be read again. This is problematic because it might be the case that RAM slot 2 will be read again in the future. To account for this fact, the parties write back a new version of index 2 to the sword RAM. This is also where the parties would account for rights to the RAM. If the next access is indeed also to index 2, we will be able to support that access as well. The parties continue in this manner, on each access reading a slot from sword RAM and writing it back to a fresh slot until n-axis are completed. Notice that after n-axis, the sword RAM is completely full and we cannot write any more values. However, also notice that there remain n-available reads and that those reads contain the most recent copy of each RAM index. To continue, we use our remaining reads to extract the RAM content from our now exhausted sword RAM and then rewrite these values into a fresh sword RAM one at a time. Once we have copied the RAM content in this way, we are ready to handle n-more accesses. By repeating this process, our RAM can support arbitrary numbers of accesses. So, by using sword RAM, we can construct an efficient RAM. There is still one issue I have until now swept under the rug. Suppose that the arithmetic circuit generates a query into the array. Specifically, it searches for some specific RAM index, say index 2. As is, there is no way to force the prover to read the corresponding index from RAM. In some sense, the prover has total freedom to select whichever RAM slot she would like. This is, of course, problematic. The solution to this problem is actually quite simple. Rather than just storing each RAM element xi alone, we store each RAM element as a pair containing both xi and an explicit index i. We are careful to ensure that all operations permute the RAM element and its index as a unit. Now, when the circuit requests index 2, we can force p to read the appropriate RAM slot. We do this by forcing p to generate a so-called proof of zero. Namely, she will subtract the accessed RAM index from the RAM index that was stored in the RAM itself, generating an encoding of zero. The prover can then send her share of this encoding to the verifier as evidence that she accessed the RAM slot. The verifier simply checks that the share sent by the prover is equal to his own share. Notice that if the prover attempts to cheat and substitute in some different RAM value, she will be unable to construct this proof of zero. The proof of zero is an extremely cheap operation. We can even compress many proofs of zero together using a cryptographic hash function. Thus, ProRAM allows us to force p to provide a consistent RAM access order in a very straightforward manner. The prover simply generates a number of these proofs of zero. To recap, ProRAM builds a size n RAM from a size 2n sword RAM. Each time we initialize a size 2n sword RAM, the prover programs a permutation on 2n masks. This means that on each access, we consume amortized 2 log n oblivious transfers. Notice that the number of oblivious transfers is independent of the size of elements stored in the RAM. Additionally, ProRAM features extremely straightforward consistency checks. This is in contrast to other ZKORAMs that require a much more complex consistency check involving comparison operations on both RAM indices and on timestamp values. To conclude, I want to briefly look at ProRAM's performance. We implemented ProRAM in C++ and ran it on commodity hardware. In terms of communication, our implementation confirms ProRAM's low logarithmic scaling. And in terms of wall clock time, ProRAM gives excellent performance, requiring only a few microseconds per access, even for moderate RAMs that store megabytes of data. So this was ProRAM. To recap, ProRAM is a simple and fast zero-knowled oblivious RAM with excellent asymptotic and concrete performance.