 Hi, I'm Ariel Hamlin from Northeastern University. Today, I will be talking about two server distributed O-RAM with sublinear computation and constant rounds. This is a joint work with my uncle Veria from Boston University. The distributed O-RAM model, or DO-RAM, was first introduced by Ostrovsky and Schup in 1997 and formally defined by Lewin Ostrovsky in 2013. It was originally introduced as a variant of traditional O-RAM that sought to get around the lower bound results in the single server model. In this setting, multiple non-colluding servers act as the O-RAM server. The client interacts with each of them in turn and is able to recover records in this way while maintaining obliviousness. The servers may have the same data or individual databases depending on the scheme. There are many papers that consider the DO-RAM setting, but in this talk we will specifically discuss the secure computation variant. In the secure computation variant, servers work together to emulate the client while maintaining non-collusion and obliviousness. As first noted by Lewin Ostrovsky, multi-server DO-RAM schemes are especially adaptable to secure computation and highly useful for secure computation programs where one wishes to run a program in the RAM model. In the secure computation setting, latency between the number of rounds of communication can be high, which motivates the need for constant rounds. We also wish to limit the amount of work the servers have to do to be sublinear in the data size, as this scales poorly for large datasets otherwise. There are many schemes in this setting. See the paper for full comparison, but in this limited table we examine previous work that either managed constant rounds or sublinear local computation. Gordon, Katz, and Wang, for example, have constant rounds but linear server computation. Jerecki and Wei achieves the best overall parameters, but does not obtain constant rounds in sublinear work. We are the first work to obtain sublinear local computation and constant rounds simultaneously. We present two different constructions. First, an adaptation of square root ORAM to the secure computation DO-RAM setting. This obtains square root log n local computation and a second construction that extends the first in such a way that reads are no longer indistinguishable from writes to obtain a better end to the epsilon for any epsilon greater than zero local computation. We do so by adapting techniques from doubly efficient private information retrieval. I next will discuss our sublinear DO-RAM construction, first by reviewing the traditional square root ORAM construction, then discuss how we adapt it to our secure computation setting. Next, I will explain how we extend it to the unlimited reads setting. Square root ORAM was first introduced by Goldwright and Astrovsky in 1996 and is in the standard single client single server model. In their construction, the server stores two separate data structures, the first of which is a copy of the database called the store. This is a permuted read-only data structure where the client holds the key to the permutation. The second data structure is an updateable stash. When the client wishes to make a read, the server first scans the stash to see if the record has been previously read or written to. The next step is to read a single location in the store. The exact element will depend on whether the element is present in the stash in the previous step. If it is, the element read is a dummy record. If not present in the stash, the record is read itself from the store. The invariant maintained is that each location in the store is only read once. Finally, the element is written back to the stash in the most recent location. This location is updated after every access, whether it be a read or a write. For a write, the first two steps are similar. Only a dummy element is read from the stash and the element written to the stash in the final step is the input element. When the stash becomes full, at the end of an epoch, the stash is reshuffled back into the store, removing any duplicated elements. Each access for square root ORAM takes square root of the database size and is constant rounds. This is a good starting point for our own scheme, but there are two main problems that prevent a generic adaptation using a standard secure computation compiler. The first is supporting the reshuffling at the end of an epoch in constant rounds. This is often done using oblivious sorting, and the only known secure computation variants of oblivious sort are not constant round. The second is representing the permutation of the store in a compact way when the client is emulated by the two servers. These are the two main issues that we will focus on addressing during our construction. I will now introduce our first construction that obtains sublinear server work and constant rounds. As with traditional square root ORAM, we store the data in a permuted store. Only the data is additively secret shared between the two servers. Each of them store the data under the same permutation and have a secret shared copy of the permutation key. We also store the stash as a secret sharing. In practice, the stash is instantiated with a linear ORAM with constant rounds, specifically flowRAM by Dorner et al. Because the stash is sublinear in the overall database size, the linearity of our stash ORAM scheme does not affect the overall sublinearity of our DorAM construction. When a read is performed, the operation and the index to be read are additively secret shared among the two servers. Operations then proceed as with traditional square root ORAM, first by checking that the element is not in the stash. The next step is then to read an element from the store. This is where we deviate from the original square root ORAM and solve our first problem, how to represent the permutation of the store. We do this by leveraging a cryptographic primitive called oblivious pseudo-random functions. As a primitive, OPRFs mean that each server holds a share of the PRF key and the input. They jointly evaluate the PRF while not learning anything about the other server's input. At the end of the protocol, each server learns the output of the PRF. It is this output that represents the location that each server looks for, the element being searched for, whether it be a dummy or index i in the store. In practice, we sort the PRF outputs for each element lexicographically and then do a binary search over those tags to find elements. Finally, once an element is found in the store, we write it back to the stash. For the write, the process is identical until this third step, and this is where we solve the reshuffling and constant rounds problem. When we go to write the element that is already in the stash, instead of writing to the next location in the stash as in traditional square root O RAM, we instead update the element already in the stash in that specific location. This means that each element is only in the stash once, which is key for our reshuffling face. We consider the following invariant to be key to our reshuffling process. If an element has been read or written to, it is in the stash, and each element only occurs in the stash once. This eliminates the issue that was present in the initial square root O RAM construction that he used oblivious sort to remove duplicates, and thus was not constant rounds. We also rely on the observation that each server knows, without loss of privacy, the set of elements that have not been read in the store during the current epoch. Reshuffling is a simple process that takes the elements in the stash and the unread elements in the store, and simply privately repermutes them under a new OPRF key. The stash is then filled with new dummy blocks to be overwritten during the new epoch. Because the permutations between the two epochs cannot be linked, obliviousness is preserved, and the permutation process can be done in constant rounds, though in linear time. In summary, we are able to achieve the first secure computation DOE-RAM variant with sublinear local computation and constant rounds. Our local computation is square root log n, where n is database size. The square root comes from the linear operations performed on the stash, and the log n comes from the binary search performed to find elements in the store. I now want to discuss how we extend this construction to support unlimited reads without having to update the stash. Overall, this creates far better performance when the number of reads greatly outnumbers the number of writes. Now, we move on to our sublinear DOE-RAM with unlimited reads construction. Our motivating thought for this construction is the question. If we are willing to leak the difference between reads and writes, can we support reads without updating the internal state of the servers and thus put off the costly reshoveling process at the end of an epoch? To do this, we turn to another technique to hide data accesses. Private information retrieval or peer. Peer is a read-only protocol, and the servers are stateless during a read access, meaning they provide obliviousness without having to update the server's state. Using peer, the simplest solution is to add a peer store in addition to the store and the stash already existing from our previous scheme. Due to the overall modularity of our first scheme, writes remain the same as the first protocol, but reads now first scan the stash, where the latest copy of the element would be if there was a previous write, and then read the element from the peer store. They then take the element from the stash if it exists or the one from the peer store. Because peer essentially supports an unlimited number of reads, we can support any number of reads between epochs, and refreshing only depends on the number of writes. However, we must be careful in selecting an individual peer scheme to instantiate this version, as we wish to maintain our sublinear server computation and constant rounds in our new construction. As previously mentioned, the trick to choosing the peer scheme depends on two things. The peer scheme must be sublinear computation for the server, and we must be able to emulate the client in constant rounds. This is an issue as traditionally peer schemes require linear server work, which because this peer store would contain our entire database would not work, unlike our stash, which is square root in size. Thus, we turn to doubly efficient peer or deep peer, where the double efficiency here refers to the fact that both the client and the server have sublinear computation. Deep peer is also known as peer with pre-processing, due to the fact that the database is processed during the setup phase. In addition, the constructions for deep peer rely on coding constructions that are highly amenable to secure computation. Leading to an emulated client that only takes constant rounds. I will note that there are other two server peer constructions, namely Korgan and Gibbs, that do meet the sublinear server computation. But those constructions rely on primitives that are not conducive to secure computation emulation, particularly punctured pseudo random sets, which are not clear how to convert to a constant round protocol when emulating the client. The construction of deep peer, first introduced concurrently by Kennedy et al and Boyle et al, are based on new assumptions on noisy permuted read molar codes. I want to take the rest of the talk to introduce some of the challenges we had when converting the deep peer client to a secure computation, and how we solved them by introducing a secure computation variant of fast Fourier transform protocols, which we believe is a contribution of independent interest. I want to go through, at a very high level, the process of how deep peer works using read molar codes. Read molar codes are a family of locally decodable codes that represent the database as a polynomial. In particular, they use a t-degree multivariate polynomial over a finite field, and the encoding is a low-degree extension of the database. In general, however, consider the setup process that takes in a database, and then viewing each element of that database as a point, interpolates a degree t polynomial we will call psi. This is what the server stores. When a client wishes to read the value at index i, the client first represents that index as a random polynomial v. This random polynomial is also a degree t, and evaluates at zero to the index being read. The client then evaluates this polynomial at several preset points and obtains the output, here represented as a, b, and c. These are the points that the client now asks the server to return the values of the database function at. The server obtains the points y sub a, y sub b, and y sub c, and sends them back to the client. The client is then able to interpolate across these t points in order to recover the database value at index i. When emulating this process using two servers, the only steps that are non-local or non-linear operations are the two interpolations and the random polynomial evaluation, as it depends on the secret shared record i. The naive way to compute these steps is an n-squared operation of a matrix multiplication. This would not work for our reshuffling process has to be less than n-squared. However, these operations are also computable by FAST for a transform or FFT algorithm. We introduce an FFT secure computation protocol that can evaluate or interpolate the polynomials in quasi-linear time, n log-squared n, with only local computation, meaning that computing the FFT during these steps does not incur any additional rounds. We believe this protocol may also be of independent interest to the crypto community. To wrap up, we are able to obtain two constructions, the first of which is the first secure computation DORAM with sub-linear local computation and constant rounds. The second is an unlimited reads DORAM that leverages doubly efficient peer when we leak the difference between reads and writes to obtain better asthmatotic performance. For that construction, we are able to obtain an n to the epsilon for any epsilon greater than zero local computation and bandwidth. We also present an FFT algorithm that in quasi-linear time with only local computation is able to do multivariate interpolation and evaluation, and we believe may be a primitive of independent interest. Thank you for listening to my talk. Please feel free to reach out if you have any questions. Our emails can be found on our paper, and thank you for the additional soundtrack of my cat who you may have heard in the background. Karina, I love you.