 Hello everyone. I'm going to talk about Optorama, Optimal Oblivious RAM. I'm Gilad Arsharov and this is a joint work with Ilan Komogatsky, Wai Kailin, Karthik Nayak, Anok Peseriko and Alain Shee. Our agenda is as follows. I'm going to start with the introduction, which includes the problem definition. I will then give a very short tutorial on the progress of Oblivious RAM from square root or RAM to Optorama. And I will conclude with telling you a little bit about our techniques as much as the time will allow. So what is the problem that Oran comes to address? Suppose we have some secure processor and an untrusted memory or in a more modern view, suppose that we have some client that uploads this information to an untrusted server. Even if the uploaded information is encrypted, the access pattern, the memory locations that the client accesses reveals unsensitive information. For instance, suppose that the medical doctor uploads all the genomic information of its client to the cloud. Suppose also that the doctor always accesses regions that relates to the kidney. Even though the data is encrypted, the server can infer that the patient has some kidney problem and not say a heart problem. And this is a leakage that we want to prevent. Oblivious RAM solves this exact problem. It can be viewed as a compiler that takes the program that the client wishes to execute and converts each memory access to some sequence of operations on the physical memory. It chapters elements around the physical memory and moves blocks around to hide the logical access pattern. In particular, it also introduces some overhead. Every access in the logical program is translated to many accesses by the Oblivious RAM compiler. Oblivious RAMs were introduced by Goldreich and Ostrowski. The informal definition says that the access pattern, the memory locations that the client accesses can be simulated and they are data independent. As we mentioned, Oblivious RAM introduces some overhead and we have a known lower bound that this overhead is at least log n for a logical memory of size n. That is, if the logical memory is of size n, every access in the original program is translated to at least log n operations amortized. This lower bound holds even if we assume crypto. Some words on our model. We assume that the server is passive that it behaves just like a memory and is not allowed to perform any computation. We assume that the word size of the RAM machine is log n. This is natural as otherwise, the client cannot describe an address in the memory using a constant number of words. Moreover, we assume that the client memory size is constant. In these settings, the lower bound is omega log n and we have 30 years long line of research to try to match this lower bound. The starting point is the work of Goldreich showing a square root oram. And the last work in this line is the beautiful work of Patel et al from two years ago, showing a construction called Panorama. Panorama came very, very close to match the lower bound. They showed a construction of order log n log log n, just log log n gap between the upper bound and the lower bound. In our work, we'll present the first wrong construction that achieves overhead of log n matching the lower bound. Our main result is an order construction with log n over text overhead. This is asymptotically optimal. The construction achieves computational security and matches the lower bound of Larsen and Nielsen. And when we replace the one we function with the random order curve, we achieve also statistical security and matching the lower bound of Goldreich and Ostrowski. As I mentioned, the model is when the world size is log n, client memory size is constant, and the server is passive. Our construction is in the walls and bins model, which means that every data block is treated as an opaque. Our construction is not practical, and even though it is asymptotically optimal, its concrete efficiency is relatively bad. Besides our main result, we also have another result that might be of independent interest. The key building block in our construction is an oblivious type compaction. In that problem, we have an array of elements in the memory in which some of them are marked. The goal in oblivious type compaction is to move all marked elements to the beginning of the array. We don't require stability. We don't ask that if one marked element appears in the input array before another marked element, then it will also appear before that element in the output array. We just ask that all marked elements will be moved to the beginning of the array. Solving this problem non-obliviously is very easy. We can just scan the array once and write down all marked elements, and then scan the array again for the second time, and write down all elements that are not marked. But how can we do it obliviously? Apparently, this is not an easy problem. We know deterministic constructions that achieves it in order and log n. This is essentially just performing an oblivious sort on all elements. This is an open problem since 95, which this work also shows a resolution that reveals the number of marked elements and works in time and log log n. It took almost 20 years of work to get rid of this leakage. And it's also worth mentioning that there is a lower bound of omega n log n if we ask for stability. Our result shows an optimal solution for this problem. We show a deterministic oblivious type compaction in order n. It's worth mentioning that there are two follow-up works we showed in a follow-up work that will appear in information to the wetty crypto how to achieve a linear type compaction that also supports good parallelism. And Dittmeros and Ostrowski showed an oblivious type compaction in linear time with a smaller constant. In the next few slides, I want to show the progress in oblivious RAM, starting from the square root orm of Goldreich to the hierarchical solution and to the beautiful work of Panorama and finally talking about our work, Optorama. We start with square root orm. So in order to hide what we want to access, the first thing that comes to mind is to shuffle the entire memory. So let's say that the client shuffles the entire memory and it can also store this permutation pi, this mapping. Whenever it wants to access an element i, all it has to do is to compute pi of i and access that memory location to retrieve i. The server cannot learn what was accessed because it doesn't know which element actually resides in that location. When the client wants to access some element j, it's going to compute pi of j and then access the relevant element. However, if the client wants now to access i again, it cannot go to the same memory location and the server will see that the client accesses the same element. The server cannot see which element was accessed, but it does see that the client accesses the same element more than once. Square root orm solves this problem by introducing another layer which is called the shelter. So we have a shelter of size square root n and we also introduce square root n random elements, dummy elements to the record array. The record array is shuffled and now when the client wants to access some element i, it's first going to scan for the element i in the first layer in the shelter. If the element is found in the shelter, it's going to access some dummy element in the record array and if it doesn't find the element i in the shelter, it's going to compute pi of i and access i. After i is retrieved, the client is going to write it in the shelter. Now when we want to access some element j, we're going to do exactly the same thing. We're going to scan for j in the shelter. We're not going to find it, so we're going to compute pi of j, access pi of j, find the element j and write it back to the shelter. Now if we want to access the element i again, we're going to scan the shelter, find the element i in the shelter and therefore we're going to access some dummy element in the record array and we can make sure that we never access the same dummy element twice by introducing some label for each dummy element. So after square root n accesses, the shelter is going to be full and therefore we have to rebuild the entire structure. The rebuild uses oblivious sort, which is going to cost order n log n. So we are going to pay some expensive operation every square root n accesses and this is essentially why our access is square root n log n. Arachical ORAM allows us to pay the expensive operation, the oblivious sort over a larger number of accesses. Instead of having just one shelter, we can think of the arachical construction as having many shelters. Each shelter will have its own shelter. We have log n levels of doubling sizes. With each access, we are going to perform some lookup in each level and every 2 to di accesses, we are going to rebuild table ti from the content of all previous levels. As a result, after n accesses, we are going to rebuild the last level from all previous levels. We therefore are going to pay the expensive operation of n log n only after n accesses and not after square root accesses as a square root ORAM solution. More explicitly, the amortized cost can be described as performing log n lookups plus the amortized cost of rebuilding the levels. In previous works, the expensive rebuild used oblivious sort. When we put the cost of oblivious sort in our expression, we get log square n amortized cost per access. The beautiful work of Panorama has the following observation. If we assume that the input of each level is randomly sharper, then we can actually rebuild the level without paying the expensive oblivious sort. In particular, they showed how to implement a rebuild of a level, which is in fact some sort of a hash table, in n log log n, assuming that the input for that rebuild is randomly sharper. Moreover, they also observed that each level by itself is randomly sharper. All elements that were not accessed in that level are randomly sharper from the previous rebuild. However, recall that every two-to-the-eye accesses, we are rebuilding level i from the content of all previous levels. While each layer by itself is randomly sharper, the concatenation is not necessarily randomly sharper. Therefore, they have to introduce another operation, which takes few arrays, each one by its own is randomly sharper and interspersed these arrays together into randomly shuffled array. This interspersed operation, they show how to implement it in cost of n log log n as well, where n is the sum of the sizes of all arrays. For conclusion in Panorama, we have to update the expression that we just saw and also charge for the cost of interspersing arrays. Panorama, we have total cost of rebuild the level in n log log n, look up in each level log log n and interspers n log log n. Putting those in the expression gives us total cost, total amortized cost of log n log log n. In addition, we have to improve each one of those operations in order to get an optimal construction. In Optorama, this is exactly what we did. We showed how to fix each one of those parameters, we showed how to improve rebuild, to linear time, how to improve the lookup to be constant time and interspersed to be of linear time as well. Only if we do all of those, we can actually get an optimal construction. In the remainder of the talk, we'll show some of our techniques. I'm going to describe a little bit about tight compaction and also a technique that is called packing. So tight compaction, where is it being used? There are many different places where tight compaction is being used in our construction. The first example is when we want to rebuild the level. We sometimes have to get rid of the dummy elements that were introduced. In Square Rootorum, we could have done it with oblivious sort. Here, we cannot use oblivious sort because we want to do the rebuild in linear time. For the second example, we observed that oblivious type compaction can be used for interspersing few arrays together. I'm going to elaborate on that in the next few slides. Suppose that we have two input arrays that are randomly shuffled, size i0 is of size n0 and i1 is of size n1 and we want to intersperse them when we assume that each one by its own is randomly shuffled. So in order to do that, we are going to generate an auxiliary array which has exactly n00s and n11s. We are going to generate that array uniformly at random from all possible arrays that has exactly n00s and n11s. Then, if we are going to obliviously route the two arrays according to that auxiliary array, namely wherever in the auxiliary array we have a zero, we are going to place an element from the first array and whenever we have one, we are going to place an element from the second array. We are going to end up with an array that is a random permutation of all elements in the two arrays. In order to see that we are getting all possible permutations, so we have n choose n0s to choose this auxiliary array and we also have n0 factorial for shuffling i0 and n1 factorial for shuffling i1 and in total this gives us n factorial possibility where n is the sum of the two sizes. The challenge is how to move these elements obliviously, how to move actually to place wherever it's written zero in the auxiliary array, an element from i0 and whenever it's written one to place an element from i1. Panorama showed how to implement it in n log log n. We are going to show how to implement it in order n using oblivious type compaction. So how does it work? Well, the idea is to start with generating this random auxiliary array uniformly at random and then we are going to run type compaction on that auxiliary array. The result will be that all the zeros are going to the left and all one are going to be at the end of the output array. Now if we put our input arrays i0 in the beginning and i1 at the end and run exactly the same type compaction but in the reverse order. Namely, wherever we add some move ball in the type compaction, we are going to remember to record that move ball operation and perform it in the reverse direction. As we can see, we are going to end up with interspersed array that exactly matches our auxiliary array our target array in the beginning. And this is a mix of the two input arrays. We are going to get an array that is a random shuffle of the concatenation of the two arrays. I'm going to talk a little bit about our oblivious type compaction to give a little taste of how the construction looks like. So we have an input array of size n where some elements are marked 0 and some elements are marked 1 and our goal to put all the 0 elements before the 1 elements. First, let's count the number of balls that are marked 0 and the number of balls are marked 1. In the output array up to this mark the number of balls that are marked 0, we want to have only 0s and after that mark all the 1 balls. So we are going to mark all the elements that are misplaced. We are going to mark by red all the 1s on the left side and by blue all the 0s on the right side. The number of reds always equal to the number of blues. Moreover, all we have to do now is just to swap every 1 element from the left side with some blue ball on the right side. The algorithm will use a bipartite expander graph. The bipartite graph will have a constant degree. There are n vertices on the left side and n vertices on the right side. We place our input array in the left side. So the algorithm is as follows. Each vertex looks at its neighbors of distance 2. Those neighbors are always on the left side. If two vertices that are marked with opposite colors are the same neighbors on the right side, they are going to be swapped. And we are going to remove their color. This requires in total nd square work. Recall that degree D is constant which means that overall this is a linear amount of work. The claim is at the end of this procedure there are not going to be so many remaining swaps. Why is that? This is because of the expansion properties of the graph. Let's consider the set of survivors. Those elements that finish the algorithms and were not swapped and are still marked. We cannot have too many survivors. If we have a set of size n over 200 of blues that survived and n over 200 reds that will survive, the set of neighbors must be disjoint. If they have someone in the intersection then there will not be in one of the surviving elements. However using the expansion property we can show that every set of size greater than n over 200 the number of neighbors must be greater than half of the right side. That means that overall after these swaps there will be too many remaining swaps. At this point after we perform the loose swap there are not so many elements that are still marked. At this point what the algorithm is going to do is going to run what we call a loose compactor. Loose compactor is a weaker variant of tight compaction. If in tight compaction we have some marked elements and want to move all the marked elements to the beginning in loose compaction we have some marked elements in the input array and we want to have the output array and we want that in the output array all the marked elements will appear but we don't care in what order we just need that the output array will be smaller than the input array. We must might close some of the marked elements. So after we perform loose compaction we are going to stay only with the marked elements, those elements that still want to be swapped. We are going to continue recursively we are going to recurse and then we'll have another loose swap and another loose compaction, loose swap at the end when we come back to the recursion all those red elements and blue elements are going to be swapped. Assuming that those elements in the recursion were swapped we want to show how to also swap the elements that remained after the loose swap all we have to do is to go back to the loose compaction and do reverse routing. Namely we are just going to reverse route all elements that came back all the move operations that we did in the loose compaction we are going to do in the reverse order. We are going to do it again loose compaction given this when we do, when coming back from the recursion all the reds were moved were changed with the blues and then we are just going to do reverse route and all the zeros will appear now before all the ones. I'm stopping at this point with tight compaction I did not describe how to do this loose compaction we are going, we did it in order n as well I'm going to talk about another technique that we have in the paper and this is packing so what is the idea of packing we are working in the raw model where the word size is w say we have n balls each of size d bits how much that it costs to sort these balls using classical oblivious sort we have to pay d over w n log n this is because we are going to put the d bits in d over w memory words but what if d is much smaller than w what if one memory word can contain many elements this is the idea of packing we are going to pack a lot of balls into one memory word as a result we can sort now in time d over w n log square n we are not going to observe that here we don't going to pay for the ceiling of d over w what does it mean is that when n and d are small say that n is like w to the fourth which means poly log size and d is logarithmic in the word size we can sort actually in linear time but where is it being used remember that in order to enjoy it in this linear time sorting the number of elements must be very small it's going to be something like poly log n so where it's being used so recall that this is the orm construction is the hierarchical construction and in fact if we look closer then each level in our orm is arranged as a sequence of bins and the elements each element resides in some random bin and the size of each bin is something like poly log size previously to build the structure of this hash table of a bin inside a bin we needed to use some oblivious sort which gives us some log log n over it now with the packing trick we can get rid of this log log n over it so for conclusion we show that there exists an orm with order log n blow up which is asymptotically asymptotically optimal we also show the hash table that is built in linear time on permitted input and supports lookup in constant time and we show that oblivious type compaction in linear time thank you so much for joining me