 There once was a table of hashes that stored extra items and stashes It all seemed like bliss, but things went amiss when the stashes were stored in the caches This is alibi, a floor in cuckoo hashing based hierarchical ORAM and a solution I'm Daniel Noble presenting and this is a joint work with my advisor Brett Hemingway-Folk and Raphael Ostrovsky I shall now explain oblivious RAM or ORAM and why it is so useful Oblivious RAM is a technique in which a client with a small amount of trusted memory can make use of a larger entity with a large amount of untrusted memory Let's imagine the following scenario Alice has photos, lots of photos, and she is no longer able to store these all on her local device She would therefore like to make use of a cloud service provider However, the cloud service provider is not entirely transparent about what it does with Alice's data Furthermore, if Alice is a business that has access to sensitive data such as medical records Putting these on a remote host might lead to disastrous consequences There are a few cryptographic primitives that Alice could use to protect her data The most basic of these is to encrypt the data This will hide the content from the service provider, but it will not be able to hide Alice's access pattern her reads and her writes She can instead Use a primitive called private information retrieval Which is a protocol that she can engage in with the service provider in which she is allowed to learn Read information from the database without the service provider learning what information she accessed Lastly if she wants to protect both her Reads and her writes She is able to use a primitive called oblivious RAM or ORAM PIR and ORAM by themselves do not Protect the contents of the data. They need to be combined with a secure encryption scheme in order to do so The most common application of oblivious RAM is the one just described in which a client Outsources memory to an untrusted service provider However, this is not the only application The first oblivious RAM protocol had in mind a different application that of a secure execution environment that Needed to depend on an untrusted RAM This actually is relevant in practice now that we have secure execution environments like intel sgx A trusted execution environment can Use untrusted RAM Using an oblivious RAM protocol A third application that is extremely important is secure computation Secure computation allows a number of parties Who have data split up between them to still be able to compute on that data If a piece of data is split up and each party knows which piece of data needs to be accessed Then they can easily compute on that data However, if the item the index of the item that is To be accessed is also a secret Then it is difficult to Find an efficient and secure method For each party to Access the relevant secure share Oblivious RAM protocols can be tailored to the secure computation setting And indeed in the secure computation setting sometimes you can take advantage of the Work that the Different parties can do in order to make the protocol more efficient I will now explain hierarchical RAM schemes Hierarchical RAM is based on oblivious hash tables An oblivious hash table is a data structure that has the property that accesses to the data structure access random locations As such if each item that is queried to the oblivious hash table is only queried once It leaks no information about the items that were queried Nor does it leak any information about the relationship between items that were queried and the actual items in the database However, because oblivious hash tables can only be accessed once They need to be combined in a new way in order to Allow items to be accessed multiple times from the oblivious RAM This is done by having Several oblivious hash tables of geometric in the increasing sizes briefly each In short each Oblivious hash table will have twice the capacity of the oblivious hash table above it At any given point in time not every oblivious hash table in the hierarchy will exist In the diagram shown on this slide The first and third oblivious hash tables are not needed at this point Or the first and third levels have no oblivious hash tables and the second and fourth Levels have the oblivious hash tables shown In order to query an item the Each level that has an oblivious hash table is queried starting with the smallest If the item is found another random value Announce which is only used once Will be queried at subsequent levels This is important for security The item that was queried in the smaller level may have previously been queried at one of these larger levels As such querying a random value instead protects the security of the scheme as it guarantees that at each level An item will only be queried once The fact that these random values are different every time is also essential for security because it means that again These random values will only be queried to each table once Once an item has been queried from every level it is inserted into the top level Called the cache In the cache every item is queried Whenever the crash cash becomes too big The entire data structure needs to be rebuilt all Levels that have an oblivious hash table from the smallest level until the largest one such that the sequence of oblivious hash tables Um, can the sequence of levels all contains oblivious hash tables All of these oblivious hash tables are combined And built into a new oblivious hash table The details of how the oblivious hash table is built In such a way that it does not leak any information about the contents of the data is Not going to be discussed in this talk as it varies from scheme to scheme However, I am going to discuss in quite detail how the tables are accessed In particular many hierarchical ORAM schemes make use of cuckoo hashing Cuckoo hashing is an efficient form of hashing In it there are two Subtables each of size epsilon n where epsilon is some constant greater than one And n is the number of items that The scheme rushes the store in the table Each item can be stored in one of two locations These locations are determined based on hash functions as such each location can be treated as a Random value uniformly chosen from the epsilon n possible locations Each of these locations has capacity one The fact that the Each item can be stored in one of two possible locations Allows much greater flexibility in the assignment of items to locations For instance, if there is a collision in one of the tables The items can still be stored For instance in this diagram We can show that we show that the robin can be stored in the hash table on the left And the flamingo on the hash table on the right Even though both of them were assigned the same location in the hash table on the right We can continue to see that Even though this table now contains several collisions It is still possible to find a correct assignment of items to hash table locations Such that each item is assigned a unique location However, it is of course possible that A configuration of items to locations is chosen such that it is impossible to store everything in the table This depends on the randomness of the hash functions that are chosen The probability that this occurs Whenever any items are inserted is order n to the negative one This was shown by Pag and Radler in their paper introducing cuckoo hashing in 2004 Cuckoo hashing was then proposed as a scheme for oblivious hash tables By Pincas and Reynman in 2010 They proposed that each level could have a cuckoo hash table In the event that there was an assignment of items to locations in the table Such that the items could not all be stored in the table They suggested that the table could be rebuilt This is a standard way of dealing with this one over n probability of an unsatisfactory assignment However, this actually presented a subtle problem Imagine that the table is rebuilt in the case that there is an unsatisfying assignment Now if all of the items in the table are queried That access pattern in the queries will not correspond to an unsatisfying assignment However, if items that are not in the table are queried It is possible that the physical accesses to the table correspond to an unsatisfying assignment For instance imagine that the table contains the parrot, the owl and the robin We know from this that the parrot owl and robin Query locations cannot be the same two query locations However, imagine that instead we query the flamingo, the hen and the penguin It is possible though unlikely that All three of these queries will result in the same two locations being queried If this event were to occur we would know that it is impossible That all three of the items that were queried existed in the table This therefore presents a subtle security leakage That means that the oblivious RAM in general is not secure It was then proposed to use cuckoo hashing with a stash by Gudrich and Mitzmacker in 2011 A stash is an additional area of storage In which items that could not be placed in the main table can be placed instead It can be shown that For any constant s the failure probability then becomes order n to the negative s Furthermore for non-constant s it can also be shown that the failure probability becomes order s Open brackets order s over n close brackets to the s plus 1 This means that the failure probability can be made negligible Failure in this case means that more items are placed in the stash than the memory of the stash allows Observe that we also cannot reveal how many items were placed in the stash Because this again would suffer the same problem that existed with the Protocol of pink s and rhyman During the build phase the fact that the stash size was leaked would reveal no information However, if the access pattern corresponded to a sequence of elements A set of elements that does not have the same stash size We know that the items that were queried could not all have existed in that table This means that the stash sizes need to be revealed And furthermore the entire stash needs to be accessed at each level Grid brick and mids and macka then presented a protocol with these properties In their oblivious RAM protocol every level contained a cuckoo hash table And also contained a super constant size stash Say of size log n This protocol was secure And was able to take advantage of the cuckoo the efficiency from cuckoo hash tables However, it meant that the number of accesses at each level was no longer constant Because the entirety of the stash needed to be queried As such, the scheme was secure But it was still somewhat inefficient Because it still needed order log squared n physical accesses for each virtual access It was then proposed by Goodrich and Al in their paper in 2012 That these stashes could in fact be combined They showed that even though each stash needed to be of size Order log n, if the combined stash was of size order log n This was sufficient to ensure that the stash didn't overflow As such, this combined stash at the cache level would be checked first And then the remainder of the table would be checked as per normal Herein was introduced the subtle security floor that is described in the alibi paper Since the combined stash was checked first The hierarchical ORAM protocol would, if it found the item in the stash, Search for other elements instead in the lower levels This meant that instead of searching the locations of the item that had been stashed Other locations were accessed instead Now any single query did not leak any information Since only two locations were accessed and those locations were still random However, when multiple queries were combined A similar problem appeared as had happened with the Original cuckoo hashing protocol by Pinkers and Reiman While it was possible that the item that had been placed in the stash Would result in the same query locations This is extremely unlikely As such, it is extremely unlikely that the resulting stash size From the query if all of the items that are in a table are queried Will be as large as the original stash size if the stash size is greater than or equal to one In this example, again imagine that the Robin, Owl and Parrot were placed in an oblivious Hash table However, further imagine that They were not able to be assigned to this location and therefore the parrot had to be placed in the stash In this event reality, the Owl and the Robin will, when queried, still access their original locations However, the parrot Will access new locations As such, the queries for any items that are in the stash are resampled If instead we look at three random items which are not In the table The probability that they cause a complete collision, i.e. that all three of them are mapped to two locations Has not been affected Therefore, there is a statistically significant difference In the probability that these three items which are not in the table will be matched to two locations Versus three items which are in the table being mapped to two locations The solution to this is actually quite simple If any item is placed in the stash It also needs to remember Which table it originally came from Then whenever it is queried The table at which it originally came from Needs to be queried with the original query So if we find the parrot and the parrot originally came from level three The protocol should still query the original Hash locations of the parrot in query three in level three In subsequent levels, it should as before Query a random value since the parrot may indeed have been queried at this level before This successfully allows The sequence of access patterns at each level to Be independent random samples from the table space In other words, the query patterns are not affected at all by an item having been placed in the stash The additional information Therefore it serves as an alibi of sorts In the same way that a criminal needs an alibi to claim that it was He or she was going about their daily life This additional information Makes it seem as if the item existed at its original level in the table When in fact it was placed in the stash Through doing this the queries at each level appear completely random And the oblivious RAM scheme can be made secure This additional information only Requires a number of bits equal to the number of levels And therefore does not Increase the asymptotic cost of the protocol This flaw first appeared in a literature in 2012 so has existed for almost a decade It was then inherited by a sequence of other works And in total this flaw appeared in six different papers Including three oblivious RAM papers that were published in the last three years This motivates the simplification of oblivious RAM protocols And the development of modular Secure components It also makes us realize that Combining components in ways that undermine their basic security properties Can result in unintentional security flaws Lastly it Motivates the suggestion that reviewers should be Compensated more for their efforts so that they Can devote more time towards Reviewing search protocols and are therefore more likely to catch flaws such as this one