 Hello and welcome. I'm Gayatri Garimella from Oregon State University and I'm really happy to be presenting our work Oblivious Key Value Stores and Amplification for Private Set Intersection. This is joint work with my wonderful co-authors, Benny Pinkers, my advisor Mike Roslick, Neetriu Let's get started. In Private Set Intersection, Alice and Bob each have a private set of items as input. They can use a PSI protocol to learn items they have in common. Here, Bob, who learns the output, learns the intersection items, namely letters T, A and L, and learns nothing else about Alice's set. In a large number of PSI protocols, you encounter the following scenario. Alice has a function F. For most keys, she doesn't care what this function evaluates to. For a small number of keys, she wishes to convey a specially crafted random value to Bob. This small set of keys depend on her input and she wants to keep this private. So the goal is for Alice to compactly convey this function to Bob while ensuring that Bob can't distinguish if he's evaluating F on one of these special keys or a key outside of it. In PSI-related applications that we consider, the values associated with the keys are uniformly random. So they end up using polynomials for their task of conveying key value pairs while hiding keys. Polynomials have the property that when you interpolate through values that are random, the resulting polynomial F is uniformly random and ends up hiding the special keys for someone evaluating it. We see polynomials used to obliviously store key value pairs in a host of PSI papers. In this paper, we formalize and study the general notion of oblivious key value stores as a data structure to convey key value pairs obliviously. We catalog various existing constructions including polynomials that could classify as an OKVS. In particular, in the PSI context, we looked at the Paxos data structure which was used to build the most efficient malicious secure PSI in PRTY20. Paxos is a binary OKVS that uses the analysis of cuckoo hashing with two hash functions. PSI protocols need cuckoo hashing to work except with negligible failure probability, something of the order of 2 to the minus 40. The Paxos paper describes an asymptotic analysis of the cuckoo hashing parameters, but it remains non-trivial to translate them into concrete parameters needed to instantiate PSI in practice. They rely on heuristics to instantiate the PSI protocol. Our goal is to build a more efficient OKVS with cuckoo hashing using three hash functions, which is known to be much more efficient. Again, we are not equipped to determine the tight concrete bounds for negligible failure probability. Our first construction is to extrapolate the parameters of Paxos and design a binary OKVS using three hash functions. However, we would still like empirical confidence for our chosen parameters, but empirically verifying negligible failure probabilities like 2 to the minus 40 is infeasible. Our next contribution is to show some amplification techniques. Note that we can easily measure failure probabilities of the order 2 to the minus 15 even on our personal laptops. We design methods to use an OKVS with a fairly large probability and convert it into an OKVS with negligible error. So for some constant c, we show how to compose OKVS with error p into an OKVS with error p to the c with minimal overhead. Thus, we are provably amplifying the failure probability of an empirically verified OKVS. This allows us to choose our concrete parameters for our OKVS with cuckoo hashing with three hash functions. Finally, we show that our improved OKVS gives us the fastest malicious two-party PSI protocol on slow and medium networks by plugging in our better OKVS into the construction of PRTY20. And we also show many applications where OKVS can be a plug-in replacement for polynomials to improve their efficiency. Given a set of n key value pairs, we can encode them into an object s. Later, we can use this data structure s to retrieve our key value pairs. If we probe or call the decode function on any of the special keys, it returns the correct associated value. If we probe s on keys outside the set, we learn some value that we don't care about. When the associated values are random, you cannot distinguish if you probe the object s on an encoded key or some other key outside the set. This makes the key value store oblivious. Let's look at some properties of an OKVS. Given an OKVS s, the first thing that we're interested in is that we want to compactly encode the n key values. So if the OKVS has a size m, then we want m to be as close to n as possible. We can represent the encoding and decoding of an OKVS in terms of this matrix multiplication. During encoding, we need to determine what each of the rows of the matrix k are. Each row is associated with one key. Then we need to solve for vector s and a multiplication of k times s will be the decoding that gives us the set of values that we're looking for. Though it's useful to have some links, if the values of the OKVS belong to some field and the matrix k consists of the elements of the same field, we end up with a vector s, which is our OKVS, which consists of m field elements. A special case of the linear OKVS is a binary OKVS. Here we restrict that the elements of matrix k consists of binary values of 0 and 1. Now let's look at some existing examples of an OKVS. Our first example is a polynomial. So we can encode a set of n key value pairs into a polynomial say s. That satisfies the constraints of these key value pairs. That is, if you evaluate this polynomial on key k1, then the result of the polynomial evaluation is value 1. To be explicit, the coefficients of this polynomial form the OKVS. And we can represent the OKVS encoding and decoding in terms of this matrix multiplication. To decode a specific key, we look at the row in matrix k associated with that key. We can obtain the right value for that key by computing the dot product of the associated row and the OKVS s. What are the efficiency measures of a polynomial? The polynomials are optimal size. So if we encode n key value pairs, then we obtain n coefficients of the polynomial. However, polynomials have the drawback that the encoding and decoding time for n values requires order n log square n field operations. The next example is the binary OKVS pack source, which is also a starting point to build a more efficient OKVS. Pack source has hash functions h1 and h2 as its parameters. I will present a simplified version and start by showing how to decode or probe a value from s. For a key a, we look at position given by both the hash functions, namely h1 of a, which is 5 in this case, and h2 of a, which is 1. We can decode by computing the xor of the values that are stored in these positions. So we decode a, so decode of a would be s1 xor s5. We know that pack source needs to encode a set of keys. So how do we encode many such keys while ensuring that each of them decodes correctly? I will show a simple example. Suppose you have to encode four keys a, b, c, d. We use a method called peel then fill. First, we compute the hashes h1 and h2 of all the keys a, b, c and d. Next, we recursively identify slots that are constrained by just one key. So here we can see that position 2 is constrained by key c and position 3 is also constrained by just one key, which is key b. So when we have multiple options, we can arbitrarily, when we have more than one option, we can arbitrarily break ties. So I will pick position 2, which is constrained by key c. Then we can delete the constraints associated with key c. Next, we identify that position 4 is also constrained by a single key, which is key d, and then delete the constraints associated with d. Now we see that position 3 is identified by key b and we have only one option, so we delete the constraints associated with key b. Finally, for position, for key a, we have two positions, so we can just delete this key. Once we have successfully peeled the keys, we fill the pack source in a reverse order. So for key a, we have two positions given by slot 1 and 5. So we can pick any values for them as long as s1, xor, s5 matches the value of the value of a. Next, we look at key b. Here, s5 has already been occupied by a value, so we need to assign s3 as s5, xor, the value of b. We follow a similar process. For key b, we notice that s4 is unfilled and s1 has already been filled, so we have to assign s4 as s1, xor, the value of d. Finally, we plug in the value for slot 2, which is given by s4, xor, the value of value c. So this worked out rather nicely, but a natural question is, does encoding always work? Are there any bad events that prevent the successful encoding of keys? Consider a situation like this where no slot is constrained by a single key. How do we resolve the cycle? In pack source, they call the set of all keys that are not solvable as the two-core. Here, the two-core consists of keys a, b and c. In PRTY20, they show that if there's a known a priori bound on the size of the two-core, then they describe a method to resolve it. That is, they will be able to successfully encode the n keys. But I will not delve into those details. But the theorem is as follows. If the size of the two-core is greater than log n with less than epsilon probability, then they increase the size of the pack source from n to n plus log n plus lambda. And this allows them to encode n keys successfully, except with negligible error. To reiterate, the only bad event that happens while encoding is when the pack source two-core overflows. That is, the size of the two-core is greater than the expected size of log n. And this happens with negligible probability. One of our main motivations for this paper is that it's not revealed to translate this asymptotic analysis into concrete parameters while instantiating PSI. In pack source, to encode n items with negligible error, they pick a binary OKVS with size 2.4n. Using the analysis of cuckoo hashing with three hash functions, we can heuristically determine that to encode n items, we need around 1.23n bins. However, to have empirical confidence, we would need to verify statements like, except with probability 2 to the minus 40, can we encode using cuckoo hashing with three hash functions, 1 million keys into say 1.3 million bins of slots with less than 10 keys appearing in the two-core with confidence 0.999. Running experiments to verify something like this is very resource intensive. And even after investing millions of cores of ours, it's not feasible to verify. Further, what if you want to use your OKVS in an application that needs failure probabilities say 2 to the minus 80? Certainly, you can't experimentally verify this. So this is our approach. We can verify statements like, except with probability 2 to the minus 15, can we encode using cuckoo hashing 1,000 keys into 1.3k bins of slots with less than 10 keys appearing in the two-core with high statistical confidence say 0.999. And the answer is yes, we can easily run experiments to verify something like this even on our personal laptops. So our main idea is that we want to compose empirically verified smaller OKVS into a larger OKVS by provably amplifying the correctness guarantee from 2 to the minus 15 to say something like 2 to the minus 40. I will now show our first amplification method called replicated three hash function, garble cuckoo table. To encode n keys, we start with two instances of OKVS, s and s prime. Each of these can encode n key value pairs and s is parametrized by three hash functions h1, h2 and h3 and s prime is parametrized by three different hash functions h1 prime, h2 prime and h3 prime. Let's see how we would decode such an architecture. To decode a single key a, we start by computing the hash values h1a, h2a, h3a and look at the positions that they indicate. Namely 2, 3 and 5 in this case, so we compute the value as s2, xor, s3, xor, s5 when we decode a in the first OKVS. In the second OKVS, again we apply the three different hash functions which point to locations 1, 3 and 5 and compute their xor, s1, xor, s3, xor, s4 to obtain the decoding away with an s prime. We define the decoding of our replicated garble cuckoo table as the xor of the decode within the first OKVS and the second OKVS. Now let's see how we encode the value successfully into this OKVS. Remember that for each of the individual OKVS, successful encoding depends only on the keys and the hash functions. In no way is it related to the associated values of each of the keys. Now suppose you want to encode n key value pairs and let's focus on one of those key value pairs namely a and the value of a. We start by testing if s can successfully encode all the keys or if a bad event happens. Without loss of generality, suppose s is unable to encode all the n keys and it fails. In this case, we sample a random OKVS of the size of the OKVS with random field elements like this with values r1, r2 and so on. And suppose that s prime is successfully able to encode the same set of n keys within its OKVS. Then to encode a specific key and its value, we start by proving that key in OKVS s and we get a set and we obtain the value r2, xor r3, xor r5. And this is the correction that we need to apply within the second OKVS. So, in the second OKVS, we encode the key a with its value xor with r2, xor r3, xor r5 like this. Now, when we decode the key a in this replicated Kabul-Coku table, we xor both these values so that we get the expected value a. Now, each of these OKVSs s and s prime assume that they fail with probability p. The only time this replicated architecture will be unable to encode the set of n keys is when both s and s prime are unable to encode the set of n keys, that is, both of them fail. And these are independent events, so the probability that the encoding fails is p squared. The drawback of this construction is that we double the size of the OKVS and we are looking to be more efficient than that. So, our goal is to amplify the failure probability with minimal increase in the size of the OKVS. I will show a simplified version of our construction and then generalize it when I'm presenting the concrete parameters later. So, suppose you want to encode n key value pairs. You choose a hash function h1 and say you hash it into three bins. Now, you associate a single OKVS with each of the bins and one OKVS as a central OKVS. And each of these OKVS has parameters to encode n over three items and not n items. Let's see how we would decode a key a in this construction. Suppose you want to decode a, you first apply the hash function h1 to see which of the bins it falls into. Suppose it falls into bin 1, then you can decode key a by decoding a within the central OKVS n0 and the OKVS n1 and XOR both these values. Suppose a fell into bin 3 instead, then you would decode the key a within the central OKVS and within the third OKVS and XOR both these values. So, let's see how we would encode values into such an OKVS such that the encoding works correct, such that all the decoding works correctly. We start by testing each of the leaf and central OKVSs. We start by testing each of the leaf OKVSs and see if say n1 can encode all the keys that have been assigned to its bin, n2 can encode all the keys assigned to its bin and so on. Suppose one of the leaves is unable to encode the keys, say n1 can't encode all the keys that it has. In this case, we need to fix the central node such that it allows to successfully encode all the keys that lie in bin 1. So, we update the values of n0 so that n1 XOR n0 allows us to get the correct decode values for all the bins in bin 1. Since n2 and n3 can successfully encode their keys, they just adjust their values according to this OKVS n0. Similarly, if only n3 failed, then you first fix the central node n0 to decode correctly all the keys that lie in bin 3. And then, once the central OKVS is fixed, n1 and n2 can successfully encode all the keys in their bin. If n1, n2 and n3 can successfully encode, you just pick a random OKVS for n0 and adjust the values of n1, n2 and n3 so that when you decode, you get the expected value for each key. If only one leaf fails, say n1, then you adjust n0 to match n1 and then you adjust n2 and n3 to match n0. So, the bad event is when two or more nodes fail. Assume, let's calculate the probability of this bad event occurring. If each of these individual nodes fails with probability p, then we compute the probability as a summation of more than two nodes failing. And this almost squares the failure probability. Now, I will show the concrete parameters that we obtained for our OKVS that can encode a million key value pairs. We had Q bins where Q is 160 and by standard balls and bins analysis, we obtain that each of the bins has maximum load 7163 keys except with negligible probability. So, each of the OKVS leaves and the central OKVS needs to encode 7163 keys each. We plug this into our formula to compute the OKVS for a 3 hash carbon cocoa table and we found that we need an OKVS of size 8622. We ran empirical experiments and saw that only one bad event occurred in two to the 23 runs of the small OKVS. Each which holds 7163 keys each and each of them was parameterized by different random hash functions. And we saw that the probability that an encode fails was 2 to the minus 29.35 with confidence level 0.999. We obtained that the probability that encoding a million items fails is 2 to the minus 45.05 which is negligible. And our encoding time was around 2.915 seconds and the decoding time for all the million items was 1.625 seconds. And the size of the OKVS was q plus 1 times the size of each of the smaller OKVS's which was around 1.38 times million. In our paper, we describe many applications or use cases for our OKVS construction. Our 3 hash garble cocoa table construction with a star architecture can replace any random encoding task. We can replace the use of polynomials in the following use cases. The first one is the SPARS OT extension which was used to design a communication efficient semi-honest PSI protocol in PRTY-19. One of their construction relies on the interpolation of a large polynomial and our OKVS can replace this. Oblivious programmable PRFs are an important building block in many PSI constructions. Many OPRFs rely again on the use of a polynomial to encode key value pairs and we can replace their use of polynomials by our OKVS. OPPRFs are used to build circuit PSI protocols. They were also used to design the private set union protocol in KRTW-19 and they also form a major building block for multi-party private set intersection protocols. Described in the paper KMPRT-17. While exploring the multi-party PSI protocols in KMPRT-17, we realize that one of their constructions which is augmented semi-honest secure is actually efficient against malicious adversaries with a small modification. So we obtain the most efficient malicious secure multi-party PSI protocol and describe this in our paper. We show a qualitative analysis of why this is the most efficient protocol to date. Our OKVS can also replace the role of PAXOS in the following papers. As a flagship example, we consider the OT-based PAXOS PSI protocol that was presented in PRTY-20. By replacing the PAXOS with our OKVS, we obtain the fastest malicious two-party PSI protocol to date with the added advantage that this is empirically verified. We also present a new PSI protocol by generalizing the OT-based PAXOS PSI protocol to admit a linear OKVS. We show that we obtain a new vector OLE-based OKVS PSI protocol. Concurrently, in RS-21, they showed a vector OLE-PAXOS PSI protocol and we suggest that you can replace the PAXOS by OKVS in their paper as well. Finally, I'd like to provide some takeaways from our experimental results. While computing the intersection on a million items, we saw that our constructions the three-hash garble cocoa table based on heuristic parameters and the three-hash garble cocoa table based on the star amplification had about 1.61 times and 1.43 times less communication than PAXOS PSI. This means that on slow and medium networks, we have the fastest malicious secure PSI protocol. And on slow networks, we have the fastest semi-honest PSI protocol. With that, I would like to thank you for your attention.