 Hello, I am Gayatri Garimalla and I am happy to be presenting our work, Private Set Operations from Oblivious Switching. This is joint work with my co-authors, Payman Mohasil, Mike Rosalik, Saeed Sadegya and Jaspal Singh. Private Set Intersection or PSI allows two mutually distrusting parties to jointly compute the items they have in common. For example, if Alice's set has letters of the word COVID-19 and Bob has letters of the word virtual, then we expect Alice to learn the letters VNI and nothing else about Bob's set. In recent years, there has been much interest and progress in making PSI practically very fast and efficient in both the semi-honest and malicious setting. In PSI, the output reveals the entire contents of the intersection. But what if Alice only wants to learn some partial information about the intersection? These protocols don't extend immediately. A motivating example was described in this paper by Google in 2017 to measure the revenue from online advertisement viewers who later perform a related offline transaction. The functionality they need can be abstracted out as, Alice has a set of items and this time each item has a payload. Bob has his set of items. The goal is for Bob to learn the sum of the payloads of all the items in the intersection. He needs to learn the sum of the payloads of VNI which is 15 in this example. More generally, we want robust protocols that let Alice and Bob compute any F over the intersection and hide all other information. We call this problem Private Computation on Set Intersection and it has been studied in the following works. The state-of-the-art construction was proposed by Pinkus, Schneider, Kachenko and ENI in 2019. I will refer to their construction as PSDI protocol for the rest of the talk. We start by identifying the performance gap. Concretely, to compute the plain intersection of a million items in the Semi-Honest setting, the best runtime is well under a minute. In contrast, computing even the cardinality over the intersection by the fastest Semi-Honest PCSI protocol takes around 9 minutes. PCSI is 20 times slower and need 30 times more communication. We want to bridge this gap. Our starting point is the PSDI protocol. It starts with a pre-processing phase where Alice has input X and Bob has input Y. For the rest of this talk, let's assume the size of the sets is N. At the end of the pre-processing phase, Alice learns a fixed ordering of her items. Alice and Bob then learn vectors S and T such that SI and TI are an additive share of 0 if XI is in the intersection. If not, SI and TI are a sharing of a pseudo-random value. To compute a function over the intersection, the SIs and TI's now need to be compared, but in such a way that the outcome of no individual comparison is leaked to Alice or Bob. So all the comparisons are made inside the secure computation. The output of the comparisons are then fed into another circuit that computes F. Now let's take a closer look at the communication cost of comparing strings. Comparing two L-bit strings is a Boolean circuit with order L non-free gates and needs order L times kappa communication, where kappa is the security parameter. Concretely, the circuit for the order N comparisons accounts for 96% of the communication cost in the PSDI protocol. In contrast, the state of the art PSI protocols use a special purpose comparison protocol. To compare two L-bit strings, it requires just 4.5 times kappa bits of communication. Notice that the communication is independent of the length of the strings. The caveat is that these equality tests reveal the output of comparison to one of the parties. Since the special purpose equality tests are very efficient, we are interested in making them compatible with the pre-processing step. Our main idea is that we use an oblivious switching network to permute the order of the equality tests. Such a primitive needs order N log N OTs over a switching network, making the communication cost order N log N times kappa. So asymptotically, we replace the comparisons in the secure computation with order N log N times kappa for shuffling and order N times kappa for the comparisons. In almost all cases, log N is much smaller than the length of the strings, giving us our improvement. Our contributions can be summarized as follows. We propose a new protocol to compute any arbitrary function over the intersection, provided that it is always safe to leak the cardinality of the intersection. This gives us 2.5 to 3 times lower communication than PSDI and faster run times on slower networks. Our main construction is what we call the protocol core or PC. As I will show later in the stock, this immediately gives us the cardinality. Next, we show how to use the protocol core with order N OTs to learn the intersection or the union of the sets. Protocol core with order N OTs also can give us cardinality sum and more generally F over the intersection. In all these cases, cardinality leak comes from the PC protocol. Protocol core with an additional OSN gives us our most general case where Alice and Bob learn the secret shares of the intersection. Finally, we have the first private ID protocol that is dominated by symmetric key based operations. We get this using order N instances of an OPRF and the union protocol. The cost of an OPRF is comparable to OT making this protocol very efficient. For the rest of the stock, I'll discuss the constructions of all the highlighted protocols. Let's start with the protocol core or PC, which realizes the functionality we call the permuted characteristic. Here Alice has her set X. Her input to the functionality is a permutation pi that she chooses. Bob sends his input Y to the function and learns a bit vector E such that EI is one if Alice's permuted item in that position I belongs to his set. Otherwise EI is zero. Bob doesn't know Alice's input set X or pi. So the contents of the intersection are hidden from him. However, the number of ones in vector E reveal the number of matches with items in his set, which is essentially the cardinality function. One of the main primitives we use for PC is batch private equality tests. The state of the art was proposed in KKRT 16 and is very fast. Computing the equality of a million strings on a fast network takes less than 10 seconds. So here if Alice and Bob have n strings each, they want to test for equality and one of them can learn a bit vector which reveals which of their n strings match. It's useful for our purposes to see that if a pair of strings match, then they are an additive share of zero. We can alternatively say private equality tests help Bob learn which positions in his vector are an additive share of zero. Our protocol core consists of three steps. First is the preprocessing phase from PST by I'll present a simplified version. Alice has her set X and she uses Google hashing with three hash functions to place her items into buckets. Each bucket has at most one item and each item is placed in only one of the three positions given by the hash functions. Bob will use simple hashing again with the same three hash functions. Each of his items is placed in three buckets determined by h1, h2 and h3. Here each bucket has more than one item. After this, he samples strings uniformly at random for each bucket and he calls them t1, t2 and so on. Now Alice and Bob run an OPRF protocol once for each bucket. I will leave that as a black box for now. This step along with the hint that Bob sends to Alice helps Alice learn a string SI for each each of her buckets items. Let's focus on Alice's cuckoo table to understand the properties of these SI and TI strings. In the first bucket, we have Y, which is not in the intersection. So S1 and t1 are an additive share of a pseudo random value. Similarly, in position 5, Alice has a dummy item. So the strings are an additive share of a pseudo random value. In the last bucket, Alice has S which is in the intersection. Here the strings are an additive share of zero. Before I describe our complete construction, I will show what goes wrong if we use private equality tests immediately after the preprocessing step. Recall that batch equality tests can be used to learn positions on your vector that are an additive share of zero. If Alice learns the output as shown here, she can infer if her item in that position of the cuckoo table belongs to the intersection. For example, S3 is not equal to T3 and is not an additive sharing of zero. Therefore, she can conclude that T is not in the intersection and so on. I will not go into the details, but even when Bob learns the output of the equality tests, something goes wrong. He's able to learn which of his buckets correspond to additive shares of zero and this leaks partial information about the contents of the intersection. So it's not safe to leak the output of the equality tests to Alice or Bob and we need to fix this. To fix this problem I highlighted, we use a primitive call Oblivious Switching Network or OSN for short. What this building block gives us is this. Alice and Bob can send their additive shares y1 and y2 of a vector y to the functionality and additionally Alice can choose a permutation pi. As the output, Alice and Bob learn new additive shares of the permuted vector pi y. Note that Alice can correlate between her input to her output share. However, Bob is oblivious to the permutation pi and he learns a random additive share of pi y so he can't correlate his output to his input. We use this building block after the preprocessing step. Alice and Bob feed their additive shares s and t as inputs to the OSN. Then Alice chooses a random permutation pi and they both get additive shares of pi of s plus t or pi of s x or t. By the property of the OSN, Bob can't relate between his input vector t to his output vector t prime. Now Alice and Bob use the output vectors of the OSN as input to the equality test and we make Bob learn the output. Since Alice chose a random permutation pi that is hidden from Bob, he only learns the number of items in his vector that are an additive share of zero from the equality tests. This reveals the cardinality and nothing more. So we can see how the PST by preprocessing with the OSN followed by the equality tests give us the permuted characteristic functionality. Alice has a permutation pi. Bob only learns which of Alice's items belong to a set according to some ordering that is unknown to him. Next, we are ready to see how we can compute the private set union using our PC protocol. The goal of private set union is that if Alice has a set x and Bob has a set y, then Bob can learn x union y as shown in the example. We can restate this goal as Bob needs to learn all of Alice's items that are outside his set. So far what we have is the permuted characteristic functionality. This gives Bob an indicator vector e as output. This vector has value zero in all positions where Alice has an item that is outside of Bob's set. So Bob must somehow obtain these values. So permuted characteristic can be represented this way. Alice has her items permuted according to pi and Bob has his vector e. We need to ensure that when his bit B is zero, Bob learns the item and when the bit is one, he learns nothing. To do this, we use an oblivious transfer protocol as a building block. In OT, there's a sender who has two inputs M zero and M one and a receiver that has a choice bit P. At the end of the protocol, he learns the message indexed by this choice. Alice doesn't learn the choice bit and Bob doesn't learn the other message. So we can have one OT for every bit in E. Alice arranges her messages so that her M zero message is her item and M one is always bought. This gives Bob the union of the sets as we can see here. Whenever his choice bit is one, he learns nothing. And when his choice bit is zero, he learns the items. Note that learning the intersection is very similar to this. Alice can flip her OT input messages so that Bob only learns her items when his choice bit is now one. That is, when her item belongs to his set. Next, we look at the private ID functionality. This was introduced by Buddha Varapu at all in 2020. Here, Alice and Bob have input sets. And the goal is for both parties to learn a set of universal pseudo random identifiers for every item in the union of their sets. With the property that Alice can identify all the identifiers that are associated with items in her set. So she can recognize the identifiers of the letters P, S, T, Y and exclamation mark. Similarly, Bob learns a set of identifiers and can recognize those associated with items in his set. This functionality can be used by the parties to sort the private data relative to the global set of all identifiers. They can proceed item by item and do any private computation on their data, being assured that identical items are aligned. This is because all their common items have the same identifier. The original construction mainly uses public key operations and ours is the first construction based on OT extension, which is dominated by symmetric key based operations. For our private ID protocol, we start with the oblivious pseudo random function primitive. The sender, in this case, Bob learns the key of a pseudo random function. The receiver Alice learns the PRF evaluation on her set of inputs. Since Bob has the key, he can compute the PRF evaluation on any input of his choice. For private ID, we need two instances of OPRF. In the first one, Alice access the receiver and learns the OPRF evaluation on her input set. Bob learns the key K1 and locally computes the PRF evaluation on his set Y. In the second instance, Alice learns a different PRF key K2 and Bob learns the PRF evaluations on Y. Alice can locally compute PRF evaluations for this key. We define the pseudo random identifier as the XOR of both OPRF outputs. So at this point, Alice and Bob can compute and wreck and hence recognize the identifiers of their own sets. But Alice doesn't know the identifiers of Bob's items that are outside her set and vice versa. For this, Alice and Bob use our private set union protocol to learn the set of all the identifiers. We further optimize this approach and you can find more details in our paper. Finally, I'd like to comment about our performance. We implemented all of our protocols and here are some takeaways. For a million items to compute the cardinality, we reduce the communication and this gives us an improved runtime from nine minutes down to five minutes in the van setting. For computing the union, we improve the runtime from 14 minutes down to five minutes in the van setting. Our private ID approach has more communication than the previous public key based approach. Would we achieve a runtime improvement from six and a half minutes down to two minutes on faster networks? Finally, you can find our code in paper in these links. Thank you so much for listening.