 Just say a few words about our security model. So when considering security computation, we basically have three main adversarial models. The first adversary that we consider is the semi-honest adversary. In this case, we assume that the adversary followed the protocol specification, but may try to learn some additional information from the messages that he sees. The covert adversary, which is rather a new model, was introduced recently by Uman and Lindel, and in this case, it is assumed that the adversary made a cheat, but it takes a big risk by getting caught. So this is a slightly more realistic scenario. For the malicious adversary, which we are going to focus in this talk, is basically an adversary that we don't assume any assumption regarding his strategy or behavior. He can do whatever he wants. The only assumption is that it is computationally bounded. So this is our setting. Our proof works in the ideal real paradigm. Basically, we have two settings, the real and the ideal, where in the ideal setting, there is a trusted party, which is uncorruptible, where the parties just send their inputs and receive their output, and in the real setting, the party just communicates using some protocol. In order to claim that the protocol is secure, what we need to prove, essentially, is that for every adversary in the real setting, there exists an adversary in the ideal setting such that the adversary's view, the messages that he sees are essentially the same. So since there does not exist an attack that can be carried out in the ideal setting, perhaps except for changing the adversary's input, we can claim that the protocol is secure. So this is the ideal real paradigm. So the prior work regarding setting the section, so there has been much extensive work in computing this function securely, and there are two main techniques. The first technique is using pseudo-runner functions. I'm going to elaborate a bit more towards the end of the talk. I just want to say that all the previous work for using these techniques, either worked in a limited setting of the covert or used a smart card or assumed something about the domain of the universe where the elements are taken from. Much more work has been done using polynomial evaluation technique. So I'm going to focus on the malicious two-party setting, and for this setting, we have the FNP work, which is our starting point for this work by Fredman, Nisi, and Pincas. Their solution for the malicious setting required random oracle, which I'm going to talk about it soon. And all the other papers using this technique introduced quadratic complexity. I'm going to show in a minute. I just want to say that we combine this in this paper, we combine these two techniques in order to achieve efficient solution for the malicious setting under standard assumption, meaning no random oracle. So let's just overview this technique. So in polynomial evaluation, you can think of as one of the party's input, x as the set of roots for some polynomial, and to compute this polynomial, you basically multiply all these polynomials of Degree 1. So it's like saying that the polynomial 0s on this set basically q of z equals 0 if and only if z is in the set. A useful tool that we are going to use and was also used for previous solutions is the homomorphic encryption. We consider homomorphic with respect to addition and scalar multiplication. It's very easy to see that using homomorphic encryption we can evaluate polynomial obliviously without learning, without knowing the coefficient of the polynomial. The security requirement for this type of primitive is that it would be invisible or even impossible for an unbounded adversary to know whether the encryption is encryption of m1 plus m2, which was computed using the homomorphic property, or this is an encryption of m1 plus m2, which was randomly chosen regardless of the encryption that I had before. And the most popular candidate is the Pallier encryption scheme which is used for additive homomorphism. So how does the solution of FNP work? Basically Alice computes the polynomial q based on the set x. She encrypts this polynomial using homomorphic encryption and sends this encryption to Bob. Bob evaluates the polynomial in some specific way. He computes q of y using the homomorphic encryption, then multiplies with some randomness, this is scalar multiplication, and add y. Why do we add y? Exactly for the property that Alice decrypts and then she can extract the values that are in the intersection, right? Because if y is in the intersection, then she would learn y. Otherwise, she would learn something random. So just note that this introduces quadratic complexity y because the degree of the polynomial is the size of the set, right? So for every polynomial evaluation, Bob has to work in the size of the set and the amount of work is the size of x times the size of y, which is really huge if we consider large databases. So in order to reduce the complexity, FNP came up with the idea of using balanced allocation hash function. So what is balanced allocation hash function? It's essentially hash function, not in a cryptographic sense, just a shrinking function. Which has basically a pair of functions. So each element is being inserted or mapped into the less occupied bin. This was all shown by Alec, Broderick, Alec, and Uphub from 99. And what they showed is with very high probability over the choice of the hash functions, we can get that the size of the number of elements that are mapped into each bin would be of log log n, where n is the size of the sets which elements are being mapped. So by setting the parameters, fixing the parameters properly, we get that the amount of the communication complexity that we have is linear in the size of the set. However, the computations, each polynomial now will have the degree of log ln, which I'm going to show now. So the amount of work would be log log the size of the set, which is much smaller than x times y. So how do we use this balanced allocation hash function here? So the f and p basically propose this solution. Given h0 and h1, which can be chosen by both parties, Alec maps her set into the appropriate bins. So now she has b polynomials instead of 1, but the degree of each polynomial is log log. And she encrypts these polynomials and sends to Bob. Now for every element in Bob's sets, he has to evaluate both candidate polynomials, because he does not know which polynomial was evaluated by Alice. So essentially he evaluates these two polynomials and sends them back. So this solution works perfectly for the semi-honor setting, very nice and very efficient. But when trying to move to the malicious setting, it's much more problematic. So let's see which problem we need to solve in order to deal with malicious behavior. So the first problem was that how can we force Bob to use or to evaluate the same y? For instance, Bob can use y1 and y2 two different ys for single evaluations, which means that Alice would output y2 if and only if y1 is in the intersection. This means that not only that we cannot simulate it, we have an actual attack that independence of input does not hold in this protocol. So in order to overcome these problems, FNP came with this very nice idea in the random oracle, which works as follows. Instead of adding y, Bob will add some randomness, some random value s, where the entire randomization that is required for the computation comes from h of s, where h is the random oracle evaluation on s. So he first chooses s, then computes h of s, and then uses the output of the random oracle to re-randomize the entire computation, including the r that he multiplies by q of y. Now when Alice decrypts this encryption, and in case y is in the intersection, she learns s, she has an access to random oracle, so she computes h of s, and she can re-compute the entire evaluation as Bob did, so she can verify that he computed it honestly. So the first problem, the first question is, can we do it without the random oracle? And the question is yes. So we use, instead, we replace the random oracle with a pseudo-random function. So pseudo-random function is a keyed function that looks like, the output looks like a random function to any computationally bounded observer. So basically the idea would be that Bob choose key for the PRF pseudo-random function, and then use the output of the PRF to re-randomize the computation instead of the output of the random oracle. So he computes exactly the same, but the randomness comes from the output of the PRF instead of the random oracle. Of course the problem now is that Alice doesn't know k. She doesn't know the key, and we need to show how can she learn f of s using k. So in order to show you how to do it, I want to introduce the pseudo-random function and valuation functionality. So basically we have two parties where one party holds the key for the PRF and the other party holds the input for the PRF. And the output would be for the party, in this case Bob, learn the valuation of the PRF over this input. The security requirement for this function, functionality would be that Alice should not learn anything but f of x, while Bob should not learn anything but f of x. So what would be, how can we use this functionality for our solution? Alice would encrypt both polynomial evaluations. Remember that Bob has to evaluate two polynomials because he doesn't know which one was evaluated by Alice. And she uses, so if the value, if f y is indeed in the intersection, one of these decryptions would give s, right? So she needs to use one of these for the polynomial evaluation functionality. However, we created a new problem because now Alice has two potential inputs and she doesn't know which one to use for the pseudo-random function evaluation. So basically how do we solve this problem? This actually solves another different problem also but basically we have Bob commits before Alice even sends the polynomial to y. So this is kind of tricky because there is no randomness in the commitments in the sense that he uses the same s that he uses for the polynomial evaluation. So it was non-trivial to prove that the commitment is still hiding something. But basically he uses y and the randomness s that he's going to use next for the polynomial evaluation. And this is necessary for Alice to verify against these commitments which s is the correct s that she should use for the evaluation, for the PRF evaluation. So all of this for the first problem. Okay, not much left to go. So as we said, the balanced allocation requires two polynomials evaluations. And now we have to force Bob to use the same value in both polynomials because by substituting or evaluating two different values may help him to learn some additional information because he can observe Alice s output or Alice s behavior according to what he sends. So how do we force him? So one solution would be to say, okay, let's evaluate each polynomial separately and then multiply the result and then continue as before, just multiply by some random hour that we get from the PRF of s. Of course, this does not work because we need fully homomorphic encryption for this. Since the polynomial evaluation requires additive homomorphic and now we need to multiply the two evaluations. So if you want to come with some efficient solution we don't know how to do it. So the solution would be to use L-Gamal but in some tweak way. How do we do it? So recall that the L-Gamal, the secret key is a part from the group description and the generator is some alpha and g to the alpha is the public key and in order to encrypt a message g to the t we basically choose some random beta and output g to the beta and g to the alpha beta times m. This is the Cypher matrix. So L-Gamal enables us to do additive homomorphic but in the exponent and also a multiplicative homomorphic which is the standard notion, the usage of L-Gamal. So basically what we are going to do is we evaluate the polynomial in the exponent including the multiplication by some randomness r but multiply s as in the original standard notion of L-Gamal with this evaluation. So note that if the value is in the intersection then all the exponent is cancelled to be 0 and Alice can learn s. This introduces some separatism which I'm not going to get into it but this is basically what we do and the third problem that we had to deal with which was introducing the f and p where I sort of hinted when I said that they used palier. So in palier the plain text space is Zn where n is the multiplication of two primes. So it's not a prime or a group. So the problem here and we don't know how to deal with this attack is essentially that malicious Alice can basically she chooses a secret key so she knows the factorization of n. So what basically she can do she can come up with a polynomial q such that for some values the evaluation of q of t would not be co-prime with n. So if Bob will multiply q of t with some random r this will not be random in the entire n but in some much smaller subgroup which may give her some information. So we don't know how to approve that she constructed the polynomial correctly but using the algorithm already solves this problem because algorithm is of prime or a group so give us for free the solution for this problem. So just quick overview of the protocol this is really high level we have that Bob sends the commitments of Z and Alice sends the b polynomials encrypted and then Bob just evaluates every two polynomials that are chosen according to the balanced allocation hash function now the party has to run pseudo-rano function evaluation M times which is the number of elements in Bob's set and finally Alice is able to decrypt and check for consistency and output all the values that are consistent with Bob's computations. And just for efficiency so we inherit the efficiency from f and p according to the balanced allocation only a part form the efficiency of the oblivious prf well basically in the paper that we wrote we actually used the or angle the pseudo-rano function which the amount of work is proportional to every bit in the input not to every element but to every bit so this introduces some overhead of the length of each element today I think there are some other computation of oblivious prf that can work in constant amount of work total not just for every bit so it could be much more efficient and just a final remark and note that if the size of the second section can be leaked or is allowed to be learned by Bob then the entire computation is proportional to the size of the second section because you don't have to run oblivious prf just for the values that are in the intersection and that's it just for the summarize we have a protocol I think the first protocol I know the first protocol for the second section that works under standard assumptions that achieves almost linear efficiency in the set sizes and is fully simulatable for malicious can be even useless secure it's easy to show we also know how to use this technique to solve the set union problem and that's it, thank you