 Hi everyone and welcome to this talk on large-scale, actively secure computation from LPN and FreeXR Carbon Circuits. It's during work with AnAirBeneframe, Canon-Con, Canon-Ri, Emmanuel Arsini and Nigel Smart. Multi-party computation or MPC for short protocols are interactive algorithms which allow parties to securely compute on their private data. Here in our example they want to know who is the richest among themselves. The main challenge in MPC is that there is an adversary which controls and coordinates several of the participants, rather than the adversary being an external entity. By securely computing we mean a series of properties. The two most important ones are correctness, which guarantees that parties obtain the correct result of the computation. In our example here Alice is the richest among the four parties. And the second one is privacy, which means that parties learn no more information than what they can infer from their respective input and output. There are two main approaches to MPC protocols, which are secret sharing and garbled circuits. Each of these has their own advantages and disadvantages. Secret sharing protocols are very good for arithmetic circuits and low latency networks. The latter is due to the fact that their round complexity is linear on the left of the circuit. On the other hand they have a small communication complexity and they are also very cheap in terms of computation in general. Protocols based on garbled circuits are the best option for Boolean circuits and high latency networks, since they have a constant number of rounds. On the other hand communication costs in the garbled phase are bigger, since the size of the garbled gate depends linearly in the number of parties. Computation-wise this is even worse, since the dependency in the number of parties is quadratic. In our work we set out to get rid of this dependency in the number of parties. This was previously done by a subset of the authors of this work together with Yehuda Lintel, but they were only secured against passive adversaries and they could not support the Friggs or Optimization. Given our improvements, the complexity of garbled circuits is now much more similar to that of secret sharing-based protocols and even asymptotically better for circuit evaluation. Let's make clarity setting and the goals of this work. We want a constant round multi-party computation protocol based on garbled circuits in which the size of the gates is independent of the number of parties and which is concretely efficient, so we also wanted to support the Friggs or Optimization. We will deal with static active adversaries and we will be in the dissonance majority setting with different thresholds within that range. We will have Boolean circuits to model our computation and we will work in the offline-online paradigm. Let us start with a recap on garbled circuits, both in the two-party setting, which is Jaws Protocol and in the multi-party setting, which was first considered by Beaver, Michali and Roly. The core steps behind the passive version of Jaws Protocol can be summarized as follows. First of all Alice produces an encrypted or garbled version of the target circuit, which is C tilde here on the slides. She sends this to Bob and then both parties engage in an input and coding phase from which Bob obtains a garbled version of these inputs. Once he has this, he can evaluate the garbled circuit on the garbled inputs and learn the actual output of the circuit and only that. Alice can get the output just by getting it from Bob, who will send it. One could believe to think that it is easy to achieve the goals of our paper just by taking Jaws Protocol and then using any NPC protocol to emulate all of these internals of it. But this wouldn't really mark all of our goals. First of all, yes, this will be a constant road protocol since the garbling step has a constant depth. The size of the garbled gates would be independent of the number of parties because that's how they are in Jaws Protocol. And if our gray box here, which denotes our NPC protocol, would be actively secure. So it will be the overall result bar some minor details. But the problem here is that this would be extremely inefficient since we would have to do obliviously all these garbling operations which involve using some PRFs. And these don't have a very friendly NPC circuit and they are not really efficient even for the other alternatives that have been proposed. The BMR protocol by Beaver Mikarian-Rowerway from 1990 does not differ too much from the previous idea and the same goes for its successors. So in this protocol, we have parties to emulate these garbling and input encoding steps using some generic NPC protocol. But the core detail here is that these PRFs that are used for encryption, they are not obliviously evaluated within the generic NPC, but they are rather provided as inputs by the parties. But the process is very similar to Jaws. Parties will produce this garbled circuit using NPC and the inputs they provide. They will encode their inputs and once they have all of these values, they can go to the online stage where they choose the evaluation procedure with the encoded circuit and encoded inputs to obtain the right output of the protocol. Those were the differences between Jaws and BMR at the protocol level, but we will need to look in more detail at the garbling step of these protocols. In Jaws protocol, a gate is garbled as follows. First, for each wire on the circuit, you sample a pair of keys corresponding to each possible value on these wires. Next, you are going to encrypt the truth table using these keys. So you will encrypt under the two corresponding input keys corresponding to the two input values of the gate, the key that corresponds to the output value that you would obtain if you were to evaluate this gate in the clear. Finally, you permute the rows so that the evaluator doesn't figure out the correspondence between the keys and the actual values from the row that he's evaluating. If we add the free XOR optimization, things look slightly different. Rather than sampling two keys per wire, now parties will sample a single key per wire and some global correlation delta. And they would define the remaining key as the sum of that key plus delta. Now this allows you to avoid garbling XOR gates. You can simply obtain the output key for any XOR gate by XORing the input keys. In the BMR protocol, more changes are needed to deal with multiple parties. Now each of these parties will sample a key and a free XOR correlation independently. If we put all of these together, we get a vector of keys and a vector of free XOR correlations that allows us to define the vector of keys that we're missing. Once we have this, we do the garbling similarly to how it was done in Gauss. We will encrypt the table and we will permute the rows. This time the permutation has to be done obliquely by all the parties. The problem is that these double encryptions using vectors of keys will require to encrypt each of the individual keys separately. So now the size of this garbling gate is linear in the number of parties. We are encrypting each of the individual keys under all the keys, under the two vectors of keys. So our goal in this work, I would remind, is to get rid of this linear dependence in the number of parties. We can now present our scalable garbling scheme, which is based on the learning parity with noise problem. First of all, let us recap the decisional variant of the learning parity with noise problem. During these following slides, I will use teal to denote public values and purple to denote secret ones. So it goes as follows. To sample a public matrix with entries in F2, a secret S, a secret vector, and furthermore sample some error according to a Bernoulli distribution. What the decisional LPM problem asks for is whether you're able to distinguish the public values big C and small C from uniformly random ones. So let's just move this under to remind what those values are. Assuming that the LPM is hard, it is very easy to build an encryption scheme. If you have a message, which is a vector of elements in F2, you just encode it according to some error correcting code. So for this, just take the generator matrix of the code and add this to this small value Z. The idea here is that you can remove the term C times S if you know S and the error will go away when you do the decoding in your error correcting code. Given the encryption scheme from the previous slide, it is easy to make a covering scheme if you don't have the freaks of property. So what changed from the previous construction is that since you have two keys to encrypt each entry of a garbled gate, you will go back to having a single key by summing them together and that will constitute the secret key of the previous encryption scheme. And the messages that you will be encrypting is the wire key corresponding to the truth table for that entry of the garbled gate. Now, if we have the freaks of property, the previous construction doesn't work. And the reason for this is that since the key for value zero and the key for value one are correlated by this value delta, we would have that the value S11 would be the same as the value S00 since you are adding delta twice and that cancels out because we are in a field of characteristic two. Same thing goes for S01 and S10, so we've lost all security in our garbling scheme. You learn two entries of the garbled gate every time you get the corresponding input keys rather than a single one. In our work, we solved the previous problem by using a permutation of the key bits of the key space. Now, rather than defining the secret key SAP as the sum of the keys, it will be defined as one of the keys, say the one for the left wire, plus a permutation of the bits of the second key. This way, the freaks or correlation will not cancel out. We could have thought for a different solution. We could have, for example, applied the standard DLP encryption we described earlier twice. But this will have a significantly higher computational cost. Almost the double one could say since you need to sample one more matrix and do one more matrix vector multiplication. In more details, the use of a permutation brings us to four possible keys which are for any AB, the key SAP, that plus delta, or that plus sigma delta, or that plus delta plus sigma delta. So once part is learning any pair of keys, KUA or KVB, they learn the value SAP and all the security of the covering relies on the remaining hidden secrets which are delta, sigma of delta, and delta plus sigma of delta. Hence, all of these encryptions rely on some related keys, which security we have to study. Furthermore, the values that are being encrypted also depend on delta, since they are also higher keys and they have the freaks or correlation. So there's also some notion of a key dependent message. In our work, we prove that our construction has some related key, key dependent message security, which is actually something stronger than the two notions separately. What we show is that these keys that we use now, this key SAP, only has to be as many bits longer as the number of cycles in our permutation sigma. And I invite you to check our paper for the details. Equipped with our LPN based coupling, we can now present our two protocols for cargo circuits based on that. The first one, which we denote as authenticated cargo circuits relies on the fact that LPN is an NPC friendly way of encrypting data. Since we have some public matrices and the errors and keys are secret vectors of elements in F2, encryption is just an F2 linear combination that comes from computing the matrix vector product. The problem with this solution, which is very easy to understand though, is that sampling the secret errors in NPC is not cheap, since they follow this Bernoulli distribution. So the problem is that the preprocessing will be slow in this case, and our next protocol will be better on that aspect. But it has the advantage that the error correction codes that we can use for this construction seem to be more efficient. Our second protocol is the unauthenticated cargo circuits one. In this case, we assume that some constant fraction of parties is open. Say, for example, that in percent, which would be something very reasonable, since we aim for a large-scale scenario with many, many parties. So in this protocol, what each party will do is sampling sharings of the LPN keys and of the noise locally, rather than together and obliviously, as in the previous protocol, and compute some local weak LPN encryption, which has the following properties. The sum of all the weak LPN ciphertexts that belong to the honest parties, so that's n divided by c, is a secure LPN ciphertext. And in order for parties to get the sum of all the honest LPN ciphertexts together with the ones from the malicious parties, because we don't know which ones are those, parties just add some random sharing of zero to their weak LPN ciphertext. Now when you add all of these together, the sharing of zero goes away and you get the secure LPN ciphertext together with the ciphertext of the malicious parties, which has to have some tolerable amount of noise. The good thing about this is that now we have the same number of oblivious operations as in the state-of-the-art protocols that are in the BMR setting with active security and the solar majority. The problem, though, is that since now we have a single pair of keys for every wire, rather than a vector of keys, where each of these keys belongs to a specific party, we have lost the error and detection capability that was in HSS 17. So there would be different ways to solve this problem, but the solution we take in our work is to introduce some output gates that bring back these error detection capabilities, which basically have, again, these kind of vectors of keys. And once again, the details are in the paper. I would like to conclude this talk by giving some details about our implementation and experimental validation of our results. We will only present the main takeaways of our work for the circuit evaluation phase, and for this we will compare with the previous work of PLO 17. It's first interesting to see how much slower is our work, which has active security and free XOR compared with PLO 17, which was only passively secure and couldn't support free XOR. For circuits consisting only of hand gates, we are around six times slower, whereas if we take more realistic circuits that include also other kinds of gates such as AS, it turns out that in this complete example, we're only around 15% slower than this passively secure protocol. Another interesting metric to look at is when does our work become more efficient for the circuit evaluation phase than the best works that precluded in this setting and have gates of size linear in the number of parties. In the passive security case for PLO 17, this was around 100 parties compared with PLO 16. In our work, it turns out that this crossover point is also around the same number of parties. Finally, it is important to note that our implementation is a proof of concept. We get some good performance due to our use of the two key LPM encryption that I presented with the permutation trick and by sampling the LPM matrices on the fly using the AS key derivation function. So this part of the implementation only takes between 15 and 20% of the time. On the other hand, the major challenge comes from the codes that are used. So in our examples, we use concatenated codes, which were easier to implement and to analyze their failure probability. This has some problems though. We spend around 80 to 85% of the time decoding these codes, so there's room for improvement. This could come from different sources. We could use the recent GFNI instructions from Intel to improve the efficiency of the outer code, which is a Ritz-LeMont code over the field with 2 to 8 elements. Or we could use a better decoding algorithm, such as generalizing minimum distance. Or in turn, we could turn to codes that would be more difficult to implement and analyze that would provide a better performance, such as LEPC codes. Thank you very much for your attention and see you at the live session.