 Thank you. So this is joint work with Kasper Grain Larson and Jesper Buesnilsson also from Aarhus University. So welcome to the other half of the world of MPC, Information Theoretical Secure MPC. And in some sense, this talk is going to be about how different that half of the world is from what Abid just talked about and the spoiler, I guess, it is indeed very different in some ways at least. So as we all know, Information Theoretical Secure MPC protocols, they're so great, low computational overhead, no FAT here, best possible security guarantees, what's not to like. But as we also know, they're not so great because you have to have many rounds and large communication complexity, at least as far as we know from the protocols that we know about. And a very important, longstanding, open problem is whether that is inherent, those costs. So in this talk, we have some of the answers. The spoiler here is that the communication is indeed inherent, at least for some functions, but we don't have anything to say about the rounds. So just to be explicit about what do I mean by communication overhead. So one trivial observation is that if you don't care about security, of course you can compute any function with communication complexity with the size of the input. You just send all the inputs to one guy, he can use the function. And then if everybody gets output, of course, you should add the total size of the output because he needs to send the outputs back to these guys. But we'll only talk about functions with very short outputs here. Also to avoid cases where the communication complexity is large for trivial reasons, just because someone has to get a large output. We don't really want to look at that. So the question is, if you take the Information Theoretical Secure protocol, must it communicate more than the input size? It's the most basic question. There's also probably a much harder question that says, what is the circuit size? It's much bigger than the input size. Must you then communicate more than the circuit size of the function? That's not something we can say much about. We can say some things though, but it really is a different question. So our results. So the model first, we consist of statistically secure protocols, synchronous network and passive security, semi-honor security. We assume secure point-to-point channels using the standard model here, where the length of the message that you send always leads to the adversary. This is very natural, because if you're ever going to implement a secure channel, you'll be using, well, whatever you're using, you shouldn't expect to be able to hide the length of the message from the adversary. There are two models we consider, honest majority, where the number of players n is two times corruption threshold t plus one. And also dishonest majority with preprocessing, there you can also get information security and here n can be t plus one. What we show in both models is that for any number of players n and for infinitely many input sizes s, there exists a function f with input size s bits. So any protocol that computes f securely has to communicate at least some constant times ns bits. So that says that there has to be this factor n overhead compared to the input size. Okay. And recall just to make sure, this is not because every player has to compute a long input, because it has to receive a long output, because the outputs are in fact very small. More as true, it happens to be the case that our functions have very small circuits, in fact linear sized circuits in the input size. So the results also therefore say, obviously, that some functions require communication n times the circuit size of the function. So what this means intuitively, at least, is that if you have a general protocol construction that can compute any circuit securely, and if it does the same, the same approach to every circuit, whatever that means, then it must actually always have that factor n overhead, because it has to have it by these results when it computes that function that we construct. So from that intuition, at least, it seems that we always stuck with this overhead here, at least for the protocols that we know about. For honest majority, we have a matching upper bound, n times the circuit size. That's motivated by the fact that previous results were off by a factor log n for circuits that save a boolean circuit, circuits over small fields. For the preprocessing model, there was always an upper bound shown in a paper from 2013 by Ischia Hedal. So the upper bound is n times s, bits of communication. This requires exponential size preprocessed data, but if you can live with that, then the preprocessing case is essentially settled. The answer is n times input size. Good. So we also extend this to sub-optimal threshold. So for honest majority, what about n equals 2t plus s, where s can be greater than 1. So the coefficient threshold is smaller now. Then the bound is what it was before, but divided by s now. And this is nice because this exactly matches what we can get for upper bounds using packed secret sharing. So packed secret sharing is this technique where you can share a vector of secrets, but the share is still only one field element. And in this way, you can do a bunch of arithmetic operations in parallel for the communication cost of one, essentially. So this gives smaller communication, but the price is that the threshold must be smaller, the coefficient threshold must be smaller, and this exactly matches this lower bound that we get here. Okay. So before diving into how we do this, let me just mention some related work. First of all, this work by Scheihe-Dahl from 13. They prove lower bounds for two parties in the preprocessing model and this upper bound for the multi-party case that I mentioned. Then there's a work by, I think, Datta and the Pranab Kahant. They do a lower bound for three parties and perfect security. That was actually the first result showing that communication sometimes has to be larger than the inputs, but only in that particular case with three parties and so on. There's a work from Eurocrypt 16, myself, Nielsen, Ostovsky and Rosen, where we show a lower bound on a number of messages. So some functions require you to send n-squared messages. Everybody has to talk to everybody else, so to speak. This of course means you have to send n-squared bits also in particular, but that's much smaller than our lower bound when the inputs are large. Finally, there is another work from 16 where we showed some lower bounds for gate-by-gate protocols. Those are protocols that work the way we used to, that you compute the circuit. You do every gate by itself. There's a sub-protocol that you run for every multiplication gate. For that class of protocols, you can get very strong lower bounds, but of course only for that class of protocols. We want to do something for arbitrary protocols. So starting point for the results is to look at two-party private information retrieval. Just to remind you what that is, there's a server, there's a client. The server has a bit string x. The client has an index i that points to some position in the string x. They talk, there's a transcript t that's formed and at the end of the day, the client can compute the ith bit of x while the server is not supposed to learn anything here. The only very well-known and straightforward fact that I need about this situation here is that if this protocol is perfectly secure, then from the transcript, the client can always compute all of x no matter what the protocol does. This is very intuitive if you think about it. Because of the privacy requirement. If the transcript misses information about some part of x, then the server could conclude that that's not the part that the client wants. All the x has to be there somehow. That's the only thing I need you to remember from this slide. That in perfectly secure two-party peer, the client can always compute the server's input from the transcript. So now let's go to the honest majority. Oh, by the way, I forget this. So in the following, I only talk about perfect security. All our results hold for statistical security. Also, they're essentially the same. You take the perfect secure result, and you take a few small epsilons and subtract here and there, and then you get the results. I will distribute small epsilons afterwards so you can do it yourself. Anyway, so honest majority, three parties is the first step. So the function we consider here is the following. So there's going to be two parties on top. There's Nupi and Lucy, and they have each one bit string as input x and y. And Charlie Brown has no inputs, but gets the inner product of those two bit strings as the output. And so we assume that there is a perfectly secure protocol for this inner product function, which is secure against one passive corruption. Good. So you run the protocol, and then there is some messages sent between party one and party two. That's called T12, that transcript there, which you can think about as a random variable. And similarly, we have T13 and T23 that's sent between the other two parties. So the first pretty obvious observation is that since P2 has no output, should learn nothing new from the protocol, then of course, in particular T12 has to be independent of x, the input of the other guy, because you're not supposed to learn anything whatsoever. This is perfect security. That's pretty clear. So then, with that observation, we now consider what's going to happen if we run this protocol with a particular choice of the input y. So let's set y to be the all zero vector, except there's one in the i's position. So if you do this, then of course, the inner product of x and y is going to be the i's bit of x, obviously. So that means that we can now construct ourselves a two-party peer protocol because we're simply going to glue those two parties together there and consider them as being the client. And then of course, we have the Snowpiercer server has input x, and now these two guys together will learn the i's bit of x with that choice of y. But what did we say about two-party peer? We said that from the transcript of the protocol, which in this case is T12 and T13, the client can compute the service input x. So it means that you can compute x from T12 and T13. However, we just also said that T12 itself is independent of x. So from T12, you have no idea what x is, you bring in T13, now all of a sudden you know what x is. So that means that T13 must have contained enough information to determine x. So therefore, this intuitively has to be at least as large as x. I should perhaps mention for those who know about the technicalities here, what we actually show here is bounds on the entropy of say T13, and this then implies that the average communication complex must be at least the entropy, but that's a detail. Okay, so that's fine. And then from this we can lift ourselves one step more and do the... Yeah, okay, this is a takeaway message from this slide. The guy who gets the output must communicate a lot. This is the only thing you need to remember. So now, the general case. So here we have two T plus one parties. First we have, like, T incarnations of Snoopy. They look an awful lot like each other, but that's a good reason for this. You'll see. So they call P11 up to P1T. We have also T loses P21 up to P2T, and we have one Charlie bound with his P3. So two T plus one parties all together. Okay, and the inputs that we have here is each party has... Each of the top row parties have a bit vector as input. Again, X1 up to XT, and also Y1 up to YT are also bit vectors. And then in addition, every party has one bit as input. Okay, then the P11 up to P1T, and so on, and P3 for Charlie Brown. And the way we define the function is as follows. So we define a value Z, which is you concatenate all the X vectors. You concatenate all the Y vectors, and you take the inner product. That's called Z. And then the outputs are defined as follows. Each party gets as output that inner product times his input bit. So the input bit selects whether you learn something or whether you learn nothing. Okay, and now again, that's the function that we will rule over bound for. We assume a secure protocol for this function. That's secure against T-corruptions now. Okay, so again, we're going to hard-code the inputs in a particular way so that things will behave nicely. One thing we can do is we can set all the BIJs. So all the input bits for these top row guys up there will be zero, and only Charlie Brown's input there will be one. So he's the only one who learned something in that case. So then we can get a two-party protocol from this, right? Because we're going to glue all the Snoopies together, all the Lucy's together, and then we get a three-party protocol where, because the original protocol allows for T-corruptions, then any one of these three parties now can be corrupt. So that exactly mirrors the situation from the previous slide. It's exactly the same thing. So remember, the guy who gets the output has to talk a lot, right? So in this case, when we hard-code the inputs in this way, this guy must communicate at least order-as-bits, whereas this is a total input size, so the combined length of all these factors. But we can of course glue parties together in all kinds of ways. So we can also do something else, like we can say, let's set, now, B11 to B1. So the first incarnation of Snoopy gets the output, and nobody else gets anything. And then we also set Y1 to B1. I'll tell you why that is in a moment. So now we glue parties together like this, so we glue all the T-minus one last incandescence of Snoopy together with Charlie Brown and all the Lucy's that glue together. Now, what is going on here? The protocol, of course, computes the same function as before. Namely, it concatenates all the X's, all the Y's, does the inner product, but now because Y1 is set to be zero, that wipes out, in fact, effectively the X1, right? So what it does is, in fact, it computes the inner product of X2 up to XT concatenated and Y2 up to YT concatenated. It's a little bit shorter, but essentially the same size as before. So, and that's the inner product that the first incarnation of Snoopy will get. And now it's exactly the same situation again as the previous slide. So, therefore, P11 now much communicates all the S-bits. And as you can easily imagine, we can do exactly the same thing for all the players. So therefore, in conclusion, what we get here is that, so just to summarize what I just said, for each party, it holds that I can hard-code the inputs in a particular way such that this guy gets the output and that party in that case must communicate some constant times S-bits. But the point now is that the communication pattern, including the lengths of the messages, cannot depend on the inputs, right? Because in this model, even an adversary who corrupts no one will see all the entire communication pattern, can do traffic analysis, and that should not reveal the inputs. So that means if sometimes some guy has to talk a lot, he has to do it all the time. If the fact that he talks a lot depends on the inputs and it does here. So therefore, the total communication in fact has to be omega n times S-bits. Okay, so note also that the function, of course, this is just an inner product. So you can certainly compute that using all the S-elementary bits. So the bound is also n times the circumcise of the function as I promised before. Okay. Good, so that basically sums up what we do for the honest majority case where the threshold is full threshold. For the case, I don't have time to talk in details about the case of sovereign threshold and the case of dishonest majority with preprocessing. It's basically very similar ideas, but technically slightly different details. I guess I can say that if you have suboptimal threshold, what happens is that you can start gluing parties together in small groups. So we kind of do the transition from the multi-party case to the three-party case, which is still what you would use to, in two steps first you glue some parties together, then you get something which is essentially full threshold for this conglomerate of parties, and then you go the last step. That was probably not very clear, but I think from the paper it's hopefully clear. Okay, let's talk about the upper bound. So for honest majority, there is in fact this result from Q2.07, where I showed with Yers Van Nielsen that any arithmetic circuit can be computed with passive security and honest majority, but communication that's order n times circuit size field element. So you might say, but that's exactly the lower bound, isn't it? So there's already a matching upper bound, but there's a catch, namely that this only works if the field size is larger than n. So if I want to compute the Boolean circuit, say, I can still do that with that protocol, I just need to run it with an extension field. I need to make it big enough that I can have n-evaluation points in the field, and so this means that I would have to put in a log n factor into the communication complexity of this thing for Boolean circuit. So we can get rid of this thing by using a tool from last year. There is something called reverse multiplication-friendly embeddings. That's something that appears in the paper by Kramer in crypto last year. It's basically the idea is that it's a way to implement many parallel multiplications in a small field by doing just one multiplication in a bigger field. So basically what you do is you take your two vectors that you want to point-wise multiply, and you encode them using a special encoding function into two field elements in the bigger field. Then you multiply in the bigger field once, and you get something which essentially encodes all these parallel multiplication results that you actually wanted. So then what we do is you combine these two things. You basically run the old protocol from 07 over the bigger field, and this then does what you want, essentially. You have to do some unpacking at the end, but that turns out to be only for one big field element that you have to do something untrivial. So that basically gives you the corresponding upper bound. Okay, so let me go to open problems in future work. There's ongoing work where we try, and I would be very happy to receive any inspiration for this, that it would be nice to have lower bounds for also, let's say, naturally occurring functions. That is functions that we haven't especially engineered to be able to prove the lower bound. And there might be ways in which functions naturally behave in sort of similar ways to what we engineered here. So that would be interesting. A really tough question, but also immensely interesting question is, what if the circuit size is much bigger than the inputs? Is there a lower bound than the growth with the circuit size? This is a question of a completely different nature than the one we actually solved which has to do with the input size. I mean, for one thing, if I give you a function, I mean, for most functions, we don't even know what a circuit size is, right? Without some kind of assumption, we will certainly get nowhere here. But even then, I think it's really totally different techniques that you would need to do this. For the pre-processing model, there's this open problem left. So the existing upper bound says n times input size. That's for optimal threshold, nst plus 1. But what if we have suboptimal threshold? Then our bound degrades a little bit. It divides ns by this little s that appears, so this here gets divided by this. A lower bound gets smaller when the threshold gets smaller. And we don't know whether we have an upper bound that can match that thing there. Okay? And finally, something which used to be an open problem, but I think is now actually closed. So what about a lower bound for perfect malicious security? And it was 3T plus 1. So you might think that, okay, malicious security should be easy because that implies passive security. Just apply our bound. And this doesn't work here because the threshold here, right there, is n third. And we do something for n over 2. The threshold gets smaller, the bound degrades. So if you go all the way down to n over 3, our bound says we have essentially nothing. So it turns out that there's a different argument that explicitly exploits the fact that we have malicious security. And it gives the same result. There are some details we have to check, but it seems to work out. And then you get the same result n time circuit size as we have a passive security. And by coincidence, which is almost too good to be true, the next talk in this session, Goyal et al. will tell you that that's exactly the upper bound. So that's quite amazing. But yeah, so that's what I had. But that I'll say thank you for your attention. Go ahead. So you use the fact that the communication channel leaks the size of the messages. Do you think the bounds still holds if you have some sort of idealized channel that doesn't leak? I think so. So do the bounds still hold if the message links do not leak? I think we were maybe a bit lazy. I think we could still prove something even without an assumption. Then you have to have an adversary that corrupts some of the players and watches what is being sent. I'm pretty sure that could be worked out. So my question is the theorem statements you said are about the average complexity of a protocol, right? So could it be that that average is basically due to low probability events having extreme communication? Can you make a statement about the variance and argue that there can't be some protocols that are actually pretty good, but their averages are large for obscure reasons? That's a good question. I don't have a good answer to that. That would have to be, I think, a different study. I don't see anything that we have in the paper that would tell you directly what the answer to that one. Great. Thank you, Ivan.