 Hi. So I'm Ben Kreuter. And I'm going to talk about how we've been using MPC at Google. Before I begin, I'd like to say, every time I tell people in this community, hey, we're doing this. We're doing MPC at Google, everyone gets really excited and starts saying, oh, are you using Garbled Circuits? Are you using GMW? And then I have to disappoint everyone and say, nope. And actually, we're going to talk a bit about why. I think probably the most interesting part here is what is separating the academic work from the work that companies like Google are doing in this field. So we'll start with that. And then I'll talk about two applications that sort of illustrate two different scenarios in which this is being used. So we have a business to business application and an application on consumer devices on smartphones. So let's start with this separation. Why is research going one way? And everything else seems to not be going that way. So at a high level, the academic view of MPC has been mostly looking at these generic protocols. So these protocols where you can take any polynomial time functionality and compile it down into something that you can then run between many parties. And it's very nice if you're a developer, if you have a system like that, or one would hope, because you don't have to get into the crypto. You just have to specify what your function is and who should participate and what you want them to learn. And there's also been a lot of interest in the strongest security model possible, right? The malicious model or sort of these concurrent models that go even beyond that. And there's been some very good results, right? So we've defined security models on the theory side and developed a very good and very useful theory about it. We've shown that actually any function you want a protocol for has a protocol. And we've shown that you can even do it if only one of the parties is honest. So the whole world can be against you and you can stay safe. And as Yehuda was just saying, we have implementations of these systems. And for many years, people have been developing them. So this is what things look like on the academic side. Now, the theoretical results have a lot of value in practice, or so I think. The security definitions especially, we need to, when we design this, we need to know how to think about security, how to think about what we're actually doing. And the feasibility results are good. My team at Google sort of does work for teams that come to us and ask for things. And it's very nice to be able to say, yes, on paper, we can do that. And impossibility results are also good, because then we can say, actually, on paper, we can't do it and then try to solve the problem another way. So we know where to not waste our time and we know when we're doing something that's at least possible. And on more than one occasion, I have actually found the security problem in one of our designs by trying to write the simulator for it. So for those not familiar, the simulator is sort of how we define security for NPC. It's, you want to say that you can generate a transcript without actually running the whole protocol so you learn nothing from the transcript. So this is all very useful. And the security models in practice, you know, a bit more about that. The semi-honest model is actually not as bad as you might think. In some settings, it actually has value. It's akin to forward secrecy in a sense. If I haven't cheated yet, you know that everything, thus far, is secure. And, you know, if you can audit it, if you can get sort of just below the covert model, but still be able to catch cheating sometimes, that's actually good enough in a lot of cases, especially in business to business settings where contracts can be used and you can sue people. It's also worth pointing out that in a lot of cases, we care more about privacy than correctness. The malicious model guarantees both. So you obviously can't satisfy that if you only guarantee one. It's a little hard to define that. It's a little hard to say we have privacy, but you can sort of change the output because maybe you change it to just repeat my input, right? But in terms of what we really care about, privacy is often more important. And kudos to the researchers who developed the covert model, because if I remember correctly, they specifically said it was for realistic applications, and that's 100% true. That is probably the most useful thing to achieve for these business to business cases. In the consumer setting, when you have people you can't just bring into court, you probably want malicious model. And these concurrent composition questions do come up. Sometimes non-malleability is an important thing to have. Of course, as you go down the list of security models here, it gets harder and harder to do them practically. So that is the tension. Now, this I think is probably the biggest gap between academic work and what we do. Academic work tends to focus on the end-to-end running time of some system, some protocol, or the throughput, but doesn't really break down the cost. So for example, let's say we're doing this two-party computation between Google and someone else. So we have a data center with tons and tons of machines in it, and they all have to share the uplink, or uplinks if there's more than one. And the other party has some resources. That shared network is gonna have a ton of contention. You're not gonna get to monopolize it with your protocol. And even if you could, you're not gonna run your protocol once. You're gonna run it over and over and over, and you're just gonna have contention against yourself. And so the network is usually the costliest resource for you. And this is equally true on consumer devices. There's only so much wireless spectrum to go around. And crypto is not the only thing people are doing. So for a business to use this, there's obviously some reason they wanna run it. There's a benefit. If the benefit outweighs the cost, it could be used. Not guaranteed, it could be. So a few practical challenges. Consumer devices don't get to communicate directly with each other usually. Like my phone is behind a nat that my service provider runs. So I have to go through some cloud service. There isn't really any reliable PKI, sorry. And for some protocols, so for the one I'll talk about later, we're gonna sort of select random sets of phones. If we're malicious, we'll just select phones that we control, right? So how we select the parties is a sort of interesting question. And the cost metrics on consumer devices are different. You can measure cost in terms of how much energy you've used, which is a very different thing from like a data center or a laptop. All right, sorry, not a laptop, a desktop. A laptop does have a battery. Also, consumer devices fail all the time. It random points in a protocol. This can create an interesting security problem where a device sends half a message and dies and it winds up ruining its own privacy in the process. Some protocols fail miserably if you don't finish. So in ORAMs, for example, you either lose security or you lose all your data if you don't complete the protocol, which is usually a problem. Also, of course, these civil attacks where you just simulate things and call them cell phones. These are a concern. And there's this weird, this theoretical result here that you either get to have security against everyone, in which case civil attacks are irrelevant, or you get to ensure that the computation finishes even if parties drop out. You can't have both of those. Usually it's more important to finish, and so you have to carefully ensure that these sorts of malicious coalitions can't really be formed at a large scale. Marley had mentioned there's no such thing as a non-colluding service provider. I'm gonna say a slightly weaker version. It's not impossible. It is quite difficult. I say it's not impossible because it's already been done in a real protocol. But if you're trying to deploy a system, it's not really a good thing if some third party can stop playing and then your system stops working. So it does add a lot of risk when you wanna do something like that. And there's a bunch of questions that'll come up if you try to propose that to a non-cryptographer. So, our applications. This is a sort of famous quote from advertising, right? People spend a lot of money advertising their products and they wanna know that that money is well spent. So, I think this guy sold tractors and wasn't really sure who saw his ads and then bought them. So, set intersection functionality is actually very useful for this. We use it for a lot of things. It's very good when you wanna compute some aggregate information over a joint data set. And in particular, in these business to business settings, that data is often something you can't share with other businesses. So, Google has user data and this tractor salesman maybe has some data about his customers like who came in and bought tractors. Those two data sets are very, very private and very sensitive, so you can't just send them out to other people to work with. So, this is how we get our ads attribution now. If we compute the intersection of who saw the ads and who bought the tractors, the size of the intersection gives us some information. If we augment this with some extra data, so we use the, you know, we intersect on the key and then the value is how much they spent, we can sort of figure out quantitatively the effectiveness of this ad campaign that he was running. Very important that we not reveal who's actually in the intersection since that amounts to revealing at least partial data about either who bought tractors or who saw the ads. So, we'll only reveal the size. And here's our basic protocol. It's sort of a Diffie-Hellman-based protocol. We augment it with some homomorphic encryption, like PAI-A, because we only need a sum. And it's very quick, obviously, this is for an honest but curious setting. It is simple, and this is a non-technical advantage, not just because it's easy to implement, but because we have to convince other people that this is something they should do. We have to convince, you know, non-technical managers, we have to convince lawyers, and then we have to convince the people who need to implement it. There's different opinions on who is the hardest among these to convince. The lawyers are interesting because you can show them a proof and, you know, two plus two equals four is a negotiable statement here, so it's interesting talking to them. I find that managers usually trust your expertise, although I'm told that before I joined my team that wasn't so true, not because I joined, I think, that changed long before I came along. Software engineers, I think, are the worst because they already assume this is impossible. So you first have to go and undo everything they've learned about crypto and then rewrite all of it. So how about a quick comparison? Why don't we just use garbled circuits? So a few years ago there was some work from a co-grad student of mine at UVA on garbled circuits. They can be just as fast as custom protocols or even better for set intersection. So why not? So he assumes we have some bit vector representation, so we only need two AES blocks per bit, and we need some adders, that's one AES or one AND gate per bit that we need to add. Let's say we're doing 16 bits. And in our protocol, it looks like it's all the same. So at best, we're equal. Unfortunately, all of our assumptions are wrong. We don't actually have bit vector representations and 16 bits is a bit small. Tractors are actually quite expensive. Also, there's a subtle difference hiding in here. Our protocol only has to send a element from that curve for everything in the set. But in this garbled circuit approach, if you're using this bit vector representation, you have to send a bit for everything in the universe. And as it turns out, our sets are usually quite sparse in the universe of possible values. And so actually garbled circuits are gonna perform terribly. And if you try to do this, not using bit vectors, then you have to do the set intersection online in the circuit, and then your costs really start to look bad. So garbled circuits actually are not competitive, at least in this setting. A few final notes. There's a very interesting problem here of finding common identifiers for our user data and for some third-party data. We do have an audit mechanism, so we have a way to sort of post facto check that there was no cheating. In this model, in this world, that's pretty useful. Arbitrary, lawyer-imposed things are things that have no relevance to computing the results correctly and no relevance to real security, but are required for reasons we don't fully comprehend. Our future work, malicious model security, that is not substantially more expensive, would be a great thing for us to have. Non-malleable security may be something useful here for certain applications that may come later. And more functionality than just additions, so that sort of cumulative function on the intersection may wind up finding some garbled circuits use there. Unclear at the moment, if that would be necessary, but it could still happen, so nothing's off the table. So how about our consumer applications? So, we have phones. We may have heard that machine learning is this big new thing that's gonna revolutionize the world. So let's take as a particular example the keyboard model on your phone. So your phone does predictive input. You can train a local model. You may have noticed your phone tries to do things differently as time goes on. We can actually do better if we were to take all of our local models and merge them. This is what we've been told by the machine learning guys. And in this case, they've told us that merging is a linear combination, so that's good. So we could sort of, without caring about privacy, do that. Just upload all of these vectors to Google, add them up or do whatever the linear combination is, and then we're done. Of course, the downside here is that your local model reveals what you've typed, so you'd be sending a transcript of everything you've ever typed, at least over some period to us, and we don't want that data from you and you don't wanna give it to us. So, differential privacy, yes. Although, it's not terribly easy to do it here. The local models, if you add enough noise to achieve the differential privacy properties, will be a lot less useful, since there's only one element in the data set and you've added enough noise to protect that one element's privacy. So we want the outputs to be differentially private after we've aggregated. So, it's an MPC application. We have a linear combination. In this case, it's really kind of a weighted average. We have a very large number of parties, but since it's linear, why not a linear secret sharing scheme? So, maybe something like that. We will just mask all of our vectors in some way, and we'll do it in such a way that if you add them, all of our masks will cancel. So, if I'm one party, then I agree with every other party on a common mask. One of us will add, one of us will subtract, so now the sum of our vectors cancels that. So, this is the basic idea of our protocol, and this has actually been studied in the past. This is not novel. Now, sharing that whole vector would be an awful lot of communication. It's gonna grow essentially quadratically. So, we want something less costly, and as cryptographers we all know, we can use a PRG. So, instead of this one-time pad with these random masks, we'll just use a stream cipher. Now we just need to do a key exchange to agree on common keys with each other, and that is significantly less expensive. We also have to handle device failures. So, phones do just stop participating in protocols sort of all the time. So, we need to make sure we can still take your mask back off. So, easy enough, right? We send out shares of our secret keys. Unfortunately, now if a failing party was actually not failing, it was just very slow. It will now send the message to the server, and now the server can see its private input. So, by being slow, you sort of break your own security. So, we'll just add another stream cipher on top of that, and have two keys that we send shares for. And so now the idea is that we will reveal at the end when the server says I stopped receiving messages from these parties, one key, and then for all the parties that completed it, we'll send the other key. And so, you either see this late party's encrypted input or you get to add it to the output. Differential privacy, there's sort of two ways we could do this. One would be to just have the server at it that will allow the server to see the non-differentially private output, which is not great, but if it happens to be the same company that ships the software for your phone and is doing a ton of stuff to keep your data secure, maybe it's okay. Another thing we could do is try to add noise locally, and we've been told by experts that actually we can add a whole lot less noise because we add everything up, and so we'll actually get the right amount of noise at the end, so that's good. Open question for anyone who is interested in trying to solve a problem. Can we do the distributed noise generation efficiently enough to have say, thousands of parties do it? And in this case, you must be secure against malicious parties because consumer devices are just not as accountable. And that is that, so I will open the floor to questions. Hi, so I'm glad to see that Google is considering privacy and using MPC with consumer devices, that's great. I guess my question is, what would you like to see from the academic community to help out with this? So in addition to these generic protocols that people are doing, there are frameworks coming out that are supposed to make it easier on software engineers, so what would you like to see from the academic community to help out with? Do you mean what would I, as someone who's working on this at Google, want? Or do you mean what would I like to see for the world at large to have? Both. What I would like is a generic protocol that is significantly less bandwidth intensive. So think like FHE, except without the enormous constant factor. So either make that constant factor way smaller or find a garbled circuits protocol that has sublinear communication, but not a huge constant. For the world at large, I think the frameworks are great. I think that once we have these generic protocols that are really practical enough for more problems, that will lead to this sort of MPC renaissance, because I think there's a lot of problems out there that people need to solve, and I think people are aware of that. It just hasn't, the costs so far seem to be outweighing the benefits for a lot of people. You want to alternate? I actually have two questions. The first one is how long was this in deployment, both these systems? And also I was wondering if you could comment on experiencing your experience debugging these things when they're running, because you're dealing with private data, you can't just open everything up. So I was curious about that. To answer that question, debugging is hard. And as for the deployment, the phone one will soon be deployed. And I don't know how much I can say about how long the other one has been in production, but I can say it has been deployed. That's great, thank you. In your example of merging predictive input models, what are you doing to prevent consequences similar to like the AOL search corpus disaster 2006 where you have this corpus and even if nothing in there is immediately attributable, there are things that are obviously passwords and somebody could just go brute force to find out what username that password belongs to. Right, so in this case, the differential privacy property is the answer to that. So imagine without worrying about how you got it that you had the merged model already. If you add the right amount of noise that was sampled according to the right distribution, you'll have the property that no small subset of the inputs matters. As in, you could have excluded a small subset of the inputs and the output distribution will remain the same, or at least indistinguishable. That's sort of the high level summary of the differential privacy property. So it's a way to say you can learn aggregates but not specifics. Okay. With the phone example where you're merging these models, it's true that you only care about almost sort of high probability contexts. So if this is some sort of markup chain model, you don't really care about sort of rare events across the whole sample, interesting computing aggregates. Is differential privacy the right notion or do you think that there should be research into other notions like, okay, I want to get the maxima of the sums of the sets. I don't want to see anything below a certain threshold. So I think the differential privacy property is the best one that we have right now. I would love to see research on new ways to think about that. As I said, theoretical results have a lot of value for that very reason. So if you have ideas, tell us, publish them, tell the world, please. Henry? Thanks for the talk, that was really interesting. So I was curious about the, in the case where you're merging these machine learning models, what do you do if a client just sends you garbage because it's secret shared garbage. You don't know that it's garbage, but when you add up the garbage with everything else, you're gonna get a garbage output. How do you prevent that or how do you handle it? Two ways. One is it's still kind of an open problem and two, we can usually tell when the output is garbage. A good model is sort of distinguishable from a random model. So if the best that we can do is to force cheating clients like this to give us random outputs, that's good enough. Thanks. I'm gonna ask one question. So you mentioned there may be more interesting functions on intersection that is interesting. Do you have examples of few things, functions that would be interesting or? No, but with PIE you can do affine functions, right? So if we ever had a case where we needed something that is not affine and there are a lot of functions that are not affine, then that would be an example. No specific example. No specific example that I can give here now. Is there a third speaker? That's where I was asking. We can go keep boring with the questions. Is it about the question? Yeah. So for this, you're taking this average are some amount, a subset of the players and if you do this multiple times and you take different subsets that can leak more information. Yes, it can. But I think again, this becomes more of the differential privacy. You can view that in a differentially private world where we say we'll just add more noise. By taking a large enough sample, you can get away with actually adding more noise than would be needed for just one round of the application. And I think I'm being told that we are done. We do have a third. Sorry, you can come chat with me in the back if you have more questions.