 The authors are Let Boyle, Kamin, Chong, and Raphael Paz, and Let will begin the talk. All right, thanks. So yes, this is a joint work together with Kamin and Raphael on large-scale secure computation. So as you may have guessed, the setting here we'll be talking about is secure multi-party computation. Okay, so we have some collection of parties, each with secret inputs, who want to evaluate some global function on these inputs without revealing any further information on them. So for this talk, what we're gonna be talking about is a setting of large-scale MPC. Okay, so here we have lots of parties and lots of data. And in this, to kind of give a motivating example, suppose that we could take all the hospitals, you know, across the country or the world and compute some global, say, statistical data or information about patients' genomic information or treatment histories. This sort of thing seems like it has a large potential for a lot of benefit. And it's something that you can't do without some sort of layer of security. For example, you can't just publish these kinds of information. Okay, and kind of just to hit home the point here, in this sort of setting, we have lots of hospitals, actually looked up statistics online. Apparently, there's about 5,500 hospitals just in the US. And also the inputs themselves will be lots of information, say, entire genomic sequencing. Okay, so what are the sorts of challenges? What do we have to face that's different in the large scale? So going back to these two points, lots of data, okay? Once we have lots of data, the types of programs that we want to evaluate become quite different. You know, this big data sorts of computations, things that are very lightweight. Maybe just look at small portions of the data. With lots of parties, we need to have protocols that will scale reasonably as the number of parties grows, which I'll denote always by N. Going into a little bit more detail. So starting way back in the beginning ages of multi-party computation, these protocols typically follow the following framework. Okay, so you have some function F you want to evaluate on the inputs. The first step is to convert this to some sort of circuit form. Then from the circuit representation, you can actually compile this generically into some kind of MPC protocol for evaluating it. And of course, so optimizing this step right here will end up with a protocol that's much, could be potentially much more efficient. So what happens when we start with some lightweight program? You know, think of something like a binary search or something like this and convert it to a circuit. So automatically, generically in this transformation, the size of the circuit can blow up as big as the size of the whole data. So it seems that as we're going to progress, if we're going to hope for things like these medical giant NPCs, then we're gonna have to take a step away from this circuit representation. And in fact, there's already been, I think kind of a move in the cryptographic community to start looking at things like modeling programs, say as RAM machines or RAM programs. So here, think of you have some CPU with small local storage that can make random accesses to a potentially large amount of information. And where the cost of each memory access is just one, you don't have to do something like scanning the entire database. And pushing even a little bit further in this work, we're also going to look at parallel RAM machines. Kind of playful over here with your Windows 98. But in a parallel RAM machine, same deal, except you have potentially quite a few of these CPUs that are all computing locally and accessing the shared memory in parallel. So one thing I kind of want to point out is that this parallel RAM model is quite expressive. For example, it captures models like MapReduce, which are used a lot today, that in order to highly exploit parallelism and random data access. So for example, in this step in the middle, you have potentially quite a few of these mappers and reducers acting in parallel. So what do we have in terms of multi-party computation? By and large, almost all constructions are in the circuit model. You know, all the way from original protocols, scalable, which I'll talk a little bit about later. All of these optimizations still require you to first convert the program to a circuit. I think there's been comparatively a sort of a sprinkling of works that have looked in the RAM model of computation, almost completely in the two-party computation model. And it seems that extending these techniques to the multi-party computation don't scale appropriately with N. So what do I mean by that? As just kind of an example, for instance, every single party in the protocol will have to maintain memory in these protocols comparable to the size of everybody's input. Oops, and of course, so we're the first actually also to look at the pure model. I don't have to point at that, that's there. Okay, so let me go a little bit further into detail about the scaling with N. What exactly, so this is an example of something that seems not so good, but let's be a little bit more explicit. So kind of a baseline, let's think about the original protocols MPC. These complexity as the number of parties grows, scales roughly as N squared times the size of the program. Okay, for example, things like GMW, for every single gate of the circuit, everybody has to communicate with everybody else. So we'll get this N squared term. So it's a really nice line of work, beginning with Damgard is shy, actually was mentioned also in the previous talk, that of scalable MPC. And they look at the setting where you have lots of parties. Of course, they're looking in the circuit model. And what they can achieve is bringing this term down to something like a relatively very small polynomial in N, plus the size of the program times only a polylogarithmic overhead in terms of the number of parties. So this is gonna be, whenever I say something like scalable MPC, this is kind of what I'm referring to. And what else, so there's two other additional properties that are potentially nice. For example, that of load balancing. Okay, so when the number of parties grows, the difference between the total amount of work and the amount kind of each guy's fair share of work can become quite drastic. So it makes sense that you would want to be able to design protocols that have the option of load balancing, not in computation and communication and also in memory requirements for each party. In addition, if you think about having all these people running some sort of protocol, the locality of communication might become important. What do I mean by this? I mean, so if you look at classic protocols, it's sort of a sort of standard that everybody ends up speaking to everybody else. But now, again, as the number of parties is growing, it makes sense to consider not just how much information you're sending, but also who you're sending it to. So for example, this small graph here can become something a little bit nasty with lots of parties running a protocol for the internet. So with communication and locality, we want to say, is it possible to design the protocol where the locality here is gonna basically correspond to the degree of the communication network at the end of the protocol for each party? Or in other words, how many people do I have to talk to during the course of the protocol? And you can trivially, so of course locality, and means you talk to everybody else, and we could hope to achieve something like polylog N. So here we are. In this work, we provide a protocol for scalable multi-party computation for REM programs. Okay, but you saw this fancy title with all sorts of business in it. What we actually do is even more and more, we achieve load-balanced scalable multi-party computation for parallel RAM programs with polylog-rhythmic and communication locality. And to be a bit more explicit, so we consider what I personally think is maybe one of the cleanest points in the giant multi-dimensional multi-party computation space. So we look at unconditional security with an honest majority. Our particular protocol requires slightly more than two thirds honest with static corruptions and of course, assuming secure channels. So let's get started. I wanna start out by giving you a roadmap of what our construction looks like and different intermediate steps. Okay, so somebody comes to you, they give you this, say parallel RAM program, PI. First thing we're going to do with it is compile it to what's known as an oblivious program. So here, this means that the, if you look at the sequence of the memory access patterns that take place as you actually execute the program on the data, that this possible distribution doesn't reveal anything about the inputs. Okay, so an example of something that's not oblivious would be something like binary search, okay? Because you can see if you move left or right, that's telling you about information in the database. So the notion of oblivious RAM was first conceived by Goldreich and Ostrowski back in, say, 96, this is the journal version. And it's become a really tremendously interesting area of cryptography on its own, right? Actually, I'm not gonna talk about it today, but there's been a lot of work in coming up with nicer, cleaner, better solutions. And also, so I'm citing here, this is a sister work to the present work together with the same authors where we extend this to the parallel RAM setting. So all you need to know for this talk is that such compilers exist. They only require polylogarithmic overhead and they don't require any computational assumptions. Okay, so everything is statistical here. Good, so from now on, forget about your original program, or just assume without loss of generality that it computes the same functionality, but that it has this nice property where the memory access patterns don't reveal anything about the secrets. Okay, so our next step is going to be, it's kind of taking a small step here in two different ways. So we're gonna take this program and convert it to a protocol, but there's two different reasons why this is not the ending protocol. First of all, so this will be scalable in the complexity sense I mentioned before, and it will be balanced in terms of the memory required for each party, but it won't be load balanced in computation or communication, and it won't be communication local. But even more importantly, this is going to be a weak step here, this gray region, where we're only going to achieve a very weak form of security. Okay, so we're used to things like starting with some sort of functionality, getting maybe like a semi-honest secure protocol and then compiling to malicious. Well, we're going to consider a step that's even much weaker than semi-honest. Here, we think of the parties, which I'll sometimes refer to as agents, as honest, okay? This is good for us. Makes it much easier to design protocols. By honest, I mean that it, everybody follows the protocol, of course, but also you can kind of think of that it's done obliviously, so it's not that I'm trying to gather information, it's that the adversary itself is actually an external observer. Okay, so the only thing that the adversary can see is just these sorts of things like communication patterns or say parts of memory that are accessed. We refer to these as external observations and we consider this notion, we call it oblivious security, partly because it's going to be quite related to oblivious programs. Okay. So from here, we take this weak, oblivious secure protocol and we'll show how to compile it to malicious security in the sense that I promised you. So here again, this is one third minus epsilon static corruptions and this step, it's going to be important that it preserves these properties that we've gotten, the scalability and memory balancing. All right, so how do we do this? We've got all these steps here. So this step here is actually, I can think of it as an abstraction and a formalization of some techniques that have appeared implicitly in prior works. So say that we start with this protocol that's secure if everybody's honest and the only thing that's leaked are these communication patterns and such. Now we're going to emulate honest parties by electing small committees that will serve, so there'll be a different committee for each of the parties in the original protocol. And as long as we can elect these parties in some way that each of them, or sorry, elect committees in the way that each of them has an honest majority, then we can just run small scale NPCs among each of the small committees in order to emulate the actual honest parties. And for example, so that whatever secrets were held by these honest parties will now be secret shared among the committee. Okay, so I'm not going to focus so much on that one. This one here is a new transformation and we'll see more about this later to be continued. And just to mention, so there are these works with some multi-party computation or in particular two-party computation in the RAM model go straight from here down to here in slightly different ways without the scalability and memory balancing. Okay, good, so we're kind of there, but again, I haven't given you everything I've promised. The next thing that we wanna do is go all the way. So I wanna get load balance, not just in memory, but also in communication and computation. And in addition, I wanna get communication locality. So we'll show how to do this in this layer of oblivious security, it becomes much cleaner. And then it will turn out that the same transformation will be able to take us all the way to malicious security. And this will be the final part here. Okay, so to recap, this first step we do with O-RAM or OP-RAM compilation. Down here, these two arrows are this abstraction with the committee election and then emulating honest parties via small-scale NPCs. So the two things I wanna talk about today are these two remaining arrows. And let's get started with the first arrow here. So keep in mind, we're starting with somebody gives us an oblivious parallel round program. And we want to get from that some sort of protocol that's secure in this oblivious sense that's scalable and memory balance. Okay, so the main idea for this step is to assign each job, whereby job I mean either one of these CPUs or some block of memory to an honest agent. Okay, so we have these honest parties or honest agents. I'm gonna say you get this CPU, you'll get that CPU and so on. Then to emulate the protocol or sorry, to emulate the execution of the original program, we're going to do this in a protocol by having each person who's emulating a CPU does the corresponding local computation. And whenever they're supposed to make a memory access, they communicate with the person who's playing the job of that memory location. Okay, so as computation goes on, we'll end up communicating in different patterns. So what does this give us? So the communication patterns of this protocol correspond exactly to the memory access patterns of the original program, because there's no, the only time anybody's talking to each other is because the CPU wants to access some piece of memory. But we started with an oblivious program, which means that we were exactly guaranteed that the access patterns didn't leak anything about the inputs. So we actually immediately get oblivious security. So how about the complexity of this protocol? Okay, well, so there's this polylog overhead from the ORAM compilation. Then the, so see that this protocol here is memory balanced. Why is that? Because everybody, each of these is only a very small amount of information, polylogarithmic in N. So if you, if everybody only has one, then of course we're balanced. If you have to sort of double up, then you'll assign them evenly. So nobody's too far off from one another. So from memory, we're good. But note that we're not load balanced for in terms of computation and communication. Indeed, these guys who are playing the CPU roles are actually doing all the computation and the people in the memory side only occasionally get contacted. And further, it's not local communication-wise because as time goes on, the CPUs will end up talking to large portions of other parties emulating the memory roles. So this leads me to the second arrow within the oblivious secure region. How can we take this and turn it into something that is load balanced and communication local? So here's my token slide where I crime on all the load balancing and locality. Welcome to the final session of the conference. Okay, I got three minutes. We'll do this and then we'll do a recap. Okay, so load balancing. What we're going to do, once somebody's performed too much work for a job, they're just gonna pass it to somebody else. Okay, so think of these parties or agents as workers and jobs being passed among workers. And it's important for this particular step that each of the jobs has a small state. So when I'm sending it to you, I have to send the state and that's not gonna be too much overhead. But the challenge is, how do I maintain the contact information? Okay, so you guys are all working. I can tell you're working hard. And you're passing jogs amongst each other. Now say I'm playing a CPU and I need to contact a memory location somewhere, but I have no idea who's playing that role anymore. So what we do, instead of having everybody potentially need to contact any other job, we actually think about the jobs as being situated on some low degree graph, okay? In particularly a Boolean hypercube for simplicity. So now say I'm the CPU and I'm trying to get to whoever's in charge of this piece of memory. I don't know who you are right now, but I'll know who my neighbors are in this graph. And whenever I want to send this message, I'll route it through this graph. Oops. And exactly this is the property that since I never need to talk to anybody except for my neighbors here, that whenever one of my neighbors swaps out, they have to tell me, but if somebody else swaps out on the other side of the graph, I don't care. I never need to be contacted. Okay, for communication locality, you can see I'm shrinking in space here so I'm getting more and more technical, right? So we use a similar sort of flavor of technique that again we want to keep, oh actually I should emphasize, so why is this not local? Because it's local in terms of jobs, but I want it to be local in terms of parties, but the parties are actually swapping out so over time I'll be on this node but then I'm playing this job over here and then I'm this job over there and so on. So over time I'll end up talking to a lot of different people. So to get the locality, we put another layer on top of this and route again now between the workers instead of the jobs. This ends up actually being significantly more complex to analyze. For example, you need to make sure that we're not breaking load balancing, whereas before that the amount of work that I did corresponded to when I had that job versus now I can get work, even if I don't have any jobs assigned to me because I need to route things to other people. But with some trickier distribution analysis, it works QED, so. And kind of just the core takeaway message of the previous slide, if you weren't following all that, which is perfectly acceptable, is that there's some really cool load balanced low congestion routing on the hypercube. We use this a lot in the protocol. It's an algorithm that goes back to Valiant and Brebner in 1986. And we use it in two different ways for load balancing and locality, but with a similar flavor. Okay, so to sum up, I think that this is a really interesting area and there's a lot of questions still open. So there's all sorts of things. I mean, again, we have one pinpoint in this huge space of multi-dimensions for multi-party computation. You can consider what if you take it to the computational setting? Can we get better communication or computation by leveraging computational assumptions? What about specific targeted protocols? What about adaptive security? For example, we use heavily these committee sorts of structures, which are not amenable to adaptive security. And also on the other side, I think it's very interesting to try to understand what are the limitations? What is the best that we can...