 All right. Welcome. Our first speaker today is Matt Chang. He's talking about hacking on multi-party computation. It's okay to take pictures of the speaker for this talk. Give it up for Matt Chang. All right. Thank you. Okay. So, um, I want to talk a little bit about my experience with multi-party computation. Um, there's something that I first encountered about six years ago and I thought it was a really cool topic I wanted to share with you guys. So, sorry, what is multi-party computation? Well, the idea is that you might have multiple parties that want to, uh, that have private inputs to a function, but they want to jointly compute that function without revealing what it is they're computing. And I don't know, when I first heard this, I know that sounded like magic. But of course, it's just math, right? Which just something that might be just as good as magic. Um, so a little bit of history, one of the first problems that was posed, um, in the area of multi-party computation is called the millionaires problem. And I guess nowadays it'll probably be the billionaires problem because millionaire, that's no big deal now, right? Um, so the idea is that you have two millionaires that want to compare their net worth, but they don't want to reveal what that net worth is. So, um, how can they come, uh, determine who is the winner in that, uh, without, um, revealing it? And so, um, the answer that was first developed was, uh, to use something called garbled circuits. That's a pretty fancy name, but, uh, basically it takes a Boolean circuit and turns it into some sort of encrypted Boolean circuit and we'll see a little bit about how they do that. But first, um, one of the key concepts, um, in order to make this work is called oblivious transfer. And so the idea here, and this is I think one of the first things I was like, well, this is magic, um, is that say I'm, uh, I have two pieces of information and, uh, you want to request one of them without telling me which one you requested. Um, and I want to send it to you, but without revealing the one that you didn't request. And so, um, that's what oblivious transfer does. It allows you to request the piece of information and only get back what you request to without telling me which one you requested. Um, and this is just like in more formal language what that is. Um, okay, so what is the whole garbled circuit protocol? Um, so first you have to take your function and describe it as a Boolean circuit. So in the case of a, uh, the millionaire's problem you want to create the greater than comparator operator, uh, in terms of a Boolean circuit. And then of course this is a cryptography. So our two players in this protocol are Alice and Bob. And so Alice garbles or encrypts the circuit and she sends this garbled circuit to Bob along with her encrypted inputs. Bob receives the encrypted, um, inputs and does oblivious transfer to get his encrypted, uh, outputs or sorry, you get the encrypted inputs through oblivious transfer. Um, Bob then evaluates the circuit and then, uh, they need to, he needs to come, uh, communicate back with Alice to determine what the output of the function is. And this can either be done by, um, just having Alice find out what the answer is or they could share the answer. But, um, this allows evaluation of some of the arbitrary Boolean circuits but it is, um, relatively so it requires a lot of communication. Um, so a different, uh, mechanism for, uh, doing this kind of multi-party computation is what's known as homomorphic encryption. And this is something that, um, was observed early on with one of the first public key cryptography, cryptosystems RSA. Um, and, um, there was, it was observed that RSA exhibits a multiplicatively homomorphic property. Um, also another encryption scheme, which isn't quite as wildly used, but, um, was defined, uh, about the same time was, uh, Algomall. So they both have this multiplicative property. And just to give a brief overview of this, um, so, uh, briefly an RSA public key is, um, a pair of numbers N and E and, uh, I'm like, without getting into the details about how, uh, about E and everything. Um, N is a product of two large primes, P and Q. And to encrypt a message, we simply raise, we view the message as a number from, uh, zero to N and, uh, raise M to the E power. And it was observed that if you take the ciphertext to the two ciphertexts and you multiply them together, just because of the way that multiplying, uh, two numbers with raised the same power works, that's the same thing as encrypting the product. So, um, you have this property. You can, I could give you two, uh, or you could give me, um, two values that have been encrypted with RSA. I can just multiply them together and give that back to you. And then that's the, um, an encryption of the product of the original numbers. Now this is, uh, again with unpadded, uh, RSA, like sort of the textbook RSA, but, uh, it's an interesting property to observe. Algomall, um, is, uh, a different scheme that, um, isn't as widely used, but, um, and I'm glossing over a bunch of details here, but, um, the main, uh, point is that there's a public key that specifies a prime number, um, and I guess there should also be a generator there, and a, um, public value Y and to encrypt, uh, each time that somebody encrypts a message, they generate a random number R, uh, raise G to the rth power mod p and multiply the message M by Y to the rth power. And just like with RSA, this has a multiplicative property. If you do component-wise multiplication, um, the, in the first component, the exponents add, that's just another G to random number. And, uh, in the second component, you have the M1 and M2 multiplied together times Y to that same random number that G is being raised to. So, um, it's, uh, again, multiplicatively homomorphic, but, uh, something that's perhaps a little bit more useful, uh, and at least something that I actually had used before was, absolutely homomorphic encryption schemes. So, oh, that flowed. Um, the first one is, uh, additive Algamol. So, we could actually modify the Algamol scheme that we saw before to produce an additively homomorphic scheme. Um, and then there's also Pi A, which, um, actually works out a little bit better, but we'll start off with additive Algamol. So, this just requires a, uh, slight tweak to the original scheme. So, instead of, um, multiplying just by the message, we're going to raise, uh, the generator G to the mth power. Uh, and then, uh, when we multiply component-wise, the addition will happen in the exponent. And, though, there is a little bit of a problem with this scheme in that when you're trying to do the decryption, you'll do the same process as you do normally where what you do to decrypt is raise the, um, first component at the G to the x power. That's the private key. And, um, that will give you the y to the r1 plus r2. And then you can compute its inverse, do that multiplication, and you're left with G to the m1 plus m2. And that is not easy to come up with because, in fact, the entire security around this scheme is that it's hard to determine what the exponent is given the, just the number. So, um, that's what's referred to as the discrete log problem. Um, but there's still useful cases for this. Um, PAE, on the other hand, okay, this is a much longer key generation scheme. Um, it uses an RSA style modulus. And a bunch of technical details here that, um, just having for your sake, but I don't think that I really need to talk about all the details. Um, a lot of feedback. Okay, so, um, but the upshot of all this, so this is the encryption, uh, algorithm and then to decrypt, uh, it has this nice property that, uh, okay, it's going back. You see that just like with the additive algorithm, the message goes in the exponent of this G value. And so when you multiply ciphertext together, uh, you'll have the additive property, but, um, this essentially allows you to complete discreet, compute discrete logs, uh, easily if you have the private key information. Um, so, uh, so far I've just talked about a multiplicatively homomorphic scheme or an additively homomorphic scheme, but, um, what would be really nice is to be able to do both or have a fully homomorphic scheme. So have it so that it, um, can be added, um, compute addition and multiplication on the ciphertext. Um, it turns out that, uh, this is, this was, I think first accomplished in 2009 by Craig Gentry. And what he did was, um, he used this last base encryption scheme, um, that was somewhat homomorphic, somewhat fully homomorphic. And what that means is that, um, you can do addition and multiplication up to a point. In particular, you can evaluate polynomials up to a certain degree. And it turns out that with his scheme, um, you could write the decryption circuit in terms of, uh, this, um, in terms of polynomials that's small enough so that you can, um, sort of decrypt while, uh, homomorphically, uh, and then doing one more operation. So, um, this, you can sort of imagine this as you have like two locked boxes, one inside the other, and what you're able to unlock the inner box without having to unlock the outer box. Um, so that's, uh, how that works. I, currently it's still pretty slow. I know that there have been some improvements, but, um, it's generally a pretty slow procedure. Um, no, this is a, uh, so there was actually a tool that I've used in the past, um, although it was pretty tricky to get to work, but, um, it's the tool for automating secure two-party computation or tasty for short. Um, and the, the language for is a Python like language. Um, and what you do is specify what the client and server operations are and it would compile two separate binaries, a client and server for you. Um, and it actually supported both additively homomorphic encryption and garbled circuits. So you could actually combine those two operations into a single protocol. Um, so one example of how this might be useful is, um, say, uh, this is an example I can think of that, uh, in machine learning, uh, oftentimes, uh, when you're doing some, like, say a classification, uh, procedure, you, after you've trained your algorithm, you have some, some vector that, um, represents information about your, what, um, your algorithm is trained. Um, and this may be some sort of proprietary information that you don't want to share. Um, and correspondingly somebody who might want to, uh, run their information against your classifier doesn't want to actually share the information. So what you could do is, um, the client could encrypt their, uh, the information that would go into the algorithm and submit it. And on the server side, they, uh, keep the, uh, vector in plain text. And, uh, with an additively homomorphic scheme, you can compute the inner product of two vectors, um, where one is encrypted and the other is just a plain text vector, because, um, if you multiply a, an encrypted number by, an encrypted value by just a number, then that just works like, uh, the way that you would think about from grade school how multiplication works, right? Where you define multiplication as repeated addition. So you can do the same thing with an additively homomorphic scheme. So, uh, you could first do that to come up with a, um, a value and then, um, what you often have to do is compare that to whether or not, uh, it's greater or less than a particular value. So you could then use a garbled circuit to, uh, compare the, uh, resulting value with what, uh, the algorithm wants. So, um, it's, it was difficult to use. I found, I, I think I was only able to get this thing to work using Gen2, uh, and a lot of painful work. Um, but, um, part of the, the point of this talk was to show that, uh, there are actually some tools out there where you can start playing with these kinds, um, algorithms and techniques. Um, we're going to take a little bit of a detour from, um, the hardcore computational aspects and, uh, talk a little bit about zero knowledge proofs. So I'm going to like speed run through, um, a blog posting by Matthew Green. I'm really cribbing this section from his blog post. Um, he wrote, uh, an awesome blog. The, uh, I have listed in the references slide, um, that explains, uh, what zero knowledge is all about. And so the, the main idea behind it is that, um, I want to convince somebody that I know some information, uh, but without revealing what that information is, and we're going to go through a particular example that he went through, um, in his blog posting. Um, and so there's three properties of zero knowledge completeness. That means that if I actually know the information that I'm trying to prove, then I can, uh, prove it to you. Soundness means that, in fact, I can only convince you that I know the answer if I do know the answer to the problem. And the zero knowledge is the, the part where you're not able to extract information about the solution, um, from my proof. Um, so the example Matthew Green gave was, um, a graph tree coloring problem. So, um, he, he picked Google, he said like, maybe, um, some customer comes to Google and says, I want you to use all of your computational power to find a free coloring of a graph. Uh, and this could be useful for cell phone providers to make sure that they're not overlapping frequencies at cell towers. Um, but Google doesn't want to do all this work and then, uh, just give it away. And likewise, their client's not going to, doesn't want to pay for it unless they are actually sure that Google did the problem, right? So, um, you have some graph here and you have to make sure that, uh, you can find colorings so that with only three colors, so that no two colors share an edge. So here's an example coloring of that graph. But again, and this is a small graph. So this is, was one that, uh, would be fairly easy to just do by hand. But imagine you have a much larger one. Um, now, uh, Google wants to hide all the information. So it's going to put hats on, on top of its graph. You imagine that, uh, they have this big room with the graph laid out and underneath it, it's colored, but they're going to cover it up with hats. And what they're going to do to prove that they've solved the problem is say to the customer, I'll let you pick two hats and to reveal what colors they are. Um, and if the customer does this enough times, they will ultimately be convinced that Google did actually do the three coloring. So they pick up those two and they see, okay, node seven is purple and node eight is blue. But, um, one thing that Google doesn't want, so the client's like, okay, well I might just gotten lucky that one time. I, I want you to prove it to me by letting me do it more times. But if we just left the coloring the same, then they could learn most of the coloring and maybe even be able to figure out the rest of it, um, on their own. So what Google does is each time the customer picks up those two hats and puts them back down, they randomly recolor it, um, but still making it a valid three coloring. Uh, and so no information about which colors were in the first try will help somebody on the second try learning more information about, um, the three coloring. At least that's what we hoped for. Um, and so, um, the, I hope it, it's fairly clear that the completeness and soundness portion, um, should be fairly clear here. The interesting thing is how do we know that this is really not leaking any information about the three coloring? Um, and so that's where the, the blog poster was, uh, really informative for me was that, um, it illustrates that the way that you have to think about is how could you cheat this protocol? And so, uh, math degree solution was time machines. So, um, now what does he mean by that? Well, suppose that Google was treating and just came up with a random coloring and anytime they got caught, they hop into a DeLorean, go back in time and change the coloring. Um, it was like, they know which two hats they're going to be picking. So they're like, all right, we're just going to make sure that those two colors are different. Um, now, uh, it turns out that this procedure would be indistinguishable from if they had actually done it correctly. Um, so the point being that if the customer was able to figure out information about the three coloring from this protocol, then they would have been able to do it from the cheating protocol with the time machine, but that was garbage. So that's the, the general idea of the proof of that. Um, they're not able to extract it when it's actually done correctly. So, um, this is again another tool, um, developed by, uh, the folks at JHU and Matthew Green in particular, um, which I haven't had a chance to play with yet, but I've been meaning to, uh, it's, um, called charm and it supports a number of crypto primitives, including, uh, as it says here, like some, some basic math things, rings and fields, uh, it looked at curves, and, uh, zero knowledge proofs, all that kind of good stuff. So, um, something to take a look at. So why did he go off on this zero knowledge stuff? Well, there are two, uh, security models for, um, these protocols. Uh, there's honest, but curious and then the malicious model and, um, honest, but curious is what you might think. It's just, I'm going to run through the protocol correctly and then I want to get information about, um, what, uh, the other person had put into it, but I'm not going to try and cheat and the malicious one is I'm going to try and cheat as much as I can and you have to try and catch me. And, um, so that's where the zero knowledge proofs come in is that the zero knowledge proofs are used to detect cheating and allow somebody to abort if the cheating is detected. And, um, so I'm going to briefly talk about, uh, like really briefly talk about this protocol that, um, I helped, uh, do some implementation work with, um, six years ago. Um, it was called, it was a secure text pattern matching algorithm called 5 p.m. And, um, I'll briefly illustrate, this is the sort of insecure version of it. The idea behind this protocol is that, um, you set up these, what are called character delay vectors that tell you that if a letter in the pattern that you're looking for is found, now let's just assume that we found it in our pattern. How might, how many steps further would we have to go before we reached the end of the pattern? We put a one there and, um, so that's what the, the, um, vectors over on the left are. And we just go through the text on the server side and we add and shift down. And, um, we can use an additively homomorphic encryption scheme to replace the, replace this insecure version with, uh, version that operates on encrypted data. And, um, so you can use Pi A or additive algomall, elliptic curved algomall. Um, and, uh, that's how the, the honest but curious version works. In the malicious model, um, we use a couple of things. One thing is called threshold encryption, which allows, uh, the two players to jointly come up with an encryption, uh, key that they will then use to, uh, encrypt both of their inputs. They actually swap, both go through, um, that same, uh, text search protocol, um, using some linear algebra and then they use zero knowledge proof to detect if one of the other is cheating an, an abort, uh, in that case. And so I'll quickly conclude with saying that, um, cryptography generally, um, is kind of about, uh, any time that you want, you wish you had some third, trusted third party who could handle information for you, uh, we're gonna use cryptography to do that instead. And there are some tools out there that you can try to use, um, you can try to use some of your own information. I actually just recently redid the honest but curious version of that protocol in Python like during a hackathon at night really quick. Um, but of course don't lie on anything that's not been verified. And, uh, with that there are some references, uh, but I'll be releasing the slides, uh, after talk. So, uh, I'll tweet about that at, at Crypto Village. So I think I'm just out of time. So thank you.