 So today is going to be something of a change of pace. I'm going to do a quick introduction to public key cryptography with a bunch of different aspects of it. Those of you who went to Christel's talk the other day, there'll be a little bit of an overlap. And it'll probably overlap with some of the stuff she's going to be doing later also. But it never hurts to see things twice from different perspectives. Anyway, also some of you may have seen some of this. Hopefully you won't have seen all of it. OK. So what is problem does cryptography try to solve? It tries to solve the problem of two people communicating with each other securely so that other people don't know what they are communicating. So we suppose that Bob wants to send Alice a secret message. And the adversary Eve, the eavesdropper, who can read the message that's transmitted. It's by radio or over the internet. But still Eve can't figure out the message. And this says cryptography in the dark ages, which means pre-1970s, so not that long ago. So here's how stuff was done. First, Bob and Alice would share a secret key. Might be a number or certainly back in the olden days, it might have been a key phrase or something like that. Bob uses the secret key to encrypt his message. And he sends that encrypted message to Alice. Now Eve gets to see the encrypted message, of course. Alice uses the secret key to kind of reverse the encryption process and decrypt it and read the message. But Eve, who can see the encrypted message, can't decipher it because she doesn't know the secret key. All well and good. There's one drawback to this scenario, which I mean, in some of this, I'm not sure people even thought of it as a drawback. They sort of thought of it as intrinsic to the whole process. The problem is Bob and Alice have to exchange the secret key. This could be done in person or by trusted courier, I don't know. But anyway, so there's the problem, that there's the secret key that they need to share before they can do anything. So here's sort of the problem. Suppose Alice is Amazon and Bob wants to buy something and needs to send his credit card number. Well, Bob is not going to go to Amazon headquarters and get a secret key. Amazon could send a secret key to Bob over the internet, but then Eve could read the secret key also. And needless to say, Amazon didn't exist in the mid-1970s, but nonetheless. So at that time, Diffie and Helman proposed creating a crypto system that didn't use a single secret key, but used two keys, one of which was literally secret, but the other of which was public, that not only did Bob and Alice both know, but Eve knows the public key, also everyone knows it. And the idea was that in order to encrypt a message, you only need the public key. So Alice has this private key that she keeps secret and a public key that she publishes for everyone to see. And Bob only needs that public key to encrypt the message. But to decrypt it, you actually need to know the secret key. So only Alice can decrypt the message. OK, so that's the meta idea. It doesn't tell you how to do it, but it's an idea. It was actually kind of a really, I mean, it was really a breakthrough idea that this might even be possible, even if they didn't quite know how to do it. And I'll just mention, because these are the terms used in cryptography, that the message that is being sent that can be read is usually called the plain text and the encrypted message is called the cipher text. All right, all well and good. But Diffie and Hellman actually, in their groundbreaking paper, did not propose how to create such a system. They proposed something similar, what's called a key exchange. But they did not come up with this. But these are now called public key crypto systems or, in fancier terminology, asymmetric crypto systems. I usually use public key, which is a little bit of a misnomer because there are two key systems. There's a public key and a private key. But OK. So as mathematicians, let's reformulate this mathematically. You can think of setting up crypto systems and sending messages just as functions. And a public key crypto system would look like this. OK. So we have a set of possible public keys, some big set, with associated private keys, a set of private keys. There's the set of plain texts, the messages that Bob might want to send. And what would happen is Bob, well, this function, there's an encryption function. Its input is an ordered pair, a public key, and a message. And its output is an encrypted message, a ciphertext. And then there's a decryption function. And the decryption function also takes two arguments, a private key and an encrypted message. And it gives you an output of plain text. OK. It's normally assumed that Eve knows what the encryption and decryption functions look like. This was a big deal in cryptography from roughly 50 BC up until the 1800s. So a long time, maybe it was better to base the security on keeping the encryption and decryption methods secret. It turns out that was a very bad idea. So the only real secret for encryption and decryption is Alice's choice of her private key. All right. So that's what the functions look like. This doesn't really tell you what they should do, but that's easy enough. The idea is if you take a pair consisting of a public key and a private key, and they are an associated pair, so a valid public-private key pair, then for any message, any plain text, if you take the message and encrypt it using the public key and then decrypt it trying to use the private key, you'll get the message back. But actually, I don't think I wrote this. But if you try to do this with any other private key, not the one associated to this public key, then you should get garbage back out. You don't get the message. OK. And since Eve knows the public key, that's great. But that doesn't let her decrypt messages, nor hopefully doesn't let her figure out which private key goes with Alice's public key. If she can do that, then she's just like Alice. She can decrypt everything. So for example, the set of private keys needs to be big enough that you can't just do a brute force search through the whole thing, or even a cleverer search. All right. So these encryption decryption functions, I mean, they're sort of inverses each other. You encrypt and you invert it by decryption. They're basically what are called trapdoor functions, which are functions that are very easy to compute. You don't want this encryption function to take 30 years to compute. You want it to take 30 milliseconds, or maybe even less. You want inverting that function to be really hard. If I give you the ciphertext and even the public key, I don't want Eve to be able to invert the encryption function and get the plaintext. But this is the trapdoor. There's some extra piece of information, in this case the private key, that makes f inverse easy to compute. So the extra information is often called a trapdoor. Interestingly, this is one of the big problems. We actually don't know trapdoor functions exist. But we build things we think are trapdoor functions. OK, so how do you build these things? Well, they're generally based on hard math problems. Now, I like to say it's very easy to write down a hard math problem. I'm sure you know lots of hard math problems. In fact, it's very easy to write down math problems we don't even know how to solve. The trick, though, is we want them to have this trapdoor thing that's with some extra information, you can solve it. And that's trickier. So you want a hard math problem that if I give you this math problem, you can't solve it. But if I give it to the person next to you with this extra information, it's really easy for them to solve. And what I'm going to do is I'm going to go through, I think, four hard problems that are used currently, and there are others. But these are four of the main ones used to build crypto systems. And I'm just going to describe the hard problems first. Then I'll tell you how to build crypto systems out of each of them. The first one is the integer factorization problem. If I give you two big numbers, two big primes, say, multiplying them is really easy, depending how big they are, a fourth grader can do it. If I give you the product, factoring is much, much harder, especially if the numbers are big. So that's the integer factorization problem. But that's not actually the problem I listed here, is it? OK. So what I listed here is just exponentiation modulo PQ. So the information, well, I'm going to give you the product P times Q. I'm going to give you this exponent E. I'm going to pick my favorite secret X, and I'm going to tell you X to E reduced mod PQ. And the hard problem you have to solve is figure out what X was. Turns out this is very, very easy to do if you know P and Q. And you've probably seen this, but even if you haven't, I'll show you how in just a minute. OK. So it's easy to compute, but hard to invert unless you know P and Q. Question mark. We don't actually know it's hard. We don't have a proof it's hard, but we suspect, think it's hard. And this is used to build the RSA cryptosystem, invented by Rivest, Shamir, and Edelman. Although there's an interesting story with the British Secret Service that I don't really have time to talk about now. But anyway, the second one that Christel talked about quite a bunch is the discrete log problem. That RSA problem on the previous slide was raising things to a known exponent. So unknown thing raised a known exponent. The discrete log problem reverses it. You take a known thing and raise it to a secret exponent. And then the goal is to recover the exponent. OK. So Fp star is the field with P element. So this is the multiplicative group there. And you just map Z mod P minus 1 Z tick. And it's an additive group by taking a K and raising this known quantity to the Kth power, mod P. Easy to compute. I mean, if K is a 100-digit number, how easy is it to compute something to the Google power? Well, it turns out it's actually really easy. There's a very fast algorithm for exponentiation. And so it's easy to do this kind of thing. But we believe hard to invert this process, to find the exponent if I give you the power. And this is used to build something called the Algomal public key cryptosystem. It was also used by Diffie and Helman in their original paper to build what's called a key exchange, which isn't quite a cryptosystem, but it can be used to exchange information. And the Algomal public key cryptosystem, well, you fix some big prime P. I give you this G, which usually is a generator for this cyclic group, but it doesn't actually have to be. And I tell you what G to the K is, but I don't tell you what K is. OK. That was the discrete log problem on the multiplicative group of a finite field. You can talk about the discrete log problem for any group. It's just I give you an element of the group, and I give you that element multiplied by itself using the group law K times, and you have to recover K. So another group that gets used a lot is the group of points when an elliptic curve over a finite field. Probably most of you have seen elliptic curves. If you haven't, don't worry about it. It's just another kind of group. It's not going to come up again in my talks. It will come up again next week in some of the other talks. Why should we bother using elliptic curves? So on the one hand, I have the multiplicative group of a field. The group law is multiply two numbers, take the remainder mod p. On the other hand, I have this elliptic curve, which has a group law that's given by this very messy polynomial formula. Why should I bother with this more complicated group operation instead of the less complicated group operation? OK. Well, the reason is because as far as we know based on current algorithms, solving the elliptic curve discrete log problem is actually quantitatively harder than solving the discrete log problem in the multiplicative group of a finite field. And I'll give you some formulas later, which illustrates just what I mean by quantitatively different. But what that means is since the elliptic curve discrete log is harder, we think, that when you use it for cryptography, the t sizes are smaller. The message sizes are smaller. The ciphertexts are smaller. Why is that good? Well, honestly, if you and your friend are sending information to each other, any messages using your laptop, it is completely irrelevant. Who cares if there are messages or 2,000 bits instead of 200 bits, right? I mean, those are bits. I mean, how many gigabytes does your computer have and transmission rates? But if you're a company like Amazon, it makes a difference. You're doing an awful lot of data transmission. It also makes a difference if you have limited bandwidth. So for example, how many people who, when they flew here, took a look at that 2D barcode on their plane ticket? Yeah? I mean, those are all over. At least some of the airlines they used to, I think they probably still do, on those embedded in that barcode is a digital signature. And it's an elliptic curve digital signature. And why is an elliptic curve digital signature? There's not enough bits in one of those things for an RSA signature. So actually, one other example that I love is what kind of digital signatures are used by blockchains like Bitcoin? Again, elliptic curves. If you're invested in Bitcoin, your investment is protected by the discrete log problem on elliptic curves. OK. So again, the reason we do this, we want smaller key sizes, smaller messages. You always want to be maximally efficient. And people push the limits of the efficiency until the security gets broken. You've got to be very careful. But you're always trying to be maximally efficient. When I say be maximally secure, that's not really what I mean. What I really mean is you have to be secure or the things useless. You want to be maximally efficient for the given security. The last problem is one that I already spent two lectures talking about, namely the closest vector problem. And here's how one would sort of formulate this. What we'll do is we'll pick a lattice and I'll publish a bad basis for it. Remember, the bad bases are the ones where the vectors are not very orthogonal to one another, very small angles and things. And then if Bob wants to encrypt his message, say it's n bits, so each of these epsilons is 0 or 1. So that could be a message. Then he simply forms the vector in the lattice this linear combination. Now, if he simply sent Eve this linear combination, it would be totally trivial. I'm sorry, sent Alice that for Eve to decrypt it. If I give you some linearly independent vectors and I give you a linear combination of them, it's just inverting a matrix to find the coefficients. But Bob's cleverer than that, he throws in a small random vector. So now the vector he sends to e, to Alice, isn't in the lattice anymore. It's just close to the lattice. And the way Alice decrypts is because she has a good basis, she can actually find the closest vector, which will be this one. And then she can just use linear algebra to get the epsilons. So that's the basic idea. And I'm going to discuss cryptosystems built from lattices in lecture four, so tomorrow. So for today, I'll just quickly run through how one builds public key cryptosystems with the other hard problems that I mentioned. And again, the details, if you've seen them before, this will be review, great. If you haven't, don't worry about the details. The idea is just to sort of see how one uses mathematical problems to do things. And in the problem session, you can start playing with some of these, especially the ones you haven't seen. They're really fun to, for example, prove why they work. So for RSA, the private key is these two big primes. Those are my secret, or Alice's secret. And the public key is the product of those two primes and an encryption exponent. You need to be a little careful with your encryption exponent, but they're, I mean, a random choice is good. But people like to take E as small as possible for efficiency. Some people suggest they're using E equals three. And it's a little risky. OK, the plain text would just be a number mod PQ. And the site for text is you, Bob takes his secret message, raises it to the ETH power, where E is known, reduces mod PQ, so it takes the remainder. That's his site for text to see. The way Alice decrypts is she raises C to a certain power to undo the E. So what she wants is D times E to be congruent to 1, not PQ, because you're in the multiplicative group. So basically, modulo the order of the multiplicative group, mod PQ, which, if you remember from your number theory, it's Z mod P star cross Z mod Q star. So it has P minus 1 times Q minus 1 elements, which is this number. So basically, Alice has to solve a congruence modulo this modulus that she knows, because she knows P and Q. So she can find this D and she can decrypt, and all is good. Eve, who would also like to find this D, one way to do it would be knowing this modulus here. But since she knows P times Q, knowing this modulus is the same as knowing P and Q. So Eve can break the system if she can find P and Q, but that's OK, because we're assuming factoring P times Q is hard. Now I've cheated a little bit here, because the underlying problem is not factoring. The underlying problem is if I give you M to the E, take the E-th root mod PQ and find M. OK? So the actual RSA is actually based on the problem of finding roots, E-th roots, mod PQ.