 This talk is part of an online undergraduate course on the theory of numbers, and will be about RSA cryptography. So cryptography, crypt means hidden, and graphy means writing, so cryptography is just hidden writing. In other words, basically you want to send someone a coded message that other people can't read it. So early in the 20th century some number theorists like Hardy used to kind of almost boast about how useless number theory was. And RSA cryptography turns out to be a rather major practical application of a number theory, like a lot of internet commerce depends on this and similar things. So the basic problem is as follows. You have two people who are traditionally called Alice and Bob, and they wish to communicate with each other over the internet or a phone or whatever. They want to send each other messages they don't want someone to read. And in between them is someone traditionally called Eve, for Eve's dropper I guess, who reads all the messages between Alice and Bob. And Alice and Bob do not wish Eve to know what they're saying. For example Alice might be you, and Bob might be the owner of some internet store, and Alice is sending her credit card number and doesn't want Eve to know what it is. Well there are several methods of doing this. So we could use a code book. So Alice and Bob could share a code, so AAAA means something and so on. And this was a method used by navies in the 20th century. So the British and German navies, all the ships and submarines would carry these huge code books telling them what various sequence of letters really meant. Well code books are a little bit cumbersome. The Germans had something called an enigma machine, which was a very ingenious electromechanical device for turning messages into apparent gibberish. A third method is a one-time pad. So a one-time pad consists of a sequence of random letters or numbers that both Alice and Bob have. They have the same one-time pads and Alice simply adds the garbage in the one-time pad to her message and Bob subtracts it again. And if the one-time pad is random, this is completely secure. Well these methods all have various problems. So code books and enigma could actually be broken by code breaking. In fact the British broke the German codes, which has been very well publicised. But British keep rather more quiet about the fact that Germans also broke the British naval codes, but whatever. One-time pads are very secure but they're really cumbersome. In actually making one-time pads used to be a really tedious process. It was used for very high-secure communications. For example, the Russians tended to use one-time pads for their spies in the United States. Obviously it has a bit of a disadvantage that if you're caught and have a one-time pad on you, it's a little bit difficult to explain away. But all these methods have the following problem. We need to transfer something from Alice to Bob. I mean you need to transfer a code book or an enigma sheen or a one-time pad or whatever. And this is really inconvenient. If you're buying something off the internet, you don't want to have to take a code book or something along with you all the time. Or if the internet seller had to send all customers a code book, this would be really inconvenient. So the problem is how can Alice and Bob communicate without actually sending each other a physical object? So they're only allowed to communicate over a telephone line or whatever where Eve can listen to everything either of them says. Well, a method of doing this was found by Diffie and Hellman. At least assuming the existence of something called a trapdoor function. So what is a trapdoor function? Well a trapdoor function, you know, a trapdoor is sort of one way. You can go through it easily one way but can't go through it the other way. So a trapdoor function is a function s from some set to some set s. So f is, first of all, it's easy to compute. Second, it has the property that the inverse is hard to compute unless you have a secret key. Well, if you can compute f, why can't you compute the inverse just by trying all elements of s and applying to f to them to see what answer you get? Well, in order for the inverse to be hard to compute, s needs to be very large. So s has to be some large set. It might typically have say 10 to the 100 or maybe 10 to the 1000 elements. So it's completely impractical to find the inverse of f by checking every element of s to see what it is. And s might be something like say integers from 1 to 10 to the 100 or something. So you can encode messages by selecting an element of this set s. And then a and b can communicate as follows. So Alice chooses some trapdoor function f a, I'll call it Alice's trapdoor function, and publishes it. So both Eve and Bob know Alice's trapdoor function. Well, Bob uses f a to send messages to Alice. So what Bob does is he takes a message, applies f a to it and sends that to Alice. And Eve can read f a of the message but can't decode it because this function f a is very hard to invert. It needs massive amounts of computing power. On the other hand Alice can invert it and find the message because she's got the secret key that allows you to invert the function f. So that allows Bob to send secret messages to Alice. Well Alice sends secret messages to Bob in the same way that Bob chooses Bob's trapdoor function fb and publishes it. And Eve is then stuck unless they have a computer that allows them to sort of sort through 10 to 1000 cases or something that they can't read the messages. Well there's a problem here. We need to actually have a trapdoor function. I better say first of all there's a difference between a trapdoor function and a secure hash function. So trapdoor functions and secure hashes are very similar. They're both functions from some set s to itself that are easy to compute and the inverse is hard to compute. Well the difference between a trapdoor function and a secure hash is the trapdoor function has a secret key that allows you to invert it. But a secure hash doesn't. So there are actually several common applications of a secure hash I'll just mention. So here are some applications of the hash function. There's one very one that's been in the news a lot recently which is blockchain. So a blockchain is a series of blocks each of which contains a secure hash of the previous block. And this makes it this means that if you know each block you can verify all the previous blocks by knowing the secure hash because it's very hard to fake the secure hash. Another one is in Bitcoin. Mining a Bitcoin is sort of related to inverting a secure hash function. We want to find s so that f of s is in some small set. So this is very difficult to do and takes a lot of computation. You just have to keep on searching for all s until you find something f of s in that small set. So that's roughly what Bitcoin mining is. It's essentially trying to invert a secure hash function. There's another application for this which is password files. In the very early days computers would just store passwords in open text. And this was a real problem if a hacker actually found the password file then they would know everybody's password. So what you do is you hash all the passwords with a secure hash function. And then in order to check someone's password you just hash the password they've entered and see if it's the same as the hash you've got. And since the hash function is hard to compute if you don't know the password but have read the password file it's still not much use to you. Incidentally there's been a certain amount of speculation about whether certain public secure hash files are really secretly trapped or functions because of course if you had a trapdoor function and published it as a secure hash function then you'd be able to read everybody's secret messages that they've hashed. But as far as I know the secure hash functions in common use don't have secret trapdoors and no one's even been able to think of a way in which they could have secret trapdoors. So anyway we now come to the following problem. Find a trapdoor function. And the first example of a trapdoor function was found by Clifford Cox in about 1973. And for various technical reasons Clifford Cox never published this and it wasn't really known that he'd found this until about 20 or 30 years later. His trapdoor function was rediscovered by Rivest Shamir and Adelman a few years later and is usually known by their initials RSA. And it works as follows. So what we do is we pick P and Q as two large primes. Well what does large mean? Well large means pretty large. It means hundreds or maybe thousands of digits depending on how secure you want to be and how paranoid you are. And now what you do is you publish the product P times Q and you publish a large integer K. However you're careful not to publish P and Q so P and Q are kept secret. So you only publish the product you don't publish P and Q. And now your trapdoor function is given by F of X is X to the K modulo M where M is this number P, Q. So we've published M. And now you notice this is easy to calculate because we showed earlier that calculating powers modulo and integer can be done fairly rapidly by using writing K in binary and remembering to reduce mod M each step. So how do you invert it? Well we can invert it using the secret key. And the secret key is knowing the individual primes P and Q such that P, Q equals M. And you do this using Euler's theorem. So Euler's theorem says that X for phi of M is congruent to 1 modulo M. So X is congruent to X to the phi of M plus 1 modulo M. So if J K is congruent to 1 modulo phi of M then X to the J times X to the, sorry X to the J to the K will be congruent to X modulo M. So X goes to X to the J is the inverse of X goes to X to the K which was your trapdoor function. Well how do you find J? Well we need to solve this equation here. So J is the inverse. We can find it using Euler's algorithm because this is just finding the inverse of a number modulo phi of M. If we know phi of M. Well phi of M is easy because M is P Q product of two primes. So phi of M is equal to P Q minus P minus Q plus 1. So this is just P minus 1 times Q minus 1. So if we know the factorization of M then we can invert our trapdoor function. So there's your trapdoor function. If you know the factorization of M we can invert it. And if you don't then it seems to be very difficult. Well let's discuss a few problems with this. So problem one, how do we find large primes P and Q? Well remember these have hundreds of digits. You can't really test whether they're prime by testing all their factors. They just take too long. So what can we do? Well as we saw earlier we can use some various prime tests. For instance we can check to see if 2 to the x is congruent to 2 modulo x. And if this is true then x is probably prime. So you can just choose a random hundred digit number and test this. And if it is then you can use x as your prime. Well of course that's not certain because you might have run into one of these funny car Michael numbers and it might not be prime. Well this is actually a very crude primality test. There are actually much better tests that people use which you can even have tests, fast tests that make it certain that P is prime. But the point is that finding large primes is not actually very difficult. By the way the difficult part in this is that it's hard to produce random numbers on a computer. Ok well your programming language that you use probably has a random number generator except it's not actually generating random numbers. It's generating random numbers according to some simple numerical procedure and what it's actually producing is pseudo random numbers. And the trouble is if anybody knows what random number generator you're using they can easily simulate this and find out what your supposedly random numbers are. So you should never use a random number generator if you want to produce random numbers in cryptography. This is a kind of standard mistake. What you can do instead is use something like a common one is to time keystrokes. Your computer can measure the time you measure between different keystrokes in milliseconds and it can work out whether this number of milliseconds is odd or even and that gives you one random bit of information. So if you type lots and lots of keys you can actually generate reasonably random sequences that are hard for an attacker to find out. So finding large primes, large random primes whatever random means is actually you can do it if you're not stupid. The problem too, can we break it? Well one way to break it would be can we factor pq fast? And at the moment if p times q has say a thousand digits there seems to be no fast way to factor it. All known algorithms for factorising numbers take far more than polynomial time in general. It's unclear if there's a method of inverting the function without actually factorising pq. No one's managed to think of one and as far as I know no one's managed to rule out this possibility. So there's a bit of uncertainty there. We can factor pq very fast on a quantum computer. So for quantum computers there's something called Schor's algorithm which allows you to factorise large digit numbers fast if you've got a big working quantum computer. Well this leads into a bit of a problem because we don't actually have a working quantum computer big enough to factorise these large numbers. It's very difficult working out exactly how good quantum computers are at the moment because there's an incredible amount of hype put out by some people involved in quantum computing about the performance of their computers. I was going to have a sort of rant about this but I've decided that it's not really connected with number theory so I'm putting the rant into a separate video that you can link to if you have nothing better to do with your time. A problem three is there may be plenty of other attacks. So one problem is cryptography and trying to keep your message secret is stupidity. I mean mathematician Ian Castles used to work for the British in code breaking during the Second World War and he said once that they'd never managed to break a code without somebody on the German side doing something really stupid and there are all sorts of stupid things you can do. For example you need to use, suppose you want to have several different trapdoor functions using different pairs of primes, you might have p1, q1 and p2, q2 and so on and finding these large primes maybe you think it's too difficult finding thousand digit numbers so you come up with a great idea. You're going to pick one prime and then you're going to choose various other primes and form your trapdoor functions using these products here and this seems perfectly safe because you can't factor p1, q1, that's just as hard and you can't factor p2, q2 and so on. But if you do that you've done something really dumb because an attacker can take these two public keys and work out their greatest common divisor and that will be p. So if you reuse a large prime in order to save effort which at first sight seems harmless your codes will be broken by any competent attacker. Another example of something that can go wrong is so called man in the middle attack. So this is just one of, a typical example of many things that can go wrong with an apparently secure crypto system. So the man in the middle attack is as follows, you have Alice here communicating with Bob here and we have Eve in the middle but instead of just watching what goes between Alice and Bob Eve has actually taken control of some computer on the internet through which Alice and Bob's messages transfer. And now what Eve does is as follows, Alice sends her trapdoor function fA down the internet and Eve intersects it and sends Bob Eve's trapdoor function and with a little message saying it's from Alice Bob then uses Eve's trapdoor function to send the message back. Eve unravels it, reads the message and then uses Alice's trapdoor function to send the message on to Alice. So Bob has sent a message to Alice using what he thinks is Alice's trapdoor function but Eve has managed to read it by intercepting it and changing it. So even if you have a theoretically, you know, you might be able to prove mathematically that maybe you could prove mathematically that you can't factorise large numbers. This wouldn't actually mean your cryptographic system was secure because there are all sorts of other ways it can be attacked. Okay, I think I'll leave it at that. The next video will not really be about number theorists as said at the sort of rant about quantum computers and after that I'll get back to number theory.