 In this lesson, we're going to be looking at a specific variation of the vision air cipher, which is called the auto key. And then we'll look at yet one more variation, which is called a one time pad, which is the first type of crypto system that we'll be looking at that is actually perfectly secure for encrypting a message. So let's take a look at what would happen if you wanted to try and brute force the vision air cipher. This might provide a little motivation as to why the auto key and one time pad are needed. So we've got a cipher text here with a specific keyword, just a two letter un. We set up our running key, which is a kind of name for what happens when you repeat your keyword over and over and over again, it runs as long as the message to create a candidate. And we can look at that candidate and realize now that that ain't it, that's not a plain text message. So maybe we try a different two letter keyword, maybe we try hi create our running key decipher the cipher text and that's not a candidate either. Maybe we do this a bunch. In fact, we try all of our two letter keywords. And none of them are looking right. So we jump up to a three letter keyword. We try something like ABC. That's not it. Maybe we try dog and we try all of our three letter keywords, but they aren't it either. Well, let's pause for a second before we go on the four letter keywords and figure out how many did we just try to see if this is really a path we want to keep going down. So to brute force our three letter keyword, assuming that we previously had tried all of the two letter keywords, we would have 26 times 26 minus 26. So 26 choices for the first letter in a two letter keyword, 26 choices for the second letter in a keyword. There's our 26 times 26. And then we're going to subtract off the 26 two letter keywords that are actually just like a or BB or CC and so on. Since if we did those, it's not really a visionary. It's more of a Caesar cipher. If we use the same key letter on every single letter, that's just Caesar. So let's we'll assume that whoever encrypted the message didn't just do a Caesar cipher. Well, there's 652 letter keywords then and using that same process to compute the number of three letter keywords, we would do 26 times 26 times 26 to compute the number of three letter keywords possible. And we'll subtract off the 26 a, a, a, b, b, c, c, c, and so on. So there's 17,553 letter keywords, which means that if we're trying to brute force this and we just start with the two letter keyword and then we move to a three letter keyword after that, we will have in total tried 18,200 keys just for a three letter keyword. Remember, these keywords can be of any length. In fact, if we were going to keep going and try n letter keywords, where n is just three, four, five, and so on. Well, once we did four letter keywords, we would have had our 18,200 plus another 26 to the fourth power minus 26 brings us close to a half million. n equals five, we've got over 12 million keywords to try. We'll jump up to eight, eight gets us up to 217 billion. n equals 10, we're up to 146 trillion. And when we get to 20, I'm not even going to try and read off that number. But you can see we get to a really large number of keywords that we would need to try in order to brute force this. And it looks like having a longer keyword then would make this message more computationally secure. Meaning it's going to be harder for us to use our computer to just brute force out the right key in order to get the cipher text. So question, how long would the most computationally secure key be for a message? Think about that for a moment. And the answer we're looking for is that the key should be the same length as the message. If we saw that the longer the keyword, the better. Then why not make our keyword as long as the message itself? That would be the longest key that we could considerably use. The problem is how do you create a key that that's long but is still easy to use? If I just rattled off a really long sentence to you, it might be hard to remember and get the spelling right. So we need some way to make it a little bit easier for us to use. And that's where the auto key cipher comes in. So to do the auto key cipher, which by the way is the cipher that Blaise Visionaire actually invented, we start with what's known as a priming key. Which is our keyword kind of like we've done before. But instead of taking that priming key and repeating it over the plain text, we're going to do something slightly different. By the way, in modern vocabulary, this priming key might also be called an initialization vector. It's a part of a larger class of ciphers called stream ciphers that we'll look at a little bit later on in this course. And the auto key cipher is some kind of classical version of a stream cipher. So anyway, we start with our priming key. In this case, we're going to use the word unicorn over our plain text message I saw Michelangelo at work. So the message we've seen in previous lessons. Now instead of repeating unicorn over and over again, what we do is we take the plain text itself, and that is what we start to put as the rest of our running key. So we have unicorn, I saw Michelangelo kind of starting and stopping there at the end of the L for our running key. And you might be thinking, well, why would you use the plain text as part of the key? Well, remember, you have the plain text when you're making the message, and you've got the priming key when you are creating the message. So you've got everything you need as the message creator. You might be wondering, well, how are you going to have the plain text to use as your key when you are the message receiver? Well, let's finish encrypting this message to get our cipher text, and then we'll switch roles here and see what would this look like with you or somebody who got this message and how you're actually going to be able to still decrypt this even though you don't have the plain text from the start. But let's just pause here and recognize that we've got something here that kind of met the criteria that we're hoping for from the beginning is that we have a key that is now equally as long as the message. And it's pretty easy to use. I still only needed one key word, and now I have this kind of nice running key. It doesn't have as much structure to it as the Visionaire cipher, meaning we don't just have the same keyword over and over and over and over again, which we'll see later on introduces some patterns into our cipher text that we might want to avoid to keep it secure. So here we've got a new cipher text message with the new priming key. Our priming key is the word queenly. And we'll see that we can start then if we have this message in the key, we can at least start decoding this message even though we don't have the rest of the key yet. Because as soon as we decipher the first character and get the plain text of A, that means we now have the next character in our running key. So we can kind of move that A up as part of our running key. And we can move on and decipher the next character in our message, a T, which then gives us the next letter in our running key, and so on. So we can kind of go one letter at a time and as soon as we get a new plain text letter, we can keep adding to the running key until we have enough to cover the entire message, and then we can decode the rest. So we can see that even though we didn't have the plain text at the beginning when we received the message, the priming key allows us to start generating just enough to keep us moving down the line until we've deciphered the entire message. So how do we defeat the auto key? There's a reason why this is still not in use today. It does have some weakness. We're not going to go into all of the details. It's a little bit beyond the scope of this course, but it's a very manual process. And the process goes something like this. We're going to start guessing some common words that we might think appear in our plain text, since that means they'll also appear somewhere in the running key. This is a kind of a bad part about the auto key, is that because we were using English language as a part of our key, we could probably guess at some of the words that are going to be in that running key, whereas if we just had one word that repeated over and over and over again, the chances of you guessing that one word correctly are probably a lot smaller than guessing just an English word that happened to be in your message. So while it added length to our key, and we'll see it added definitely some security for frequency analysis, some of these more manual guess and checks are going to be actually more vulnerable in the auto key. So I suppose that we think the word that is somewhere in our cipher text. We can see that if we just started at the beginning, we can start deciphering the message if that were in our message. And look at that plain text. I don't think that's probably English. So we just slide it over a spot. And that doesn't look like English either. And we just keep sliding it to the right until maybe we get it down here over the LLAM in our cipher text. And I see this English word pop out underneath in the candidate row, S-E-A-T, maybe that's seat, or maybe it is the end of one word and the start of the word attack at the end. I don't know. But it definitely has the right structure for it to be conceivably English. And if that were the case, and that was right, that means I know two other pieces of information here. I know that if S-E-A-T is in fact the correct plain text, that means S-E-A-T is going to be the next letters in the running key. So I can move that up and to the right. And it also means that T-H-A-T, that keyword, I guess, was a preceding part of the plain text. So we get a lot of information here that we can now continue to use. If I know the plain text and I know the cipher text, I can use that to reverse engineer part of the running key. And if I know the cipher text and I know the key, I can use that to get more of the plain text. So now we can work this message in both directions. So pretty manual. There's not a great way that we would know how to automate this using Python quite yet. But we can see that there's a really, once you get that first letter guessed correctly in your running key, the rest unravels pretty quickly. You have to have a good command of the English language because you're not always going to see words pop out, but maybe just groups of characters or partial pieces of words. And that's why for historical reasons, a lot of linguists were the best cryptographers is that they could see the structure and patterns and the languages and the partial pieces of plain text and were able to make the best guesses about what was right and what was wrong. Let's look at how Auto Key Security does for the frequency analysis. We can see here, this is the entire text of Pride and Prejudice with that priming word of unicorn. And we can see that everything here is pretty good. Everything is between about two and a half and four and a half percent. Not only anything is above 5% here. So it does a great job at disguising character frequencies better than any kind of standard vision error cipher could do. But it's not quite to that gold standard. If we could do this absolutely perfectly, all of those bars would be about 3.8%. That'd be one divided by 26. And we're not there yet, although we're getting close. But we're going to see there actually is a way to get even closer. And that's by using our one-time pad. A one-time pad cipher is an encryption technique where we're going to create that running key, but it is going to be made by a purely random key stream. So we're not using our priming word. We're not using the patterns that are in English language. It will be completely random. So this was originally kind of thought up of in the late 1800s. But it got really popular in the early 1900s by Gilbert Vernum when they used to use kind of these old ticker tape teletype systems for transmitting messages. They would actually use this one-time pad. And because if you could did it correctly, it would give you a truly unbreakable cipher. It gets its name from these sheets of paper. We'll see one here in just a moment where the key stream was usually printed before it was distributed. So in order for a one-time pad to work, you have to use these set of rules. Otherwise, it's all going to fall apart. The one-time pad should consist of truly random characters. So that's really hard to do. We'll see is generating randomness with a computer or some other tool is really hard. So they would often use like radio static. Or I've seen examples where they use like lava lamps because they kind of have this really hard to predict randomness. The key should have the same length as the plaintext. So we've seen that already. There should only be two copies of the one-time pad code. So just for you and the receiver. If anybody else has a copy, you're in trouble. You should only use the key one and only one time. That's really where things get complicated here is that you have to see there's a lot of work that goes to creating these and then you only get to use them one time. And so after you use that one time, you got to destroy them. Otherwise, somebody else might find them and be able to reverse engineer it. So you're gonna see all of these restrictions make this really, really hard to pull off in real life. So this is a real high value type of security that requires a lot of planning, a lot of ahead of time kind of meeting up. So you would only use this sparingly back, especially in the early 1900s. It's hard to really add these keys to people in person. And in fact, they used to kind of like hide them for spies and things like that. So the left hand side of the screen is kind of a code book. You would use one sheet that are numbered and then you would kind of tear it off and burn it and then you would use the next sheet on your next message. And then to distribute these would be kind of like really small pieces of paper. The one on the right is using numbers. So maybe not exactly a visionary cipher that we know about but same idea. They would actually hollow out these walnuts and put them in there. So you can just kind of like hide them in plain sight. So again, really difficult to implement this type of cipher. And it opens up this bigger picture, this bigger question in cryptography about how do you get the keys to people? If the key is a secret but you have to get that secret to somebody else so they can decode your message then how are you gonna get them that secret? If you could get them that secret information why wouldn't you just give them the message itself? So let's see this key distribution problem in cryptography is huge and it's not until we get into more modern times like the 1970s that we really start to get a good mathematical way to be able to kind of solve that problem. All right, let's look at a one-time pad cipher here. So here's our message. We will infiltrate the tree house at dawn and we create this one-time pad key. So this is a purely random string of characters that starts Y, Y, IVF and so on. And we encrypt, we get this cipher text down here at the bottom. So it works just like a vision error cipher. It just needed that randomness to get that one-time pad key. Here's another cipher text. There's our random key. So that's the same cipher text that we just generated. And let's say we try and guess the random keys. We're gonna try and brute force this. So first of all, that's a lot of choices there. You can see that there are 36 characters in our cipher text. So there's 36, sorry, 26 to the power of 36 choices here for our random key. And just by chance, this random key happened to give us a candidate that appears like it could have been the plain text. The people in the tree house are our friends. But this message just seems to be very different than the actual plain text that created the cipher text. And that's because when you are free to choose any random character to be in the key at any position, that means you could conceivably decode any of these kind of candidate plain texts that are real English messages. Kind of, and in fact, that means you could, in theory, if you were to brute force this, generate all possible messages that are 36 characters long. They're all equally likely to appear and you just start randomly guessing at the key stream. There's no way for somebody who's trying to brute force this to tell which is the actual plain text without knowing the actual true one-time pad key. We saw something like this similar when we're talking about substitution ciphers where if you just kind of randomly map plain text to cipher text characters, that you could get a lot of different plausible plain text come out. Now in that situation, you couldn't have literally any plain text come out because there was a one-to-one mapping between plain text to cipher text. It was mono-alphabetic. But with this vision air cipher, it is truly could be any message of 36 characters length that could pop out of there and they're all equally likely to occur. And that's why this system is so secure is that there's no way for an eavesdropper to know even if they could brute force it what the correct message was at the end. So if these keys work so well and there's so hard to generate and share, why can't we just reuse them? It seems like we've got a great system in place. Why do we have to just limit it to one time only? Let's take a look at why that is exactly. It's a good mathematical reason for it. So suppose that we generate a cipher text this way. We've got our message and our random one-time pad key and we use it to create the cipher text we see on the right, uc, edq and so on. And then we have a different cipher text that we create by getting a new plain text, begin, so on and so on. Same one-time pad key to get a different cipher text. Suppose somebody's eavesdropping when we send these two messages. So they actually have both of those cipher texts and they suspect that we used the same one-time pad key to generate them. Let's look at what they could do. If the first cipher text is plain text one plus the one-time pad key and the second cipher text is the second plain text plus the one-time pad key, if they were to essentially subtract these messages, meaning going character by character, converting them to numbers and taking the difference, mathematically what they'd be creating is the difference between the two play texts plus the difference between the two keys. But because we reused the key, the one-time pad minus the one-time pad characters essentially cancel out, y minus y is zero, y minus y is zero, i minus i is zero, v minus v is zero and so on. We end up with just the difference between the two plain text messages, which if you actually compute that, it's not readable, but what we've done in this process is that we've removed all of that randomness from the one-time pad that made it so secure. And what we've essentially reduced this problem to is another cipher text message without the security of the one-time pad, it has the same security as a non-one-time pad running key cipher, so kind of like auto key. And as we saw earlier, those are possible to crack. Now, they're not easy to crack, there's still gonna be a lot of guesswork, but it is possible. We've taken this from an impossible, a mathematically impossible task to at least a mathematically difficult task. So we're headed in the right direction. And that is why we should never reuse any keys in theory. There's nothing special about the fact that this is a one-time pad cipher that means we shouldn't reuse the key. It's just, there's a real big issue with reusing one-time pad cipher because you think it's so secure. And in fact, it is so secure, but only if you use it the one time. Here's another visual way to think about this. So instead of using about text, you might be thinking like, wow, I still think that's pretty secure at the end of the day. There must not be much patterns in that difference message because I can't read it, but let's look at it with images. I think the images show patterns a lot better than text does. So say we've got this image on the left, which is just a collection of black and white pixels. We can think about this one-time pad picture here outlined in purple as just a bunch of random pixels. Some are white, some are black, and we've randomly assigned them to the exact same dimension. So when we overlay them and we keep only the pixels that are different, we get this message here on the right. That's mathematically called an XOR, an exclusive OR. So they kind of compared each corresponding pixel, and if they were the same, they were one color, and if they were different, they were another color. And that's kind of our cipher text here on the right. Let's do the same thing with this picture of the smiley face. And so we have, again, a series of pixels, just black and white. That's our plain text, essentially. And then we have our same exact one-time pad from the first image. So those pixels are oriented the exact same place. When you combine those together, we get a new cipher text. And now those both are very secure. However, if we were to take those two images and then combine them by taking the difference again, you see that all of that randomness comes out of the picture and we have this kind of difference image between the two original plain texts, in this case the two original images. Now visually we can see those patterns very clearly are now in that result of the difference. They're the exact same patterns that would kind of continue to emerge in those two plain texts once we took the difference between them. A little bit easier to see with the picture, but the fact remains that if you reuse that one-time pad to create two different cipher texts, to create the difference that resulting difference has a lot of hidden structure and kind of a fingerprint of the original messages that you can now use to recover the original plain texts. That's it for the auto key and one-time pad cipher. We were not going to dwell too much on those, but they're just really historically and mathematically really important variations on the vision air cipher. And in fact, this idea of auto keying and one-time pad and these kind of running key ciphers are really important because they hold true to today. Many of our modern data encryption like cell phones or even DVD copy protection is built on these running key ciphers or stream ciphers as they're called now. We'll come back to those later on in the course, but for now we're going to focus our attention on the standard vision air cipher in figuring out how can we actually crack those messages.