 So we see the vision air cipher. It's simple. We're just using the Caesar cipher, the generalized Caesar cipher. But we're just changing the key, each letter. And note the key word in this example is serenton. And because normally the plain text is longer than the key word, you repeat the key word as often as necessary. So if in our example on the board, the key word was S-I-I-T, you just repeat it, S-I-T. And you don't have to finish at T. You just repeat the letters as often as necessary, such that the key word is as long as the plain text. Why do we choose a short key word? Why would I choose a key word like serenton or S-I-T? Well key words, remember, they are a secret. And both parties need to know that secret. So when I encrypt something, and I want someone else to be able to decrypt something, they need to know the corresponding secret. So in these classical ciphers, at least, when we're using key words, then we use something that's easy to, for example, remember and to keep secret and easy to exchange with someone else. So normally they are words. Having a random key word is like having a random password. Who remembers random passwords? Not many people. You, same with passwords, you normally have words which are easy to remember, not too long, but we'll find that that's not very secure. So in the vision air cipher, the idea and how it was used is that you have a plain text and the plain text we use in the examples are short, several words. In real life, in a message you want to send someone, it will be many words, maybe a large document. So normally a plain text is quite long, but for convenience we keep the key word short, so that we can remember, so we can exchange it with someone else. So in this case, we must repeat the key word as often as necessary such that we have enough letters in the key to be the same length as a plain text. And the benefit is that, compared to the mono-alphabetic cipher, the benefit is that the output cipher text, the output cipher text letters change for the same input plain text letter. And we see that in some cases here. Our plain text is quite short. Imagine you have a plain text which is a thousand words long, so much longer. Then, and you look at all the ease, you'd expect to see about 12% ease in the plain text, because we know from English language that approximately, not always, but approximately the most frequent letter is E and about 12% of the letters in the plain text would be an E. It depends upon the source material, but it's around about that for at least for large input plain text. So in the input plain text, we expect about 12% ease, but for the output cipher text, the corresponding letter changes. E here maps to M, to L, to R, to V, and if we keep looking at the ease, we're likely to get different output cipher text letters. So there's not this direct relationship between the output cipher text letters and the frequency of letters in the plain text. And that's what makes it more secure against the mono-alphabetic cipher. With a mono-alphabetic cipher, we can quite easily break it by looking at the frequencies of letters in the cipher text. Not so easy in this case. But there is a problem, and it leads to a way to break this cipher. The problem comes from the keyword. If we have a large plain text, a large document that we want to encrypt, and our keyword is Sirenthorn, we have to repeat it many times. And because we're repeating the keyword, there still may be cases where we get a letter on the input plain text which is encrypted with the same letter in the keyword. So if we repeat the keyword, it's possible, and it's likely the longer the plain text gets, it's likely that we'll have an E that is encrypted with R and we'll get V. And then sometime later, it's likely to get E encrypted with R, we get V again. So we still may get some patterns in the cipher text, but not as obvious as when we use a mono-alphabetic cipher. So the limitation of this cipher is that we use a short fixed length keyword, and that keyword has some structure. It's a word that we recognise. The way to overcome this limitation, and in fact there are ways, I think on the next slide, there are ways to break it. It's much harder than mono-alphabetic, but it turns out that if you can look at some structure in the output cipher text, you can eventually learn the length of the keyword. Once you know the length of the keyword, you see that it repeats, and then you can start to analyse in a similar way the frequency analysis of letters, and it's possible to break the vision air cipher. It becomes the same effort as breaking a multiple mono-alphabetic cipher. Harder than mono-alphabetic, but possible. The weakness is the keyword. Two things. It repeats, and it's structured, and it's a word that we recognise. It's not random. So the solution, choose a keyword which is random, and that it's long. And as long as the plain text. So that's how you make the vision air cipher strong, and unconditionally secure, is to choose a long keyword that's random. Okay, if you've got questions, that's okay. We have time. So we repeat the keyword in vision air, but to make it secure, do not repeat the keyword. And that's what the one time pad does. There are different variants of it, but looking at our alphabet, we use a random key, which is as long as the plain text. If your plain text is a thousand characters, your key is one thousand characters, random characters. Not a word, not words combined together, but random characters. If you do that, you get an unconditionally secure cipher, an unconditionally secure algorithm. No way to break it if you do that correctly. And that's the best cipher that we can get. Let's give some examples, or see an example of why. First, there's a, I know there's an error in this, this one. Let's see if I can find what the error is. This is the one time pad. It's the same as the vision air cipher. The difference to make it a little bit easier, we introduce the 27th character, a space. Before we've dealt just with the 26 characters, the 26 letters. To make it a bit easier, let's include a space, which is just a 27th character. And what this example shows is a cipher text, that an attacker has intercepted, and then the attacker has done a brute force attack. They've tried all possible keys. Many of them produce random plain texts. But here are two examples. They've tried this key, key one, and this other key, key two, and they get two different plain text messages. Which is the correct plain text? Now there's an error in here. It was picked up last year by the students. Just make note of it, and then we'll explain what happens. The error on the slide is these first three letters in the key. MFU should be, let me check, MFU should be PFT. You can change that. So change this key to MFU. The first three characters should be PFT. That's an error. And in fact, it's an error in the textbook. So I copied from the textbook, and the textbook was wrong. And the students picked that up last year. So what's happened here is the attacker has intercepted some cipher text, and had tried a brute force attack. There are many possible keys. We've got many possible keys. We try them all. Let's say we have a very fast computer. We try them all. We look at all the possible plain texts. Many are random. That is, they make no sense in English. We know the message is in English. Many of them are random, but we start to see some plain texts. The first one we come across is this one in English. Mr. Mustard at the candlestick in the hall. Okay, that makes sense. All right. I don't know if it makes sense, but it's English. Makes sense if you know some stories. Pluto. But then we keep trying on the keys, and we find another plain text which also makes sense. Now the attacker has this question. Which plain text is the true plain text? The person who encrypted the plain text chose some plain text and chose a key and obtained this cipher text. But the attacker doesn't know which plain text is the right one. In fact, if you try all the keys, you'll find more plain text that actually makes sense. So the security of this lies in that even though we can try a brute force attack, there's no way to know which is the true plain text of this message, and which is the true key of these two, for example, you don't know which one's the right one. If you've got two plain texts which both make sense, you can only guess which one is correct. And in this case, if you guess, you could be completely wrong and it's of no benefit to you, especially when there are more plain text values. So that's where the security comes in or the unconditional security. Even with the brute force attack, no way to know what is the correct plain text. How do we come up with these values? The same as the vision air, except we introduce a space just to make it easier to read. You don't need that space. So we talk of plain text, chose a random key. Note that the key is as long as the plain text and random. There's no structure in this key. It's not a combination of words. So it has to be as long as the plain text and random. The space is just another character. Remember when we map letters to numbers, it becomes 26. If we go 0, 0 through to 25 is A through to Z and then space is just the number 26. We can add as many characters. You can add punctuation marks if you like. It just increases the alphabet. What does it mean by random key? Look at the key one and key two. Do they make sense to you? The keys? These keys are random characters. For the one-time pad, they must be random. That's by definition of the one-time pad. The key is random. So there's no relationship between those letters in there. It's not a word or a phrase. It's a random set of characters. Effectively, if you have the vision air cipher and use a long random key, you get the one-time pad and you get unconditional security, which is the best security you can have. We don't need to study anything else in the course, because we've got the best cipher. But the problem is the key is inconvenient. Again, yes, the key length. So there's two conditions on the key. The key length is the same length as the plain text and the key is random. So yes. So I've got a message to send. Let's say I've got a message which is missed scarlet with a knife in the library. What I do is I choose a key which is just random letters, the same number of letters as my plain text message. I encrypt using the same approach as we did with the vision air. We take the letter m, map it to a number, key. In fact, it shouldn't be m, f, u. It should be p, f, t. Key p, map it to a number and use cipher, the p plus k mod 27 in our case with a space. And I'd get, as an output, the letter a, the cipher text. So it's the same way to encrypt. It's just that the key is random and as long as the plain text. And that's the problem with one-time pad. It's secure but very impractical. I've got a one megabyte word document I want to encrypt or text document I want to encrypt. And I want to send it to someone. Then what I need to do is I need to choose a random sequence of one megabyte in length because it needs to be the same length as the text, the plain text. So my key is one megabyte and it's random. I cannot remember it. I cannot think about it in my head. I have to store it somewhere and I have to send it to that other person somehow, either write it on a piece of paper or maybe on a USB disk. So effectively to transfer that one megabyte message securely, I also need to transfer another one megabyte for the key. So it's very wasteful in terms of network communications. You double the amount of traffic you want to send. Think, I need to have a, I send securely, I upload data one gigabyte per day. Then because the key must be random, we cannot repeat it. I cannot use the same key tomorrow because that's not random. So what I need to do is today I generate a one gigabyte key. I encrypt my data with a one-time pad and it's secure. Tomorrow I need to generate another one gigabyte key. Somehow get that one gigabyte key to the recipient so they can decrypt. And that becomes very inconvenient, especially for network-based communications. You need to transfer a lot of information to get the key to the other person and you need to generate large random numbers. And in fact, we're generating many large random numbers can be difficult for a computer. So it's very inconvenient. I thought I had a summary slide. Maybe this, okay here, two practical limitations. It's hard to provide a large number of random keys and it's hard to distribute large keys. And that's why the one-time pad is only used in very specific, very special cases. It's seldom used in general data encryption. We have covered substitution schemes. In all of the cases, we take one letter and replace it with another. With our play fair, we talk a pair of letters and replace with another pair. And the same we do here. We take a letter and replace with another letter. We substitute one character with a different character. Another operation we use in security is transposition. We rearrange our characters. Let's go through two quite simple examples of rearranging and then we'll summarize on this topic. Two examples of classical ciphers for rearranging, transposition. First the rail fence. What we do is we take our plain text and we have our key and the key is called the depth here. We write our plain text at some depth where the depth is the key, the secret for example. We write in diagonals over n rows or in our example over three rows. So we have three rows and the ciphertext is obtained by reading row by row. Not very secure but we'll see this such rearrangements later become useful. Let's encrypt our plain text on the board. So we have our plain text. We write that plain text in three rows with a depth of three. We're just going to write it out. I, N, T, internet, technologies and applications. So I'm just writing the plain text in diagonals across three rows. Can I get there? I hope I didn't spell it wrong. So in this case the key is the depth, the number of rows you write across and to obtain the ciphertext just read now row by row. This is the first piece of the ciphertext then the second row then the third one. I, E, E, N, O, E, N, N. P, I, T, N, N, R, T, C, O, G, S, D, P, C, I, S and you'll get to the end. Just read it row by row. So we haven't replaced letters with other letters. We've just rearranged the letters. So that's what we call the rail fence cipher. If we change the depth then to a depth of four we'd write I, N, T, E, R, N, E, T, T and so on. So we'd write it in four rows. To decrypt you take the ciphertext and you can then write, we know the number of letters in total and then we divide it by three in this case because the depth is three. So how many letters do we have? Four, eight, twelve, we have thirty-five letters here. So with a depth of three we need twelve letters on the first row so we write the first twelve letters on a row then the next twelve letters on the next row and the last eleven letters on the final row. So to decrypt simply write this set on the first row then on the second row and then on the third row and then simply read across the diagonals and you'll get the plaintext back. Very easy to decrypt, very easy to break because frequency analysis can be applied. The letters in the ciphertext are the same letters in the plaintext and therefore the same frequency of letters occurs. There are in our plaintext there are one, two, three, four, E's and in the ciphertext there are four E's and similar with all the other letters. So in a large plaintext we will see the same frequency of letters in the ciphertext as the plaintext and using that information we can then work out what the depth may be. The depth is the key, it's secret. Once we find the depth it's easy. You can write it as you wish, yes, but in diagonals just so you know that you're reading it in that order. Yeah, you can write it down if you like but you're reading left to right, that's the idea. So just a rearrangement of letters. Next one. Another rearrangement called rows, columns, transposition. We take our plaintext and the plaintext is written in rows and we have a key, in this case the key is a number of integers, so in this case our key is six integers, so one, two, three, four, five, six and the key determines the order of which the columns are read. So let's see how that works. So we have in this case a six digit key, that means we're going to have six columns for our plaintext to be written down in. So let's write it, security written across really in a matrix with six columns because our key has six digits in it and we keep going, security and cryptography and we need to fill out the last row to be the same number of columns so let's just add an x at the end, some special character to fill it out. So we simply write the plaintext in rows where the number of columns is determined by the number of digits in the key and then to get the ciphertext we read column by column following the ordering of the key. So let's write the key at the top three, one, five, six, two, four. So this is the first column we read. So the ciphertext E-Y-Y-A and then we read the next column which is specified by the key R-D-O-Y column two. So we don't read column by column this way, we rearrange those columns according to the key column, the second column to read R-D-O-Y and then S-T-R-R column two this continues along, I've just run out of space, I-C-G-X-C-A-P-P-U-N-T-H. There's just one long ciphertext, the wrapping is not important there. So now we have our ciphertext from our rows column transposition. Write your plaintext in row by row, the number of columns you have is determined by the key and then read the columns also determined by the key, the orderings determined by the key. Again the same set of characters in the plaintext as in the ciphertext, exact same letters just rearranged. So again frequency analysis is easy or relatively easy, you try and you'll see in past exams I've given questions here's the ciphertext, there's no key, find the plaintext and with some knowledge of what words you expect to see in the plaintext it's not too hard to try some different ordering of columns. So you take this, you need the first guess how long the key is, if you know the length of the key it's even easier and then you need to write it and you can try different ordering of those columns and see if you get words that make sense. You'll see that in an exam, maybe in a quiz. So both of these techniques are rearrangements of characters, no substitutions and hence quite easy to break with frequency analysis of the letters. Any questions on either of them? What's the meaning of this rectangle? Nothing in the encryption, the meaning of this I just mean if you want to decrypt, if you know the depth is three, to decrypt this ciphertext you break it into three chunks because the depth is three and then you write the first chunk on the first row. That's what this rectangle is used for, just for decryption. Any other questions on either of those two ciphers? Why do we need X? Because we need to fill out here because we see this X becomes in the middle of the ciphertext. So we need each row to be full so that when we decrypt we can rearrange those columns correctly and get the original plaintext back. So we just fill it out in this case with X. If I was missing two letters at the end I'd put two X's at the end as an example. It could be another letter but in both this case I used this letter X and in play fair I inserted an X between the L's. It can be any letter but it needs to be known by the person encrypting and decrypting. So it's normally specified as part of the algorithm. A part of this algorithm says as described here and should say if you have spaces at the end fill out with the letter X or if you're using a different algorithm fill out with the letter G. It's up to the users. As long as the encryptor and decryptor know which letter it is it's okay. Some other questions? It depends on the question. You could probably decrypt this without a key. You would just have to try some different depths. So you'd need some common sense of trying but you know it's not a depth of one. So in an exam I have given a question like here it is I think what I told you is in an exam here's the cipher text. Maybe there were 24 characters and I said in the question there were no or there were an even number of letters in each row. There was no padding needed. So with 24 characters you know the depth must be either 2, 3, 4 or 6. A depth of 5 would not work with 24 characters. We don't have a multiple of 5. So I give some extra information sometimes. In a question like this I've given here's the cipher text and I say the key is 6 digits. I don't say what the key is. I say it is 6 digits. And then what do you do to break that? Yeah you count the letters. Let's try. Last 10 minutes maybe. What have we got on the slides? Yeah let's try. No I don't think we'll have time to try. Let's try and finish this. We'll try next week to break one of these ciphers. It needs a bit of explanation. Let's finish on this topic. So both ciphers just rearrange letters. Not very secure. But try this one. This is row's column and an example of here's our plain text and a key. And we obtain some cipher text. And then we take that cipher text and encrypt again using the same key. And so we do it twice. For example on this one we use the key to encrypt our plain text and we got this as cipher text. What we would do is encrypt this with the same key and we get some different cipher text. That's what we do in this case. And what these numbers present is the positions of the letters. The letter A is in the first position in the plain text. Letter T is in the second position. Letter Z at the end is the twenty-eighth position. That's the ordering of the plain text letters. We just number them. One through to twenty-eight. After we encrypt one time we rearrange those letters. And the rearrangement is that the first letter letter A moves to... How do we read this? Where's our letter A? The first letter moves to this position. The third letter, this T, the third letter in our plain text, moves to the first position in the cipher text. The third letter moves into the first position. This is just describing what happens. It's not describing the algorithm. It's the result of the algorithm that we rearrange the letters and the rearrangement if we look at the ordering of the letters looks like this. Third letter becomes the first, the tenth letter becomes the second and so on. Just look at these numbers. Do you see a pattern? Who sees a pattern in these numbers? Do you see some sequence? Plus seven. Three plus seven is ten, plus seven is seven, plus seven is twenty-four. Four plus seven is eleven, plus seven, plus seven. Okay? There's some pattern here. Two plus seven. The key length is seven digits. We have seven columns here. And so there's some relationship to this ordering here. The point is that we can see some structure or some pattern in here. And in cipher text, if you can see some structure or some pattern, there's a potential for an attacker to use that information to discover the key in plain text, to break it. We take this cipher text and apply the same cipher using the same key and we get this cipher text. Effectively, we reorder more. And third character now ends up somewhere here. The first character which moved to here now ends up here and so on. Look for the pattern in this sequence of numbers. Look for the pattern in the sequence of numbers at the bottom on your slides or on the screen. It's reasonably obvious people picked it up in about a minute for the second set. Now try for the one at the bottom. Minus eight, minus four. Anyone? At least between the numbers, it's not so obvious. All right, minus eight, minus four, plus twenty-two, minus three. So it's not so obvious in the difference between numbers. I cannot see a pattern there. It's much harder to find some relationship between the ordering of those characters. And that makes it harder to break that cipher. The idea with encryption is to take some structured plain text and produce some cipher text which has no structure, which is random, effectively. So if there's no structural pattern in the output, that's good. And what we see here, which is an important principle of real ciphers, we've taken our simple cipher, our simple rows columns, encrypted once, and we still see some pattern, not very secure. Encrypt again, pattern is not so obvious. It's more secure. And that's an important principle to apply simple operations multiple times. And that adds security to our cipher. And we see many ciphers in use today are built from taking simple operations. This is a combination of two transpositions. It gets even better if you combine transpositions with substitutions. Encrypt using a play fair cipher, then apply rows columns on the output, and then encrypt with the play fair cipher again, and then rows columns again, and the final cipher text will be much, much stronger than if you did just one of them. So this concept of combining transpositions and substitutions is what's used in real ciphers today, and one of the main points that we get from these classical ciphers. If you find the pattern, then write down an equation that describes it. And then I'll post it on the web email list. That finishes our topic effectively. The last two, which we'll cover next week, are just some other examples. In fact, we won't go through rotor machines. We'll just a brief example on steganography. And then we'll move on to real ciphers, and we'll see these same concepts repeated. Transpositions, substitutions, combination of the two. Let's continue that next week.