 We're talking about the basics of ciphers. Ciphers are the algorithms for encrypting and decrypting our information. And we're talking about classical ciphers, ones which are very old, but just demonstrate the basic concepts. And last week we gave an example of the Caesar cipher. If our input can be any lowercase English letter, one of 26 values, then the Caesar cipher, the output ciphertext, we shift by K positions, where we need to wrap around if we get to Z and we come back to A, for example. So that's the general concept. We can express that mathematically, but I think most people understand that. And of course, to decrypt, we do the opposite. With decryption, we must get the original plaintext back. It's no good if I start with plaintext, encrypt, and then summon decrypts with the right key, but it doesn't get the plaintext back. That's unsuccessful. So the decryption with the Caesar cipher, we shift back to the left. So we had an example of that. There are many other classical ciphers. I'm just selecting two here just to demonstrate some concepts. Caesar cipher was a substitution cipher. That shift effectively means we substitute one letter with another letter from the alphabet or from the character set. So the character set is our 26 letters. We encrypted, and I, what do we encrypt? We encrypted hollow Steve as our plaintext last week. We had a key of D or three. And the ciphertext, the first letters were K-H-O-O-R and so on. So the first letter of plaintext was H, the first letter of ciphertext was K. K is not in the plaintext. Hollow Steve does not have the letter K. What we did is we substitute the letter H in the plaintext with the letter K in the ciphertext. That's a substitution. We take one letter of the plaintext and replace it with a letter of the possible character set. In general, we don't have to replace just one letter with one other. We can replace a set of letters with a set of other letters and it can be more complex. So the cipher tells us how do we replace? How do we substitute one for another? Another operation is transposition. We rearrange the characters. Rail fence, simple example of a transposition cipher. You take your plaintext letters, you write them in rows, but you write, it doesn't have to be in diagonals, but you write the first letter on the first row, the second letter of the plaintext on the second row, the third letter of the plaintext on the third row, the k-th letter of the plaintext on the k-th row. And then you go back to the first row. And then you get the ciphertext by reading row by row. Let's show an example, a quick example to illustrate, just to remind us what we had with the seas of cipher. We had our plaintext from last week and we took our key to be the letter D. This is last week, or the number three. Remember, we're just mapping letters to numbers. And we got our ciphertext with the seas of cipher. Using seas, we got something like k-h-o. So you should have this already, h-y-h. And now let's try a different one, rail fence. It's similar plaintext, but a little bit longer. It'll help with the example. And let's add another word at the end. That's our plaintext. We're going to use the rail fence cipher. And let's take a key of three, the simplicity. Here our key is a number. It indicates the number of rows we're going to write our plaintext in. So to get the ciphertext, what we do is we write the plaintext, the first letter on the first row, the second letter on the second row, third letter on the third row, and then back to the first row. So we can visualize it like this. So we just spell it out h-e-l. So three rows. And come back to the first row. L-o-s-t-e-v-e-s-e. Someone told me when I make a mistake. C-u-r-i-t-y. So we do that. And to get the ciphertext, we read the row one at a time. So the first row is h-l-t-e-c-i. Now the first six letters of the ciphertext. So I've just written the plaintext in rows. And now we read row one, row two, row three, to give us our ciphertext. And what's this? So our rail fence cipher, where the key is the number of rows. If we had a key of four, we'd write it in four rows. And if we... Yeah, we'll leave it there. This is a transposition cipher. So there's nothing complex about the cipher. It's just showing the different operation of we've taken the plaintext and we've followed some algorithm that rearranges the plaintext. If you look at the ciphertext, the same letters appear as in the plaintext. We haven't substituted. We haven't replaced. We've just rearranged the letters. So there were, what, one h, two l's. There's one h and two l's in the ciphertext. There's no difference in the set of letters. Whereas in our Caesar cipher, we have different set of letters in the ciphertext. So this is demonstrating our two basic operations, substitution and transposition. Very simple. Many real ciphers use these operations. But they... We'll see more complex in that we repeat the operations. Easy to break. What do you need to do to break, to find the plaintext given the ciphertext? If you don't know the key, what do you do? Oh, you want to... Oh, you could brute force. Okay, we can always brute force. Well, how could you... So that brute force is try all keys. Well, what keys have we got? One row? Well, one row, if we just wrote it in one row and we wrote it off in one row, our ciphertext would equal the plaintext. Two rows, three rows, four rows, any number of rows we could have written it in. But of course, with the length, the number of rows cannot be greater than the number of characters. So try all rows. But again, when we have a longer plaintext, to, instead of trying all rows, all keys, then try and be a little bit more intelligent and determine what the likely key is. That's cryptanalysis, to use some knowledge of the algorithm and some knowledge of the structure in plaintext and ciphertext to try and estimate what the key is without trying all possible keys. And similar to Caesar cipher, and we'll see most ciphers in the classical ciphers, their structure in the ciphertext looks random, but if you had a very long plaintext message, and most plaintext messages are not a simple short three words, but maybe quite long, then if there's some structure in the original plaintext with a transposition cipher, we just rearrange that plaintext so there's still some structure left in that output ciphertext. And the structure here is the number of letters, the frequency of particular letters. In any language, some letters occur more frequently than others. In English, the letter E occurs more frequently than any other letter. In general, if we look at many texts, I'll give you some statistics shortly, but E occurs more frequently if we look at a very long text. So if E is the most frequent letter in the plaintext with a transposition cipher, E will be the most frequent letter in the ciphertext. We don't hide that structure when we do a transposition, so that doesn't add any security of hiding that structure. We'll see with different substitution ciphers that we can start to hide the structure. So what good is a transposition in rearranging letters? We'll see that by combining substitutions and transpositions, that is, do a substitution and then a transposition and then repeat another substitution and another transposition, we can get a much more complex cipher. That is, the ciphertext is much harder to see the structure compared to the plaintext. So let's try and move on to that, and so far we've just shown two examples of the main operations using many ciphers, many block ciphers. Substitution and transposition. Let's look at another variation of the Caesar cipher. With the Caesar cipher, we use one key and each letter is shifted by that value of the key. So each letter is shifted by three positions. Very easy to break the Caesar cipher. A simple extension of the Caesar cipher is to use a key which is not just one value, but multiple values, where each value of the key, for example, a word, each value of the key determines how much each corresponding plaintext letter shifts by. I'll demonstrate that, and then we'll talk about this cipher. A simple extension of the Caesar cipher, and let me get a nice example. Let's stick with our original, our plaintext from Hello Steve security, and now our key is not a single letter, but a word. So we think of a word. Anyone? Choose a key? Not very imaginative, are we today? A key, a word, someone choose a word. Love, okay. So same as the Caesar cipher, each letter in the key corresponds to a number. I'll show you the numbers in a moment, the mapping so you can remember. And what we do to get this, the cipher text is we apply the Caesar cipher, so we take the plaintext H, the key L, and get our cipher text using the Caesar cipher, and the next plaintext letter, E, use the key O to get the cipher text, L with V, L with E, and now we repeat the key, or the key word in this case. We repeat it enough times, such that we'll have characters in the key as same number as in the plaintext. What have we got in the plaintext? There are, what, 20 letters in our plaintext. Our key is four letters, so we need you to repeat it five times. I haven't lined it up very well, you'll do it better. And to get the cipher text, what we do, we take H and L, and we use the Caesar cipher on them. Now let's bring up our mapping of letters to numbers. Let's find it first. Just a different cipher. We'll see how useful it is in a moment. So just remember, for example, we've got H and L, so H is 7, L is 11. I'll write the numbers for the first case. H is the number 7, L is 11, which means we take the letter H and shift by 11 positions to the right to encrypt the Caesar cipher. So we take with H, shift by 11 positions to the right, brings us to 18, okay, S. So the output ciphertext will be S for the first letter. 7 plus 11, 18 or S. For the second letter, E and O. E is 4, O is 14, so we take the letter E and shift at 14 positions to the right, brings us to where? 4 plus 14, 18 again. S, just by chance, okay? So this shift to the right is just add the key if we think about the numbers. E is 4, the key is 14, so the ciphertext is 18, 4 plus 14. Or S, what's the next letter in the ciphertext? Where L and V are the inputs. Work out the next letter in the ciphertext. The plain text letter is L, the key letter is V. What is the ciphertext letter? And if you're trying to shift all the positions, yeah, you'll get there. Anyone have the letter? Rather than shifting to the right, you can do that. It's easier sometimes to think of this cipher mathematically. And what we do is we really add the plain text number to the key number. But to wrap around, when we get to the end, the mathematical operation is we mod by 26. Because we have 26 characters, we bring it back to zero. Once we get to the value 26, it actually corresponds to zero. So we have what? L is 11 and V is 21. So we take L, shift 21 positions to the right, which brings us to position 11 plus 21, 32. 11 plus 21, position 32. But we only go up to 25, so 32 mod 26 is 6. It would bring us wrap around to G, letter 6 in that case. So the output ciphertext would be G, which is really just 11 plus 21 mod 26. 32 mod 26 is 6, or the letter G. And we keep going. Again, it's just a Caesar cipher, but now the key letter is changing each character of the plain text. L and E, the fourth letter will do that, and the second one will stop. L is plain text. E is the key letter. L is 11. E is 4. 11 plus 4 is 15. The ciphertext letter is 15 or P. And I'm not going to do the rest. The point isn't the details of this algorithm. We'll see the point as we analyze and compare. So apply the Caesar cipher, but change the key. Now, what the user needs to do is choose a key. This is the key word. Love is the key word. And the algorithm says we just repeat the key word to generate the key such that the key is as long as the plain text. Now, let's compare a little bit about using the plain Caesar cipher and this modified. Just look at the first word, hello. In fact, the first four letters. In our normal Caesar cipher, look at the first four letters, H-E-L-L. When we have two L's in the plain text, look at the ciphertext. The two L's map to two O's. So with the normal Caesar cipher, each plain text letter map always to the same ciphertext letter. In this case, whenever we have an E, we get a H as an output. Whenever we have an L, we get an O as an output. That's the Caesar cipher. The problem with that is that now what an attacker can do is look at the ciphertext and try and work out based upon the expected statistics of frequencies of letters that normally I guess that the most common letter in plain text is E. And therefore the most common letter in the ciphertext is H. Let's guess that H corresponds to E. And it makes it very easy for the attacker to guess, or not to guess, to work out the plain text without trying all keys. Whereas if we look at this new modified Caesar cipher, when we have repetitions in the plain text, and the simple example here is the two L's, look at the ciphertext, we get different letters as output, G and P. So even though we have repetitions in the plain text, we don't necessarily get repetitions in this ciphertext. And that increases the security of the cipher. It makes it harder for the attacker to look at the ciphertext and determine the plain text. This cipher is called the vision air cipher. Maybe on one of the slides. But it's just a modification of the Caesar cipher. It's almost perfect in that it's almost the case which it's impossible for the attacker to find the plain text given the ciphertext. There's one limitation though. If we try and line up our letters, and I didn't do it very well, but every four letters we're going to repeat the... Did I do that right? Yes? Ah, okay. Every four letters we repeat the key. Now the problem with this is that if we look... So now we know. With the two L's, since we have two different key letters, we'll get output which is different. Okay, that's useful. But look at this case. We have an E and an E. And then here another E and E. That output will be the same because we'll take the same input plain text letter and using the same key letter we'll get the same ciphertext letter. And the result is that there's still some patterns or some structure in the ciphertext. Because let's imagine this is very long now. Again, E is the most frequent letter in the plain text. Then every time there's this E encrypted with the same letter in the key, we have two occurrences here. Maybe there are others later. We'll get the same ciphertext letter as output. And the attacker can use that knowledge to try and guess, okay, this output letter occurs the most frequently. Most likely that corresponds to E. And they use statistics of the plain text to try and work out and guess the length of the keyword. And once they guess the length of the keyword they can start to try and guess what the keyword is. So it turns out it's quite easy to break this cipher. But it's better than the Caesar cipher because we produce in many cases more variations in the ciphertext even when we have the same input plain text. So the problem with this cipher is that we use a keyword which is shorter than the plain text. And we have to repeat the keyword. As a result, we may get repetitions in the ciphertext. How to make this cipher better? Any suggestions? Use a keyword as a sentence. So don't just use a word. Use a longer sentence, for example. Imagine our plain text is a one megabyte file, a lot of text, a large document. What do you do for your key? A long, long sentence? But if it's a sentence of, say, 10 words it still will have repetition. So if your plain text is a large file, millions of characters, if you use a sentence for your keyword you'll still have to repeat the keyword. And you'll still get potential repetitions in the ciphertext. So what you do, you choose a keyword which is as long as the plain text. So if I've got a one megabyte file to encrypt I choose a keyword, a key, which is one megabyte in length. And even better, you make it random. You don't choose a word, you choose a random set of characters. If you do that, if you choose a keyword which is the same length as the plain text and random, completely random, then the ciphertext will be unbreakable. It's the perfect cipher. Unbreakable in that there's no information that an attacker can use to get the plain text without the key. And unbreakable in terms of you can't do a brute force attack on it. In theory you can't do a brute force attack. So it's the perfect cipher of simply, again, use the Caesar cipher but make sure the key is as long as the plain text and random. It's called the one-time pad. Perfect cipher, unbreakable security but very impractical. The main reason it's not relevant, it's not used in practice is because I've got a one megabyte plain text. I need to choose a one megabyte key and I must deliver that one megabyte key to you beforehand so you can decrypt that one megabyte file. Or in other words I've got a five gigabyte DVD I want to encrypt. The key itself must be five gigabytes as well. It's very inefficient to distribute that key to people. If the key was a word I could write it down on a piece of paper and give it to you and then you can decrypt. But if a key is a five gigabyte random sequence how do I give it to you? So the problem with this perfect cipher is the key is too long. And the key must be random. Generating random values all the time is hard and we'll come to random numbers later but to generate continuously different random numbers is not easy. So we've gone a little bit off the lecture notes and what we're intending to cover we can take the Caesar cipher the simplest cipher modify a little bit use a long random key and we'll get a perfect cipher. But the trade off always that occurs in security is that perfect security impractical for use inconvenient inconvenient for performance and for usage. Hence it's not used very often. It's only used for very short texts or very secure applications. Let's give some more examples but instead of doing on here I've got some software that will calculate for us to save us some time. Let's see what have I got? I've got a book that I've downloaded in plain text it's just I don't know what it was some free book many characters in it let's count the letters I've got some software called crypto which does different classical operations one of them is it counts letters of the book take some time it counts all the letters and shows me the count of the letters here. Okay so the letter A occurred 36,000 times in that book E 4000 times Q 437 times We can do that on any source so just one example let's be a bit more convenient and sort it based on percentages not based upon the absolute values but the percentage of all letters of all the letters in this book E occurs 12% of the time E is the most frequent letter in this book T occurs 9% of the time the next most frequent letter and we see the most frequent letters in this English book and I don't show the lowest letters but you'll see what Q X will be the bottom two I think usually they're very infrequent this is typical of most English texts you can do the analysis of other books or other sources or you can collect some books and do the analysis of all of them and you'll see E is usually the most frequent letter you do it in Thai language different character set you'll see a character is the most frequent some are more frequent than others in any language we can do it not just on letters but on pairs of letters called diagrams if we look at each pair of letters the most frequent pair of letters occurs the most frequency in this example and then on triples of letters or trigrams THE occurs the most frequently in this book in other sources it may vary a little bit but there will always be usually some common trigrams towards the top and some down the bottom which are very infrequent or don't occur at all the attacker uses this knowledge to break the Caesar cipher the vision air cipher and other classical ciphers because what they do is they look at well what occurs most frequently in the cipher text and then starts to try to map it back ok if this is the most frequent letter in the cipher text then that may be most likely corresponds to E in the plain text and does the mapping backwards to work out what happens to get that letter that became E this is called frequency analysis and it can be done on most plain text not just on English, on different languages and similar on similar approaches are used on images and on other files which have some structure and the attacker uses that structure in the plain text to try and work out certainly what the key and what the plain text was when we encrypt we'd like cipher text that doesn't have any structure and by not having structure we think ok is completely random for example if we encrypt this book we'd like cipher text where the frequency of each letter is about the same that is the letter E occurs the same frequency with the letter Z and the letter Q and the letter T because if they all occur about the same then there's in different positions there's no structure in the cipher text so that's our goal just to finish this let's give one more I'll give an example where you can go and read about yourself I will not go through it here on this website and it's linked to from our course website it lists a Caesar cipher and we do a brute force attack on a Caesar cipher that's easy try all 26 keys and we take from the 26 possible plain text we find 17 meeting lobby at 10 as the correct one so we can do a brute force attack but more complex analysis can be done by looking at the letter frequency the frequency of ease and this explains this is just worth reading in your own time it's too hard to go through in the lecture but the example takes some cipher text, some longer cipher text using a cipher with 26 to the 26 factorial possible keys a brute force is not possible but we as the attacker look at the frequency of letters in the cipher text and we see t is the most frequent in the cipher text and then we start to make guesses okay if t is the most frequent in cipher text most likely that corresponds to e in the plain text so t maps back to e we can use similar analysis and go through some steps and again it's hard to see on the screen but you go through steps and you start to take the cipher text and start to work out letters in the cipher text mapping back to the plain text and then you start to recognize words like this and internet and it takes 10, 20 minutes by hand and you know the plain text so using this frequency of letters it's very easy on these classical ciphers to do an attack with some computer support it's almost instantaneous you can program to do an attack for you except for this one time pad which is perfect in terms of you cannot do such so I'm going off have a look at that in your own time I think that one I want to come back to our lecture notes no one more example before the lecture notes sorry I've done a similar attack where I took a plain text okay and I counted the letters in the plain text there are about there were exactly 552 letters in my example plain text to start with and I encrypted it if I look and analyze the plain text and looked at the most frequent letters through to the least frequent the most frequent letter the first letter there were something like 74 occurrences the least frequent letters or there are zero occurrences and this plot is showing from the most frequent through to the least frequent letters not saying what letters they were that's not the point it's showing that the blue line for the plain text indicates that with our original plain text some letters were more frequent than others okay and then I encrypted that plain text using the vision air cipher the extension of the Caesar cipher that we just covered I encrypted it and then I encrypted it again using the rail fence transposition cipher the one where we write in rows so it's vision air and then rail fence and I looked at the output and I say that's after round one the red one and what happened is that there were still some letters which occur much more frequently than others so the first few letters there are about 40 instances of the most frequent letter and there were some letters or all letters occurred in the output and there were 5 or 10 occurrences but then I encrypted again using the same cipher the vision air and the rail fence I applied it again on the output of the first set of encryption the first round so after encrypting again round two the most frequent letter occurred about 30 times and the least frequent about 10 or 11 times and then I encrypted again and after round three the green one the most frequent letter here and the least frequent here what's happening we see after repeating the encryptions so we take some plain text encrypt and again and again the resulting output cipher text if we look at the frequency of letters is becoming more spread out that is some of the there's not so many letters which are the much more outstanding than others much more frequent than others we see a flattening of this curve an ideal case would be to keep going from a security perspective the ideal output would be that occurs equal number of times that's this horizontal line here that is about 21 times so 552 letters 26 in total 552 divided by 26 is about 21 the ideal case for security would be this solid line what we see is if we start with a plain text it's nowhere near that there are some more frequent letters we encrypt and we encrypt again and again and again we get closer and closer to this ideal case and this is demonstrating the concept which is used by many real ciphers today they take simple operation substitution transposition and they apply them on the plain text and then they apply them again in a second round and again and again and again so much that the resulting ciphertext as an output is looks completely random and does not have any statistics like the input plain text does that is it approaches this horizontal line a completely random ciphertext would have the same number of a's as the number of z's because if it's random then one should not occur more frequently than another that is this horizontal line I haven't showed you the steps for doing this I'm just trying to explain the concept that is used in ciphers now and there's two main points most or many block ciphers use substitutions and transpositions substitutions replace one character with some other character transposition rearrange characters in the input and they repeat those operations so they take very simple operations and repeat them and we can get very secure ciphers such that in practice only brute force attacks will break them and that was if we go back to our lecture notes now in yours I made a mistake I had the Caesar cipher here but I've changed that to vision air cipher we're not going to do it but that's what my plot was from I took some plain text I applied the vision air cipher what is the vision air cipher it's that modification of the Caesar cipher where we use love as the keyword take that and then apply the rail fence and then do it again those two steps again that's round one and again round two and again round three and start looking at the output and I did that for an example to give the plot that I showed you before we will not do it in class it's a product system we have multiple rounds of transpositions and substitutions let's get on to real ciphers I'll just see if we have an example we'll come back to this one but here's death here's a real cipher being one of the most popular ciphers in the world it's no longer recommended we'll talk about limitations we'll come back to it in a moment you don't have to understand this just trying to show you some concepts this is the encryption stage and it's broken into rounds there were 16 rounds in death each round we do the same steps we take some input apply some operations some substitutions and transpositions get some output and repeat ok in the similar before I showed the case after round 1 round 2, round 3 we just repeat the same steps and death did it 16 times what do we do in each step in death round 1 for example is this green box again you don't have to understand this but these blocks I know hard to read here some are transpositions called permutations permutation is a transposition rearrange and some are substitutions and a permutation sometimes referred to as a box that does permutations is a p-box and a box that does substitutions is an s-box and if you study the details of death you'll see the s-boxes and p-boxes are all that's performed in here so these blocks correspond to substitutions and transpositions and there are only 5 or 6 blocks in here and that makes up one round and then in the second round we just do the same again and we do it 16 times in death death is a real cipher which just uses the same concepts of the classical ciphers the details we're not going to so let's go back and talk more generally summarize from an attacker's perspective the aim of the attacker is to find the plaintext or the key and or the key if you find the key you can automatically get the plaintext and if you can find the key then you can potentially get future plaintext very easily so two people are communicating and they're sending many messages over a period of a week or a month and they're using the same key so if I'm the attacker and after two days I intercept a message and I decrypt and get the plaintext well that's good I've broken or I've found the plaintext but even better if I can find the key they're using because now I can of course find the plaintext for that one message but all future messages which they encrypt using that key I can easily decrypt so finding the key is of benefit we assume the attacker knows the ciphertext they can discover the ciphertext they know the cipher being used the algorithm and the attacker in some cases may know about pairs of plaintext and ciphertext which we're encrypted using the same key that we're trying to discover for example some military organization are sending secret messages about upcoming attacks and the plaintext messages send a bomb to XY coordinates at time Z that's the structure of the plaintext to say to send a message to whoever to send the bomb or send the missile but the plaintext messages containing the coordinates and the time for the attack for a future attack so instructions but they encrypt that plaintext with some key and send the ciphertext some of the attacker wants to intercept the ciphertext and work out when the upcoming attack is going to be and where so they can move for example so they encrypt some ciphertext they cannot read it they cannot decrypt but then they see that tomorrow the bomb lands in this XY coordinates at this time so even though they didn't decrypt the ciphertext to get the plaintext they have the ciphertext and now they know what the plaintext was because they know that the bomb went off in these coordinates and at this time so they know the original plaintext must have said these XY coordinates and at this time so in some cases the attacker can work out what the plaintext would have been after the fact not very useful in that case because the bomb has already exploded but in future cases now they can use that information that this plaintext mapped to this ciphertext but they still don't have the key they know a pair of plaintext and ciphertext they don't know the key they don't discover the key it turns out by knowing pairs of plaintext ciphertext can help you in finding the key and can help you for finding the plaintext for future ciphertext values so sometimes in some attacks in practical attacks the attacker may know pairs of plaintext ciphertext but not know the key in brute force try all possible keys how do you prevent that make sure the key space is large enough such that a trying them all will take too long or will be too costly cryptanalysis exploits some characteristics in the algorithm to work out what the plaintext or key is so the frequency analysis of those simple ciphers is cryptanalysis it's not doing brute force it's using some structure of them to work backwards from the ciphertext to find the plaintext and or the key and we assume that the attacker if they decrypt and find some possible plaintext they can recognize which one is the correct one so now let's move into given these concepts move into real ciphers and specifically symmetric key encryption I think we've seen this picture very similar before symmetric key encryption sender and receiver use the same key so there's symmetry between the keys we take some plaintext we have a cipher we encrypt that plaintext using a shared secret key k we send the ciphertext to the recipient they decrypt the ciphertext using the same key as what was used for encryption so k and k the same values it's a shared value just one value so this is symmetric key encryption is commonly used for data confidentiality and encryption in many cases so for it to work we need a good algorithm to encrypt a strong algorithm such that if the attacker knows the ciphertext they know the algorithms it must be hard for them to find the key or the plaintext even if they know pairs of past plaintext and ciphertext somehow they've discovered some ciphertext values and the corresponding plaintext but they don't know the key yet it should be hard for them to find the key and for this to work the key must be of course secret if someone yells out the key in the class then it's no longer considered a secret between the two users the key must be secret otherwise the system fails symmetric key encryption we distinguish between two types block ciphers and stream ciphers so the algorithms we use block and stream ciphers block are the main ones and the difference between block and stream ciphers is the plaintext we operate at a time so what we do is we take some plaintext encrypt and get ciphertext if and we only operate on a particular length of input plaintext block ciphers typically 64 or 128 bits there are some variations but 64 is common which means we take 64 bits of plaintext encrypt to get ciphertext using our algorithm and then if we have more plaintext to encrypt we operate 64 bits at a time encrypt the next 64 bits encrypt the next 64 bits and so on until we're finished stream ciphers usually operate one byte at a time they take 8 bits at a time and encrypt it turns out that the algorithms used are slightly different generally stream ciphers stream ciphers can be implemented much faster to be much faster than block ciphers at least in the past nowadays there's not much difference in terms of performance they're becoming very similar in performance and stream ciphers and block ciphers are used for many of the same purposes let's forget about stream ciphers for now block ciphers we'll see one example of a stream cipher a little bit later one of the most popular block ciphers in real use was the data encryption standard, DES developed or designed about 30-40 years ago in the US and was developed as a standard for the US government which meant all the government departments in the US had to use it to encrypt their data but many other countries used it because they wanted to inter-operate with the US organizations so DES became a worldwide or used worldwide it operates on 64 bits at a time so in our classical ciphers I was using examples of letters, our 26 letters of English our real ciphers operate on binary values and what DES did is we take our plain text as binary and we take 64 bits apply the DES algorithm to encrypt to get some ciphertext and 64 bits of ciphertext is produced and then take the next 64 bits of plain text and encrypt with DES so we operate at a block at a time if so our input plain text now we assume is binary so if it's some text message we map say using an ASCII table the characters to bits DES used a key length of effectively 56 bits the key was actually 64 bits but 8 were not used for encryption so it meant effectively there was 56 bits in the key now a brute force attack try all keys how many keys in DES how many do we need to try 2 to the power of 56 if there are 56 bits in a key then how many possible key values are there or we can vary the bits between 0 and 1 so it becomes 2 to the power of 56 possible key values so in theory a brute force attack on DES requires an attacker to try and decrypt 2 to the power of 56 times do we have a table that showed those numbers have we gone through it sorry I jump ahead a few slides if our key length is 56 bits there are 2 to the power of 56 possible keys so a brute force attack on DES in theory what we do is we take our cipher text decrypt with key 1 check the plain text makes sense no decrypt with key 2 key 3, key 4 and in the worst case we have to try all 2 to the power of 56 keys okay we have to try them all that's a brute force attack how long does it take so it depends upon how many keys to try and how fast our computer is that's making the decryption attempts and we'll return to this table later but for the DES example we have a 56 bit key if my computer can decrypt 1 billion times per second it can try 10 to the power of 9 keys every second then with 2 to the power of 56 keys to try it takes what 2 or 3 years 833 days of my computer trying to decrypt for a brute force attack but if my computer would work 1000 times faster or maybe I had 1000 computers or 1000 times the money to spend on the computers then it would take just 20 hours to do a brute force attack on DES and if it was a million times faster than this I'd do it in a minute okay we'll come back to some further examples of brute force but the 56 bit key of DES turns out to be too short to withstand brute force attacks computers nowadays can do it in minutes hours I'll show some examples later of computers which were built originally to be just for breaking DES let's go back to DES so it had a key length effectively of 56 bits it does permutation, some initial and final permutation, a permutation is a transposition rear-range characters rear-range bits 16 rounds each round involves permutations and substitutions transpositions and substitutions the simple operations that we saw demonstrated with the CSER and rail fence cipher more complicated but the same concepts are used in DES let's not worry about Fistool turns out decryption in DES is almost the same as encryption the advantage of that we use the same algorithm to decrypt as to encrypt we just use things in a different order means you can implement the encryption say in hardware or software and you don't have to implement a separate piece of software or hardware to decrypt because it's the same so you just implement once and that software or hardware can both encrypt and decrypt that's a practical advantage the algorithm of DES is considered secure in most cases it's a good design it works well the main problem with DES is the key length is too short such that nowadays it's easy to do a brute force attack so we said there are two types of attacks brute force and cryptanalysis cryptanalysis takes advantage of the algorithm brute force is just trial keys DES is generally considered secure from cryptanalysis the algorithm is strong but insecure in terms of brute force the key length is too small the key size is too short it's longer recommended and only used in old applications but we use for many years so in the 70s 80s and 90s was very common and only started to be phased out and so in the 90s and 2000 and so on so that was one block cipher the concepts of DES have been applied in other ciphers the other ciphers have similarities to DES one extension of DES was triple DES easy apply DES three times and use a different key each time so instead of having a single 56 bit key effectively you've got a what three times the length which is 168 bit eight bit key so a simple way to increase the key length is to encrypt multiple times each time you encrypt use a different key do I have a picture? triple DES it was an extension of DES because many people had implementations of DES many people used DES it makes sense to reuse the knowledge and the software and hardware many people designed triple DES you take your plain text encrypt with DES normal DES using key one then you'd apply DES again using a different key 56 bits 56 bits and then key three 56 bits and you get your ciphertext so effectively your key is these three values and there are options to to have a shorter key if needed there are variations encrypt decrypt encrypt has some advantages in practical use but apply DES three times gives us a longer key makes brute force practically impossible 168 bits then we flick forward 56 bits took 72 seconds on my network of ultra fast computers 128 bits takes 10 to the power of 16 years and triple DES has 168 bits even longer 10 to the power of 16 years note that the universe is about 10 to the power of 10 years old no one's going to wait for triple DES to be broken by brute force so very simple extension of DES use it three times and brute force is solved the problem it's three times slower than DES so we care always about security and performance every time I want to encrypt a large document DES takes some time triple DES takes three times as much time as DES so that was the practical problem with triple DES some it's used in some cases triple DES but no longer recommended because there are equally secure algorithms but faster perform better what do we skip DES is no longer recommended triple DES is secure but doesn't perform so well compared to others nowadays DES operations we're not going through this just to show that DES is really made up of multiple rounds of transpositions and substitutions there are other ciphers and the most common one in use today and recommended is the advanced encryption standard AES DES was developed by the national institute of standards and technology in the US NIST NIST maybe I'll skip that DES standardised by NIST it was originally designed by IBM and the NSA standardised by NIST so they define how it works and so was triple DES so that was the improvement and then they realised at some point that we need a better algorithm faster just to secure triple DES so what NIST did they held some competition for people to propose designs and they selected the winner one called Rindale and in 2001 it became the standard so it's been around for 10 plus years AES it operates on blocks of 128 bits at a time so if I have a 1 megabyte file to encrypt AES takes the first 128 bits encrypts and the next 128 bits encrypts and so on it allowed for different key lengths 128, 192, 256 bits you could choose a key length and you can still today it used substitutions and permutations it used a different approach than DES but still uses our basic operations it's used today in many applications you'll see it in Wi-Fi in wireless LANs you encrypt your data disk encryption software or file encryption software now uses AES many internet applications make or support AES it's very common in software and hardware today and recommended for use and 128 bit key length is still considered secure and some others so there are many some of them are listed here by different people different block lengths different key lengths many of them follow a similar structure to DES called this Feistel structure so they all follow similar design principles some are free to use some have patents means maybe you need a license to use it in hardware or software some are faster than others in the last five minutes let's just come back to brute force attacks so brute force attacks try all keys in the key space so we talk about the key space is the entire set of keys that a user can choose from so we how do we measure a brute force attack well how long does it take how much time does it take well that depends upon how many keys we need to try and how long it takes to try one key so how big is the key space and how fast is our computer usually what we do is because we don't want to count we worry about the details of how fast the computer is we look at the number of operations how many if I have a cipher text and I want to try all keys how many decrypts do I need to do to try all keys so I decrypt the plain text the cipher text with key one is one operation I decrypt the cipher text with key two another operation I try all keys how many operations do I need so a k bit key requires two to the power of k operations and the next slide gives some examples for different key lengths the key space simply two to the power of k we'll see a special case down the bottom and then I give some actual times based upon some speeds of computers not of any specific computer but some numbers I save if we have a computer that can do one billion decryptions per second this is approximately how long it would take to do a brute force attack so with a 32 bit key there are two to the power of 32 possible keys to try two to the power of 32 is about four billion four by ten to the power of nine so if I can decrypt at ten to the power of nine keys per second I've got four billion to try I can do one billion per second it takes me about four seconds if my computer is one thousand times faster I can do one billion or one trillion decryptions per second then it's going to be one thousand times less time to do a brute force attack and ten to the power of fifteen decrypts per second so some example numbers of time so des we saw nowadays is considered breakable in terms of brute force 128 bits with ten to the power of fifteen decrypts per second still impossible to break with brute force attack and any longer impossible how fast is a computer we're out of time but next lecture I'll show you for example a normal new PC maybe we're talking about less than a billion per second maybe one hundred million per second so my computer less than this let's say I have ten computers and I can get one billion per second then this is the speed I can get if I have ten thousand computers then maybe I can get this speed and I can do it all in parallel or I'm a government and I can have ten million computers or supercomputers that are equivalent to having ten million Intel CPUs then maybe we're at this speed with ten million PCs decrypting all at the same time ignoring communications between them still impossible to break so just look at the scale and we see 128 bits or larger impossible for brute force attacks let's stop there and on Thursday I'll show you another example of brute force a real brute force attack was of how to encrypt on our computer don't leave yet though just as a reminder don't if you leave then check the website this week you have the practice quiz not marked do it in your own time and some exercises again not marked do it in your own time and the exercises follow from last week I said try this virtual network software so there's some instructions there try it and try this open SSL software so follow this link and you'll see an explanation because next week there'll be homework using these two homework that we assessed so if you can't use the software it'll be much harder when you have the homework so I'll say a bit more about it on Thursday