 We've finished last week with a few quick examples on transposition techniques, so these classical ciphers are being used to demonstrate concepts that are used in real ciphers. So with the substitution techniques we went through from a very basic Caesar building up to the one-time pad. The one-time pad is the best. It's perfect for security. You can't break it. The only problem with the one-time pad is that the key that we need must be as long as the plain text. I want to encrypt a large file. My key must be the same length of that file. And there's two problems with that. Distributing large keys is hard. How do I get the key to you? We said we need some mechanisms for distributing keys in a secure manner. That's hard. And in fact it turns out generating long sequences of bits which are random is not easy. So we sometimes take for granted how random numbers are generated. Call the RAND function in your programming language, but what's the algorithm for creating random numbers? It's sometimes quite challenging to create a truly random sequence of bits. So one-time pad is what we call unconditionally secure. Under no conditions can you break it. But it's not practical to use. So we need to make a trade-off of maybe less secure, but more practical to use. They were substitution techniques. Transposition is rearranging and we went through quick examples of railfence and rows columns. Railfence, write our plain text and read it, write it in rows, read it in right across three rows in this case and then read row by row to get the ciphertext. So it's just a way to rearrange the the plain text and a second one rows columns. What do we do? We wrote our plain text across a set of columns and then we read the columns, but the order in which we read the columns is determined by the key. The key is that sequence of six digits there. The column ordering. Read the second column first because the one is there. Read this column first, this column second, and third. So E, Y, Y, A, and so on. We get ciphertext. That one is a good one to practice. That is decrypting. I don't show the algorithm for decrypting. You should try to decrypt all of these ciphers. Some are obvious. Some are easy to go backwards. Basically, decryption should be the inverse operation. And the easy way to test if you understand decryption is if you get the original plain text back. So I suggest maybe try decrypting and see if you can work out the algorithm for decrypting in that case. Transposition is not very useful in terms of frequency analysis because the same characters of the plain text exist in the ciphertext. We do not hide any of those statistics of the frequency of letters. If there are 12% E's in the plain text with transposition techniques on their own, there will be 12% of E's in the ciphertext. We don't substitute, we just rearrange. So that's the weakness of these, but they play an important role in real ciphers transposition. So let's summarize with a couple of concepts. This illustrates a concept we use in real ciphers and using the rose columns transposition. What I've done is there was original plain text, attack postponed until 2am, and we padded it out with xyz. So that was our plain text. And we had a key, 4, 3, 1, 2, 5, 6, 7. And we applied the rose column transposition, that is we wrote the plain text across a set of seven columns, row by row, y7, because our key contains the numbers 1 to 7, and then red column by column. And if you encrypt that, you get this ciphertext. You can check later, TTNA, and so on. Now look at the numbers here. The other way to think of that transposition is rearranging letters. So we can think the first letter in the plain text moves to another position in the ciphertext. And that's what these numbers do. We say that the first letter of the plain text, 0, 1, which was an A, after applying our transposition cipher, it moved to where? After the first transposition, it moved to where 0, 1? Why can't I see it? Here it is. Position, what is it? 13. So that cipher takes the letter A in the plain text, the very first letter. And if you map where it ends up, it ends up here as position 13 in the ciphertext. That's what these numbers show. The third letter in the plain text, T, the second T there, ends up as the first letter in the ciphertext. So that's shown here that letter number 3 ends up at position 1 after we encrypt. So this is the original ordering of the letters, 0, 1 through to 28. Just wrapped it across two rows. After we encrypt, we end up with this ordering of those letters. So number 3 is in the first position, number 10 is in the second position, and so on. Look at this set of numbers. Can you see a pattern? Look closer. So know is the wrong answer. Find the pattern. Plus 7. With respect to the numbers, which is just referring to the letter positions in the plain text, 3, 10, 17, 24. 3 plus 7 is 10. 10 plus 7 is 17. 17 plus 7 is 24. Okay? There's some, maybe some pattern and you see, okay, the next four, four, 11, 18, 25. Another pattern of plus 7. Okay? 4 plus 7 is 11 and so on. So we see that there is some pattern in the ciphertext. That's bad. When we encrypt something, the ciphertext should appear random. Random means there is no pattern. Okay? So when we can easily see a pattern, then that suggests that it's not so good at encrypting. The encryption should hide the patterns. It should make it appear random. So the point here is applying this simple cipher produces ciphertext which has some quite easy to recognize pattern. Okay? If we write the letters as numbers as in this case, it's not hard to see that it's plus 7. Why 7? The key goes up to 7 meaning in our cipher there's 7 columns. So there's some relationship there. So you could go and work out the algorithm and see why it's 7 and there's 4 of those entries. So it's not so good there because there is a pattern in the ciphertext. But we take that ciphertext, TTNA, so on. We encrypt it again using the exact same cipher and same key. And we get this output, NSC and so on. And the correspondence of the original plaintext letters and the rearrangement is in the last set of numbers towards the bottom of the slide which means that the very first letter of our plaintext, 01, after one application of our encryption ends up at position 13 but after two ends up here at position number 22. Okay? So it's just rearranging again. What's the pattern in the second output? Anyone see it? Look at that second sequence of numbers and see if you can see some some relationship between subsequent numbers or some form of pattern. I cannot see it. If you see it tell me. Okay? It's applying the encryption operation the second time has started to hide the pattern of the plaintext. So to me it appears more random than the first output ciphertext. And this is a concept commonly used in ciphers today is that repeating simple operations, simple encryption steps, so this transposition cipher is quite simple just rearranged by rows and columns. In the first instance it didn't provide much security because we see that there's some pattern in the output but applying the same simple cipher again improves the security. It mixes things up more making it harder to recognize any pattern in the output. And that's a concept that we use. Applying simple operations multiple times can improve the security of an encryption algorithm. Not just applying transpositions but the same can be said for substitutions. And in fact the general approach is to combine them. Do a substitution cipher maybe like Caesar cipher or vision air. Take the output of that do a transposition like this rows columns and then take the output of that and do another vision air cipher that is apply substitutions and transpositions one after each other and then repeat again. This concept of repeating the encryption starts to produce more random looking ciphertext each time. So real ciphers today are built upon those principles of combining transpositions and substitutions plus repeating that multiple times basically looping through encrypt encrypt the ciphertext encrypt that output and keep going multiple times. And that leads us to the the real ciphers the block ciphers that we'll look at and we'll go through one example of a real cipher and see that those principles in in place. Any questions so far on classical ciphers we've got substitutions and transpositions as the basic operations and that same applies in real ciphers and what we do is we combine them do a substitution operation do a transposition operation and then in fact do it all again and do it again and that builds up the real ciphers you have today. What's the problem of doing it again and again and again? Okay we've done two transpositions what if we do a hundred more secure what's the problem? It takes time okay with security we care about the trade-off between security how secure it is and maybe usability how easy it is to use and one one common measure of usability is how well it performs. If I encrypt a file and it takes me one day to encrypt it then that's not so good for the user not so convenient so we need algorithms that are secure but also fast so there's a trade-off there what else before we move on to the next lectures just returned we've gone through I think seven classical ciphers which one's best common quiz question tell me the best of all the seven one-time pad okay and another concept to always think about is well the idea with ciphers is to produce random output if it's random then there's no structure of the plain text represented in the ciphertext plain text always has structure we don't have random messages we want to send to people who sends random emails to their friends no one because a random message contains no useful information so when we're communicating we have structured messages but what we want is that so that the attacker cannot determine what that message is we want the ciphertext to have no structure to appear random so having a random ciphertext is what we aim for and the one-time pad does that the one-time pad takes our structured plain text and really shifts each letter by a random number of positions based upon the key remember the one-time pad was the vision air cipher but the key is as long as the plain text and random what was the vision air cipher the Caesar cipher but with different keys all right so it's all built on the Caesar cipher we take the letter i in plain text we shift it by s positions where we can think that we can map these letters to numbers and we get the letter a we take the letter n we shift by i positions we get the letter v well one-time pad is exactly the same but instead of using a keyword we use a random sequence of characters here take the letter i shift by a random number of positions we'll get a random output take the letter n shift by a random number of positions and we'll get a random output can we write an equation for the one-time pad try write an equation for the one-time pad common written quiz question maybe even an exam question an equation like the way we wrote an equation for the Caesar cipher go back to the Caesar cipher our very first one write an equation which is for the one-time pad and the hint is well the one-time pad uses the Caesar cipher but it changes the key each plain text letter we encrypt and again we're assuming that letters mapped to numbers that's how we write the equation in this case the p the plain text corresponds to the plain text letter we map that to a number a becomes zero b becomes one and so on that's how we can interpret it as an equation it's actually we can do it in different ways but maybe the simplest way to think of it i'll generalize it i'll say cipher text position i equals the plain text in position i plus the key k of position i all mod 26 that's effectively the one-time pad which what i mean is that the plain text p the string is made up of p1 p2 p3 is made up of letters p i say the length n an n character plain text we can think of just n letters and we denote p1 p2 p3 and similar the key in the one-time pad we have k1 k2 k3 k i and the key is the same length as the plain text k n and to get the cipher text all we do is we take p1 the letter but represent it as a number and similar the first letter of the key and but represented as a number add them together mod by 26 and we get our cipher text c1 so in fact the one-time pad this perfect cipher is really an extension of the Caesar cipher the very first known cipher it's just that k is no longer fixed in the original Caesar cipher for every letter of plain text k was always the same value in the one-time pad k is changing and it's random that is k1 is a random letter k2 is a random letter and so on of course we don't nowadays deal with just English characters we usually when we have encryption on computers we use binary as an input okay so the input is a sequence of bits so if it's an English letter that we want to encrypt a we look at the binary representation maybe use ASCII to encode that as eight bits if it's some other language then again they're encodings to binary if it's an image they're encodings to binary so our real cipher today just operate on binary let's consider the one-time pad in the binary form what would it be what would the equation be if we're using binary this was the form when we're using let's say just English let's say lowercase just one case we were not using both upper and lower case we were just 26 letters what if we use one-time pad but with binary what's the equation the general equation someone tell me how many letters how many characters with binary two so how we change the equation mod two easy it's exactly the same instead of mod by 26 why do we mod by 26 to do that wrap around okay to wrap it around when we shift if we've got zero one it's then the input plain text is either a zero one so we need to mod by two to wrap around so the equation becomes ci equals pi plus ki mod two where the plain text is either is a sequence of zeros and ones let's consider some possible values what if pi what are the values of p potential values pi zero or one okay good what are the potential values of k zero or one so in fact we've got four combinations that is if my plain text letter is zero my key could be zero my plain text letter was one my key could be zero or my key could be one and the plain text could be one of the two values so there are four possible combinations there let's look at the ciphertext we get when we encrypt plug those values into our equation what do we get pi zero zero plus zero is zero zero mod two is zero one plus zero is one one mod two one zero plus one is one one mod two one one plus one is two two mod two zero easy do you recognize what operation that was it's x or that is that's the same as pi exclusive or with ki exclusive or zero exclusive or zero is zero one and zero is one zero and one is one one and x or one is zero so in fact the one-time pad when we're using binary is in fact the same as exclusive or the plain text and the key so the one-time pad in binary is really just c or ci equals p x or k again that's very nice because in hardware implementing exclusive or is very fast it's very easy to do so you have a plain text a sequence of bits a million bits to provide perfect encryption all you need to do is generate a one million bit random number the key and exclusive or with your plain text and you'll get cipher text that is unbreakable so it's a very very simple cipher the problem the key is too long but exclusive or becomes an important operation in encryption one way to encrypt something take your plain text x or with a random sequence okay in this case so in practice we often think of the one-time pad is just x or the plain text with a random key and we'll see that come up later what if the plain text is not a binary number for example like a string of words like some words then convert it to binary okay can't we convert everything to binary nowadays you know ascii encoding converts text to binary all the different encodings utf and so on converts different languages into binary what about images pictures again we can save pictures on our hard disk that is just a binary representation of an image so the algorithms of jpeg and so on specify how to convert colors to binary representations video we can convert to binary voice we can convert to binary everything today we can convert to binary so when we encrypt we're dealing with binary input and binary cipher text so one type pad is x or let's move on to the next topic on block ciphers and see some of these concepts applied in real ciphers ah but wait did I skip some slides earlier I think I did we've done brute force attacks all right we've talked about them I think we've said that this 26 factorial is actually the key space a couple of slides here this one this one's important how do we measure the security of an algorithm how can we say one algorithm is more secure than another well first we can talk about unconditionally secure that is impossible to break if we have cipher text and there's no information in it such that we can find out the plain text or key then we call that cipher that produces produce the cipher text unconditionally secure the only known algorithm that does that the only one is the one time pad it is unconditionally secure but as we've said several times it's not very practical because we need a key so in fact talking about unconditionally secure is not very useful in comparing ciphers because there's only one cipher in the world that is unconditionally secure all the others are conditionally secure secure under certain conditions so in fact we need another way to measure so we talk about computationally secure so we say a cipher is computationally secure if the cost of breaking that cipher it exceeds the value of the information we obtain from breaking it the example of that is that I encrypt the the password for my bank account using some cipher and you get the cipher text so what you want to do is you want to decrypt that cipher text and you'll find my password then you can access my bank account and take all my money okay okay so it's a cipher such that you try and break it and it takes you what you do you know it's a strong cipher you go buy a new computer to try and break it okay to do some operations you spend a hundred thousand baht on some new computers some fancy hardware and you go and you eventually break it you spend a hundred thousand baht and you get the password to my bank account and you steal all my ten thousand baht from my bank account is that cipher computationally secure that would say is computationally secure because the cost for you to break it a hundred thousand baht exceeds the cost of the value of the information ten thousand baht that you can get from my bank account so we if we can value information if we can put a number to the information what it's worth then we can talk about whether the cost of breaking it exceeds that value so we want a cipher such that it's too expensive to break to get the information of value well that sounds easy but it's very hard to put cost to information all right the cost of the value of my password from my bank account corresponds to how much money is my bank account what's the value of the password to your gmail account that's hard to put a number to what's the value of encrypting some confidential information for a company again hard to put numbers to the value of information so it's hard to measure that the other way is that that the time required to break the cipher exceeds the useful lifetime of the encrypted information so say the encrypted information is in is the location of where some military attack is going to occur okay so tomorrow that some military is going to drop a bomb at some location the people who are against that military want to find out where the bomb will be dropped so that they cannot be there where the bomb is dropped or to move okay that's going to be dropped tomorrow we need to know the location so the people who want to find out the location go and find and try and break the cipher of that encrypted information it takes them two weeks to break the cipher is that a good cipher would say that's a good cipher because it takes them two weeks to break it the information is only valid for one day because they dropped the bomb after one day so we need to find we need to break the cipher before one day otherwise the bomb drops on us so here is about time the time required to break the cipher if we call this cipher computationally secure must be longer than the useful lifetime of the information again that's hard to measure you encrypt you encrypt an email how long should that email stay encrypted for it's hard to put numbers to the time for for what's the lifetime of particular information what's the lifetime of an email what's the lifetime of a file you want to encrypt and protect from someone accessing but if we can measure those things we can talk about an algorithm that is computationally secure but it's hard to measure those things so it's hard to put a number this algorithm is more secure than another or this algorithm is computationally secure but we need some way to compare what else do we miss I think just one more slide right we covered all that this one and we may see it later when we see some attacks on a real cipher so far we've said what does the attacker know what do we assume we assume that they know the cipher text somehow they can intercept and find the cipher text that's a given and we assume that they know the algorithms used they know the cipher so if I'm using the vision air cipher and you're trying to attack and find my plain text that I've sent to someone else you're going to assume that you know that I'm using the vision air cipher and that you know the cipher text that I generated so when attackers are applying crypt analysis to try and defeat a cipher we classify the types of attack attacks in different ways and this is one classification based upon what they know in advance the normal approach the normal assumption is called a cipher text only attack the attacker only knows the cipher text and the algorithm and in fact we'll go through the other four in all of these cases we assume that they know the algorithm and the cipher text algorithm and cipher text is always known if the attacker can defeat your cipher using just the encryption algorithm and the cipher text then that is the weakest form of a cipher and we'll compare it with the others in a moment but in some cases the attacker may be more successful at defeating the cipher if they know some more information they don't just know the algorithm and the cipher text maybe they also know some pairs of past plain text cipher text that have been generated using the same key this is called a known plain text attack an example and I think we've mentioned it before but we'll try and write it again so the challenge here is that a is sending data to b and some cipher text was sent let's call that cipher text i and that cipher text was obtained by a encrypting some plain text i with some key i'll just denote k that gave us c i okay so what a did was they had some plain text they encrypted with the key and they got the cipher text they send the cipher text to b now consider the attacker the malicious user what do they know well we always assume they know the algorithm they know e similar the decryption algorithm the out encryption and decryption algorithms usually go together so there's one in one name that refers to both a cipher and so known to the attacker and we assume that they know c i that's the normal information we assume the attacker knows but in some cases they may know more information for example with known plain text they may know in the past a has sent messages to b and they may know the plain text messages maybe plain text one and the corresponding cipher text so a pair and they may know multiple pairs the more the better so in this case we assume the malicious user knows the normal information as well as somehow we don't say how but somehow they've learned one of the old plain text values p1 and the corresponding cipher text which were obtained using the same key k okay so c1 was obtained by a by encrypting p1 with key k and somehow the malicious user has learned both of those values they haven't learned the key they want to find the key the key or p i they want to find but they know some old pairs so the challenge is defined k or p i and a known plain text attack is when the malicious user does know some of the past plain text cipher text pairs the more information the attacker knows generally the easier it is for them to perform some attack and so the more pairs known the easier attacks can be how do they know that without knowing the key maybe the message was released through some other means so maybe plain text p1 of course we intercepted c1 yesterday so today we've intercepted c i as the attacker yesterday i intercepted c1 and c2 so i know c1 and c2 but i don't know p1 or p2 how do i learn them maybe the information was context or time sensitive p1 was important yesterday but today it's been released so c1 is old and the plain text is old so maybe there's a way that i can learn the plain text from the past communications like that issue if if the plain text is exceeded its lifetime then it may be released there may be a way for us to find it so this assumes that the attacker can find past pairs this is called a known plain text because we know some plain text pairs as well there's some variations of that known plain text means that we know about past plain text and cipher text pairs we always know cipher text because we can intercept chosen plain text is the same except those plain text messages were chosen by the attacker so i can't really draw that again it's the same but we know some pairs but p1 i chose that value that can make it easier for the attacker to try and find weaknesses in the algorithm that's easy really that concept maybe the a is some service that encrypts information and sends it to someone so all i need to do is get a to encrypt a message that i choose somehow get a to encrypt p1 although i don't know the key that a used i can then observe that the cipher text that a sends to b c1 so i learn that when a encrypts p1 the value i chose we get c1 again i don't know the key i want to find that this is a chosen plain text in that the attacker chooses the values of p and finds the corresponding values of c chosen cipher text is similar but the attacker gets to choose the cipher text as well as the corresponding plain text so the backwards way okay maybe i choose some cipher text which i know triggers a weakness in the algorithm okay the algorithm doesn't work on a particular cipher text or it's weaker for some cipher text than others so what i try and do is get the attacker to decrypt that cipher text using their secret key and if i can do that and get that plain text from the decryption then that can help me in breaking the cipher so a chosen cipher text attack is usually even better for the attacker's perspective again they know pairs of plain and cipher text but in this case they got to choose the exact cipher text chosen text is a combination of those two the attacker can choose plain text and find the corresponding cipher text and or the attacker can choose cipher text and find the corresponding plain text the first point the more information the attacker knows generally the easier it is it will be for them to defeat the cipher using crypt analysis and even better if they can control that information that is encrypted so they can choose the values of plain text because some ciphers may be weaker for particular values okay so they may be strong for most values but there may be some special case values that those ciphers reveal a weakness so if we can get the users to encrypt or decrypt those values it can be easier for the attacker so when people study ciphers and try to compare which one's stronger than the other they often classify them is the cipher subject to a chosen text attack is it subject to a chosen cipher text attack and so on so a cipher that is where a chosen cipher text attack is possible versus a cipher where a chosen cipher text attack is not possible only chosen uh sorry let's get this correct a cipher where chosen cipher text attack is possible versus a cipher where chosen cipher text only is possible which one's the better cipher one where it's possible to defeat the cipher only with cipher text versus it's possible to defeat the cipher if we have chosen cipher text we would like a cipher that is not subject to cipher text only a cipher that is subject to a cipher text only attack is the weakest of them all because that's the cipher where the attacker only needs the basic information so a cipher that is subject to cipher text only is the weakest a cipher that is subject to chosen cipher text but not cipher text only would be stronger that gets confusing even for me the simplest way to remember the more information the attacker knows the easier it is for the attacker we will see some some real examples of how people compare ciphers with respect to these known information after we go through some real ciphers which we will do now let's look at the next topic