 We saw the Caesar cipher introduced last week, we're using the English alphabet, 26 characters and there's no difference between uppercase and lowercase, that is, our input plain text, the case does not matter and in the output cipher text the case does not matter. It's just on this slide and in some cases in textbooks and so on you may often see the plain text presented as lowercase and the cipher text as uppercase, but in terms of applying this equation and the number of letters we have we just have 26 letters, okay? Uppercase a is the same as a lowercase a, but it's just that sometimes to present it we use uppercase for the cipher text, but it's not so important. So the specific Caesar cipher shifts our letters by three positions and mathematically we can express that as if we map the letters to numbers and we needed to find that mapping, so here I've defined the mapping, the letter a is equivalent to the integer 0 and b1 and so on, it could be different but we'd need to adjust if we use it different, but once the mapping from letters to numbers is defined then we can express the encryption operation via this equation where we take a plain text letter p, the integer value of that plain text letter and we add the key in the specific Caesar cipher, the key is the number of positions we shift which is three. We add three and because we need to wrap around if for example if we have plain text letter y and we shift three positions and it becomes around to b then we need to mod by 26 because we've got 26 characters in that alphabet and a generalized Caesar cipher k can take any value, we can shift by four positions, k equals four, shift by ten positions, k equals ten, so k is the key in this case, so if I know my plain text I can choose a key, how many positions do I shift by, the value of k and someone who then I can encrypt, I do this for each letter, one letter at a time and I get some cipher text and I can send someone the cipher text, if they know the key then they can shift the cipher text back that k positions and they'll get the plain text. If they don't know the key, they don't know the value of k then they don't know how many positions to shift to get the original plain text. We saw at the end of last week that if we've only got 26 possible keys then it's quite easy for an attacker, a malicious user to try and guess, try all possible keys and one of them should give a plain text message that we understand which was a brute force attack. The decryption operation, we take the cipher text value and subtract the value of the key mod by 26 and a couple of questions at the end of the lecture yesterday, last week, what is minus 24 mod 26, because if we have c equal to zero, subtract some value, a shift of 24, we get minus 24 here, mod 26, what do we get? Two. So some basic modular arithmetic, what is it? If we take a multiplied by some number plus b equal to c, then when we use a mod, we find the remainder. So the remainder when we divide by 26, so in this case, this is an integer, positive or negative integer, so for our example, if we express our modulus, let's try, if this is our modulus, 26 here and minus 24 mod 26, then the way that we can look at the modulus is that the modular arithmetic is that this equation must be true. 26 times by some number, I'll write the minus one in a moment, 26 times by some integer, positive or negative, plus the remainder equals minus 24. That's another way to look at the modular arithmetic or the definition. 26 times by some integer plus the remainder equals this value. So in this holds true if that integer is minus one, so 26 times minus one is minus 26, plus two is minus 24, which is what we have here. So be careful when you have negative numbers here. Go back to very basic definitions of modular arithmetic or a mod operator to see what to do with negative numbers. We finished with a brute force attack where I showed an example where we tried all possible keys and one of them gave the plain text, the rest gave random looking strings, random words that we couldn't make sense of. And from that we guessed what was the key. So that's a type of attack. Let's go back to describe these types of attacks. So we've mentioned some of the terminology that we have with ciphers. The operations we use, we saw with a Caesar cipher, we really substitute one letter with another letter. We replace one letter with another. Which one do we replace it with? Well it's defined by the Caesar cipher as the one that's three positions along. So we use substitution in the Caesar cipher. Another form is transposition where we rearrange letters. And a product system is when we combine them together. We went through this terminology last week. Now let's look at what type of attacks we can perform on different ciphers. What crypt analysis we can perform. The objective of an attacker is to recover the key and the message. If we can recover the message, the plain text, then that's good. It's even better if we can recover the key. Because if we have the key then we can easily find the plain text. And we may be able to find plain text, subsequent plain text messages in the future. I will often refer to this other user as an attacker. Maybe a malicious user. It doesn't mean that they are a bad person. It means that that's the identity or that's the entity that's trying to find out what the key or the plain text is when we perform encryption. So an attacker or a malicious user. What they want to do is find the key or the plain text message. How can they do that? What approaches do they have? They can either perform a brute force attack and that's what we saw with the Caesar cipher. Try all possible keys. And if you try all possible keys, if you have the cipher text, try to decrypt with all possible keys then one of those keys should be the correct key and should give you a plain text that you can understand. That assumes that you can detect the plain text message. For example if you expect the message to be in English, then if you have some cipher text and decrypt with all possible keys then one of those keys should give you a plain text which is in English. The rest should be random looking, not English words. So it assumes that the attacker knows what they're looking for in the message. Not the exact message but knows the structure of that message, what language it is, what format that message is. If so then a brute force attack involves trying all possible keys on the cipher text until some plain text is obtained that we can understand. The other type of attack is a more intelligent attack. A brute force we can think is a dumb attack in that we just try all possible keys. A more intelligent attack is one where we take advantage of the algorithm that is used for encryption and try to deduce what the plain text or the key is. Try to work it out without trying all possible keys. So with the knowledge of the encryption algorithm we may be able to take advantage of that and find out what the plain text is without trying all the keys. Crypt analysis. If we can find the key then generally that means that all the past messages which were encrypted with that key we can find out what they were and all future messages which are encrypted with that key we can find out what they are. So that's why finding the key is often desirable. If you find just the plain text but not the key then you've found the contents of just one message. If you find the key you may find the contents of many messages. We will see some examples of crypt analysis through some of the classical ciphers and some examples of brute force attack we've already seen one. When it comes to real ciphers in use today not the classical ones but ones which are used in computers today we will not see any detailed examples of the crypt analysis. It's quite complex and I do not follow and understand even some of the attacks. But we can classify the attacks based upon what information the attacker knows. Normally the attacker knows the algorithms so we assume the attacker the malicious user knows the encryption algorithm and the decryption algorithm. So the steps that are done to encrypt and decrypt. For example if Caesar Cypher was used the attacker knows that equation. So we assume the attacker knows the algorithms used and also knows the ciphertext. That's what we assume that they know. They want to find the key or the plain text. That's the goal. If the attacker just knows the algorithms in the ciphertext and nothing else then we call that a ciphertext only attack. So the attacker or the person analyzing the ciphertext just knows the algorithm and the ciphertext. But their life can be a little bit easier if the attacker also knows other information. Even if they don't know the key and the plain text. A known plain text attack is if we know the algorithms the ciphertext and one or more plain text ciphertext pairs. So in addition to the basic information somehow the attacker has found that some plain text let's say p1 when encrypted with a key some key k gives ciphertext c1. The attacker knows the values of p1 and c1. That's what we mean by a plain text ciphertext pair. Someone in the past has so the normal users a has taken some plain text p1. They've taken their secret key k and encrypted that plain text and sent the ciphertext across the network and the recipient b has decrypted and got the plain text. So that's the normal operation. When that happened then the attacker could have intercepted and obtained the ciphertext so the attack attacker knows c1. A known plain text attack assumes that the attacker somehow also discovered p1 without knowing the key. So the attacker also knows the corresponding plain text. How can that happen if we don't know the key and if we cannot break the cipher? In some communications the message may become public after it's being communicated. For example the plain text message is and it's maybe not the the best example but a plain text message is a sent from one army commander to another army commander saying the location where they're going to perform a drop a bomb. So they one specifies the location the GPS coordinates and sends them encrypted so they take the plain text which is the location encrypts them as the ciphertext sends the ciphertext to the other army guy and they drop the bomb. The attacker only sees the ciphertext they don't know the original plain text but after that bomb has been dropped the attacker knows the location of that bomb therefore they can guess what the plain text was because they now know what the coordinates were. Before the bomb was dropped they didn't know but after it's been dropped they know the location so they can work out what the plain text was in that case. That's the concept at least. So in some cases after ciphertext has been sent after some time you can think that the secrecy of that expires it's no longer relevant and in that such cases it's possible to find out what the plain text was. So if an attacker knows the pairs of plain text ciphertext it can help them in trying to find out what the key is depending upon the algorithm. So a known plain text attack is when the attacker or an analyst has more information than a simple ciphertext only attack. Generally the more information the attacker has the easier it is for them to find out what the key is. A chosen plain text attack is similar except the attacker somehow got to choose what the plain text message was and they find out what the corresponding ciphertext is. So the attacker chose a plain text message P1 somehow got the normal user to encrypt that with their secret and then the attacker intercepted and found out the ciphertext and so therefore the attacker knows both the plain text and the ciphertext and the advantage that they can obtain here is if they can choose a plain text which has some structure that helps them find the key by the algorithm. So again it depends upon the algorithms but being able to choose the plain text it can also be useful to the attacker. If the algorithm has some bugs in it or some deficiencies then maybe some plain text combinations can highlight those bugs cause those bugs or those errors to occur and therefore if the attacker can choose such a plain text they can find the corresponding ciphertext and then try to work out what the key was. So chosen plain text the attacker not only knows the plain text ciphertext pair but they chose the plain text in that pair. When we talk about these types of tax in practice the with real ciphers they may be possible when there are many pairs not just one pair. The attacker doesn't know just one pair plain text ciphertext but they know many different pairs. The more they know the possibly easier it is for them to break the cipher. When I say break the cipher I mean find the key find the secret. The next level up is if they can choose the ciphertext and get the corresponding plain text that can be useful as well and chosen text is a combination that they can choose a plain text and get the corresponding ciphertext and choose some ciphertext and find the corresponding plain text. As we go down if such an attack is such an attack is at the bottom is makes it easier for the attacker if they know this information. When people evaluate the security of real ciphers they often determine what their requirements are and with respect to these types of attacks. So they may say this cipher is secure against chosen plain text attacks and known plain text attacks and ciphertext only attacks but it may not be secure in the case of some chosen ciphertext or chosen text attacks. So often you may see ciphers classified in terms of their security with respect to these attacks. If a cipher can be broken using a chosen cipher, a chosen text attack then it's probably also can be well with a chosen text attack it doesn't mean it can be broken with a chosen ciphertext attack. Similar if a cipher can be broken with a chosen plain text attack it may not be able to be broken using known plain text attack because this type of attack assumes that more information is known. Another way to think of that a cipher that is resistant to a known plain text attack is more secure than a cipher that is resistant to a just a chosen ciphertext attack. So we can use these types of attacks to measure or rank the security of some ciphers. I don't have any specific examples of these attacks as we go through some ciphers later we'll see how knowing for example pairs can help. When we after we go through the real cipher desks we'll see a case where if we know multiple pairs of past plain text ciphertext we can perform some attack. We need to go through some other details first. So now we come to how do we measure how secure something is. I give you a cipher tell me how secure it is. Two general approaches some so there's the the concept of unconditionally secure. A cipher that is unconditionally secure means that if the attacker has the ciphertext they will never be able to find the plain text or key. So unconditionally secure means there are no conditions under which it can be broken. So if the attacker finds the ciphertext they will never be able to find the key. That's our desirable security level. That's good. Unfortunately there's only one known cipher that has that property. It's called the one-time pad and we'll see that in work in use today. The one-time pad we'll see through our examples is if we if if we have the ciphertext even with a brute force attack we cannot find out what the plain text was. So that is our perfect level of security. The problem with the one-time pad is it's very inefficient to use in practice. So it's perfect in terms of security but in terms of using it with computers and large messages it's very inefficient so it's not practical to use. So if our cipher is not unconditionally secure then we need some other way to measure how secure it is. And we talk about a cipher that is computationally secure. What does that mean? We'd say a cipher is computationally secure if either of these two hold. The cost of breaking the cipher exceeds the value of the encrypted information. So an example of that I use a cipher. I encrypt my password for my bank account. Inside my bank account I have a million bar. I encrypt the password for my bank account and you find the ciphertext the encrypted password. Your goal is to decrypt that ciphertext and then you'll find my bank account password and then you can steal all my money. If you build a supercomputer or you use a large network of computers to try a brute force attack on this cipher and you try all possible values and you spend two million baht to buy these new computers and you run this attack and finally you find my password then you can take all my money. That cipher would be considered computationally secure because it cost you two million baht to build these computers to find my password but once you found my password you only obtained one million baht. There's no gain in it for you. So the cost of breaking that cipher exceeded the value of the encrypted information. The value of the encrypted information in that case my password was one million baht. If you could find the password you get one million baht. The other aspect is from the perspective of time. If the lifetime of the encrypted information is shorter than the time it takes to break the cipher then it's also computationally secure. So that's the case of coming back to the army case. The military encrypt their plans for their attack which is going to occur tomorrow. They encrypt it and then the opposition come along and they find the encrypted plans. If they can decrypt it they can find out where the attack will occur and do something about it. But if it takes them five days to decrypt then it's of no value because the attack occurs in one day time. If it takes them five days to decrypt then that information is no longer useful. So if our encrypted information has a short lifetime or a shorter lifetime than it takes to break the cipher then we also call that computationally secure. The problem in practice is how much does information, how much is information worth, what's the lifetime of information. It's very hard to estimate. In my two examples we could. The amount of money in my bank account I could estimate what that is valued. What about my exam? What is the value of my exam when I encrypt that? It's hard to put a financial value on that. The lifetime of the exam we can put a value on it. I encrypt the exam. I post it on a website. The lifetime of that exam is from now until when the exam is held. After the exam is held that information has no value anymore. So it's very hard to estimate in real life in many cases what is the value of information and what is the lifetime of information. And therefore it's very hard to know if the cipher can find the information within the lifetime or not. And similarly find the information and obtain that value. It's also hard to estimate how much effort is needed to break a cipher. How long does it take to break a cipher depends on many different parameters. But we need some way to measure the security of systems and this is what we have. Unconditionally secure even with the cipher text we cannot with the cipher text we cannot find the plain text or key. But there's only one example that provides that level of security and the other case computationally secure is important. It's hard to put numbers to a third. We saw this last week the brute force attack. A brute force attack essentially depends upon the key space and the computing capabilities. The brute force attack involves try all possible keys with the hope that one of those keys will be the right one and it will give you plain text that you understand. And this gave some examples depending upon the key size measuring bits in these cases. So a 32 bit key gives us two to the power of 32 possible keys. On average we only need to try half of them because on average sometimes we'll find the key quickly sometimes we'll get to the end to find the key on average we only need to go halfway. So this gives an example of how long it would take if we decrypted at one million decryption operations per second we could try one million keys per second and this column as if we can try one trillion keys per second which is one thousand billion keys per second in this case. So it depends upon your computer speed of how fast you can decrypt. We see a key length of 128 bits should be long enough even with the brute force attack it's going to take thousands of years to find that key. So that's one of the recommended sizes 128 bits or longer for key lengths in real ciphers. We saw with a Caesar cipher the key length was 20 or the key space was 26 there are 26 possible keys so they're for very easy to perform a brute force attack that was a Caesar cipher easy easiest one brute force attack is very easy assuming that we know or we can recognize the plain text when we decrypt we'll get 26 different plain text values assuming we can recognize which one is the correct one brute force works and that's commonly true we either recognize the language or if it's some image or some file format that we can recognize that it's an image or it's a file format that we can understand or our computer can understand so it assumes that there's some structure in the plain text and pretty much all messages that we communicate in practice have some structure they're not random bits that we communicate even if we compress things which randomize effectively randomizes the information even if we compress things we can uncompress and find the structure again so with the Caesar cipher how do we improve against a brute force attack well we could try to hide the encryption algorithm not so make it difficult for the attacker to know what the algorithm in practice that doesn't work very well when it comes to real ciphers when we need to encrypt information and send it to someone else we need implementations of those ciphers software or hardware someone needs to implement them and therefore keeping it secret from the rest of the world is very hard so if i want to send someone else say an encrypted message both of us need to know the cipher unless i'm going to implement it myself in software usually we'll use someone that a cipher that someone else has implemented and therefore someone else knows about that algorithm so in practice the algorithms are usually public only in very special cases are they secret so that doesn't help us we could try to hide the structure of the plain text but again there are limited ways that we can do that and an attacker if we find out can easily find out that what the structure was that is i send you a message and assuming you can understand russian instead of sending it in english or thai then i write that message in russian and send it to you and decrypt someone else who the attacker who decrypts they'll get some plain text values how do they recognize what the correct plain text is because they cannot read russian well they find somewhere to translate from russian to english or to thai and they can find out so usually the attacker can recognize the structure of the plain text so the only real way to improve is to increase the number of keys make it so there are more keys and that's what a a general mono-alphabetic cipher does increases the number of keys available compared to the Caesar cipher with the Caesar cipher we just shift our position our letters along x or k positions with a mono-alphabetic cipher we have any arbitrary substitution of one letter to some other letter let's write it down let's go through a detailed example we have our alphabet the set of characters were possible with a mono-alphabetic cipher if this is the alphabet what we do is we select a mapping from each character each possible character in the plain text to one of the other characters in the same alphabet for example i select that the letter a maps to f so when i choose which cipher to use i select a mapping and i choose b to map to one of the other 25 characters and c to one of the other 24 characters and so on for the rest of them so there's a mapping here i will not go through all of them and we do that for all of them we select a mapping from one of the letters in the alphabet to one of the other letters and that mapping is our key so this specific mapping and let's maybe i'll go through and see if i can do it all h okay not so easy but i think it's close what is that's just some random mapping that i've chosen from each letter to one of the other letters of course there should be 26 unique letters here i shouldn't i just check i haven't made a mistake and included two let one letter twice there that's one possible mapping that would be defined one key in this monoathletic cipher of course there are other mappings that are possible and now what we do if we want to encrypt some plain text we just follow that mapping plain text hello and the cipher text that letter h becomes m e becomes g l becomes c c and o becomes i so we have a so we have our cipher text here and the key we can think the key is this specific mapping one way to write the key would be to write this sequence of characters that's the key another mapping would correspond to another key a different key for example if i did a different at the end instead of y to a and z to q it was opposite and all the others were same that would be a different key it's a different mapping and potentially i'd get different cipher text not in this case but in other plain text cases so the number of mappings that we have is the number of combinations of letters we can have here which is 26 factorial which we mentioned last week because we can choose 26 possible characters for the first letter a can map to one of 26 possible values b can map to one of 25 possible values that cannot map to the one that a maps to c to 24 and so on which gives us 26 factorial possible mappings 26 factorial keys which is four by 10 to the power 10 to the power of 26 possible keys compared to c's a cipher now we've averted a brute force attack a brute force attack is impossible in this case what do you do to find to attack in this case if we cannot do a brute force attack what can we do divide by divide by the number no there's no equation here it's just simply a mapping mapping from one letter to some other letter what can you do well you need to be perform a more intelligent attack that takes advantage of some knowledge about the the plain text and the algorithm brute force will not work because it will take too long in theory it will work but in practice it takes too long with 26 factorial keys so now we need to use some crypt analysis some other attack that uses some knowledge of the algorithm and an attack on a mono-alphabetic cipher involves exploiting the characteristics of the plain text and the corresponding cipher text in my example plain text hello there were two elves in the cipher text there are two c's because with this cipher with this specific key an l will always map to c so whenever we have an l in the plain text we'll always get a c in the cipher text and same for the other letters wherever we have an e if we had a long plain text like a long message thousands of characters then every instance of an e in the plain text we'd see a g in the cipher text there's a one-to-one mapping between characters in the plain text and the cipher text and we use that to be able to determine given some cipher text what is the plain text and we take advantage of the fact that in our input plain text message normally there is some structure if our input plain text message is an English phrase or an English document not a not a word or a word in theory it works but it it practically works when we have a large number of words but if it's in English if you take a large English document and you count all the letters okay all the letter a's all the letter b's and the c's and you count what percentage they form of all the letters then you'll most likely get some statistics like this that is the most frequent letter in most English documents is e about around 12 percent of all characters in English are e of course it varies upon what the document is but if you take a large piece of text and count the number of e's you'll get close to around 12 percent e's these statistics are from example from some large I think it's from the textbook they created use some large legal database so some large documents legal documents and they counted all the characters and they found 12 percent of the characters were e the next most frequent character was t and then we get the letter a o and then we get some others where the least frequent characters we see j q x and z okay that's typical for most English documents at least most large ones by large I mean at least a paragraph you'll get you'll start to approximate these statistics what that means if our input plain text has that structure we'd expect our ciphertext to have those same statistics that is I look at my ciphertext it'll be longer than this if we have a large plain text I look at the ciphertext and I find the most frequent character in the ciphertext and if I find the most frequent character in the ciphertext for example is g there's about 12 percent of the characters are g in the ciphertext then I may assume that g maps to e in the plain text because I expect the most frequent character in the plain text to be e about 12 percent so if I find the most frequent character in the ciphertext to be g and it's around 12 percent then I make it may make an educator guess a guess to say that okay most likely g maps to the letter e that is g in the ciphertext corresponds to an e in the plain text and I can do that with other letters as well I count the other letters and I find okay what's the second most frequent character in the ciphertext maybe I find it's letter s and then I make the assumption okay the letter s occurs about nine percent of the time most likely that corresponds to the letter t in plain text because the second most frequent letter in English that is expected is t and we can do that with the most frequent and the and keep going and we can start to make some guesses as to what the mapping is from one letter to another and we'll start to work out what the possible mapping is from our all 26 characters to the the key here we can use not just the frequency of letters but also in English the frequencies of pairs of letters diagrams and the frequencies of triples of letters trigrams let's have a look at some examples it would different differ depending upon the size of the source and where that the source documents from I have let's get rid of this I've downloaded some some text documents there are some old books I think in this case so three three text documents they contain English in fact they're quite old so 100 plus years old so the English maybe not as common as what we see today so they're about 600 000 characters let's count the frequency of letters in them and I've got some software that we can do it but it's not too hard if I remember how to do this the first one and I'll sort by percentage so what this software this crypto package is just some scripts that will count the number of letters in the input text file and we'll sort them by the most the highest percentage and it counts them and it gives us these percentages in that there were in fact 470 000 characters that is English letters some of the other characters maybe spaces commas so punctuation mark so only counts the English letters there were 12 percent of those letters were E 9 percent with T 7 percent O and so on that's just the top I don't know the top 15 or or so letters we can do the same for those other documents they're just different books and this one we see also completely different book different source also around 12 percent E's 8 percent T's maybe a slightly difference in the ordering here but you see the similar frequencies of letters and the last one all right around 13 percent E's in this case 8 percent T's so the most most frequent letter is easily the most is the letter E in all three cases so when you have a large source of English text most likely you'll have around 12 percent E's because our monoalphabetic cipher has a one-to-one mapping from plain text letter to cipher text letter then we also expect that there'll be a letter in the cipher text which is occurs about 12 percent of the time the most frequent and from that we'd assume it's the letter E in the plain text we can also count pairs of letters diagrams so look at every pair of letters and see which pair occurs most frequently in this case the pair Th occurs around four percent of the time there are many different combinations of pairs in this case H-E-E-R-I-N and so on so the diagrams and we can do it for others trigrams triples of letters the most common in that source was T-H-E much more than the others T-H-E-A-N-D-I-N-G so you would recognize that those triples are most common in the English language so if we had a long plain text and somewhere in there there was the A-N-D and then in the let's and somewhere we had T-H-E the not just the word T-H-E but in other words as a part of a larger word then for example T-H-E would map to N H is M E is G when we look at our long ciphertext and count the trigrams we'd most likely see that the sequences of N-M-G would be the most frequent of all trigrams around maybe three four percent and if we find that the sequence N-M-G is the most frequent then we make this guess or estimate that those three letters correspond to the plain text letters T-H-E and then we replace them in our our ciphertext and we start to see parts of the plain text and we can eventually go through and work out what the plain text is so performing this analysis based upon the frequencies of letters diagram trigrams the expected structure of the plain text we can break a mono-alphabetic cipher without performing a brute force attack it doesn't just apply to English of course all languages have some structure they're more frequent letters or characters than other letters so it's not dependent upon the length language not just not just words but even in other types of messages we can see some structure in those messages some expected structure so we use the frequencies of letters diagrams and trigrams and expected words to analyze the ciphertext to work out what the plain text is once we know the plain text we know the mapping from ciphertext to plain text we know the key in this case and once we know the key we can decrypt any ciphertext which was encrypted with the same key so we're successful as an attacker so that's a problem with mono-alphabetic ciphers brute force attack impossible but some analysis is very easy you can write software that will do it in a matter of seconds or minutes for a large text i may try and either today or next week i'll try and find an example that we can go through just to demonstrate that so how can we overcome this the problem is that one letter always maps to the same letter in the ciphertext whenever we have an e here we'll always get g in the ciphertext and as a result the letter g in the ciphertext will most likely occur 12 percent of the time and then from that we guess that corresponds to the letter e okay so we count the most frequent letter in the ciphertext and then make the educated guess that that corresponds to the letter e and we use that to make some further guesses about the subsequent letters how do we get around that one way try to encrypt multiple letters of plain text at a time don't encrypt don't map one letter to one other for example consider a pair of letters and find a mapping we'll see an example of that and another use multiple cipher alphabets so we'll see poly polyalphabetic ciphers shortly as well we'll come back to this one later next cipher an example of solution one it encrypts multiple letters of plain text at a time mono-alphabetic takes one letter produces one letter of ciphertext encrypts one at a time so does Caesar cipher the play fair cipher encrypts a pair of letters at a time two at a time let's explain it with a an example and the example is on this slide we've got the answer we'll go through the steps to get that answer we have some plain text that we want to encrypt we have a keyword the key is not just a number or a let a single letter it's in fact an entire word in the play fair cipher a keyword and we're going to encrypt that plain text to find some cipher text and if everything works on the example on the board we should get this cipher text so that's the answer so we're going to encrypt hello with the keyword thailand so what we do is we with the keyword we write the keyword down but in a five by five matrix with english we have 26 characters and a five by five matrix we have 25 elements we need to combine two characters we'll treat in our case the characters i and j is the same the letters i and j is being the same letter we'll see how that works shortly so we write down our our keyword in in a matrix five by five we write it down row by row and we do not repeat the letters the idea is this matrix is going to contain all letters of the alphabet once it will never contain them twice so with the exception that we'll combine the letters i and j we've got 26 letters if we combine i and j we've got 25 letters so we write down our keyword t h a i l a well we already have the letter a in the matrix so we don't write it again so we move on to n and d okay that's the first step and then for the rest of the matrix we fill that out with the remaining letters in the alphabet in alphabetical order so we already have a so b c we have d e f we have h we have i we treat i and j the same so right here j as well where do we get to g h i j we have l we have t so that's our play fair matrix i created it by writing my keyword followed by the rest of the letters in the alphabet such that only one each letter only occurs once and we have this one special case of i and j let's treat them the same we write them in the same element now we use this matrix to encrypt our plain text our plain text in our example is hello we're going to encrypt a pair of letters at a time so we don't do a single letter we operate on pairs of letters for that to work and for the algorithm to work we need some special rules so first we're going to operate on a pair of letter at a time but if we have two letters in a pair which are the same character which we do have in this example we have l and l in this pair we don't want them in the same pair our algorithm will not work so what we do is we split them up by introducing some special character and a common special character is at least or one of the least frequently used characters say x so what i will do is this was my plain text i will change it to be he as the first pair instead of l l i will separate them with an x so l x l o so i have three pairs now i don't want to have two letters that are the same in one pair hence just introduce this special character and now i'm going to encrypt each pair at a time encrypt he and get some cipher text encrypt lx get some cipher text and encrypt l o and get some cipher text and what we do is we look up in the matrix with he the cipher text becomes the letters in the matrix which are on the the same row as the first as the current letter and the same column as the other letter in the pair let's see how that works he look at them in the matrix we have h and e so the cipher text becomes l and d because he on the same row as h is l and the same column as e that's l this is the same row as h and the same column gives us the letter l and then for the second letter in the pair e h the same row and column will be d in the matrix we get the cipher cipher text pair l d in this case plain text was he cipher text l d and then we do that for each subsequent pair lx lx we're going to get what lx try it same row as l is the letter a and same column as x so first the row and the column so a z and then l o here's a special case if the both letters in the pair in the same column then simply we take the letters underneath if we're in the same column then the cipher text becomes e u because we don't we cannot do the same row and column so we simply move down so l o the last pair becomes e u if they were on the same row if for example we had plain text q u we move to the right and we wrap around so if our plain text was q u the cipher text would be what rp if it was q u we move to the right r and wrap around p same column move down same row move to the right and in both cases we wrap where necessary wrapped the other endpoint and that's our cipher text we've encrypted we get our l d az e u so we would send that and the receiver must also of course know the keyword what the receiver does they know the keyword they write down the matrix they get the same matrix the receiver knows the cipher text and you can check they can do the opposite to find the original plain text what the receiver would do would find when they decrypt they would find the plain text is h e l x l o then it's up to the receiver to do some intelligent thinking and think that okay what is the word h e l x l o most likely it's not that it's most likely it's hello because here we have l x l o what's the word in english that contains l x l i cannot think of any so the receiver has to be smart enough to recognize that this x was introduced to split the two l's up so they've removed this and they get the original plain text hello which works in i think in most cases because introducing that x there should not give us a a word that is common so that's a plain text the play fair cipher and note it operates on a pair of characters at a time as opposed to the mono alphabetic that operated on one character at a time and makes it better against these and frequency attacks of looking at the frequency of letters and diagrams and trigrams any questions on how to do that on the set of the matrix and it has quite simple a and x line and it has to work out what does a x become b a okay moves down in the same column and we wrap around okay so there's and those rules are described here so there's some special cases treat i and j as the same letter so we can fit it in a 25 element matrix we have 26 letters in our plain text a pair of letters cannot be the same character therefore we separate them we split them up with with an x read replace with the letters on the same row and column either move down if they're on the same column move to the right if they're on the same row you should be able to encrypt and encrypt with the play fair sofa after the break which we won't go to just yet but after the break will you have some examples of encrypting these cycles let's so you can do some examples after the break let's it turns out that it's harder to break the mono alphabetic but still possible with with the same type of analysis the relative frequencies of characters so by still looking at diagrams trigrams and expected words it's possible to break possible to break the play fair something in the last five minutes let's just introduce poly alphabetic ciphers we had a mono alphabetic cipher where we used just one alphabet one set of letters a map to f b could map to any of the other letters not including f so any of the 25 other letters and so on with a poly alphabetic cipher we can have multiple alphabets a set of mono alphabetic ciphers there are different ones visionnaires one will go through vernum cipher will not it's in the textbook and there are others and we'll finish with the one-time pad which is a simple extension really remember the one-time pad is the unconditionally secure cipher the best one in terms of security the vision air cipher is quite simple it's simply the generalized Caesar cipher applied for each letter remember with a Caesar cipher we have a key k which can take one of 26 values the key can be from 0 to 25 or if we map them to the letters from a to z the vision air cipher we just use a Caesar cipher but we use a different key for each letter let's quickly try before we have our break plain text letter s sorry i key s what are the values remember the Caesar cipher c equals p plus k mod 26 these are integers p is i what's the value start from 0 8 s the value of s i can never remember 18 so we simply so we the first letter in the plain text is i internet technologies the first letter in the keyword is s the keyword is serent on here so we apply the Caesar cipher on that that plain text letter 8 plus 18 is 26 26 mod 26 is 0 and the letter for 0 is a so we need we encrypt the letter i with a Caesar cipher where the k is s and we get the letter a as the output uppercase a show and then we do that on the next letter n but we use a different key for our Caesar cipher the key comes from the letter from the keyword so n is the what n is 13 is it and the key is i which is 8 so we get 21 which hopefully is v so we apply the Caesar cipher on each letter where the key comes from the keyword so very simple it's just an extension of the Caesar cipher but we change the key every letter originally in the Caesar cipher we use the one key for all letters in the vision s cipher we change the key each letter to determine what the key is we choose a keyword and for practical purposes if i choose a keyword it may be shorter than the plain text message i want to encrypt in this case i have a keyword to make it the same length as the plain text i just repeat the keyword as necessary in this case i had to repeat it twice if i had internet technologies and applications then i'd have sear and horn sear and horn sear and horn and it may just keep repeating so that we have the same number of letters as the plain text letters and just encrypt with the Caesar cipher we're changing the key each time we move on to the next letter in the plain text and as a result there are multiple cipher text letters for each plain text letter plain text letter there's t t t here three t's the cipher text is k h k different cipher text letters e e e cipher text is m l r v with our mono-alphabetic cipher and Caesar cipher when we have the same plain text letter we get the same cipher text letter but now with our polyalphabetic cipher the vision air cipher in this case with the same plain text letter we don't necessarily get the same cipher text letter so this frequency analysis is much harder to be applied in this case because the frequency of letters in the cipher text is not the same as the frequencies of letters in the plain text much harder to break the one-time pad is essentially the same except we have a random key instead of a a a known keyword we choose a random key let's have a break now and then after the break will i'll give you some examples on those sizes ciphers we've gone through you'll do some examples and then we'll talk about the one-time pad