 In the last week's lecture, we introduced some concepts of encryption. So encryption is taking plain text and a key, and we apply an algorithm that transforms that plain text into ciphertext based upon the key. The idea is that the ciphertext, if someone intercepts a ciphertext and sees it, they cannot get the plain text unless they have the correct key. So the algorithm must be such that given the ciphertext and the algorithm, it must be hard to find the key or the plain text. And we had a couple of examples of old ciphers, the Caesar cipher, I think, but that was just to illustrate some concepts. And we started to talk about attacks and keys, and we'll get back to that today. Some types of brute force attacks. But from the attacker's perspective, their goal is to find the plain text or the key. So they have the ciphertext, they have the algorithm or the cipher, so we assume the attacker knows the algorithm being used, they want to find the plain text, what was encrypted, and the key, because the key should be secret. Now in some attacks against systems, the attacker may know more, they may know other pairs of plain text and ciphertext, maybe they've discovered it through other means, and that can help them for some algorithms. There are two basic approaches for the attacker, brute force attack, where you are trying to find the key of the plain text, then there are limited set of keys, so try them all. And one of them that you try, if you try them all, will be the correct key and will give you the plain text that was encrypted. The other approach is a little bit more intelligent, crypt analysis, where we try to take advantage of the characteristics of the algorithm, some weaknesses perhaps, or some weaknesses in the algorithm and try and work out what the plain text or key is. So analyze the algorithm to try and derive the plain text or key without having to try all possible keys. Brute force attack is simple, but time consuming, that is, if you have many keys it takes a long time to try them all, crypt analysis is hard, but if you can find a weakness in the algorithm, then it can be much more effective than a brute force attack. We mentioned a little bit about brute force attacks, or the number of keys last week, we will see that again, but before we go through more issues of attacks, let's go more depth about symmetric key encryption. So in symmetric key encryption, the model is that we take plain text, denote it as P, we encrypt using a key. And in symmetric key encryption the key is known by both the encryptor and the decryptor. So we say it's a shared secret key, the key must be secret, no one else can know it, apart from those people encrypting and decrypting, and it's shared, that is both user A and user B have the same value. So there's symmetry between the two users with respect to the keys, that's why we call it symmetric key encryption, it's the same key on both sides. And the algorithm must be such that we encrypt with the plain text with this key K, we get cipher text C, we send the cipher text to the user B, and the algorithm must be such that when we decrypt using the same key, we must get the original plain text back. If we don't, then it's ineffective. And the attacker's objective is to take the cipher text, assuming they know the algorithms used for encryption and decryption, to take the cipher text and find the plain text or the key. So they don't know the key, and they need to find it. This is the most commonly used form of encryption for, well, still is, for all types of data encryption. There's another form, which is called asymmetric key encryption, where the keys used on either side are different. So there's asymmetry between the keys. We'll see that later in this topic, not today though. This is the main form used, the other one's used in some special purposes. We'll compare them after we go through the second. For this to work or to be secure, it must be hard for the attacker to be able to find the plain text or key. Impossible preferably, but in practice impossible is not possible. That is, making an algorithm such that it's impossible in theory to find the plain text or key is very hard or very inconvenient. So usually there's ways to measure the strength of the algorithm. Think about how much effort it would take an attacker to find the plain text or key. And again, this one used shared secret keys. Send and receiver must both have a shared secret key. How do they get it? How does each user know the value of K? Let's say user A on the left chooses the secret key K, chooses a 128-bit number randomly. I use my computer to generate it. How do I get it to use a B? We write it down and when we visit them we give it to them on a piece of paper. But that's a manual delivery of the key. It's possible, not very convenient of course, especially if someone's on the other side of the world. But be careful, we assume that there is some way to distribute the key to the receiver B. So far we assume that A and B know some shared secret and no one else knows it, otherwise it's not a secret. So we may see at the end of this topic there are some ways to automatically distribute that key to B in a secure manner. Of course we can't just send them the key in an email or via some message over the network because if we send the key and it's unencrypted then the attacker could intercept and see the key and therefore it's no longer secret. And we cannot encrypt the key, at least using this technique, because to encrypt the key we need another key to encrypt and B must know that key and we have this same problem. So for now let's assume that they somehow have exchanged keys. Here we'll see a technique that we can use to automatically do it. And again, secret key means no one else knows it. Within symmetric key encryption there are maybe two types, block ciphers and stream ciphers and they really differ in terms of definition about how much plain text they encrypt at a time. All the algorithms in use today, we have some input plain text. What they all do is that they break that plain text into chunks, different usually fixed size sequences of bits. So with computers today we operate on plain text and treat it as bits. So the difference between block ciphers and stream ciphers is on how many bits or bytes do they operate on at a time? So what normally happens is that if we have a large file, two megabytes in length, for example, we want to encrypt it. What the algorithm does is takes that file and splits it into blocks. In a block cipher the typical size of each block is 64 or 128 bits. Different algorithms use different sized blocks. In a stream cipher the algorithm usually encrypts one byte or in some cases one bit at a time. So eight bits or one bit. So really the difference is how many bits do they operate on at a time? So if we're using a block cipher, we have a large file, we break it in to say 64 bit blocks, we encrypt each block at a time. Get some ciphertexts output, encrypt the next block and so on. Well it's similar in the stream cipher but we encrypt one byte at a time. Now what the practical difference is is that the stream ciphers that have been developed are generally much faster in terms of implementation than block ciphers. Much faster in that you take your input plain text and to encrypt it happens in less time. So when we care about encrypting something in real time then stream ciphers usually make sense. For example, two different cases, I have a file on my disk, a large file, I want to encrypt it so I apply an algorithm that encrypts it, usually we'd use a block cipher. The time it takes to encrypt is not so important whether it takes one second or two seconds it probably doesn't matter for me. Even one minute if it's a very large file. Alright, I'd like it to be fast but I can put up with a little bit of delay. So block ciphers are mainly used for file encryption. Stream ciphers were really developed for real time streaming of data. For example, you're talking to someone across a network using voiceover IP or some application that's streaming data in real time. As you generate that data, as you talk your computer converts your voice to bits and those bits have to be sent straight away so that the receiver receives them within a short delay. And there maybe a stream cipher makes sense because those bits which are generated are encrypted quickly and then can be sent. So it introduces a much smaller delay than block ciphers. Stream ciphers mainly applied for streaming data across a network. We want to do things quick. So there's some differences in the algorithms between the two. But in fact nowadays because computers are so fast and the algorithms have been improved the block ciphers are quite fast as well. In some cases almost the same speed as stream ciphers. We will list some examples of block ciphers and you'll see them mainly in use but we'll also mention some stream ciphers. Stream ciphers usually make use of an exclusive or with a random number but those details we will not get into. So just be aware some people talk about block and stream ciphers to classify them. Let's look at some block ciphers and the main one that was really became popular designed in the late or the mid 1970s was the data encryption standard. It was designed by IBM and the NSA had some input and it was standardised eventually by what's called NIST the National Institute of Standards and Technology in the US. It became popular because it was made standard by the US government and it meant really that many of the US government departments when they encrypted data and hence many of the companies that needed to deal with the US government had to use this standard and it spread across the world and probably was one of the most widely used encryption ciphers in the world. And in fact many of the ciphers that have been built since then have similar concepts to DES, the data encryption standard. And what DES would do is when we have our plain text input it would take that and split it into blocks and the block size was 64 bits. So what it would do is the algorithm itself just encrypts 64 bits at a time. So if you have a file it splits it into 64 bit blocks and encrypts that and gets ciphertext out. And the resulting ciphertext one simple way is just combine the ciphertext from each of those blocks. So you take the first 64 bits of your file encrypt, you get 64 bits of ciphertext, you take the next 64 bits of the file encrypt and get 64 bits of ciphertext, you keep going and the resulting encrypted file is that combination of all those 64 bits of ciphertext. There are in fact other ways to combine those output blocks but we may see that later. But DES is designed to take 64 bits in of plain text, 64 bits out of ciphertext. And it uses a key of, the key was actually 64 bits in length but in effect eight bits were unused for encryption so effectively it's 56 bits in length. And we're not going to go through the design. If you want to know the design you need to sit through my security and cryptography course for three or four lectures but that's not what we'll go through today. What the design looks like is this. So the algorithm specifies how do you transform the plain text to get ciphertext. This is just to illustrate that there are many components to the algorithm. It's too small to see but the red rectangle at the top takes 64 bits of plain text in and down the bottom here produces 64 bits of ciphertext. And then the way that the cipher works and many ciphers today, block ciphers, is that they use two common operations, substitution and transposition. And we tried to illustrate them last week with the Caesar cipher, replace one letter with another letter, that's substitution. And the other cipher we used I think was the rail fence cipher. We wrote the letters in rows. That's transposition where we rearrange letters. So even though we went through simple ciphers, DES uses the same concepts but on binary values. It combines operations of substitutions and transpositions. Transpositions is rearranging. Another word for that is permutation. Permutation is an arrangement of a set of elements. So the other thing that DES does and many ciphers is that it uses simple operations of transpositions and substitutions and then it repeats those operations and repeats them again and again and again to make the output ciphertext more secure. And again this is too much detail for this course but the idea is that in this red rectangle there's a round, round one, which is this big green rectangle here where we do some operations of permutations and substitutions and we get something out. Then round two we just repeat that but taking as an input the output of round one and we repeat it again and again and again with the idea that applying these operations multiple times makes this it harder for the attacker to take the ciphertext and work back to the plaintext. So DES used 16 rounds of applying some complex operations. Uses exclusive ores, left shifts and so on. It's quite complex to go through but many people consider the design to be quite secure. People have analyzed and try and find weaknesses in the design of this algorithm and most people consider it to be quite secure. That is if you get ciphertext as output it's very hard to work backwards and find the plaintext unless you know the key. The problem with DES is that the key was too small. 56 bits at the time maybe in the 1970s computers could not try all the possible keys in a reasonable time. To try all possible keys you have 56 bits in each key so there are two to the power of 56 possible keys. How many is that? Well it's two to the power of 50, two to the power of 30 is a billion, two to the power of 40 is a thousand billion so two to the power of 50 must be close to a million billion. So one followed by 15 zeros. Well in the 1970s that was considered too many keys for a computer to try within a reasonable time but as computers got faster it turned out that it started to become possible to build a computer that would be able to try all possible keys and if you can try all possible keys you'll eventually find the correct key if you're the attacker and you'll be able to decrypt. So DES became subject to a brute force attack so from that perspective it was insecure but from the perspective of the algorithm it was considered secure. It's no longer recommended because the key is too short. We'll see examples of good length keys shortly. But people when they realized that DES was maybe not so good anymore they started to design new ciphers and one of them was triple DES basically used DES but used it three times using different keys each time. So apply a DES operation and do it three times and either use three keys so if one key is 56 bits then three keys triples that to 168 bits and a brute force attack on 168 bits is much much harder not possible. The problem with triple DES is that it was three times slower than DES and the time when DES was designed it wasn't designed for speed and for the current computer architectures that it was built for and today's it's it's quite slow. So again triple DES is available today and considered secure but not so common because it's quite slow and soon I'll show some examples on a computer how long it does take to encrypt so we'll put some numbers to slow and sort of in the 1990s the US government tried to develop a new standard for encryption. There are others around but one of them that because the US government creates a standard that everyone within the US must use it becomes widespread so they created the advanced encryption standard AES and they actually had a competition where many people from across the world submitted their algorithms and they people compared them to select the best algorithm and eventually AES was created. It used similar concepts to DES it uses multiple rounds multiple operations of substitutions and permutations it operated on a block size of 128 bits so if you've got your one megabyte file that's split into blocks of 128 bits and encrypted one at a time it allowed for different size keys so if you were happy with 128 bit key you could use that but if you are really wanted to be secure you could choose a longer key and the trade-off was really performance the longer the key the slower to encrypt so it supported three different key lengths generally AES is widely used today so in terms of encrypting data there are other algorithms but one of the most widely implemented ones is AES the advanced encryption standard if you use full disk encryption on your computer on your laptop your operating system encrypts the contents or file encryption usually they use AES. Wi-Fi encryption makes use of AES many protocols used for communications across the internet use AES so this is a widely used in encryption standard the purpose of this course is not to go through how they work it's just at this stage to mention some of them so you're aware of when you hear of AES you know it's a block cipher there are many others this just lists some names of different algorithms that people have designed over the different years they all have similarities in design and that they use either 64 or 128 bit blocks they have different key sizes and this Feistel structure was this common design structure that people use for DES and continued for other algorithms no need to know about all of them but just some examples so for this course we're not going to study the details of the algorithms what we're going to assume is that when we want to provide IT security that there are some symmetric key algorithms that are secure and that we can use and trust AES is one and there are others and really the key point that know about symmetric key encryption is captured in these assumptions and in fact these assumptions and some others are in in one of the handouts so they're on the slides but there's also a handout a little bit later that lists them all in just two pages I'll just bring it up so you're aware of that you'll find it somewhere in your handouts this is a two page list of assumptions and principles that we're going to arrive at at the end of this topic on cryptography and then we're going to use them as we go through the subsequent topics in this course so when we talk about passwords encryption in the internet denial of service attacks will always come back to these assumptions so that we can make some arguments of what's secure and what's not so have that one ready when we go through the next topics so what are they to get started so symmetric key encryption we use the same secret key K use for encryption and we denote that as the function E usually and decryption the function D and often that secret key which is shared between the two users A and B will write as K subscript AB if it's shared between users D and C then K subscript DC just indicating A knows the key and B knows the key so that's the notation we encrypt plain text P with a key and that produces ciphertext so using our notation apply our encryption function the input is a key shared between A and B and the plain text and the output is ciphertext C when we decrypt that ciphertext if we use the correct key then we assume that all produce the original plain text so we have some algorithm E such that it will always produce the correct or the original plain text and that the decryptor the person who does the decryption will be able to recognize that plain text is correct and therefore the key is correct so we write the decryption as we take the ciphertext as input we take the same key that was used for encryption K AB and we apply our decryption algorithm and we get the plain text we assume that if we try to decrypt the ciphertext using the incorrect key that we will not produce the original plain text will produce something different and more importantly we'll be able to recognize that what's produced is not the correct plain text and therefore recognize that the key we've used is wrong so let's say A and B have a shared secret key K AB A encrypts using K AB and they gets the ciphertext C and then another user comes along user X they know the ciphertext C if they try to decrypt that ciphertext using a key other than K AB so some other value then when they decrypt they will get the wrong plain text and in fact we'll assume they'll get some random sort of garbage plain text that they'll be able to recognize this is not the correct plain text that's this point that the decryptor if they use the wrong key the one that wasn't used for encryption they will not get the correct plain text and they will know that so there are assumptions that will use as we as we go through and and apply the concepts of cryptography to build up security mechanisms questions on those assumptions again later make sure you know where to look them up so we can be clear on how they used in different security mechanisms what is E and D what are the algorithms well for example AES implements the encryption or decryption algorithm desk triple desk and many the others that's what we mean by E and D in this case what is K AB well with AES it's 128 bit or 192 or 256 bit random number usually so when I talk about a key now we talk about binary values some length and usually when we choose a key we choose a random value okay I don't choose a key of all zeros because then it's if I choose the key according to some structure then it's more likely that attacker can guess that key in the same way that you never choose passwords from words you always choose random passwords correct no alright passwords are different but in theory with passwords you should choose a random sequence of characters it's harder for some to guess well same here we choose a random sequence of bits it's harder for someone to guess now in theory passwords if we choose random ones is good but not very convenient for us but with keys it's not us really choosing it it's the computer so a piece of software chooses the key and it's easy to choose and easy to remember because it's usually saved on disk so we usually use random keys for block ciphers and stream ciphers we'll come to passwords in our next topics so given these assumptions what about the attacker well we've said there are two types of attacks brute force or crypt analysis try all keys or tries to find some weaknesses in the algorithm try all keys in the key space the key space is the set of all possible keys so if I have a k-bit key then the key space is 2 to the power of k keys that's all possible values and a brute force attack we usually measure how good it is or measure the attack based upon how long it takes and the time it takes to try all keys depends on two main factors how many keys we need to try 2 to the power of k if we have a k-bit key and how fast our computer is so if I try 1 billion keys using my laptop versus 1 billion keys using some supercomputer they'll take two different amounts of time but usually when we talk about the strength of algorithms we we don't look at the time in fact we look at how many operations that it takes how many decryptions we need to apply or really how many keys so we usually approximate and say okay this algorithm the strength of it depends upon how many operations we need to do to defeat it to break it with a brute force attack that's purely dependent upon the key length so we usually ignore the fact that different computers take different amounts of time and just assume that they're all the same we'll see examples of the different speeds on different computers shortly with crypt analysis we need to find weaknesses in the algorithms and that depends upon the specific algorithms there are different methods and different analysis techniques that people take advantage of meet in the middle attacks and different names of attacks which we will not go through but generally with well-designed algorithms they are hard to find attacks that will defeat them and the algorithms in use today there are some known attacks but in theory they're not much better than brute force so therefore they're not very practical when we looked at crypt analysis and attack we measure that the strength of an attack and therefore the strength of an algorithm based upon how many operations it takes to defeat the algorithm how many decryptions it takes for example the amount of memory we need to defeat the algorithm the more memory we need the stronger the algorithm and maybe the amount of known information by the attacker the more information the attacker knows based upon previous plaintext and ciphertext the easier it is for the attacker would see some numbers of them shortly as well on brute force attacks this gives some it puts some numbers to the time given different theoretical computer speeds so the first column is the key length you can ignore the last row here maybe for this course it's not relevant I use it in a different example in my other course but ignore the last row so the first column those first six rows indicates the key length for example if we use des a 56 bit key the key space in the second column means how many possible keys and quite simple 56 bit key two to the power of 56 possible keys the normal version of AES 128 bit key to the power of 128 possible keys the last three columns say if we have a computer that can try keys at a particular speed how long would it take so we give three different examples of computers for example if we can calculate it one billion or try one billion keys per second ten to the power of nine is one billion if we have a computer that can do that speed with deaths it would take 833 days to try all possible keys that's easy to calculate if there's two to the power of 56 possible keys and we do ten to the power of nine keys tried per second and then we just divide and we get the number of seconds we can try that I don't know if my calculator will handle it actually I do again I bring up my trustworthy BC calculator two to the power of 56 that's the number of keys how many is it that's how many okay and if we can do one billion keys per second ten to the power of nine keys per second then that's how many seconds it would take us to try them all well seconds what can we do convert to minutes divide by 60 convert to hours convert to days 833 days okay so that's the calculation of the time sorry if it doesn't show very well there move that across sorry the projectors not aligned a bit easier to see 833 days so that's all that's that the key space divided by the the rate and that's this one if we had a faster computer 1000 times faster or we have 1000 computers okay we distribute this task across many computers and that's easy to do because with a brute force attack what we do is we try all keys and we don't have to try them in any particular order we can try some keys on one computer and another set of the keys on a different computer at the same time so we can make it a we can parallelize this problem quite easily so the second column says if we could do in total 10 to the power of 12 keys per second then our 56 bit key would be down to 20 hours okay so well that's reasonable but if we have that processing and the next column is what if we could do 10 to the power of 15 per second well down to seconds so 56 bit key and even 64 bit keys are considered too weak for encryption today how fast is your computer which column do you think it's near and you want to have a guess a laptop or a new PC today we'll see it depends a bit on the algorithm but that closer to the the less than 10 to the power of 9 okay so they're nowhere near here so I will see shortly in my laptop it's I think one million per second not one billion or tens of millions per second so say a standard laptop is not going to break deaths on its own but if we buy some dedicated hardware that's programmed to break deaths and we we spend a bit more money than just on a normal CPU then we can do it especially if we're a company or a government so these 56 64 bit keys are considered too short to be secure that's the smallest length the AES is 128 bits even with 10 to the power of 15 keys per second it would take you 10 to the power of 16 years to break it so that's considered long enough okay and similar increasing to 256 10 to the power of 54 years just as a note the age of the universe is 10 to the power of 10 years so we're not going to do it before the universe ends there's an error somewhere there I think this five seconds is wrong you'll work out the exact value it's not five seconds I think it's five something else other time you haven't fixed that brute force is easy to do to is easy to prevent by just making the key long enough okay generally 100 bits or longer nowadays people use 128 even 256 is commonly used most algorithms are slower with longer keys so that's the trade-off more secure but slower to encrypt we'll go through how fast by looking at an encryption on my computer that is I'll use some software to encrypt something very simple and we'll get then some measure of just on my laptop which is a few years old now how fast it takes in practice and you can do the same in maybe in later some homework tasks so I'll use some software to encrypt some message and then we'll measure the speed and see how long it takes to encrypt this the commands I'm going to use are in one of the handouts there or that link to on the website so you can see them and run them on your own some of your homework tasks will involve doing something similar to this the software we're going to use to encrypt is called open SSL and it's widely used not just as we're going to use it but it's used by other software so many web servers many applications use open SSL to encrypt data now in the last few months there've been some flaws found in it a serious flaw that meant that people using it were revealing their keys to other people or some secret information but still it's considered was still one of the most widely used encryption libraries available we're just going to use it because it's easily available and it's something you try on your own computer so first I need some plain text and I'm going to copy some of the commands to save a bit of time you'll know what they mean so I'm just going to create a file called plain text dot txt and it was going to contain some text okay the plain text is our message here and okay creates the file just to be sure okay there's our message our plain text we want to communicate to someone we're going to encrypt it just to be get some details the number of characters in this 72 characters how many bytes anyone want to guess how many bytes is this file or is this plain text normally a computer will encode bytes normally a computer will encode text characters as a byte each okay one character one byte 72 bytes okay that's the standard encoding and we can check it's 72 bytes okay how many blocks if we use a desk remember our ciphers our block ciphers take the plain text and split it into blocks how many blocks are we going to have to encrypt here 2 4 this is 72 bytes how many blocks 11 12 2 4 I'll say when you get it correct desk look back at desk desk 9 sounds okay desk if we just jump back desk operates on a block size of 64 bits 8 bytes per block okay so desk encrypts 8 bytes produces 8 bytes of ciphertext then applies the algorithm on the next 8 bytes of plain text a aes works on 128 bits I'm going to encrypt with desk just just to get started so with 72 bytes desk does 8 bytes at a time so there'll be 9 blocks in this plain text we'll not see it but what desk will do is just take the first block encrypt it using its algorithm take the next block encrypt with the algorithm and combine those output blocks together before we encrypt maybe we can we should look at the plain text we said at 72 bytes xxd is just a what a program to look at a file but in binary or hexadecimal form desk and other ciphers they all operate on bits not on letters doesn't matter the language they just treat everything as a sequence of bits first that's the the file in hexadecimal okay H we have 48 in hexadecimal okay not so important at this stage let's look at it in binary let me remember you don't need to know these commands at this stage there's the file but in binary form so the first column just says where we are in the line of the the output so it's these four columns is the binary form of this file so the first letter H in hello uppercase H is represented by these eight bits and you can work that out in that remember if you go back to ASCII encoding you can look up ASCII encoding and find the letter uppercase H maps to what's this 72 I think in ASCII yes that's that's all this is just ASCII encoding of these characters to binary so what desk is going to do is take what have we got 64 bits at a time so here we have 32 bits and another 32 bits desk encrypts those 64 bits and get some ciphertext then it does for the next 64 bits and get some more ciphertext and at the end it just joins those blocks of 64 bits together and that gives us our result let's we need to encrypt and we have our plain text to encrypt what else do we need a key anyone want to choose a key well we should choose a random key okay we should let our computer choose a random key for us so we need some way to choose a random number different ways to do it on my computer this program open SSL actually has a random number generator or generator in hex there's a random number and another random number and it's how long in hex 8 bytes 64 bits desk actually takes a 64 bit key as input although we said it's 56 bits of those 64 bits only 56 are used but we must generate a 64 bit key so our 8 byte key and I'll use the first one this 5 7 and a 3 3 the way the algorithms work when we have more than one block is that we join them together and the simple way to join the ciphertext blocks together is just to concatenate them but there are other algorithms to join them together and in fact in practice most algorithms take a third parameter so we have plain text a key and a third parameter that really initializes the algorithm to her joining joining the ciphertext blocks together we're not explained too much about it but it's often called an initialization vector something to initialize our algorithm and I'm going to use this second random number for that don't worry too much about that yet so we've got plain text I've got a key this 5 7 value and I'll also use this EE value as some other initialization vector and again you don't need to remember this I'll just copy it open SSL we'll use to encrypt encrypt we choose an algorithm it supports many different algorithms desk is the algorithm and the way that we combine the blocks together we must specify and this we're using the basic technique called ECB electronic code book again not so important for you not yet what's mine as EE mean I can't remember we'll may see later take my input plain text produce some output ciphertext okay so far we're going to use software called open SSL encrypt using algorithm desk inputs my plain text the outputs going to be a file called ciphertext dot bin and I need to specify the key and this initialization vector minus IV I'll just copy and paste these values and the key minus K and another option sometimes the software will add some padding in there to allow when we decrypt to detect if it's correct or not but I don't want to do that it's of no use so I need to add a special option no padding no pad it's just a feature for this software not so important we encrypt it's encrypted we now have ciphertext of bin the length is 72 bytes okay if we encrypt plain text we always get ciphertext of the same length unless maybe we introduce some padding in there but we haven't done that let's look at the ciphertext first we'll look at it in hexadecimal there it is okay so the ASCII encoding and the actual hexadecimal values we could look at it in binary it shouldn't make any sense the idea of encryption is to take some structured plain text which makes sense and convert it into something that appears random such that when someone has that random ciphertext they have no way to work out what the pattern of the plain text was and if you look at the sequence of hexadecimal characters you shouldn't see any pattern in there anyone see any pattern sure okay we need to be secure make sure I haven't done anything wrong so the idea of encryption is to convert structured plain text into random ciphertext that is random no pattern in this output ciphertext look at these hexadecimal characters any pattern hard to tell isn't it look close there is you can either look in the ASCII form but the dots are those unprintable characters in ASCII or maybe in the hexadecimal form there in fact look at the fourth row and the seventh row they are the same okay that's a problem and that's a weakness in this encryption when we have ciphertext which has a pattern it's very unlikely randomly to choose to generate this number of characters that are identical then it gives an opportunity for an attacker to try and deduce what the plain text was and in fact there's a weakness in how we encrypted this such that the output ciphertext does have a pattern and it's to do with the way that those blocks were combined okay we encrypted 64 bits at a time really two rows actually no one row in hexadecimal this is we encrypted and produced one block of ciphertext another block of ciphertext and we produced nine blocks of ciphertext 72 bytes it turns out in this case the fourth block is the same as the seventh block why well let's look at our plain text and you may see why why is the fourth block in the ciphertext the same as the seventh block in the ciphertext because the fourth block in the plain text is the same as the seventh block in the plain text in our original message it was space secret space just turned out that our message was if we broke it into blocks that we ended up with two blocks which were the same and that the way that the algorithm works is that we encrypt a block we get ciphertext we encrypt a block we get ciphertext so therefore if we encrypt two blocks which are the same we'll get the same ciphertext output and that's a problem and the problem can be overcome we will not do it here but it can be overcome by combining the output blocks in a different manner in our case we just concatenated the output blocks together but there are other algorithm called modes of operation that will combine them in different ways in the command that I use the way that I combined it I just was called ECB that's the basic way to combine the blocks there are better ways this one's insecure but it's a very simple one so that's just an aside that you need to not just choose the algorithm to encrypt but the algorithm to combine the blocks together after you encrypt and there are some that are better than others what else can we say about this let's decrypt just to make sure it works I have my ciphertext I link decrypt open SSL and ENC same algorithm desk ECB but specify the minus D option the minus E option was for encrypt the minus D is for decrypt I remember now the input is the ciphertext the output let's call it something else maybe the received message that is this is what the receiver does the sender user a takes the message encrypts sends the ciphertext across the network then user B receives the ciphertext and they must decrypt so we decrypt and we're going to get some output and we need to use the same values for our initialization vector let's hope I can find them and the key so we must have them at the receiver and no pad I think and let's just look at the received file okay it worked that's good our algorithm works of course when we decrypt we must get the exact same plaintext which we have in this case but let's that's not so interesting let's decrypt again I know it's a bit hard to see but let's just change one thing let's change the key we encrypted with key that ended with 3 3 let's change it to 3 4 when we decrypt what's going to happen can I decrypt well we will be able to decrypt but what what's the output going to be so what I've done I'm going to decrypt the minus D means decrypt our ciphertext I'm going to write it to the file called received 2.txt IV is the same as before the key is almost the same as before except this last number 4 when we encrypted it was 3 now I've changed to decrypt with 4 so I'm using a different key to decrypt what's going to happen maybe the ends of the last bits of each block will be different okay we've got the different key here so I think you know that you're not going to get the right the same plaintext if we use this different key let's do it and have a look so we do that and I'll open the the file with our trustworthy xxd receive 2 there it is and it's hard to see but this is the the received plaintext in this case after decrypting it's random effectively it doesn't make any sense whatsoever the dots here mean that they are unprintable ASCII characters we cannot display them like control characters so we didn't just get some bits change we effectively get all bits changed one small change in the key means nothing works we decrypt and we get plaintext which is really random with respect to our original plaintext and that's what we'll assume that always the case if we use the wrong key no matter how close to the original key what we'll get when we decrypt will be recognizably wrong I recognize this is not the message someone sent to me because no one sends me random messages sending people random messages communicates no information so this decryption failed and we know that because the message makes no sense and most secure algorithms have that property changed just one thing in the key or change one part of the cipher text and you decrypt and you'll get random output recognizably wrong questions on encrypting and decrypting you don't need to remember all this code not yet anyway so what's insecure about this approach that we just did we should be able to recognize maybe two things which were wrong or insecure we use des with ECB electronic code book what's wrong with des what's wrong with des the out of the in general the key is too small okay with des alright the key it's hard to see here but this is an 8 byte if we convert it it's 8 bytes or 50 64 bits but in fact only 56 are used so if you go back to our brute force we can buy hardware that will break that within a short amount of time so someone can guess my key or calculate my key by just trying decrypting many times just change the key until they get output that makes sense there's some English words and then they found my key that's the problem with des the other problem and we haven't touched upon it much and we won't have time to do when we use block ciphers the way we combine the blocks is important in this approach we just concatenated them that's what ECB does and the result was if I bring back one of the earlier ones when we have blocks which are the same in the plain text like the fourth one was secret and the seventh one was secret when we encrypt we get identical blocks in the output so the fourth one is this Q9W9 and the seventh is the same exactly the same that's the problem in the way that we combine blocks the algorithm ECB is a problem there are other better ones that we should have used we're not going to touch upon those algorithms anymore we won't have time but if we just jump forward where are they that described here how to use block ciphers on large pieces of data there are things called modes of operation ones called electronic code book it's not secure it's very simple but not secure there are others which are considered secure in different situations CBC CFB and others but there are a few pictures on the next few slides but we will not cover them but in a homework you may need to choose one so just be aware that you choose not just desks or the encryption algorithm you also choose a mode of operation but usually there are some default ones which are automatically chosen or I'll recommend one to choose which is secure okay let's in the last 15 minutes some other practical things how fast can I decrypt well open SSL has a speed test built in and what it does it just does a few decryptions and it tells us how long so what we're going to try is see how fast my computer will decrypt or encrypt using AES with 128 bit key and using that the mode of operation called CBC which is a common combination and what it does for three seconds it does as many as it can of different size blocks so 16 byte blocks 64 byte blocks and so on and at the end will give us some summary statistics maybe I should zoom out so it's a bit easier sorry it wraps around I'll do it again and we'll just take one of these as an example it's just doing as many as possible on my computer and take this number first there's some small differences between them but this says that my computer can encrypt using AES 61 megabytes per second keep so to if you have a large amount of data if you have it a 61 megabyte file 61,000 K you can encrypt that within one second so that's some measure of speed of how much data can we encrypt per second so if you have a large multi gigabyte file then it will take multiple seconds if we use AES the other way to look at it is how many encryptions per second and it doesn't come out so well but maybe this number what is that 11 million 11 million really encryptions in three seconds they did it for three seconds and it counted I could do an 11 million in three seconds so in one second that's about what a little bit less than four million so my computer can do about four million operations per second so if I was going to try a brute force attack on a key you can think that my computer can try about four million keys per second there's some variation so it can encrypt or decrypt and they take about the same time encrypt and decrypt at a speed of about four million times per second four million four by ten to the power of six remember back to here well this was a calculation for ten to the power of nine so my computer is about a thousand times slower than this column which means if I tried to brute force deaths it wouldn't take 833 days or what's that nearly three years it would take 300 years on my laptop but if I had a better computer it would be faster and in fact if I had a access to hundreds of computers I could speed that up by a factor of hundreds because we can parallelize let's look at other examples my computer is not designed or the CPU is not designed in or the software is not designed to be fast for encryption there's actually a mode that we can do that takes advantage of the hardware I have to enable it I think EVP nowadays CPUs have a special instruction in the instruction set to AES encrypt so there's some hardware optimizations so I'm doing the same what's this or here before I had what 68 or 61 megabytes per second now it's up to 366 megabytes per second so your CPU nowadays in the last few years includes an instruction to encrypt with AES AES only so that designed to be faster in hardware and another example some people have of course to defeat ciphers they build some dedicated hardware not a general purpose CPU but build hardware that's just for decrypting I may have included this in your handouts maybe I can't remember no maybe a long way through okay yep it's there but it's just an example of of some hardware people have developed over the years to try and defeat different ciphers so back in 1998 first people to start a design so hardware to defeat deaths okay so in 1998 EFF developed some hardware it cost less than 250,000 US dollars all right sounds expensive to us but for say for an organization or a government that's not very expensive and it could do 80 billion keys per second on death okay my computer did something like four million keys per second on AES at that time this is a long time ago this could do 80 billion keys per second so it was hardware dedicated for decryption of deaths and it could solve deaths so basically a brute force deaths in a couple of days so that meant at that time deaths was insecure it could be broken by brute force but what about AES which is widely used today in 2006 some company Cy engines developed some hardware to try and brute force AES so they used FPGAs so dedicated hardware just for operating on AES and they use a bunch of them so they had something like 128 120 FPGAs of these boards here single boards they implemented them and they could do something like 400 million keys per second per FPGA so times that by 120 for the entire system actually this one sorry was on deaths the next ones AES so they broke deaths in eight days but it only cost ten thousand dollars so ten thousand dollars is not much for most people who want to try and decrypt death so death was truly broken at that speed at that time and quite cheap to break if we try and extend that and we'll come back to that Moore's law what's Moore's law Moore's law was this idea that computers gets faster every every year or two that is the the number of transistors we can fit on there gets more and more roughly it says that computers double in speed every one and a half years it's not exact but let's assume that today I buy a computer for 30,000 buck and in one and a half years time for the same amount of money I can buy a computer which is twice as fast as today that's what it says okay or the other way today I buy a computer for 30,000 but and it's at some speed in one and a half years I spend half as much money and I can buy the same speed computer the cost halves every one and a half years for the same speed given that approximation then today or this was 2013 I haven't updated in the last 18 months it's quite easy to break deaths hundreds of dollars to buy hardware to break this what about AES the same company Cy engines built dedicated hardware to break AES a brute force could do about 500 million keys per second on this one piece of hardware it cost about $100 per FPGA and this piece of hardware used 128 FPGA's field program or Gatorace so there's some other statistics there about half a billion keys per second remember our first column of our table was a billion keys so this is about half the speed just get to the the main point there how long does it take if they can do that speed to break AES look at this column AES 2 to the power of 128 keys that's a key space they can do about 64 billion keys per second per $15,000 so that the hardware cost about $15,000 and you can do 64 billion keys per second if we extrapolate that then to break AES for $15,000 it will take us 10 to the power of 20 years okay but if we spend more money we can buy more of these machines so if we spend $15 million we increase the capacity by a factor of 1000 therefore we decrease the speed by a factor of 1000 10 to the 17 years if we spend $15 billion then 10 to the power of 14 years so even a government who wants to spend $15 billion just to break AES is still going to take forever to do it so this is just illustrating even with the most dedicated hardware AES 128 is still secure against brute force remember computer get faster and faster and faster so what about if I use AES 128 today what about in 15 years time can someone break it then well if you extrapolate again using Moore's law you see if you spend $15 billion in 15 years from now 2028 it will cut you down to just 100 billion years to break it so again it's still secure all right if you don't use 128 bit keys but in you're a little bit paranoid and you want to use 256 bit keys then it's going to take again 10 to the power of 49 years to break so AES is considered secure against brute force attacks so that's just something about the speed of computers in practice and the brute force attacks if we just make the key long enough we can defeat them we can't be subject to a brute force attack and that almost finishes so that's that was some examples of how fast we can do it today there are other crypto crypt analysis attacks on these ciphers so brute force is one way the dumb way crypt analysis is to look at the algorithm and try and find a better way and people have come up with some better ways focusing on AES just to give an idea brute force attack on AES you need to try 10 to the power of 128 keys there's an attack in theory that can cut that down to two to the power of 126 keys so if you measure and compare the attacks against brute force two to the power of 126 compared to two to the power of 128 is about four times faster so instead of what do we have a hundred billion years it would be four times faster maybe two billion years so cutting it down by a factor of four means nothing it's still secure and in fact these attacks you can see details in these references but the attacks in theory can be a little bit faster but in fact they require extra information to work they require either extra memory a large amount of RAM or known data passed upon past encryptions past plaintext ciphertext pairs which are practical in most games so today AES is considered secure against brute force and secure against any publicly known crypto crypt analysis and many people still use it to finish today we assume that yet attacker knows all algorithms that are in use they know that we're using AES and they know the implementation details of AES we assume the attacker knows everything about the algorithms they know any parameters in that algorithm which are public we assume the attacker can intercept any message sent across the network anything sent across the network let's assume the attacker can find it but we assume the attacker doesn't know a secret value if we say this is a secret value of the algorithm then let's assume the attacker has no way to know it some other through some other means like a secret key and let's assume from now on the brute force attacks are impossible if we use a reasonable size key reasonable size maybe more than 80 bits so AES 128 let's assume brute force is not possible so later when we talk about using these techniques to provide security mechanisms you can't just say oh let's brute force it because it will take you forever to do it so we'll use these assumptions as we go through the rest of the course next lecture tomorrow we'll start from here we'll look at not encryption but authentication of data proving who's who you're communicating with