 So that's just, let's recap on what we've gone through in the course so far and see where we are. We've spent a lot of time up until now looking at encryption, two basic approaches, symmetric key encryption and then recently public key encryption or public key cryptography. So we focused on taking a plain text message, encrypting it and sending the cipher text with the aim of providing confidentiality and that's a key thing for cryptography. We started with classical ciphers to show the concepts. We went through some block ciphers, DES and spoke about AES, random numbers related to user encryption and other purposes in cryptography, stream ciphers, still symmetric key encryption. Then we shifted to public key encryption. Remember that symmetric key encryption generally is considered to be fast, so the algorithms like AES, triple DES, are fast when we encrypting data, they're considered secure, we don't have ways to break them in normal cases but the problem with symmetric key encryption is that we need to have a key at both sides, A and B. So the problem is how do we get that key from A to B securely and that led to public key cryptography, RSA and just Diffie Homan and the advantage of public key cryptography is that we can encrypt with a public key and decrypt with just the corresponding private key and so long as we have the correct public key and it can be made public, we don't have to keep it secret, then we don't have this problem of key exchange but in fact with the man in the middle attack we just saw still there is a problem with key exchange in public key cryptography. If someone gives you a public key, how do you know it is their public key, it's not a man in the middle saying it's Steve's public key and that's what we saw with the attack on Diffie Homan. So there is still a problem with key distribution in public key cryptography. We need some way also to authenticate messages. Someone sends a message, I want to make sure it hasn't been modified. Even if I don't care if it's confidential, sometimes I'd like to send a message, I don't care if you read it, I just want to make sure no one modifies it along the way or I want to make sure that it came from the right person. So that's what will lead to authentication and we will look at general authentication and one specific technique called message authentication codes and then see another technique that relies on hash functions and see how hash functions are used to provide digital signatures. So the next two topics are about authentication. Then we'll return to some issues of key distribution. How do I get a key from A to B, some practical ways, some protocols for doing it? And the final topic will be a couple of examples of those techniques using some internet protocols. So we're moving to authentication. So although this topic is titled message authentication codes, first we'll talk generally about authentication. What are the requirements? We'll talk about how we could authenticate using symmetric key encryption and then we'll introduce a different technique to authenticate message authentication codes or MACs. And the next topic will look at a third technique using hash functions. If you remember back to some of the very first lectures, yes, all right. If we compare, say, symmetric key encryption like AES versus public key cryptography, which ones use more in practice? They are both used because they are used in conjunction with each other commonly. So a very common setup is that we use public key cryptography to exchange keys. And then those keys that we've exchanged, we then encrypt the data using symmetric key encryption. The idea is that symmetric key encryption is fast. So when we have a large amount of data, use that. But to get that initial key, well, they use public key encryption to exchange. So when you communicate to a website using HTTBS, you're downloading a large file or web page, then that data needs to be encrypted. Commonly today it's encrypted using AES or symmetric key encryption. But before you download the web page, there's a key exchange where we use public key encryption. So they both used. Can anyone name the six attacks? They're not specifically listed here. Remember the six attacks from the first lectures? We talked about passive attacks, releasing the message contents, disclosing the message contents and traffic analysis, releasing the message contents. How do we stop that attack, encrypt our data? That's simple. That's what we spent all the last lectures, last topics on. Before I send the data, I encrypt it such that if someone intercepts the message, they cannot decrypt and get their message contents. So disclosure or releasing the message contents, how do we solve that? Encryption. Traffic analysis we haven't really talked about, and we were not much, but we can also use encryption to try and hide patterns plus some other techniques like adding in some random messages to hide the patterns of our normal communications or delaying messages sending them at different times. The idea with traffic analysis was that the attacker observed the sequence or the frequency of messages. By modifying our frequency of messages, we can hide that pattern. But remember other attacks, active attacks, pretend to be someone else, masquerade, modification. Someone sends a message, you modify it before it gets to the destination. What else was there? Masquerade, modification, replay attack. Someone sends a message, you take it and send a copy later. And denial of service. So the first three we'll look at a little bit. Denial of service in this course we will not look at in any depth. If you want to learn about denial of service attacks, you need to sit through the IT security lectures where we look at denial of service attacks in depth. So those active attacks where the attacker intercepts a message and then modifies it or pretends to be someone else. Well, this from three through to eight gives some specific cases of those attacks. Masquerade pretend to be someone else. How do we stop that? We don't. We can't stop someone pretending to be someone else. What we do is that we use techniques such that if they do, we can detect that they have. So with the first two attacks or disclosure especially, we can stop the message being disclosed by encrypting. But for the active attacks, the normal approach is we can't stop the attacks, but we use techniques such that the receiver can detect if an attack has taken place. And for Masquerade, we use in general message authentication. We authenticate the message that was sent to us. Content modification. Someone sends a message. Someone in the middle modifies it along the way. How do we detect that? Authentication. Sequence modification. Someone sends five messages, one, two, three, four, five. And the attacker rearranges the order such that the receiver receives one, three, four, five, two. And that has some purpose for the attacker. How do we stop that? We use message authentication. We authenticate based upon the sequence number. Related to that is timing. Some very detailed attacks can be successful if the attacker can modify the timing at which messages are received. Maybe attacks on financial systems. So again, authentication is used. What's repudiation? Denying something. Source repudiation. The sender denies they sent the message. Destination. The destination denies they received the message. How do we stop that? We use authentication techniques, but the specific name of those techniques is referred to as digital signatures. We sign a message. So these attacks, three through to eight, we mainly use authentication. What does it mean, authentication? We send a message from source to destination. The destination, when they receive the message, they want to verify, usually two things, or one of the two. They want to verify that the contents of the message hasn't been modified. We sometimes refer to that as data authentication. And the source of the message is who they claim to be. So source authentication. You receive a message and it says from Steve. You want to verify, is it truly from Steve or someone pretending to be him? So we sometimes talk generally about authentication and sometimes specifically about data or source authentication. How do we do that? How do we provide authentication? There are different approaches. We can use symmetric key encryption in some cases. So the techniques that we've currently used to encrypt for confidentiality, we can also use them to provide authentication. We'll show some examples in this topic. There's a new technique that we'll talk about which is closely related to symmetric key encryption referred to as message authentication codes, max for short. So we'll talk about those concepts. And then there's another approach that uses hash functions. Everyone knows hash functions from some computer science course or some data structures course. We'll return and talk about properties of hash functions and see how they can be used for authentication and often combined with public key encryption to get what we'll call digital signatures. The last two are in the next set of slides. The first two are in this set of slides. First we'll look at authentication using symmetric key encryption. And let's start with an example. Exit out of here. I've got some plain text message I've created before this morning. It's just in a file called plaintext1.txt. It wraps around. So I've got this message in there. I'm not gonna show you just yet. I'm gonna encrypt it using symmetric key encryption and send it and we'll look what happens at the receiver under the case of different attacks. First, how long is the message? 72 bytes. The file size is 72 bytes. So the message is 72 bytes. I'm gonna, just for simplicity, I'm gonna use DES. DES in the ECB mode. First I'll encrypt that. DES, remember, uses a 64 bit key and we should choose a random key. So first I'll, well actually I chose a key before to encrypt. I've already encrypted it. I won't show you how I encrypt because then I'll show you the key. Okay, so I wanna hide the key for now but I've encrypted this and I created ciphertext1.bin. It's also 72 bytes. The way I encrypted it was, I'll show you the command. Encrypt using DES in ECB mode. The input was the plaintext. The output was the ciphertext. And just to wrap around, I specified an IV and just to make things a bit shorter, the IV that I used was this value and I specified a key and I won't show you the key. And I added the option for open SSL notepad. The idea here is what I wanted to do is encrypt this 72 byte plaintext using DES in ECB mode. 72 bytes. How many bits in a block in DES? Maybe your exam you had to remember the number of the block size of DES. 64 bits. DES takes 64 bits at a time or eight bytes at a time. Encrypts and then encrypts the next eight bytes depending upon the mode of operation. ECB is the naive mode of operation when we just take one block encrypt and we don't connect them together. How many blocks in our plaintext? Not four. DES is 64 bits or eight bytes per block. How many blocks in our plaintext? Nine. We've got 72 bytes of plaintext. Eight bytes per block. There are nine blocks. Because I've got an exact an integer multiple of the block size, 72 is nine times the block size. I don't need any padding. So I actually explicitly say no padding. If I had 73 bytes, I would need to pad out so that I build it up to 80 bytes. So just in this specific case, the reason I add no pad to the command means I don't want any padding. I don't need it. And I obtain ciphertext also of 72 bytes. I send you the ciphertext. You receive the ciphertext and you decrypt. So let's try some cases of decrypting. I will not run that command. You receive the ciphertext, so now we decrypt. Add the minus D option to say let's decrypt. The input is the ciphertext that you received. The output, let's call it received one. We specify the IV. You know that, let's say it's public. And the key that you use to decrypt, what are you gonna use? Well, you don't know the key at this stage, I know it. But if you receive a message and you decrypt, or if you're an attacker and you receive a message and decrypt, what happens if you don't know the key? Well, brute force, well, you can sit there and brute force the 64 bits, maybe you can do it in a few days if you really try hard. But if I moved up to AES in 256 bits, you would never get it. So let's try a different key, all right? It needs to be 64 bits, I'll try a key. I know it's wrong, but let's try one. And no pad, what's going to happen now? Error, no pad is okay in this case because we have an integer multiple of the block size. If we didn't, we would need to use padding. Any errors? No, the software doesn't report any errors, why? It's okay, I can tell you the key is not right. Well, the software doesn't know whether or not there are any errors in the decrypt because it just applies the DES algorithm, decrypts with this key and gets 72 bytes as output. And that's one of the reasons or tricks of using no pad here. It means that there'll be no padding and there'll be no extra information added to detect errors. And we'll return to that later. If you don't use no pad, maybe the software will report error and I think you've done that or you've seen error if you've made a mistake using OpenSSL in one of the homeworks. But here it doesn't report an error. Let's look at the received ciphertext, sorry, the received plaintext. I'll just make it a bit smaller. There's the received plaintext. Do you think this is the correct plaintext? Hands up for yes. One person, one's got his hand on the head. I'll count that as up, two people. Hands up for no. Why would you say no? Why no? It doesn't make any sense, all right? That is, what do you see here? This is the hexadecimal form. Do you see any patterns in the hex? Not really, I don't, you have to look closely. This is the ASCII form of those corresponding hexadecimal values. When you see a dot, it doesn't mean it's the dot character. It normally means it's a non-printable character in ASCII. In ASCII, there are characters which are keyboard characters which can't be printed on the screen in a single character like backspace, escape, and so on. That's what the dots usually mean. But this doesn't seem to make much sense and I'll give you a hint that the message was in plaintext English. So here, we have some form of authentication. Why? You received a message, you decrypt, you realize it doesn't make sense. Therefore, something must have changed from when the user that encrypted it and between they've sent the ciphertext to you. You assume that the user that encrypted the plaintext had an original, had an English plaintext message. Therefore, when you decrypt with the correct key, you'll get that recognizable plaintext. Here, you don't. Therefore, you assume something's changed. What may have changed in this case? The key in this case. So we may guess that the key that was used is wrong because we don't recognize the output. Let's try again. Yep, right, right. So I think you recognize the question is about, well, we're assuming that the plaintext makes sense. What if we expect to get a plaintext that doesn't make sense? Correct, what if, so the question is maybe, how do we know if this makes sense or not? Someone has sent us something, we've decrypted, are we sure this doesn't make sense? Well, if I know it was an English message in plaintext, in text, then I recognize it doesn't make sense. What if it was part of an image, it was a small image? What do we expect to see if it was a small image and we decrypt? Do images have JPEGs, do they have any structure in them? Do you think an image, a JPEG is just random bits? No, an image is not random bits. Images, for example, have some structure. Let's see if I can show you an image. So that's an important point. Here's an image, S-I-T logo. It's a JPEG file, there's the image. Let's say that was the plaintext. That was encrypted, the ciphertext sent, you decrypt, and you see the output. How do you know whether or not what you got from the decryption is correct or not? Well, let's see what a JPEG looks like. I use XXD to see the raw, the binary, or the hexadecimal value. Large, so I'll just show page by page. This is the JPEG, the file. Does it have any structure or any patterns? Well, there's a lot of one zeros, one zeros in hex at the start. And if we scroll through, so this is the binary form of that image, we see a lot of zeros here. There is some pattern in the JPEG, why? Two reasons, we need to encode the pixels using some algorithm and many pixels will be the same. So when we encode them, we'll get the same output. And also in images, normally at the start of the file, they have something to indicate that this is a JPEG and it's of this particular encoding for JPEG. So the first maybe 5, 10, or I don't know, 100 bytes may be a defined structure which is the same or very similar in every JPEG. So in fact, images do have some structure. In theory, when we decrypt, we can check if the decryptive value contains that structure. What about other information? Images, videos are effectively images, so similar. What type of messages could someone send which have no structure? Again, a steganography, well, they have some structure. What messages would A send a B such that when B decrypts that the real value is random? What's the content of the authenticated message? I want to send you an email, all right, that has some structure, it has some letters in it. I send you an image, it has some structure, a video. Normally, we send messages which have some meaning, therefore they're not random. What if, think of a case when someone sends you a random message? Has anyone ever sent you a random message? No, we're checking, we're decrypting. So once we decrypt, double encrypted, we double decrypt. We get the original, we get some plain text. I want to, what type of plain text would be random? Random plain text, why would you send a random message to someone? A key, all right, so one simple case, I generate a key, a random value, and I send that encrypted. That's a case where we will not be able to recognize when we decrypt whether it's the original one or not. So many messages have some structure, such that when we decrypt, we can recognize if it's correct or not. But some messages don't. And the problem with relying on the structure is that somehow we have to check. I receive and decrypt a message, I want to check. Is this a JPEG? I need to check against the standard of JPEG and see if it has the format, or it doesn't match that. Maybe I need to check against a PNG or a TIF image or a GIF image or all the other image formats. That becomes very complicated. So in many cases, the structure of the message can be used to determine if we've decrypted correctly, but in general, that becomes very complicated and we need other ways to determine. And that's what max about other ways. So I think the point, and you'll see it on the slides, is that in some cases with symmetric key encryption, we can detect if the decrypted ciphertext is correct or not. I'll use the correct key. I know the key in this case. So here I decrypted and I get a plain text message that makes sense. So what does that tell me about the key I just used? Is the correct key? And in the previous case, when I decrypted and I got a message that didn't make sense, it told me the key, either the key use was incorrect or something else. In the previous case, when I decrypted and I recognize that the plain text was wrong, I know one of two things, either the key was incorrect, the initial value, maybe that's one of three things. So key, initial value, often the key and the initial value go together or kept secret together or something else was wrong, what? Not, all right, the cipher mode, let's assume we know des and easy b. The cipher text was modified. Let's say I encrypted our plain text, obtained cipher text. That cipher text was sent, but then modified along the way such that cipher text one dot bin was not the original one. Even if the correct key was used, we'd get random looking output. So in general, if we can recognize the output and recognize when it's wrong, then we know either the key was wrong or the cipher text was wrong. That is, we decrypted with the wrong key, maybe the wrong key was used to encrypt or someone modified the message along the way. We'll use that knowledge to provide authentication. But the point is being raised, how do we know whether this is correct or incorrect? And in many cases it's possible, there is some structure in the message, but in practice it can be quite complicated to detect that structure. So we introduce extra information and that's what Macs do. They add some extra information so we can be sure whether what we decrypt is correct or not. That basically covers this part on authentication using symmetric key encryption. So the example was, so we'll finish this just before the break, if we encrypt a message with symmetric key encryption, we send the cipher text, when we decrypt, we expect to get a recognizable plain text as output. The encryption provides confidentiality, but it also provides authentication. If we can recognize the correct plain text, if the correct plain text is obtained, it means that we've used the right key. And if we've decrypted with key K, then there's only one other person in the world that could have encrypted with that key K, the one that I've shared the key with. If the key is shared between A and B, if B successfully decrypts with the key K, then it must have been encrypted by user A. So we authenticate the source. We know it came from A. And we know nothing's being modified along the way, because if something was modified along the way, the cipher text was modified, then it wouldn't successfully decrypt. So assuming we can recognize the correct plain text, symmetric key encryption provides authentication and confidentiality. But this assumption is an issue. In general, we can't always assume that. Sometimes we can't recognize the correct plain text. I showed one example, we discussed a couple. This is another one very quickly. If we decrypt the cipher text here, and we get this plain text, do you think the key is correct? Do you think we've got correct plain text? Probably yes, because you recognize that plain text. If we decrypt this cipher text and get this plain text, do you think it's correct? Do you think this is correct plain text? What would you assume? Maybe not, because it just looks like random characters. It has no recognizable structure. So that's the concept. If we can recognize the structure, we can authenticate the source and the data. Similar applies for binary data, but it becomes harder. Do we know if the decrypted value, this value, is correct or not? Well, I don't know. Is it part of an image? What type of image? A video or some other format? Or was it a key? It's hard to know if it's correct or not. Recognizing the correct plain text is possible if there's some structure in the original plain text. And many messages do have some structure. But sometimes automatically detecting that structure, having your computer to detect if it's correct or not, is very difficult. To make it easier to detect, we sometimes add extra information to the message. Well, for cryptographic mechanisms, the extra information is called a message authentication code. So we'll see after the break how we add a Mac to the message so that when we decrypt, we can be sure that we've got the correct plain text.