 So you don't have this, I'll just ask a few questions of you or we'll check our knowledge of what we know so far about encryption. So state something up here and in doing this we'll get some review of the terminology or the notation that we use with encryption. So you have a sheet that lists the set of assumptions we're making, so that's what we want to arrive at. Let's go through some cases and try and answer what is known or what is assumed when something happens. So I'll show you what happens and then we'll discuss what's the outcome. So the first case, let's say that X, so this is what happens at the start, X, some user X, we could call them X. They have some plain text P1, they encrypt it and they send it, they send the ciphertext C1, which is the output of the encryption to Y, so X sends a message to Y. So how was it C1, the ciphertext, the first ciphertext obtained, the notation we use E is the function, the encrypt function, and it could be a different, it could be DES, AES, there are many different algorithms here, but we say it's some encryption function. It takes two inputs, the plain text P, we'll see that there are different plain text as we go through, so P1, and it takes a key as input. What does KXY mean? What do you understand? So when I have the quiz, maybe an in-class quiz next lecture or a moodle quiz, what does, what would KXY mean? How do you interpret that? It's a key, K for key, and the XY probably suggests that this is a shared secret key known by both X and Y, so this is the notation we'll commonly use. So this suggests we're using symmetric key encryption. So I say the function E, the encrypt function, but we now know there are two types of encryption, symmetric key encryption, where both sides use the same key, and public key encryption, where we encrypt with one key and decrypt with another. We'll use different notation to identify secret keys, K, and later public and private keys, PU and PR, just so it's clear. So here we use symmetric key encryption. We send the ciphertext to Y. What happens? Well let's consider what if some intercepts that ciphertext, Z, another user, intercepts the ciphertext, and they try to do this operation. What happens? What can we assume, or what do we know if they try to do that operation? They're trying to change the key. Not necessarily change the key, but you've noticed they use it. What's different here? Right, so E means encrypt, D means decrypt. So we intercepted, so we went now with Z, we intercepted the ciphertext, D1, and I tried to decrypt it, so maybe I'm trying to do an attack and discover the plaintext, because it was confidential, I want to learn it. What happens when I try to decrypt it? What can we assume happens? What does Z know after we try this? What do they not know? They don't know, right? So with this operation, what's the outcome of this operation? It's wrong, correct? That is, something goes wrong. Why does something go wrong? Because the key doesn't match with what was used for encryption, okay? So that's correct. When Z tries to decrypt the ciphertext, note there's a different key used here. We try to decrypt the ciphertext using KXZ. Because the ciphertext was obtained by encrypting using KXY, then something will go wrong here when we decrypt, and we'll assume that A, Z cannot learn the plaintext P1, all right? They will not get P1 as output. And the other thing we'll assume is that they will know that it's incorrect decryption or the plaintext is not the original one. They will probably produce random output. And from that we can recognize that we didn't get the plaintext. So two things we assume happens here. Z does not learn P1, and Z knows that it doesn't have P1. Because the decryption didn't work. Say it returns an error or it returns something that doesn't make sense. Then there are ways to ensure that will be the case. So why doesn't it work? Because the key used to encrypt and decrypt is different. Why did Z use a different key? Because they don't know KXY. If they did, it wouldn't be secret between X and Y. So just highlighting the assumptions that we make when we use symmetric key encryption. Same message was sent. C1 was sent from X to Y. Y receives a message. There's some ciphertext. I don't know it is CR. C1 was sent. CR is received. We don't necessarily know if CR and C1 are the same. They may be. But maybe someone tried to modify them. Let's check. So what Y does is they try to decrypt the received ciphertext using KXY. They do that and they get some plaintext they recognize. Maybe some English message. What do we know now? What does Y know? He knows the key used is the correct key. Because we got recognizable plaintext, we know the decryption worked. It didn't go wrong this time. So some things we know from Y's perspective. We know that we used the correct key because the decryption worked. We know that the output plaintext is what X sent. And therefore the ciphertext received is what was sent. So the plaintext we received is what X sent. It hasn't been modified along the way. If it was modified, it wouldn't decrypt. So because it does decrypt, we assume that the plaintext received has not been modified. There's been no attack on it along the way. So we know two things. We know the plaintext first. Actually, we obtain the real plaintext. We know that plaintext is correct. It hasn't been modified. And the other thing we know is that the key used to encrypt with K, X, Y, what does that tell Y? The key used to encrypt with K, X, Y, what does that tell user Y? It tells Y that the message came from X. The only people in the world that know the key K, X, Y are X and Y. So user Y receives a message which was encrypted using that key. Unless I sent it to myself, then it must have come from the other person, X in this case. And that's authentication. We know where the message came from. So our assumptions on encryption used to build up confidentiality and authentication. What if, slightly different, Y receives the ciphertext, tries to decrypt. When they decrypt, they get plaintext which looks unrecognizable. It's not an English message. It's random bits. What do we assume now if that was the outcome? Well, something's gone wrong. If we get unrecognizable plaintext, that means that the decrypt did not work. Why can the decrypt not work? Well, the inputs were not matching what was used originally. Either the ciphertext was modified along the way. Maybe an attacker tried to change the ciphertext between user X sending and user Y receiving. If the ciphertext is modified, then it will not decrypt if we use K, X, Y. Or the message was encrypted using a key other than K, X, Y. So if either of the two inputs are incorrect, then it will produce unrecognizable plaintext and Y will detect an attack. Either the ciphertext was modified or the key used to encrypt was different than K, X, Y. And therefore we know that. So again, that provides authentication. If the attacker does try to modify something, we detect it at the receiver. So that's how we use symmetric key encryption to provide confidentiality. No one can decrypt unless they have the key. And authentication. If someone tries to modify the message, the receiver would detect that modification. If someone tries to send us a message pretending to be X, if they don't have X's key, then the receiver will detect that it came from someone pretending to be X, not X. Questions on symmetric key encryption? We need to understand the basic assumptions so we can move on. No questions. We've spent a couple of lectures on it. Then let's move on to the newer part. What happens here? New plaintext, P2. New ciphertext, C2. What's the difference? If I show you the equation at the top in a quiz, what can you tell me about what happened or what was used? What type of algorithm? Don't be shy today. It's very cold, so there's no reason to sleep. You should be awake. What's PU? Public key. The notation PUX, PU stands for public key. This tells us that the encryption algorithm, E, is different from the previous one. Here we're using public key encryption or asymmetric key encryption. We're using a different approach. That's the first thing to recognise. We're using a public key encryption algorithm. We're encrypting the plaintext P2 with PUX and we get C2. User Y receives C2. What do they know? What does user Y know when they receive C2? They know C2, they've received it. Do they know anything else? If so, what? They know the public key of... C2 is the ciphertext. We have X and Y and Z as the users in these examples. We assume that we always know the public key of other people by definition. Y knows the public key of X because it's public. What else do they know? Or what do they not know and why don't they know it? Do they know P2? Does Y know P2? Someone's nodding their head. Anyone will have a vote. Does Y know P2? If you don't put your hands up, then you will get an extra question in the quiz. I recognise your faces, so remember your names. Don't worry. Two options, yes or no. Does Y know P2? Hands up for yes. Does Y not know P2? We have a few people voting. Good. Why does Y not know P2? What's your opinion? Why do they not know it? They know C2. What about P2? Anyone can help them? You voted no. Why do they not know P2? It was sent from X, yes. And it was encrypted by X. Can they decrypt it? Can Y decrypt C2? If they can decrypt it, then they will find P2, correct? But you said they can't find P2. Now you've explained to me that they can. The question is, can Y decrypt C2? Well, with public key encryption, if C2 was obtained by encrypting using the public key of a user, then it can only be decrypted using the corresponding private key of that same user. C2 was obtained by encrypting using the public key of X. The only way to decrypt that is using the private key of X. PRX, we were to note that. And the private key of X is known by who? Private key of X is known by X. Private means secret just to X. So, why does not know the private key of X? When Y receives C2, they cannot decrypt because they don't have the private key of X. So, in this case, why does not know P2? Some people voted for that but didn't have the right reasoning for that. It's because to discover P2, you need to have the private key of X and only X has that. So, remember, with public key encryption, if it's encrypted with one key in the pair, say the public key, it can only be decrypted with the other key in the key pair, the private key of the same user. What if X receives C2? Do they know P2? X receives this. Do they know P2? Yes. Why? Because X has the private key of X and P2 was encrypted with the public key of X. Whoever has the private key of X can decrypt. X has the private key of X. So, they can decrypt and learn P2 in this case. So, X knows P2. Y, if they receive, would not know P2. So, this provides confidentiality. The way to achieve confidentiality with public key encryption, if you want to send a secret message to someone, you encrypt using their public key. Let's say Z sent this message to X. Z encrypts the message using the public key of X such that only X can decrypt because only X has the private key. So, that's the role of public key encryption for confidentiality. It works in most algorithms. It works the same if we use the keys in the opposite direction. If you encrypt with the private key of X, you can only decrypt with the public key of X. So, what happens here? Someone encrypted P3 using the private key of X and they obtained C3. What if Y receives C3? What does Y know? Do they know P3? This is logic. There's not much detail of the algorithm we'll go through. It's just some logic. If you remember the rules, why you're saying no, why can't they know P3? What do they need to know P3? P3 was encrypted with the private key of X to decrypt it, to decrypt the ciphertext. What do you need? You need the public key of X. Remember the keys are in pairs. If you encrypt with one, you can only decrypt with the other. It was encrypted with the private key of X. So, if you have the public key of X, you can decrypt. Who has the public key of X? Everyone. It's public. So, why receives C3? They can decrypt and they can learn P3 in this case. So, yes, they do know P3. There's no confidentiality here. The message is public. We can see what it is. What else do we know? If it successfully decrypts with PUX, what does Y know? And this is an important role of this public key encryption. What does Y know when it decrypts? They know the plaintext and who sent it or who created it. It must have been created by X. Because if P3 was encrypted with the private key of X, and then we can successfully decrypt the ciphertext with the public key of X, then that implies that the only person who could have created that message is X. Because the only person in the world that has the private key of X is X. So this is performing some form of authentication. When Y receives C3, they decrypt with the public key of X. They learn P3, but they also learn this message, P3, definitely came from X. No one else could have sent this message because no one else could have encrypted with the private key of X. So that's the role of public key encryption for authentication. Just going back, you see we're using the keys in the opposite order. For confidentiality, encrypt with the destination's public key. For authentication, encrypt with the source's private key. X encrypts the message with their private key. They send it to someone. That person who receives can verify it came from X by decrypting with the public key of X. So we've got two roles of public and if I miss one, maybe I have. Right, there's another one. Y receives C3, what if Z receives C3? Maybe Z's malicious. What can they do? So same equation. Does Z know P3? Yes they do because you need the public key of X to know P3 and it's public. So yes, Z knows C3. Can Z verify it came from X? Yes, because if it does decrypt with the public key of X then it means it was encrypted with the private key of X so Z also knows it came from X. From Y and Z's perspective, they know the same. So everyone can verify this message. Everyone can see the contents. We can combine the two forms of public key encryption to provide both confidentiality and this authentication. That is starting on the inner part. Take our plain text message P4 encrypt it with a private key of X. So let's say user X does this. Then the output of that encrypt all of that with the public key of Y and we get some ciphertext C4 and send it to Y. Y receives C4. What do they know? You work from the outside first from Y's perspective. To decrypt this, Y needs the private key of Y. Y has that so they can decrypt the outer layer. Then they can decrypt the inner layer because they need the public key of X to decrypt the inner part and the public key is known by Y. So they decrypt the inner part. They learn P4 so they find out what is the plain text and they know it came from X because it was encrypted with a private key of X. So this is a way such that Y receives the plain text and they know for sure this message came from X. What if an attacker tries to do that? They receive the message. Z, for example. Can they see the plain text? Can Z see the plain text P4? Why not? Z needs the private key of Y to decrypt the outer part. The outer part is encrypted with a public key of Y. Z therefore needs the private key of Y to decrypt and Z doesn't have the private key of Y because it's private just to Y. So Z cannot decrypt the outer part and therefore cannot see the inner part and cannot see the plain text. So we have confidentiality. Other users cannot see the plain text and we have authentication. The user that gets the plain text can verify who sent it. So we can combine the two modes. Remember how public key encryption is used and it's quite simple. Encrypt with one key, you can only decrypt with the other key in the pair because we have key pairs in both directions. If you encrypt with the public key you can only decrypt with the private key and if you encrypt with the private key you can only decrypt with the public key of the same key pair and to provide confidentiality encrypt with the destination's public key to provide authentication encrypt with your private key and this becomes very useful in many applications today, public key encryption especially for signatures. You don't have those slides and the idea was just to think about and to remember those rules and then apply them to solve some problems. Any questions before we return to the slides? Public key encryption, we have a pair of keys and it's shown here that the rules for confidentiality encrypt with the public key of the destination for authentication encrypt with the private key of the source. That's what we just did and so on. Only the person for confidentiality only the person who has the private key can decrypt and for authentication we can verify who sent the message using their public key. If it decrypts with PUA it means it must have been encrypted with PRA means it must have been sent by A. That's how we apply authentication and that's about all we want to know about public key cryptography. There are a few more slides. Where is it applied? So where is it useful? Well, we can encrypt messages but for confidentiality we can have secret messages using public key encryption but in practice it turns out the public key encryption algorithms are generally much slower than the symmetric key encryption algorithms. So if you want to encrypt a large message you would use a symmetric key encryption algorithm because public key encryption is generally very slow or much slower. So they both can provide confidentiality but symmetric key encryption is much faster. But public key encryption is very useful for authentication for signing things a digital signature. When we only need to encrypt a small amount the time doesn't matter so much or we can provide a signature when we come to a digital signature on the latest slides. There are different algorithms same with symmetric key encryption we mentioned some of them we mentioned AES, DES, triple DES and a list of others on one slide. For public key encryption there are different algorithms a very popular one one of the most prominent ones was RSA and still used. RSA is a public key encryption algorithm but there are some others and they all depend upon solving mathematical problems that is the security of them depends upon the difficulty in solving some mathematical problems like factoring primes solving logarithms and so on and we're not going to go into those algorithms but they have completely different designs than the symmetric key encryption algorithms so that we can use one key to encrypt and another key to decrypt. So there are different algorithms RSA is a common one we will see in some you'll see later in some topics mention of things like Diffie-Hellman maybe elliptic curve cryptography primarily used for digital signatures which we'll cover shortly and also key exchange which I'll also show an example of the design of the algorithms the requirements of the algorithms we're not going to cover there's a few slides on the details which we're going to skip over on public key cryptography RSA the details of that will definitely not cover so RSA is one popular algorithm it's not so hard to understand but we will not cover it in this course we will see examples of it use using software so regarding performance public key cryptography algorithms are generally much slower than symmetric key cryptography so therefore when we have a large amount of data or time is very important we would encrypt using symmetric key algorithms so what role to public key cryptography play they are very useful for sending secret keys let's show an example of that before we summarize the assumptions let's say we have our two users we have come back to user A and user B they want to send a lot of data A has a lot of data to send to B and because encrypting with symmetric key cryptography is much faster than public key we'd like to encrypt this data gigabytes of data using a symmetric key algorithm and an example symmetric key algorithm just list one of the names a very common one or popular one is AES the advanced encryption standard so that's a specific implementation of symmetric key cryptography so to encrypt the data we need to have a key known by both A and B let's say A chooses a secret key KAB so what we're going to do to encrypt the data encrypt the data using a symmetric key algorithm in this case called AES and using the key KAB and send the ciphertext to B and B would encrypt using which key KAB how does B know KAB A chose KAB chose a random number 256 bits long how does B know KAB how does B know those 256 bits in practice how do we get this key from A to B send it to them if we send the key to B unencrypted and someone intercepts that then they can learn the key so we can't send the key to B unencrypted B needs the key A chose one they haven't met each other before they are on the other side of the world they need to send it across a network they can't send it unencrypted because if someone intercepts so they must encrypt this key how can they encrypt this key such that they can send it to B they could encrypt using another symmetric key encryption algorithm but they would need another key to encrypt with so what we do is we encrypt the secret key using public key encryption let's see how it's applied so we have another algorithm that we're going to use in the algorithm and in the example of a poppy one is RSA and with that every user has their key pair A knows PUA PRA A knows its own key pair they generate their own key pair the public key and private key of A B has a key pair PUB PRB and what else does A know we'll assume that A knows PUB how does A know PUB? it's public maybe they posted it on a web page or they sent it in their email, in the signature every time they send an email so because it's a public key it's not so hard to distribute so the public key of B we'll assume is also known by A well there are some challenges in distributing the public key, someone could try and create a fake public key of B we'll not cover that yet we'll assume that A knows the real public key of B that it's not a fake one we'll see maybe towards the end of the course when we look at web security and digital certificates some other ways to overcome getting fake public keys but when we say a key is public we assume anyone can know that value similar B knows the public key of A here so that's what's known up front we need to get key KAB to B so what we can do is encrypt it create some cipher text encrypt using the algorithm and I'll just write the name here RSA, the public key algorithm encrypt KAB using a public key algorithm to send confidentially to B what key will we use to encrypt it to send to B so that no one else can see it which key should we use here you've got four to choose from A has four keys there which one is it going to use to encrypt public key cryptography for confidentiality which key do we encrypt with we encrypt with the public key of the destination PUB using the public key of B we encrypt KAB the secret value that we chose we send it that is we send the cipher text to B and B tries to decrypt decrypt the cipher text C1 what do they try to use as the key if they think it came from A what do they use as the key here it was encrypted with PUB they are B they use PRB they receive a message they encrypt it with public key encryption they will try to decrypt it with their private key who knows PRB only B therefore only B can decrypt and that successfully decrypts because the corresponding keys in the pair we use for encrypt and decrypt so we get as an output the original plain text what was the plain text KAB now B knows KAB an attacker if they intercepted the cipher text would not be able to decrypt because they don't have PRB so an attacker would not be able to learn the secret key now the B knows KAB the data can be encrypted or say C2 the second set of cipher text using the symmetric key algorithm that is sent and we can decrypt because we also know KAB and when we decrypt the cipher text using KAB we should get the data as output so here we are combining symmetric key encryption and public key encryption and the reason is we are using public key encryption to send the secret key from A to B that is to send KAB from A to B and we are using symmetric key encryption to encrypt the data the reason we use symmetric key encryption to encrypt the data is because it is fast faster than public key encryption so when we have a large amount of data this will be faster than using public key encryption for everything so that is a very common way to combine the two techniques that is key exchange or key distribution any questions on symmetric and public key encryption to be precise I should have when I used the D the decrypt function I should have wrote decrypt here with RSA and decrypt D subscript AES right I encrypted the key using RSA as the algorithm and I encrypted the data using AES as the algorithm just as an example of the different algorithms available any questions before we move on to the hard part we will introduce some more complexity in a moment so make sure we are clear on this otherwise the next part won't make any sense okay alright no questions I hope most people have got there so public key encryption in practice is commonly used for this purpose key exchange the benefit here is that the key itself is usually quite small say 256 bits we encrypt a small amount of data using the slow public key encryption and then we encrypt a large amount of real data using the fast symmetric key encryption and the benefit is that we can use the public key encryption to exchange the key because the key used to encrypt is known by everyone this all works on the assumption that A knows the public key of B if the attacker could make A think it has the public key of B but in fact it is the public key of Z then an attack can be successful a man in the middle attack may be possible but we will see that in a later topic when we look at web security the assumptions about public key encryption are listed here as well as on the other hand out similar to what we have said along the way that decrypting with the right key will produce the original plain text and we will be able to recognize that it is correct decrypting with the wrong key a key from a different key pair or not the opposing key in the correct key pair will produce the wrong plain text not the original one and the decryptor will recognize it is wrong so these are similar assumptions we have made before let's look at the remaining topics there are three topics here digital signatures and random numbers we are going to just focus on digital signatures what is key management about key management is how to get keys from A to B we just saw an example of key management how do we get KAB from A to B we encrypted with public key cryptography so that is one form of key management but there are others and it becomes issues of using different levels of keys master keys, session keys how long to use keys and so on we will not discuss key management in depth today we will look at it when we come up with passwords a little bit and maybe when we return to web security securing access to websites we will go through digital signatures, random numbers how do you generate a random number you need to write some software to generate a random number with your computer R-A-N-D you call the function RAND what if I ask you to implement that function you don't have a library that already has an implementation how would you implement this random function what do you think that function does when you call RAND or the random function in your programming language what do you think it does there are two basic approaches one is the common approach where it just follows some algorithm maybe does some calculations to generate what we call a pseudo-random number not really random but close to random because a computer cannot follow an algorithm to produce something random if we follow a deterministic algorithm we'll produce something that can be predictable is deterministic and that's common but there are algorithms that design such that the output they produce are close to random close to truly random the other way which is a bit harder to get random numbers is to collect some source or some information from the environment to measure the background noise from electrical components to measure key presses and the time between key presses the time between accessing your hard disk and so on and all of that information combined exhibits some form of randomness and use that to generate random numbers the point is it is not easy to generate random numbers secure random numbers are ones that are hard for someone to predict so there is a lot of study of creating good random numbers in a computer in this course let's assume that we have some way to generate random numbers it is important in security and many practical flaws have arisen in random number generators but for this topic let's assume that we've got some way to generate what we call cryptographically secure random numbers so we won't cover that in any depth what we'll do is finish this topic and this lecture today looking at digital signatures and digital signatures combine public key cryptography encrypting with a private key with another thing and actually we need to go back hash functions so we're going to jump back to hash functions to understand digital signatures what's a hash function you all know this, you've studied in some early computing course what's a hash function what are the properties or the characteristics of a hash function where do you use it for this or maybe an algorithm's course hash functions hash function is a simple way to think of a hash function a function that takes one input some message we'll use it and it produces usually a small output and that small output is generally considered unique that is two different inputs will produce two different outputs we'll look at the properties of hash functions and then see why they're important with security and digital signatures so we're going to jump a few slides here I'll try and get direct to hash functions hash functions again we won't look at specific hash functions we'll just state the assumptions we're going to make and the principles behind them we say a hash function takes a variable length block of data and the function returns a fixed size hash value lowercase h denoted there so the hash function uppercase h here takes a messages input any length message and as an output returns a fixed length usually small hash value the hash value is sometimes called a digest or even just a code we'll often use uppercase h for the hash function and it's a little bit confusing the hash value or simply the hash the lowercase h now this function the idea is that the function is designed such that when you hash different messages that the output that's produced is effectively random that is if you hash two messages which are very similar to each other the two outputs will be completely different they'll appear as like random numbers or random sequence of bits we use hash functions for a number of cryptographic operations so there are some functions to design to be especially used for cryptography we call them cryptographic hash functions and the functions are designed to meet some in practice some properties that we'll list here a cryptographic hash function will assume it is computationally infeasible what does that mean infeasible not possible computationally not possible means maybe in theory it's possible but in practice our computer cannot solve it we'll come back to that but basically very hard to find the following that is given a hash function if we know the hash value so if there's a known h it should be very hard for someone given just the value of the hash to go back and find the original message so a hash function takes a messages input and returns a hash value as output that's normal a property of that function should be such that if you know the output you cannot find the input this is called the one way property it should be easy to calculate one way but not the other that is if I know I know the hash function so the algorithm being used and I know a message m the data I want to hash then we say it's easy to find the hash value as h of m that is we apply the hash function on the message and we get the hash value so that's normal it should be easy to do that but if we know the hash function and we know a hash value it should be hard impossible in practice to find m where the hash of m equals h that's one of the desired properties of this hash function if we know the message we can calculate but if we just know a hash value we cannot go backwards and find the original message we need a hash function which has this property and there are hash functions that in practice exhibit this property and this is called the one way property because it's easy to calculate the function in one direction the hash of m is easy but the inverse is hard we can only go one direction and show you an output of a hash function you want to copy that down a little bit faster ok let's just calculate the hash of some data and just illustrate that concept or the idea I have a message it wraps around so that the file contains the message that's m and I'm going to apply a hash function on that and it's going to calculate the hash value for me there are different hash functions available some of them you may have heard of does anyone know the name or the abbreviation of a hash function you've probably heard of it somewhere or seen it MD5 is one hash function you may see hash functions used for integrity checks or file checks MD5 is a hash function another popular one is SHA S-H-A MD5 is an older hash function the secure hash algorithm is another one and there are a few others but we'll mainly use those in examples so I have some software that will use MD5 and it's sometimes referred to as a checksum for error detection so the software is called MD5SUM I'll just note the message is this it doesn't end with a dollar sign MD5SUM will calculate the hash of what's inside the file so this is applying the hash function H of the message returns a hash value and the value is this okay that is a hash value it's given in hexadecimal do you see any pattern in those hexadecimal digits looks random and that's the good the desired property of a hash function you take some message which is structured it will produce a random hash value and MD5 does that it's can someone count them how many hex digits it's I think equivalent to 128 bits 32 hex digits 32 hex digits means times by 4 one hex digit is 4 bits so 32 times 4 that's equivalent to MD5 takes a message as input and produces 128 bits as output always and in practice it's a random looking output the idea of the one way property is if I give you this value then I ask you find what the original message is that's hard to do if you have a secure hash function it's hard to go backwards it's easy to calculate the hash but hard given the hash to find the message here's a hash function a hash value a hash value so your challenge, given that find the original message I calculated that from hashing a small message and if used correctly secure hash algorithms it would take you forever with current compute power to find the original message that is it will not forever it would take you too long similar to brute force attacks on cyphers so a secure hash function means that if I give you this you would not be able to in practice find the original message you can't go backwards I know what the original message was it was simply the word security but if you didn't know that you wouldn't be able to find it so that's the one way property the other property two messages two different messages M1 and M2 if we apply the hash function on both of them that it's oh let's go back it's hard for someone to find two messages which have the same hash value practically impossible that two different messages have the same hash value and a variation of that is if I give you one message M1 and the hash value your challenge go find another message another message that makes sense in practice that has the same hash value okay so a good a secure hash algorithm will have this property that it's practically impossible to find two messages with the same hash value that is it is collision free there are no collisions between the hash values back to our example again message one I give you this message and you know the hash value that's easy to calculate your challenge then is find another message different from this one that produces the exact same hash value that's the collision free property saying that it's practically impossible to do that if you have a secure hash function you cannot find another message which produces 9 1 D and so on you could try that is you could do like a brute force attack and try many messages but it would take too long to find it similar to that if you have two different messages we said we produce random outputs is message one and message one and message two the same message one and message two they're different after please the full stop is in message one it's not in message two so that's slightly different almost the same one or two bits different if you look at the binary encoding the hash function works on the binary form of the message not on the text so when we calculate the md5 sum the md5 hash of message two what do you think we'll get 1 D 2 Fe applying a hash on different inputs should produce random looking outputs even if the two inputs are almost the same the output should be completely different there's the hash of message two compare the hash of message one and message two the two inputs are almost the same but the two outputs are effectively random there's no correlation between those two output hash values so that's the property also hash functions two different inputs no matter how small the difference is will produce two different random output hash values now some algorithms there are weaknesses such that it's not always true that you can find another message that produces the same hash value those algorithms are considered insecure and md5 is considered insecure today so there are other hash algorithms char has gone through different variations char is called the secure hash algorithm there's different versions of it char one char one is a different algorithm it produces a hash value I think it's 160 bits but there's also char with a 256 bit output and there's char 512 and so on so there are different algorithms some have some weaknesses with respect to security md5 is considered insecure char 1 also has some limitations char 2 and beyond are considered secure but from our perspective they produce outputs such that given just the output it's impossible to find the original message the one way property and given one message it's impossible to find another message m2 that produces the same hash value the collision free property we'll assume there are hash functions that have those properties let's now see how they use with cryptography and they're used in different cases but we'll illustrate them in use for a digital signature this shows some examples of hash functions used for symmetric key encryption mentions that md5 is one hash algorithm it generates 128 bit hash it's no longer considered secure there are attacks possible char was the newer algorithm and it's gone through different variations char 1, char 2, char 3 char 1 has some limitations char 2 is considered secure, char 3 is growing in use so there are different algorithms there are different attacks on algorithms and the attacks involve defeating those properties if you can go backwards if you can defeat the one way property then that can be used for cryptographic attacks or if you can find collisions then that can be used for attacks so let's now see how hash functions are used with digital signatures let's go direct to signatures the aim of a digital signature is to prove to be able to prove to anyone that a message originated from someone or was approved by a particular user it's a digital form of our handwritten signature when you sign something when you sign a piece of paper the idea is that that acts as evidence that you agree or you approve of that document you sign it the idea someone can see this document has your signature they can prove that you agreed to that that's the idea of a signature we want the same concept for data for digital data and we'll use public key encryption to do that we cannot use symmetric key cryptography for digital signatures if we tried then there's a potential flaw in it what we could do is we could encrypt let's say I want to sign a file I encrypt that file using symmetric key cryptography and using a shared secret key and then I want to later prove that it came from a particular user and the problem with symmetric key cryptography is the key can be known by two people not just one to illustrate that if you obtain this ciphertext and you know is encrypted with key KAB that is it successfully decrypts with key KAB who created this ciphertext which user created this ciphertext if it successfully decrypts with key KAB then it was created by A or B either A or B we don't know if it successfully decrypts with key KAB it means it was encrypted with a key known by A and B so we don't know which one created it with a digital signature we want to be sure that it's signed by one user we cannot achieve that with symmetric key cryptography because when we encrypt something with key KAB it could be created by either A or B so we do not use symmetric key cryptography here public key cryptography if this ciphertext successfully decrypts who created it if it successfully decrypts with the public key of A then it means it must have been created by A because there's only one person in the world that knows the private key of A user A so that's the concept we use for a digital signature but in practice rather than encrypting the entire plaintext to sign it we use the hash algorithm so this is the concept the problem with this approach is A or B may have created it we'll see this in the slides this one it means it's only from A this is the concept of a digital signature or the theory behind it in practice what we do is that we don't just encrypt the plaintext we encrypt a hash of the plaintext again we can prove that only A created it only A has the private key of A but it has some practical advantages encrypting just the hash of the plaintext what is different from encrypting the entire plaintext let's say my plaintext is a 5 gigabyte file in if I encrypt the entire 5 gigabyte file it takes a lot of time and I need to send the entire encrypted form if I encrypt a hash of that 5 gigabyte file a hash function such as md5 produces just 160 bits as output I only need to encrypt those 160 bits and therefore it's much faster to encrypt so in practice we don't encrypt the entire plaintext for a signature we encrypt the hash of the plaintext and we'll see why that provides the same security so the concept is shown here we talk about signing a message you sign a message by encrypting with your own private key if I want to sign something I encrypt with the private key of Steve here we encrypt a message with the private key of A the output of that encryption we refer to as the signature so I don't know it as S we can usually send both the original message and the signature if we don't want confidentiality we'll send the both the receiver verifies the signature they check if it's real and the way to verify a signature is to decrypt using the public key of the sender the signature if it successfully decrypts then it verifies the message if not then we don't trust the message but in practice so that's the concept that achieves the security aims but in practice the way that we really use digital signatures is that we sign a message by encrypting a hash message with our private key I want to sign a message M I calculate the hash of that message encrypt the hash value the small hash value fixed size with my private key I get the signature as output and I send both the message and the signature to the other side and the other side decrypts the signature the public key of the sender the signature is decrypted with the public key of A they get some value as output and they compare that to the hash of the message received and if they match the signature is verified if they don't match it fails verification fails we will look at why attacks are unsuccessful on digital signatures and we will not cover it today we will do some homework to see that I think everyone has a chance in the quiz to try it but it depends upon the properties of the hash function the fact that the hash function has the one way property and the collision free property then attacks are not successful on digital signature so that's why we introduced the hash function there there are different algorithms to do the encryption today is a common one but there are others we'll finish with this assumption about digital signatures a digital signature of a message M is the hash of that message encrypted with the private key of the signer of the sender so the signature I'll denote as S and we normally send the message N the signature there's no good just sending the signature like we sign a document we send that document and the signature at the bottom so we send the message with the signature and a receiving entity can verify that message by decrypting with the public key of the sender if it successfully decrypts and the hash values match then it's verified