 We're talking about now how to verify a message that we received hasn't been modified and it comes from the right sender, so authentication. And there are different approaches for authentication. And what we've finished, so the receiver receives a message, they want to make sure that nothing has changed and they want to make sure, or end or they want to make sure that the person who sent it is who they claim to be. There are different approaches for doing that. We saw last week at towards the end of the lecture, we can use symmetric key encryption to provide authentication. Just by encrypting a message using symmetric key encryption, the concept is if when we receive a message, if it successfully decrypts, that is I have ciphertext, I decrypt with the key that I've shared with user A, if it successfully decrypts that means it must have been encrypted with that key shared with user A, meaning it must have come from user A or from me. So if it doesn't successfully decrypt, I receive ciphertext and it doesn't decrypt using the key that I shared with A, then it means that message either did not come from A or has been modified along the way. So it provides authentication using symmetric key encryption. The problem with using symmetric key encryption is that although it works, it can be expansive in terms of computation in some cases. Sometimes we don't need to encrypt the entire message. Sometimes we just want to authenticate and not provide confidentiality. So there are other techniques which just provide authentication but don't provide confidentiality. And there are different approaches. Message authentication codes are one and hash functions another. We'll briefly mention max and a bit more time on hash functions. And then after we go through another encryption approach, public key encryption will spend more time on hash functions and digital signatures, a key or an important part for authentication used in the internet, digital signatures. So we may go through quickly some of these slides if they're not absolutely necessary. Went through that last week. Message authentication codes similar to symmetric key encryption. What we do is we have some algorithm, some function F. We take our message M. We take a shared secret key K. And we apply some algorithm to get some code called a MAC, message authentication code as an output. Some usually short, maybe 128-bit value. And when we send the message, so the source does this, they take their message, calculate using a key, the MAC for that message. And they send both the message and the MAC across the network. And then the receiver verifies the message by checking the MAC. What the receiver does is they take the received message and recalculate the MAC using the same shared secret key. And if it matches the received MAC, so the calculated MAC of the receiver matches the received MAC, then they verify that nothing has been modified, shown in this diagram. Source has a message, M. Here the function is shown as C. So on the previous slide the function F is shown as C. They take the message and a shared secret key and apply some function. And as the output they get what we call a MAC, sometimes called a TAG. Say a short, usually 128-bits or in the order of that short code that should be unique for this message and key pair. And they send the original message and the MAC value across the network to the destination. So this is the concatenation operation. Two bars is short for concatenate. Take the message, not encrypted, just a plain text message and combine it with the MAC and send both of them across the network. And this is what's received by the receiver. The message M and this grey box represents the MAC of that message, the MAC of M using key K. When you receive this message, the receiver needs to verify, has the message been modified and did it come from the right person? Maybe someone pretending to be user A sent this message. To verify what they do is they take the received message, calculate using the same function, the MAC using the same key and they get a MAC as an output and they compare to the received value of the MAC. If they are the same, everything is assumed to be okay. If they are different, it assumes something's gone wrong. Maybe someone sent a fake message, maybe someone's changed the message, so if they're different then we ignore or we disregard the message, maybe try some other means to get the message. It works assuming that the MAC function is such that a malicious user cannot determine the MAC value without knowing the key. It's the same principle of symmetric key encryption here. If we get the MAC which matches this value using this key, then it means the value received must have been created with this key. And the only person who has this key is the original sender. A malicious user tries to pretend to be our user A, they don't have the key K to make the MAC. The same with symmetric encryption, if a malicious user tries to send a message to me then they don't have the key shared, the key of the original user they're trying to pretend to be. Let's go back. So the MAC, a MAC function, F here, C on the next slide, is some algorithm that takes some key and message and produces usually a short unique or practically unique value. That is if we change the key we'll get a different value. If we change the message we'll get a different value. So that if someone tries to pretend to be a user then to get the same MAC they need to know the key and the message. So if the message is known then they need to know the key and if they don't know the key they cannot generate the MAC value. This is one part we're not going to go into any more details. We'll see some more examples of authentication after we go through hash functions, very similar hash functions and after we get to digital signatures. If we need this we'll return and go through details later in another topic but I think we won't need it for the other topics. At least know that a MAC can be used for authentication. That is in practical situations if we use a good message authentication code function then I receive a message with a MAC I can verify. Has it been modified? Did it come from the right source? The right source is the person who has the same shared secret key with me. If it is modified along the way I'll be able to detect that and if it comes from some malicious user pretending to be user A, user B will be able to detect that. We'll see some related examples after we go through hash functions and that will become a bit more clearer. As I said we want to skip some of parts here so we can get into the practical details later. There are different MAC functions, same as there are different encryption functions, there's AES, DES and so on. In MACs there are different MAC functions, some of them are listed here, OMAC, PMAC, UMAC and so on. One of the common ones used in practice today and in the internet is called HMAC. It turns out it's a MAC function that uses a hash function. So we need to explain hash functions for this to make sense and that's in the next topic or the next few slides. So in practice to provide a MAC for authentication often hash functions are used. So let's talk about hash functions. We won't look at the attacks we'll just summarise that in most cases for good MACs the amount of effort it takes to brute force and break the MAC to defeat it. If we use a key with k bits and the output value here is n bits, the code or the tag is n bits in length then attacks take approximately the minimum of 2 to the power of k and 2 to the power of n. That means if I have a 64 bit key and 128 bit code so k is 64 and is 128 the minimum would be 2 to the power of 64. So if k is 64 and is 128 then the amount of effort to break the MAC is equivalent to 2 to the power of 64. So think of brute force attacks on our ciphers the amount of effort to brute force a cipher depends upon the key length. With a MAC it depends upon the minimum of the key length and the tag length. If my tag is 20 bits and is 20 my key is 64 bits my MAC is just as strong as the tag which is very weak 2 to the power of 20 can be brute force quite easily. So we need to choose the code length or the tag length to be long enough and the key length to be long enough such that brute force is not possible. Typically the key lengths are the same as encryption algorithms 64 bits 128 bits and the MACs the code lengths are also 128 bits. Let's look at hash functions and some of the characteristics of MACs will start to make some more sense. What's a hash function? Everyone should know basics of hash functions you should have covered them in some data structures or algorithms course in earlier years. We have some function that takes some variable length input M our message for example takes some variable length input and produce a fixed length usually short output called the hash value. So we have a message M a 1 megabyte message we apply a hash function on that message and we should get some short hash value lowercase h in this slide as an output. Short let's say 64 bits 128 bits hash value and it should have some characteristics and the practical ones are that if I apply the same hash function on two different messages I'll get two different hash values as output and the hash value should be random looking so there should be no structure in the hash value. If I hash a message I shouldn't the output hash value shouldn't look like it depends upon the message content. So hash one message I get one hash value hash a slightly different message I should get a different hash value. Hash the same message again I get the same hash value it's a function here. A cryptographic hash function is one that's used for cryptographic techniques in security and that has usually stricter requirements in terms of what the output of the hash function should be and generally those requirements are denoted as the one way property and the collision free property and similar to what we've just said but there's some strict measurements of them. So if we have a message and we take the hash we get a hash value it should be hard given the hash value to find the original message that's the one way property. It should be easy to take a message calculate the hash and get a hash value but it should be hard to take that hash value and find the original message so it should be easy to go one way but hard to go back in the inverse direction. So some functions such that this property holds is usually required. Another way to state that it should be hard we a cryptographic hash function it should be hard to find some message that maps to some known hash value. So if I have a hash value as an attacker it should be hard for me to find another message that produces that hash value which is really given a hash value find a message. If we have that one way property of our hash function we can achieve different security objectives. Another property is collision free it should be hard to find two messages two different messages and one M2 that produce the same hash value. So if our hash function has this property we say it's collision free if it has the property that given a hash value I can't find a message it has the one way property and they have different benefits in cryptography. We will use it often for authentication hash values. Let's look at this example in a bit of detail to see how we can use it for authentication. We are similar to the map instead of using a map we are using a hash function. We want to send a message from A to B and we want to allow B to verify that the message hasn't been modified and that the message comes from A not from some malicious user. So what A does takes the message applies a hash function on that message assuming we have some chosen hash function produces a short value as an output the hash value here. Then we encrypt that hash value with a symmetric key cipher and with a shared key between A and B no one else has K it should be shared secret malicious user should not have K encrypt the hash value and send that encrypted hash value along with the message remember this is concatenation take the message and the encrypted hash value and send them across the network received by B malicious user is here they intercept the message they intercept what was sent can the malicious user read the message contents in this example can a malicious user see M hands up for yes hands up for no try again and I want to see all the hands over these two questions in this case the malicious user intercepts at this point so this is being sent across the network malicious user intercepts here they don't have the secret key they don't know K that's our assumption that the malicious user cannot know the secret only a and B know the secret can the malicious user see the message the contents of the message hands up for yes hands up for no okay the answer is yes they can let's try and draw that and make sure that's clear everyone a similar picture but we'll use some extra notation so we have our user a sending a message to be the normal scenario and we'd start with a message and using our approach we take a hash of the message and then encrypt that hash value not draw the boxes but we take the message and we take and we get a hash of the message encrypted with some key at the lower half and here's the message and we combine them we have the concatenation that's my approximation of our picture what sent across the network M concatenated with the encrypted hash value that sent across the network in this example so that's what you need to try and read from the diagram the message concatenated with the encrypted hash value because what the source did we took the message calculated the hash of that message encrypted that hash value then combined the result of that with the original message then sends it all across the network so that's this step so sent across the network now someone intercepts this component they can see the message it's here it's not encrypted so we say if we send this message across the network and someone intercepts then of course someone can read the message the only way we can prevent them from reading the contents is to encrypt that message and we did not encrypt the message so in fact this example we're not providing any secrecy of the message it's not our aim I don't care if someone reads the message not my problem it's public anyone can read the message but I want to make sure that no one changes the message or that no one pretends to be me sending the message that's our objective in this example that is we want to provide authentication we don't want to provide confidentiality in some applications we have that requirement we don't always need to keep our message secret so the answer to our question was yes the attacker can see the message contents now we'll come back to the attacker in a moment so the objective is to make sure the attacker can't change the message without it being detected and what the receiver does is when they receive the message plus the encrypted hash value they take a hash of the message received they decrypt the encrypted hash value received that's here using the same key as it was encrypted with hopefully and compare the values if they match everything's assumed to be okay if not assume something went wrong let's see what an attacker can do let's introduce some attacker into this scenario and see whether they can for example modify the message so again we have a b and we'll have some malicious user now a sends the message which was m same as before concatenated with the encrypted hash value and instead of writing K let's be more precise and let's say it's KAB hash of M that's what's sent across the network where KAB means this key is shared and known by AMB the malicious user will not know this key because otherwise it's not secret so KAB is the secret key shared between AMB sends that to be let's consider the case that malicious user intercepts before it gets to be so it goes instead of the malicious user and the malicious user's aim is to modify the message send it to B and hope that B thinks this modified message came from A that's our aim as of the attacker what do we do so we're going to send a modified message to B what can we do as an attacker in this case okay let's try some different things of the attacker we've got this let's say the message is decrease the malicious user's salary by 10,000 baht and this is encrypted of the hash value malicious user changes the message to increase instead of decrease so just changes one part of the message let's denote the modified message as M prime meaning it is a modified version of M and then they concatenate that with what they concatenate they must concatenate with the encrypted some key and the hash of some of the message what can they do as the attacker they change the message well they've got different approaches what they could do is not modify this okay that is this is concatenated let's say the message is one megabyte and this value is 128 bits so all the attacker does is replaces the first one megabyte with their new message and takes the last 128 bits and adds it to the end let's try and see what happens if we do that the attacker just changes the message but doesn't change the last part so what we have what can we do we can change if we keep this is the same as before we don't know the key what we do as the attacker is just take whatever these bits are I don't know what the key is but I know that if we take the hash of the original message encrypt with KAB and it produces 128 bits I just take those 128 bits and add them to the end of my new message and I send that to B when B receives this they perform the verification steps and back to our diagram here the verification steps of B whatever we receive take the hash of the received message and decrypt the the last part with the key shared between A and B and compare let's try that so B receives find some space they take a hash of the received message what's received M prime and they decrypt so they take a hash of this part and decrypt this part when we decrypt this part what do we get so B received a message they think it's from A what key do they use to decrypt what key do they use KAB so I've got this I need to decrypt it it was encrypted I think the message is from B therefore I decrypt sorry I think the message is from A therefore I decrypt with key AB so decrypting this with key AB will produce the original plaintext if we encrypt this value with key AB and then decrypt that with key AB will get HM back as the output so I've done the two steps so what I'm trying to draw here is at the top is this step of take the hash of the received message and at the bottom decrypt using the key and compare the outputs so the hash of the received message the decrypted part of this are they the same no assuming our hash function has the the property that if we hash two different messages will get two different hash values that's what we said as a cryptographic hash function we want the property such that we cannot get collisions a collision is when we hash two messages and get the same hash value if it has the property that such collisions are not possible then what B does they calculate hash of M prime they calculate hash of M they should be different values because the hash of two different messages should be two different values that's our property therefore they compare them and they're different and now B recognizes something's gone wrong they assume there's some attack they don't know what's gone wrong but they know something has gone wrong and they don't trust the message so this is how we've used hash functions in this case to provide authentication in the case that the malicious user tried to change the message what happens if malicious user changes the message B detects that change and we take advantage of the property of the hash functions to do that detection what else can the attacker do send the key okay try a modified key okay let's go back and see if we can modify the message from the attacker's point of view and try and fool B into thinking that they've received a message from A so in this case let's try from B's perspective sorry malicious user's perspective we modify the message as before we recalculate the hash value so we take the hash of M prime we've got the modified message we calculate the hash of that hash of M prime and we encrypt with what key what do we encrypt with K can we encrypt with KAB no we cannot because the malicious user doesn't know KAB it should be secret between A and B so we cannot encrypt with KAB let's say we encrypt with some other key okay the key of the malicious user whatever we want to call it we cannot use KAB we send that to B what does B do B does the verification they take the hash of M prime hash of the received message and they get the hash of M prime and they decrypt this part they decrypt I'll write it they decrypt and I'll not try and write it again this they decrypt all of that using what key KAB B receives a message they think it's from A therefore they'll try and decrypt using the key shared with a KAB what happens well so this was encrypted with some malicious key and then we decrypt it with a different key what happens we will not get this as an output we'll get an error or we will not get the same plaintext that's the property of our encryption that if we encrypt plaintext with one key and then decrypt the corresponding ciphertext with a different key we will not get the original plaintext as an output so we will not get H of M prime as an output here and when we compare it to here they'll be different and therefore B has detected something's gone wrong so in that case our attack by the malicious user was detected any other way we can be malicious sorry repeat the message send it again so A sends this message to B malicious user and B receives the message and then tomorrow after the malicious user intercepted he sends the same message to B yeah that's possible that's a replay attack remember we listed some attacks one of them was the replay attack where you replay an old message B still thinks the message is from A the only way to stop such an attack is to include maybe inside the message some time stamp so a replay attack is just resend the same message again and using the hash doesn't prevent or doesn't allow us to detect such an attack but we don't get to modify the message in a replay attack and to detect replay attacks if we include some time stamp in practice we can often detect them I know that this message I received yesterday I've received it again today why wrong time maybe it's an attack so yes we could but it doesn't allow the malicious user to modify the message can we modify the message somewhere I think if you explore the possibilities you'll see if we want to modify the message we'll either need to know the key KAB and we assume we don't know that okay so that's not a possible attack or we need to find a message a modified message that has the same hash function as the original message that was our previous case when we did this attack if the malicious user could find m prime which has the same hash function as the hash of M the hash value of M if these two are the same then our attack would be successful so that leads to our property of hash functions we need a hash function such that it's practically impossible for an attacker to find two messages with the same hash value if we have a message increase or decrease mouse salary by 10,000 bar and the hash value was H of M and malicious user found another message which was different and it turned out to be increase mouse salary by 10,000 bar and the hash value of this modified message was the same of the hash of the original message then this attack would be successful B would receive the message thinking it's from A hasn't been modified and would accept the message so the property we require in this case is that B a malicious user cannot find a message with the same hash value as the original message we cannot find a collision in the hash space if we have that property then this is successful in preventing attacks or detecting attacks just going back it should be for a cryptographic hash function it should be hard for an attacker to find two messages that produce the same hash value if they can then they can perform an attack that will go undetected we will see another example of hash functions when we look at digital signatures so a similar diagram to what we just saw this this case we used a hash function combined with symmetric key encryption so this E and D a symmetric key encryption in fact there are alternatives to doing this this is just one example there are others what are some real hash functions or the algorithms MD5 is a common one has been around for a long time message digest 5 developed by Ron Reves you'll see his name come up later it created 128 bit hash value so you take any message you apply MD5 algorithm and you get 128 bit hash value the aim was that if you apply the algorithm on two different messages you get two different hash values but impossible for any hash function because if you take any length input and map it to a fixed small length output then there are many more inputs that in theory many more inputs than there are possible outputs therefore multiple inputs must map to the same output but in practice with 128 bit hash value gives us 2 to the power of 128 possible hash values as long as the number of possible inputs in practice is less than that then we can start to achieve no collisions in practice so how many files are there in the world well I don't know but is there 2 to the power of 128 files in the world well maybe not so the chance of getting two messages that produce the same hash value must be very very low it's not impossible in theory but it needs to be practically impossible in practice it's very very low but with MD5 generated 128 bit hash value it was commonly used and still is commonly used in many applications you download a file from the internet sometimes the website will include the hash of the file allow you to once you download the file to check that the file you downloaded matches the file that was on the server in case there were errors with the download or maybe with in case someone has intercepted and modified the file along the way that was the idea turns out MD5 is subject to attacks there are known attacks such that it's no longer recommended today for secure applications let's look at an example different software will calculate the MD5 hash value for an input got several examples I've got some plain text message okay that's my file that's the input and I've got a program called MD5 some it calculates the MD5 hash of some input and there's the hash value in hexadecimal okay this what is it 32 byte 32 hexadecimal digits 128 bits so that's the hash value of this text if I take a different message I should get a different hash value let's try a different message here's our different message the first bit it turns out it changed so I've changed one bit so this text message we can represent as binary if I just change one bit it changes in ASCII the H to an F we can check the binary and you'll see that just one bit has changed in the input we take the hash of that second file is going to be what's it going to be anyone is it going to be similar to this the same or completely different completely different that's our goal change with two different messages we produce two different hash values and effectively random hash values there's no relationship between these two even though there's only one bit different in the input many bits are different in the output that's our goal of our hash function and that's true in that case and there are other hash functions we'll see Shah we'll see an example but it turns out although that case worked well I got two different inputs two different outputs with MD5 there are known weaknesses that it becomes relatively easy to find two different inputs that produce the same output and if we can do that we can defeat the authentication mechanisms that we use it for so MD5 is considered insecure from that perspective I've got two different and two different files file one or file and file two dot txt same size 128 bits so they're not in fact ASCII they're just binary files so I would not show the contents I would look at the binary form of those files XXD is a program that shows the hexadecimal the binary contents of a file the first one is that and the second one are they the same the files this is not the hash it's just the file contents alright this is that the ASCII this is hexadecimal you see that the same most of the way but in fact there are some small I think there are five or six bits which are different can we see on this point where BDF2 BD BD72 in hexadecimal there's a bit that's different here so the files are slightly different 128 bits in length but several bits are different they're almost the same but different okay let's calculate the hash of each of them the MD5 of the first file and what should we get different file inputs what's the hash value of the second file it's the same this is the weakness of MD5 it's possible to find two p two different inputs that produce the same hash value and someone else found these they done a lot of analysis to find these two values they differ by several bits but when you apply the MD5 hash on both of them you'll get the same hash value in that case our authentication scheme will not work because what the attacker can do if this was the original file the attacker would modify the original file to be this one this m prime and send that on to B B would do the verification it would notice the hash values match therefore I'll accept this message MD5 has this weakness in that you can find collisions between different inputs that's why it's no longer recommended so that's an exception in that case Shah the secure hash algorithm was developed as an improvement and has gone over several variations the original Shah Shah one Shah two Shah three is being developed at the moment Shah two is commonly recommended at the moment as being a secure hash function Shah one and Shah zero have some some theoretical limitations and are not recommended in most cases Shah three is being developed as as the next next version they take or they produce different output lengths you can choose the output length MD5 was 128 bits the basic output length of Shah was 160 bits but you can choose 224 256 and up to 512 bits as output the longer the output the less chance of collisions that's just some details about Shah you've all you've all set up your virtual network because there's a homework assignment that's going to use it this week due next week so here's the base file what is it 547 megabytes we can do the Shah some I remember Shah one some I think calculate using the Shah algorithm and the hash of that the contents of the file not the file name the contents of the file that calculates okay if you modify the file you'll get a different hash value and I'm not going to do that and there are other hash algorithms but the main ones recommended today a Shah two but you'll see MD5 in use as well we will not cover this this is a little bit more theory about those properties of one-way property and the ability to have no collisions collision free this is some of the theory of it it turns out in terms of brute force attacks preventing collisions is the hardest thing to do and it's equivalent to if we have a hash length of n bits so MD5 was 128 bits to provide the one-way property for example a brute force attack to defeat that property requires two to the power of n operations with MD5 two to the power of 128 operations to do a hash two to the power of 128 times is take forever but to defeat the collision property required by some applications it takes two to the power of n divided by two operations so with MD5 being 128 bits two to the power of 64 operations from the attackers perspective finding collisions defeating the collision free property is easier much much easier than finding or defeating the one-way property so when we want to prevent collisions we need to consider that the hash length or half of the hash length indicates how much effort a brute force attack takes but we will not cover any of this theory nor this we later will see during this course some different examples of where we use hash functions like in digital signatures virus detection passwords so hash functions are using different applications we will see passwords viruses and digital signatures in this course how fast to do a MD5 collision attack today that's a good task for you to find I will not give a demo here but you can basically today you can buy hardware many people buy GPUs graphics processes and can defeat MD5 quite quickly okay for a very low cost cost MD5 is considered broken and do billions of hashes per second quite easily with standard hardware so to authenticate to check that the message we received has not been modified or comes from the right person we've got different approaches symmetric key encryption we briefly mentioned message authentication codes but not many details hash functions are quite important for that we'll see that there's something called digital signatures which combines hash functions with public key encryption so first let's talk about what is public key encryption it's not just for authentication it's generally for encrypting remember Caesar so from the beginning of when ciphers were known up until the last 40 or 50 years all of the encryption algorithms we're using symmetric key cryptography des a s are all symmetric key encryption encryption that is source and destination use the same shared secret key both of them must have the same key in the 60s 70s and around that time different organizations and people started to design or develop a new technology for encryption called public key cryptography today or asymmetric key cryptography it was first publicly reported by Diffie and Helman in 1976 so they wrote some paper about public key cryptography and the one of the first and still one of the most popular algorithms that implemented is RSA by three guys called revest Shamir and Edwin revest developed MD5 and many other ciphers and it turned out later that people found out that even though they were the first to publicly introduce this there are security organizations that secretly developed similar techniques in the past so now really we say there's public well there's symmetric key cryptography and public key cryptography two different approaches in symmetric key cryptography we use the same secret key for both encryption and decryption in public key cryptography or also called asymmetric we use one key for encryption and a different but related key for decryption so we have asymmetry between the keys we don't have the same key at both sides we have two different keys but the keys are related somehow what we need in most cases for public key cryptography is that we'll see that one of the keys so we have two different keys one of them will be made public everyone knows it the other one will be private it's kept secret therefore we'll see for it to work it must be hard to find the private key if you know the public key actually we'll see this come up in the later slide I think let's get into the details so we have two keys now two different keys not one shared secret key a public key and a private key they used in different ways depending on what we want to achieve if we want to have secret messages that is confidentiality I have a message I want to send it to you and I don't want anyone else to be able to read the message except you that's my aim then how we use public key cryptography is that so for secrecy I encrypt the message using my I encrypt the message using a public key and it will turn out to be your public key I encrypt with a public key send to the other person and they decrypt with their private key okay for secrecy we encrypt with a public key decrypt with a private key for authentication the case where I want to send a message to someone I don't care if someone else reads the contents but I want to make sure that the receiver can verify that it came from me not from someone pretending to be me so authentication then we use the keys in the opposite order I will encrypt with my private key send to you and you will decrypt with my public key so we'll see this in a number of examples the main point so far is that we have two keys so we think we have a key pair a public and a private key the private key as the name suggests must be kept secret no one else knows it only one person in the world knows that private key but the corresponding public key everyone can know it doesn't matter who knows it so we talk that each user now has a pair of keys so user a will often denote as having a pair of keys their public key the public key of user a and the private key of user a so use pay pu to mean public key PR to mean private key so I think each user has their own pair of keys given that let's see how we provide confidentiality so this is the aim of user a on the left wants to send a message to user B on the right and the message should be kept confidential such that no one else can read the contents of M the message that's the aim if someone intercepts the ciphertext they shouldn't be able to find the original message how do we do that we take our message our plain text M we use a public key encryption algorithm and we encrypt the message using the public key of the destination so remember this is B this is a a encrypts our message using the public key of B get some ciphertext as output where the ciphertext is the encryption of the message is using the public key of B sends that across the network B receives this ciphertext so now they want to read the message they decrypt using their private key and the algorithm must be designed and the keys must be chosen such that if we can decrypt with the private key of B we will get the original message as the output so we need an algorithm where that property will hold that is if a message is encrypted with the public key and then the ciphertext is decrypted with the corresponding private key we should get the original plain text as the output if not then this system won't work assuming that works encrypt with one key can only decrypt with the other key what can an attacker do to read the message the attacker intercepts the ciphertext they want to find M what do they do well if they have the ciphertext to decrypt the ciphertext they need to use a key which key they need to use the private key of B and by definition the private key of B will not be known by the attacker because it must be private to be only so assuming our algorithms are designed correctly and our keys are chosen correctly the attacker can intercept the ciphertext but will not be able to get the message M and we've achieved our aim of confidentiality so to keep a message secret encrypt with the destinations public key and the destination will decrypt with their private key so always remember that ordering of keys for confidentiality and the algorithm must be such that only the person with the private key can success successfully decrypt if I receive this ciphertext and try to decrypt it using some other private key my private key for example then I'll get an error at least I will not get the original plaintext I'll get some other random text similar if in symmetric key cryptography if I intercept a message and decrypt it with the wrong key I will not get the plaintext as an output that's the property of the algorithm if you decrypt with the wrong key you will not get the plaintext the other common way where public key crypto is used is in authentication I don't care who reads the message in this case but user B when they receive a message want to be sure that it came from a not someone pretending to be a so what do we do user a takes the message and they encrypt using the private key of a okay I'm user a I encrypt with my private key I send the ciphertext to be across the network be decrypt with the corresponding public key if it was encrypted with the private key of a B thinks here's a message it came from a I would decrypt with a public key of a and if we're using a secure public key algorithm it should be such that if we encrypt with one key it will only successfully decrypt with the other key in the key pair so if I encrypt with the private key of a I can only decrypt with the public key of a if I try to decrypt with some other key it will not work we'll get an error so how does this provide authentication let's say a malicious user wants to pretend to be a some malicious user is here they send a fake message saying this is from a they send it to be they encrypt that message with the private key of who well they cannot encrypt with the private key of a because they don't know it so maybe they encrypt with the private key of the malicious user send it to be if B thinks it's from a they'll decrypt with the public key of a and they'll get an error let's let's look at those those two examples well let's look at that attack first just to make this clear so let's say we have our two users a sending a message to be but we have a malicious user is going to perform an attack and pretend to be a so they are in fact going to send a message to be it doesn't come from a comes from our malicious user but they're pretending to be a so what do they do they have their message and we encrypt that message with some key we want to provide authentication in this example the same as this case malicious user wants to send a message to be pretend to be a and wants to be to think it came from a and will trust the message so we must encrypt the message with some key what key can we can we encrypt with for authentication we should encrypt with a private key actually whose private key can the malicious user encrypt with or they cannot encrypt with a's private key they cannot encrypt with the bees private key because they don't know them they are private let's say they encrypt with the malicious users private key his own private key he sends this message to be but the from address in the message is user a be receives a message are from user a let's verify the message so what be does is that they decrypt that what do we have what's received is this is the received message this is what was sent exactly the same as received be received that thinks it's from a so they decrypted using the public key of a it will be unsuccessful that is an error will be returned or at least will be able to recognize that it didn't decrypt correctly because we'll see the property we require our algorithm same with symmetric key except we have two different keys if we encrypt a message with the private key it will only successfully decrypt if we use the corresponding public key by the corresponding public key I mean the key in the same pair that is this was encrypted with the malicious users private key it will only successfully decrypt using the malicious users public key it will not successfully decrypt using the public key of a so when be tries to use the public key of a they'll get some error from this decryption which tells them something went wrong it didn't come from a so they'll detect that and realize okay don't trust this message any questions on the concepts of public key cryptography so far we haven't we haven't spoke about the actual algorithms yet we'll see some later but the assumptions we're making is that every user has a key pair every user has their own public and private key and the assumptions about the algorithm are that if we encrypt with one key in the pair we can only successfully decrypt with the other key in the same pair if I encrypt with Steve's private key I can only successfully decrypt with Steve's public public key if I try a different key to decrypt I'll get an error that's the assumption and in fact in most cases the ordering of the keys doesn't matter if I encrypt with Steve's public key you can only successfully decrypt with Steve's private key if I encrypt with Steve's private key I can only successfully decrypt with his public key so as long as we use the corresponding key in the key pair we'll successfully decrypt and we use that for both confidentiality encrypt with the public key of the destination decrypt with the destination's private key we can only decrypt the message if we have the private key of B and only B has the private key of B and therefore only B can decrypt the message no one else in the world can decrypt this ciphertext and for authentication to verify where did the message come from a encrypts with A's private key B verifies by decrypting with the public key of A anyone can verify everyone has the public key of A it's public everyone knows that value so anyone can decrypt this ciphertext it doesn't provide confidentiality but all that provides is to be able to check this message must have come from A if it decrypts with the public key of A then it must have mean it was encrypted with the private key of A which must have mean user A encrypted it therefore it's proof that this message came from A this is an important concept in a digital signature we sign a message using your own private key we'll see some details of the algorithms and being used in the next lecture let's just summarize it turns out that public key cryptography the algorithms available today for encrypting large amounts of data are very slow compared to symmetric key cryptography they're very slow and therefore not very convenient to use turns out they're mostly used for authentication to prove that this message came from a particular person digital signatories the concept and there are some other applications of public key cryptography like key exchange which we may see in another topic there are different algorithms available RSA is a very popular one but there are others there's a concept of elliptic curved cryptography which uses some different mathematics Diffie Hellman is an algorithm for exchanging keys there's a digital signature standard for signing documents we'll see them come up later and I think this this summarizes our requirements of our algorithms so for general public key cryptography first we need it to be able to easily generate a pair of keys every user will need their own key pair so we need it such that we can generate them quite easily your homework tasks will be to generate your own key pair you use some software it takes one second to generate so that's easy given a key pair it should be easy for one user a in this case to be able to encrypt using a public key some message okay encryption should be easy but it's fast that's a practical requirement not a security requirement similar if some message is encrypted with the public key of B then B who has the private key of B should be able to decrypt so take the original ciphertext decrypt with the private key we shall get the original message or plain text back that's a requirement and it's a fundamental requirement that is encrypt the ciphertext encrypt plain text to get ciphertext decrypt the ciphertext must give us the original plain text that's a normal requirement of encryption but now from an attacker's perspective or the security of the algorithm an attacker may know the public key it's public anyone can know it if they know the public key and they know the ciphertext again the ciphertext is sent across a network and they know the algorithm it should be practically impossible for the attacker to find the corresponding private key because if they can it's no longer private so given a public key given the ciphertext it should be impossible to find the private key and note that the public key and private key are related somehow they're not just random they they have some mathematical relationship so if you know the public key it should be hard to find the private key for the attacker if the attacker similarly knows the public key they know the ciphertext they should be hard to find the original plain text so this shouldn't be able to decrypt unless they have the correct key we require that here it says computationally infeasible practically impossible in theory it may be possible but in practice it must be so slow that it would take millions of years to do so it's optional one not so relevant at this point in time so we've introduced a new concept for cryptography which uses two different keys now and it's very important in in many systems today in the internet and it turns out both symmetric key cryptography and public key cryptography are used in practice today but often for different purposes they have different advantages and disadvantages we'll see some of them over this course what we'll do on Thursday is look at a little bit about a tax give an example of a public key cryptography algorithm and the security of it and I think we'll skip key management and try and finish this topic on digital signatures which will combine public key encryption and hash functions and we'll see how they're combined to provide some form of signature and let's open try and finish that on Thursday everyone's done the homework