 So welcome to this last session or second last session, this is a half session and then we'll continue after tea with the rest of this thing. In the afternoon we just have some question answers, maybe something about your feedback and then of course giving out distribution of the certificates. So I tried to keep things a little bit simple because I didn't know what the audience was like or what to expect. And then I was pleasantly surprised to find that many people were either finished, who had finished their PhDs, some were doing their PhDs, some plan to do their PhDs, some were working on cryptography and so on and so forth. So I decided to make at least a little bit of this lecture a little bit high level, high level in a sense, a little bit more intense than the previous lectures because I felt that there was a pretty sophisticated audience around. So if you find it a little spicy just forgive me, but at least the first part will be okay. But the second part we'll talk a little bit about some of the research that our students here in IIT are doing on both elliptic curve cryptography and on side channel attacks on AES for example. And then hopefully we'll have time to show you something that we are doing in the area of clickjacking. So the discrete log problem, I think most of you know about this, how many of you know? What's the discrete log problem? Okay, I think this is an important thing that one should know. The discrete log problem and as usual we always focus on application. So what is the theory and then finally how do we apply it? Let P be a prime number and let G be a primitive root of P. So basically a generator of the set ZP star. The function FA is equal to G raised to a mod P is called modular exponentiation with base G and modulus P. So basically I have a prime number to start with and then I have a generator. So what is the generator of this group? So there is a group actually. So let me just very, very briefly talk about what is a group. A group is two things. A set, let's say the set is G itself. So this is the group which comprises a set and then which also includes an operator. So let's say the operator is plus. And then this group respects these four properties closure, associativity, identity and inverse. Basically these properties are closure is if A and B belongs to then A star B or A plus B belongs to G. So this is one property. So this is closure. Then the next property is associativity. You can associate the parentheses in different ways and you'll still get the same answer. Associativity, identity. There exists a unique element. Let's call it little E such that for all A belonging to G, it is true that E star A is equal to A star E is equal to A. So this is the identity element, a specific unique element inside this group. And finally the inverse. For all A belonging to G, there exists a unique element B such that what? A star B is equal to the identity. So these are the properties of a group. And the reason I brought this up is because the discrete log problem relates to a group. So in this particular thing, when I talked about a prime number and so on, I was really referring to the group. So a special case of this is the group with the operator, multiplication, modulo p. So I assume everybody knows what is modulo arithmetic. So this is the group that we are talking about. And we'll refer to this group as Zp star. So we say that this group has a generator, which is, let's say the generator is x if. So x is a generator if x, x square mod p, x cube mod p and so on. These things up to x to the power of p minus 1 mod p. This gives you all the elements in this group. So it's a generator if x is a generator, if x, x square mod p, x cube mod p, et cetera, et cetera, et cetera. Right up to x raised to p minus 1 mod p comprises all the elements. All of these things are different and they completely cover all these elements in the group. So this is the definition of a group. This is the definition of a group generator. So with this background, let us see what are these two problems, modular exponentiation and the discrete log problem. So once again, let p be a prime number. Let G be a generator of this group, the group Zp star, which has elements 1, 2, 3, right up to p minus 1 with operation, multiplication, modulo p. When you say group, you have to define two things. You have to define the set and you have to define the operator. So this operation f of a is equal to G raised to a mod p for a lying between 0 and p minus 1 is called the modular exponentiation function with base G and modulus p. The interesting thing is the reverse operation. So just like you have a normal arithmetic, you have exponentiation and the reverse is the logarithm. Here also, but it's a different kind of logarithm because it's modulo p. This is called the discrete log. So it's inverse a is equal to log of p to the base G is called the discrete logarithm. Modular exponentiation is not very difficult because there are clever ways to make it fast. On the other hand, the discrete log problem is supposed to be infeasible. Infeasible not just over any group, over very large groups and over very specific very large groups. So the first thing we ask ourselves is what are those specific groups over which the discrete log problem is difficult. So an example of the discrete log, I take the elements 1, 2, 3 up to 28 and the operation is multiplication modulo 29. So I find that is this visible to everybody? So 2 raised to 1 is equal to 2. 2 raised to 2 is equal to 4. This is like normal arithmetic 2 raised to 3 is 8. 2 raised to 4 is 16 but then 2 raised to 5 is 3. Why is it 3 and not 32? It's all modulo 29. So like this you can go on and then you can ask yourself the question the discrete logarithm of 18 to the base 2 mod 29 is what? So the answer is what is that number which if I raise it to the power of 8 of 2, I get the number 18. I see where I find 11 and the answer is 2 raised to 11 is 18. So I don't know whether this is visible. So I was given the number 18. So first and foremost you are given the prime number 29. You are given a generator. There are various mathematical tests to find the generator. I won't get into that here. So you are given the generator 2, you are given the prime, you are given the generator and then you are given this number 18. And then you have to find out what is this thing? This is the discrete log problem. I'm given this and I'm given this and I'm given this. I need to find out that. So that's the reverse of if I was given this and this and this then I find that. That is modular exponentiation. On the other hand if I'm not given this thing but I'm given this thing and I have to find this that's the inverse problem which is the discrete log. And that is the thing that is very time consuming which is generally infeasible if I've chosen a proper group. Now this is a toy example with just 28 elements. There's no big deal in this. But I can have something with around 2 raised to 1000 elements for example. And over that sort of group the discrete log problem is going to be very difficult. So every of these crypto problems is based on the difficulty of some problem. For example RSA is based on the difficulty of factorizing a large number. When I say large I'm talking about thousands of bits. A large number that is itself a product of two large primes. So we'll see now crypto schemes which are based on the discrete log problem. The difficulty or the infeasibility of the discrete log. Another example let PB 131 a prime number. It happens to be that a generator in this group is 2. Then the discrete log problem you can write a little program. Ask your student to write a little program to find the discrete log of 72 or discrete log of anything to the base 2. Mod 131 and turns out to be 2 raised to 17. Since 2 raised to 17 mod 131 evaluates to 72. So if this problem is very difficult can we make use of it. So the first thing many of you have already seen is Diffie-Hellman key exchange. And then also Diffie-Hellman authentication. And can we also use this thing for encryption and decryption and signature generation verification? And the answer is yes. So how many of you have seen the application of discrete log to encryption and to signature generation? One, okay a couple. So that's an interesting thing. And also when we go to elliptic curve cryptography we will need to use the same kind of logic as we use over here. So in the case of elliptic curves it's called the EC discrete log problem. In this case it's called discrete log. This is one of the first ideas that was published in 1976 of this business of public key and private key. As we had seen on the very first day this is a very powerful idea that gives us not just encryption but also gives us signatures etc. First published in 1976 the problem two users need to exchange a key securely. They need to actually agree upon a key securely. Assume A and B have agreed on a P energy beforehand or they decide on these during the current session then how do they proceed to obtain a common key. So in pictures to make it a little simpler and faster. In pictures you have A and B over here and they both assume that both know the values of P and the generator G. So the first guy A she chooses an integer little A which is a number between what? Any arbitrary integer or something special? 1 to P minus 1. Exactly. Less than P minus 1. So she chooses a number between 1 and P minus 1. Hopefully she doesn't choose 1 and simple things like that. And then she computes this G raise to A mod P. And then she sends across if the other guy doesn't know G and P she sends across G P and G raise to A mod P. And then this guy chooses a B and computes the so-called partial key G raise to B mod P and sends that thing across. And then she computes in turn she takes the so-called this is like a public key G raise to B mod P. Knowing the public key I cannot find the corresponding private key. There is a mathematical relationship so it's like that's why I say this is the first idea of a public key private key pair. This is like a private key and this is like the corresponding public key. Knowing the public key you cannot find the private key. Why? Because of the hardness of the discrete log problem. So she sends this across and guess what he does? He takes that quantity, raises it to the power of, she raises it to the power of her private key little A and she gets G raise to A B mod P. The same thing this guy does. He takes what she has given him. G raise to A mod P is what she sent across and his private key is little B and he raises it to the power of little B and both now agree on a common secret which is G raise to A B mod P. So here's a very simple example which you can just try or ask your students to try. Let's again take the prime number 131 and G equals 2. Let the random number chosen by A be 24. So this is her private key, 24. Her partial key, public key is 2 raise to 24 mod 131 which is 46. B chooses his own private key, let's say it's 17. His public key is 2 raise to 17 mod 131 which is 72. Now what exactly happens? After receiving B's partial key A computes the shared secret as 72 which is B's public key which he sent across. He could have sent it in a certificate also. So 72 to the power of 24 mod 131 which is 13 and likewise after receiving A's partial key B computes 46 which is this number here to the power of 17 which is his private key. Only he knows his private key so he can compute this thing and now both sides have agreed upon a common secret. This is one very extensively used method of computing a common secret between two parties. There's a standard well-known protocol or actually a trick to be able to attack this thing and you can figure out why this attack works. So guess what? A standard man in the middle attack. So as usual he chooses a little A and the corresponding G raise to a mod P and sends this across and then the attacker what does he do? He stands in the middle, he chooses a number little C, he computes G raise to C mod P and sends that thing across. And then he thinks that this quantity G raise to C mod P has come from her but actually it's been changed and modified by the attacker. In any case he computes G raise to B mod P after choosing a little B and sends that thing across. Same old trick, this guy intercepts it and replaces that value G raise to B mod P with the G raise to C mod P. The same thing that he had computed here and when he sends it across she happily computes this number G raise to C mod P, raise to the power of her secret little A and she gets G raise to AC mod P and he computes G raise to BC mod P and now guess what happens in the rest of this thing. She sends something computed with this secret, this secret is G raise to AC mod P. It's a secret shared between her and the attacker. So every single message that she sends this guy decrypts it with this key, reads it, perhaps even modifies it and resends it encrypted with this secret, which of course he can read. He responds so everything is intercepted by him. It's an active man in the middle attack. Everything is intercepted, read, possibly changed, reencrypted and sent. How do you solve this problem? Authenticated Diffie-Hellman key exchange. So somehow the problem over here is I didn't really know that when I got this quantity it was sent by the attacker. I was under the impression this thing was being sent by A but actually was sent by the attacker. So somehow if I can authenticate this message, I'm in good shape. So that's what we call authenticated Diffie-Hellman key exchange. Let's try to use this discrete logarithm idea for encryption and for signature generation, etc. So as usual, given a large prime P and a generator G, so we are talking now again the prime P is thousands of bits long. A generator in this thing, in this group ZP star, ZP star is all the elements between what and what? Zero and P. Yes or no? What is this group? What does it contain? One to P minus one. Not one to P, not zero to P, not zero to P minus one, but one to P minus one. So it's got P minus one elements. And this is the operation multiplication modulo P, not P minus one. Multiplication modulo P. An L-Gamal private key is an integer A, A lying between one and P minus one. The corresponding public key, so how do I generate an L-Gamal key pair? Choose a random integer between one and P minus one. And then the corresponding public key, so the public key is not just one value. In the case of RSA, what is the public key? I've got E and D, encryption, decryption. In the case of RSA, in my certificate, what is included when I say private remain public key? In my RSA certificate, what are the, is there just one quantity or two things? What are those two things? E and N. E and N, N is the modulus. Exactly in the same way, so this has to, you have to include not just the public key, but also the parameter. In this case, there are two parameters as mentioned before, the prime number and the generator in that group. And then this alpha, which is the public key. G raised to A is, this is the public key. Once again, the private key, the public key. Once you know the private key, relatively easy to find out the public key. Use modular exponentiation. If you know the public key, very hard to get A. Why? You have to crack the discrete log problem. So once again, please keep that in mind. From here to here, modular exponentiation. From here to here, discrete logarithm. Why did I say modular exponentiation is easier? Exactly. So because you can use the well-known square and multiply algorithm. You just do G, you compute G, G squared mod P, G raised to 4 mod P, et cetera, et cetera, et cetera. And then you choose the corresponding terms, the terms corresponding to ones in the binary representation of A. You just represent A as a binary number and just look at the ones over there. And take only those powers of two that are, that correspond to ones in the binary representation of A. So the same kind of idea, we're going to use an elliptic curve cryptography. That's why it's important to understand this well. Okay, how do we do encryption? So we know her public key, P, G, alpha. We know her private key is little a. Once again, the private key is over here, the public key is over here. So P, G, alpha is the public key. So P, G, alpha, B, the public key of A, to encrypt an arbitrary message M less than P minus 1 to be sent to A, B does the following. So it's assumed that B has her public key via a certificate, for example. So B chooses a random number R between 1 and P minus 1 such that R is relatively prime to P minus 1. You might want to think later on as to why this is necessary, that R is relatively prime to P minus 1. So its details are in the book. Then B computes, so it's very interesting. The ciphertext corresponding to this message M has got two components. This is the first component and the second component. So the first component is G raised to R mod P. And the second component is take M multiplied by alpha, the public key raised to the power of R. This is the random number that you just chose. You use it over here and use it over here. So what he does is he takes the message, he takes her public key, he takes the random number which he just generated and he performs M times alpha to the power of R mod P. That is the second constituent or second component of the ciphertext corresponding to this message M. So the ciphertext is C1, two numbers. C1 and C2 and sends this to A. So this is a very interesting thing. You would have a different pair of these numbers if he chose some other R. So corresponding to the same message, if he encrypts the same message 10 times you will get 10 different ciphertexts. Just imagine, unlike in the case of RSA for example. Any guesses about the decryption? So this was the encryption. Let me try to decrypt in the following way. If I know C1, of course I know P and I know G, so I can get R, correct? And then I know the public key, raise it to the power of R and so on. I know this, I know alpha, I know R, I've just got R from this. So I just move this on this side, alpha R inverse over here and I get the message. Correct or not? Impossible, right? Why? I was just trying to see whether you're alert. There's no way I can do this. I cannot, even though I have P, I have G from the certificate, I cannot simply, and I've got C1 because B send this thing to A. A cannot just take P and G and the C1 and compute R because it's virtually impossible to crack the discrete log problem. So that's not the way she decrypts it. What is the way to decrypt it? Yeah. So take C1, raise it to the power of minus A. What is C1? G raised to R. C1 is G raised to R. The following, she takes that C1, raises it to the power of her private key, but minus that. Minus a private key, raises it to the power of that. That is basically raises it to the power of her private key and then takes the inverse of that. Everything modulo P. And then multiplies it by the second component of the ciphertext. So you can very well see now what's going on. C1 to the power of minus A multiplied by C2 is G raised to R mod P to the power of minus A. And you multiply this by M times alpha raised to R mod P. So by the laws of modulo arithmetic, I can extract that mod P right outside and G raised to minus A R, G raised to A R and M. These two things cancel out very nicely and I'm left with the message. So that's how she recovers the message. Any questions so far? How do we know this thing is secure? So as I said before, one possibility is, let's see if we can recover R. So I sit down there. I mean, I'm tapping the line. I know what is C1 and C2. C1 is this stuff. I try to recover R from somewhere. So M is equal to C2 times alpha raised to minus R. However, recovering R, so I can probably, this is what I'm hoping to do to get R from this. And then once I get R from this, I use it over here to get M. But this is impossible because of the infeasibility of the discrete log problem. Now that is one problem. The other, as I said before, is in the case of key exchange is unauthenticated Diffie-Hellman key exchange. So there are different kinds of problems. Now there are some problems with the implementation software. Guess what? Use the same random number twice to encrypt two different messages M and M prime. Then it may be possible to launch a known plaintext attack. So most specifically, the problem is stated as follows. If the attacker knows M, so there is a plaintext, ciphertext pair M and C. M comma C, the plaintext and the corresponding ciphertext is C. And then what I'm looking for is what is this other plaintext corresponding to ciphertext C prime. So the attacker knows M and the ciphertext corresponding to plaintexts M and M prime. Let's call them C and C prime respectively. Then it is possible for the attacker to compute M prime if my implementation software makes the mistake of using the same random number twice. So this is not a very difficult problem. You can just try it and see how if I use the same random number R to encrypt two different messages M and M prime. And furthermore, if the hacker knows M and knows C and knows C prime, then he can recover M prime. This is known as a known plaintext attack. So you can very well see the mathematics of it, how to recover M prime. To recover M prime, if you know M, C and C prime, assuming that the software was buggy enough to use the same random number two or more times. Okay, so this is one example. Let P be the prime number 131 and G equals 2. Let A's private key be 97. So her public key is 2 raised to 97 mod 131, which is 14. And if the message to be sent is 75, let the sender B choose the random number R equals 33. So this is one of the things that he chooses. And as I said before, don't choose the same number twice. So B computes C1 as G raised to R mod P, which is G is 2 in this case, 2 raised to 33 the random number mod 131. He gets 103. And for C2 he chooses, he computes the public key of hers 14, raises it to the power of 33, the same random number and multiplies it by the message 75 gets 51. And on the other side, how does she decrypt? So the fundamental thing is, in decryption, you must use the private key. Otherwise it doesn't make any sense. Anybody could decrypt a message. So here's where she uses the private key. So she takes that C1, as you can see in the previous thing, C1 was 103. She uses that 103 over here, raises it to the power of minus of her private key and then multiplies by C2. And lo and behold, everything, modulo 131. And lo and behold, she's able to decrypt the message and get 75. So 75 was the original message. So very simple example to see how you can verify El Gamal encryption. And finally, El Gamal signatures. So once again, so this is another way of doing signatures in a very popular way. And exactly the same kind of logic and same kind of thinking is used to generate an elliptic curve signature. So how many of you have seen elliptic curves before? So because there were some questions during the lunch break about elliptic curves, I decided to just include a little bit of fit. So the next session will be a little bit spicy, a little bit research oriented. Some of the kinds of things that we have been doing, I've been doing with my students over here at IIT, both in the area of side channel attacks and ECC optimization. So to complete this example, using the discrete log now for signatures. So what did we use the discrete log idea for? Let's summarize. We use the discrete log idea for number one, key exchange. It's the so-called Diffy-Helman key exchange. We used it for encryption. It's called El Gamal encryption. And there are many variations of that thing. And now we are finally using it for the third application, El Gamal signatures. So as usual, let A and this triplet be her private and public keys respectively. Now she's got to sign a message, so what does she do? Something tells you that to sign a message, you need to use your private key. If you don't, something is wrong. So to sign a message M, she does the following. She computes the hash of the message, cryptographic hash. She chooses a random number R as usual. R lying between 1 and P minus 1. Choose this very carefully. Don't repeat it. Such that R is relatively prime to P minus 1. She then computes X. So once again, the signature has got two numbers, X and Y. The first number is the generator raised to the power of the random number mod P. And then she computes Y. The second component, which is H of M minus years where she uses her private key. She must use her private key. So HM minus AX multiplied by R inverse mod P minus 1. And her signature on this message is the pair of quantities. So what she sends to B for verification is the original message M. And the signature, which is now two quantities, two values, two integers, X and Y computed in the following fashion. And you might want to tease yourself and ask yourself, how do I verify this message? So I'm giving you the answer. Check it at home. For signature verification, the following check is performed. Just like we talked about RSA signature verification like that. El Gamal signature verification. What do you do? You've got the message. Now this is who's verifying somebody else, this guy B. Something tells you that to verify the signature, you have to use her public key. To sign, she uses her private key. To verify, that other guy has to use her public key, which is alpha. Don't forget, the public key is GP and alpha. These three things inside the certificate. So he gets a signing certificate, pulls out this alpha, pulls out the G and the P and then performs this computation. Takes the generator, raises it to the power of, he's also got the message, obviously. So he pulls out the message, computes the hash of the message, G raised to H of M. Also computes this, her public key to the power of X. So X and Y are, let's go back. X and Y are the two components of the signature. So she has sent what? She has sent the message and she sent the signature, the two components are X and Y and of course the certificate. So she gets, he gets P and G and alpha from a certificate. X and Y are the two components of the signature and the message itself. After that he computes this thing. So then he checks whether this is equal to this and you can actually verify using those two equations that if indeed this thing is the same as this thing, then the signature is verified. So in the textbook it just goes through about four or five steps as to why this is sufficient for verification. So that's a quick run of the discrete log problem and its applications to these three different things, key exchange, encryption, decryption, signature generation, signature verification. Now the next thing that I'll be talking about just after the tea break is what are the attacks on AES? We've been boasting all this while that AES is very secure. But believe it or not, security is an open-ended subject. You can never say anything is perfectly secure. After seeing so many security problems, I can never, never, never, I feel very hesitant to say this thing is perfectly secure. So I used to say about four years ago, five years ago, when students asked me, is AES secure? I said of course, it is very secure, everybody has tried it out and this and that until I started encountering papers where they hacked into AES. So once I saw that and I saw cash-based side channel attacks, I challenged my students to redo some of those attacks and see whether they are actually feasible in practice. So that happens to be a very difficult subject and it's especially important in the context of cloud computing where you might have multiple users on the same core, multiple users on different virtual machines, but sharing the same core. So what we did was multiple processes on the same core. One process is the attacker process, one process is the victim process and we'll tell you how we actually attacked AES. So that's after the break.