 Good morning everyone, welcome to session 1 of day 8. We start talking about some of the research, research problems in cyber security, a very limited set of problems and some of the things that we are doing here in this department at IIT Bombay. So, I will just straight away go to. So, there were a couple of questions by the audience, can you discuss elliptic curve cryptography, can you introduce it and talk about some of it. So, this is a rather ambitious task today to actually introduce elliptic curve cryptography and then look at some of the research that we have done some very very interesting research problems on optimizing what is called scalar multiplication in ECC. So, before we get to the heart of this problem, let us just introduce ECC. So, what is elliptic curve? We consider elliptic curve over 3 different fields. So, we define briefly what was a field, if I have to talk about fields in great detail it will take several lectures. So, what I did was to summarize or very briefly introduce the idea of a field and before that the idea of a group. So, if I was I were to recall the definition of a field a field has got a set the set can be a finite set it can be an infinite set and then there are two operators the generic plus operator and the generic multiply operator. In a nutshell the field the set of elements together with the generic plus operator constitutes a group. And as mentioned there are four group properties namely closure, associativity, the existence of an identity and the inverse. So, once again the set of elements in the field together with the generic plus operator constitutes a group. Also the elements in the field minus the additive identity. So, exclude the additive identity and that set together with the generic multiply operator again constitutes a group. We call that the multiplicative group of the field and the earlier group is referred to as the additive group of the field. And then there is also the property of distributivity that I mentioned some lectures ago. Now, the three fields that are of importance to us actually there are two fields of importance and one is this field that is actually a well known field the field of real numbers which includes the operators the simple arithmetic operator plus an arithmetic operator multiply. So, this is the field of reals which everybody is familiar with in school days also. And then the two fields that are of interest to us the two finite fields are the prime field and the binary field. So, the prime field is nothing else but say you choose a prime number p let us say p is equal to 7 and all the integers 0, 1, 2, 3, 4 right up to 6 with the operators being addition modulo 7 and multiplication modulo 7. So, that is another set of fields besides the real numbers and then binary fields. So, I will not get into the details of binary fields because it is a little time consuming to deal with all the operations and so on I briefly mentioned binary fields in the context of AES some lectures ago. So, we will look at what are the operations in each of these fields what are the elliptic curve operations in each of these fields and even before that what is an elliptic curve the equation of an elliptic curve. So, elliptic curve over reals includes all points that satisfy this equation. So, the elements x, y and even the coefficients a and b are all elements of the real numbers. We refer to such a field that satisfies and respects this equation we refer to such a graph as an elliptic curve over reals. We say over reals because these coordinates and these coefficients a and b are real numbers. The constraint is that 4 a cube plus 27 b square should not be equal to 0. So, this is the equation of the elliptic curve you are given the coefficients a and b and then figure out what are the x and y that satisfy this and plot that that is the elliptic curve. It is interesting that the elliptic curve also includes the so called pointed infinity which we designate as something like an O. This is not 0 this is actually the pointed infinity, but as we will soon see this is this element of the elliptic curve is actually the identity of the elliptic curve group. So, I just come to that in a second just note at this time that there is a very special point this is not the point with coordinates 0 comma 0 anything like that. This is an abstract point the so called pointed infinity which is also part of the elliptic curve. So, here is an example of a curve you can plot it out yourself the curve y square is equal to x cube minus 5 x plus 8. I think some of my students plotted this in an excel sheet. So, this is what the curve looks like interestingly enough I change the coordinates a little bit and the curve looks completely different it looks like it is got 2 different pieces out here. So, once again this is an elliptic curve over real numbers and there are infinite number of points on this curve. Now, once we define the curve and we understand what it looks like and so on the next thing is what are the operations on this curve. So, we are talking about curve operations. The first is what is referred to as unary negation unary as opposed to binary unary negation there is one there is just one operand. So, if the coordinates of p are x and y then minus p the unary negative of p has coordinates x and minus y. So, that is the first thing unary negation the next thing is that we need to define is point addition I would like to add 2 points in the elliptic curve p and q. Let us suppose that the coordinates of p are x 1 y 1 and the coordinates of 2 are x 2 y 2 then what are the coordinates of the sum of p and q. Let us say those coordinates are x 3 y 3 then what is the expression for x 3 in terms of x 1 y 1 x 2 y 2 what is the expression for y 3 in terms of x 1 y 1 x 2 y 2. So, we need to derive that, but first there is a nice geometrical interpretation for the addition of 2 points in the curve. So, to add 2 points draw a straight line connecting them let the third point of intersection between this line and the elliptic curve be the x 3 comma minus y 3 then the unary negative of this is actually the point that we are interested in which is x 3 comma y 3 this is the sum of p and q. So, let us just look at this there are some degenerate cases which are handled separately, but just let us look at the most important cases first. So, here is an example I am adding 2 points a and b now what should I do join them with a simple straight line segment and extend that line segment until it hopefully hits the curve at a third point. So, almost always it will, but occasionally one of them is a tangent and so on and so forth. So, it may not hit the curve at a third point, but those are degenerate cases for the time being let us talk about the normal case I connect a and b and it hits the curve at this point which is actually minus c. So, I need to find the negative of this point to obtain the sum of a and b and the negative of a point as mentioned on the previous slide is if the coordinates are x y then the negative of that point is take the same x coordinate, take the y coordinate and take the negative of that y coordinate that is the mirror image along the x axis. So, I join these 2 a and b it hits the curve at a third point and then I take the mirror image along the x axis. So, that is what that is what that is gives me this thing. So, this coordinate is x 3 y 3 then this is x 3 comma minus y 3. So, this is a geometrical interpretation of the addition of 2 points on the elliptic curve. So, this is the elliptic curve over reals that is over real numbers over the real field these are 2 arbitrary points I have just chosen I want to add them I connect them I see where it intersects the curve at this third point and then I take the mirror image and I get the sum of a and b which is this point. So, a plus b is equal to this thing. Now, you can actually go through some equation it is all in the text book to find out what are the coordinates of r the sum of p and q. So, the p is x 1 y 1 coordinates of q or x 2 y 2 coordinates of r or x 3 y 3 then you can express x 3 and y 3 in terms of these things these known things. So, x 3 is s square minus x 1 minus x 2 and y 3 is minus y 1 plus s times x 1 minus x 3 where s is the slope and from your high school mathematics coordinate geometry you know how to find out the slope of a line connecting 2 points x 1 y 1 and x 2 y 2 the slope is y 2 minus y 1 divided by x 2 minus x 1. So, this s over here there is nothing else, but the slope of the line segment connecting this point and this point ok. So, we talked about unary negation the next thing is point addition and the third thing is doubling of a point. So, to double a point p again there is a nice geometric interpretation for this to double a point p draw a tangent at p let the coordinates of the point of intersection between the tangent and the elliptic curve b x 2 minus y 2 then the twice point p the coordinates of that point are x 2 comma y 2. Let us look at a geometrical interpretation. So, I want to double this point. So, before I ask the question what is a plus b now I am asking the question what is a plus a. So, that is twice a. So, what I do is I draw a tangent at this point and hopefully it is going to hit the curve at some place. So, at this point for example, and this point then I have to take the mirror image as usual and that is twice of this. So, you can think of it as adding a to itself a and a are next to each other basically the same point I take a line segment connecting them basically a tangent at that point and I see where it intersects the elliptic curve and the mirror mirror image of that point is twice of a. So, these are the three important operations as far as cryptography and security are concerned that is unary negation point addition and point doubling. So, please remember these three things unary negation point addition and point doubling. So, if I am talking about unary unary negation if I take any point on the elliptic curve then the negative of that point is simply the mirror mirror image along the x axis would be this point. So, I am talking about adding two points a and b then I see where it intersects the elliptic curve and I take the negative of that point that is a plus b and when I want to double a point I just draw the tangent at that point to the curve see where it hits the elliptic curve and then take the negative of that point. So, these are the three fundamental curve operations just like we wrote down the coordinates of p plus q here are the coordinates of the point 2 p the coordinates of the point 2 p x 2 and y 2 are expressed as in terms of the coordinates of the point p the coordinates of the point p are x 1 y 1 and these are the expressions for x 2 and y 2. Now, we do not want to learn any of this by heart we just want to understand basically what is going on and to be able to use these equations to do encryption decryption signatures key exchange and so on and so forth. So, elliptic curve cryptography is now one of the hot items in crypto many of your browsers are probably all of them will support ECC. So, this is the new thing that is likely to replace RSA in the days to come because it is computationally more efficient and likely to be also more secure. So, having defined these operations the next very important statement is that the points on the elliptic curve form a group. So, we know what is a group once again a group is a set together with an operator and the four properties namely closure associativity the existence of an identity in the group and the existence of an inverse for each point. So, when we say the elliptic curve points form a group then we of course, ask ourselves the what is the operator and the operator is the plus that is the addition of two points on the group. So, when I make a statement that the elliptic curve points form a group then the question is what is the set the set is the points of the on the EC which satisfy those equations that I mentioned before y square is equal to x cube plus A x plus B. So, those are the points on the EC. So, that is the set in this group and then the operator the operator is simply the plus operator that is the addition of two points which is defined in the previous slides both geometrically and through those equations. This is the group operator if I have got two elements in the group then the sum of those two elements is given by R the coordinates are given by these two things. So, this important statement that the EC points form a group. Now, if they form a group the next question is is the discrete log problem feasible on this group or not. Fortunately, it turns out that for carefully chosen groups the elliptic curve discrete log problem is infeasible. I am going to state this without proof. So, you can derive this mathematically and you can show this actually, but it so this turns out to be a very important result. If the discrete log problem is going to be infeasible over carefully chosen elliptic curve groups then well that opens the door to doing key exchange encryption decryption and so on. So, let us see whether we can get to that. So, before that I just want to point out that for the purpose of cryptography the two curves of interest are not elliptic curves over reals because those are infinite groups, but elliptic curves over finite field like Z p the so called prime field and the so called binary field. So, we work with both of them elliptic curves over prime fields and elliptic curves over binary fields. So, just like we had derived the equations for addition unary negation addition and doubling for elliptic curves over reals the very same equations hold for elliptic curves over prime fields and there is a slight modification for elliptic curves over binary field. So, very quickly elliptic curve over the first field the first finite field which is the prime field. The formulas for addition are similar to those for ECs over reals, but all the operations. So, exactly what we have seen before the same arithmetic, but all the operations are including inverses are modulo p operations. The equation of the elliptic curve is also the same y square is equal to x q plus a x plus b mod p the only thing is now here we keep adding this mod p and the constraint is that 4 a q plus 27 b square this is a misprint it should be not equal to. So, 4 a q plus 27 b square is not equal to 0 mod p just like one of the previous slide the same equation with a not equal sign included. So, this is the elliptic curve points that satisfy x and y that satisfies this equation for a pre given coefficient a and a predetermined coefficient b and of course, the field is known. So, some particular field that is very carefully chosen. So, in practical problems in cryptography elliptic curve cryptography this p is a very large number which is something like 200 bits at the very least. So, around 200 bits 300 bits 400 bits, but much smaller than what use in RSA you use 1000 2000 bits in RSA over here you use 200 300 400 bits. The interesting thing is that the number of points on the elliptic curve is order p where p is this thing the size of the field ok. So, with that little introduction to elliptic curve over prime fields the next thing is elliptic curve over binary fields. So, in this case the equation is slightly different over there we had y square is equal to x q plus a x plus b over here in the case of elliptic curves over binary fields the equation is y square there is another term introduced y square plus x y is equal to x cube not a x, but a x square in this case a x square plus b. As usual all of these coefficients and the coordinates of the point x and y all of these are elements of this particular field. So, once again if you have got this kind this looking thing this stands for Galois field the number of elements over here is 2 raise to m it is always a power of 2. So, 2 raise to m and you can represent these elements as I explained before some lectures ago you can represent them as binary strings of length m ok key exchange using elliptic curve. So, this is the first application. So, now that we know the mathematics behind elliptic curves we know the equations at least we can write programs to do all those operations I can write a program to generate some of the points. So, at least to check whether a particular point belongs to an elliptic curve I can write down a program to find the negative of a point to add two points and to double a point. Once I have got all that as an assignment for example, we could give that to our students the next thing is what do I do with this elliptic curve. So, the first thing is key exchange the thinking is very very similar to your Diffie Helman key exchange, but the only difference is that I am using a different group. So, you are given now an elliptic curve over Z p to start with the prime number is given to you from that prime number and the coefficient a and b these are the two coefficients in that equation both of those equations elliptic curves over primes and elliptic curves over binary fields both of them involve two coefficients those coefficients are given. So, that the curve gets defined. So, these three things are then given p a and b you are also given a generator point a base point which is a point that generates a subgroup of the elliptic curve group. So, when I say generator again I define it some lectures ago it means that if I take g and then if I take 2 g and 3 g and 4 g and so on and so forth then I get a particular group generated by g and that group is a very very large subgroup of this group. So, it is not necessarily that that is a generator this g is a generator of the entire elliptic curve group, but it is a generator of a very large subgroup of this. So, if this thing has 2 raise to 200 points then this will also have the group generated by g will also have around raise to 200 points or so. So, g is a generator of a very large subgroup of the elliptic curve group. So, given all this I am ready to now start a Diffie-Hellman key exchange. So, A chooses an integer A which is her private key and then she computes her public key as this notice how similar this is to what you already know. What you did in the case of Diffie-Hellman key exchange the regular Diffie-Hellman key exchange is A chose a public key to be an integer the same thing and then what did she do? She computed her public key as g raise to A. So, that little g this is big g over here which is a point on the elliptic curve there it was g an integer raise to the power of A. Recall what we did in the case of Diffie-Hellman key exchange some days ago what she does is she chooses her private key which is an integer and then she raise that to the power of g a generator of that prime group. We do a very similar thing over here in this case she chooses little A a private key a very large integer and then the public key is not an integer, but the public key is a point on the elliptic curve. So, what is this notation really mean? This is the point g. So, once again I will repeat the points on the elliptic curve form a group we are talking about points in this group normally we are used to talking about integers in a group now there is another kind of group we are talking about which are points on the elliptic curve each point has two coordinates. Then we choose up we very carefully choose another point g which is a generator of a very large subgroup of this group. So, that is a generator of this large subgroup this is completely analogous to what we did in regular Diffie-Hellman where we chose a g little g as a generator of that group that prime group. So, there we did g raise to a which is g multiplied by g multiplied by g a times modulo p here we are doing capital G plus capital G plus capital G a times I can do this because I know how to double a point I know how to add a point and so on. So, she can easily compute the software on her system can easily compute a times g this operation is referred to as scalar multiplication. And it is seen very very often in elliptic curve operations in cryptographic operations you see this now in key exchange the same kind of thing you will see in encryption and decryption in signature generation and in signature verification. In the same way B chooses an integer B is private key which he keeps secret and then computes his public key which is g plus g plus g B times again completely analogous to what we did in the regular Diffie-Hellman case there he chose an integer B and then he took the generator and multiplied g multiplied by g multiplied by g B times here we are taking this generator and adding it g plus g plus g B times. The addition operation as was seen is well defined, but it is a fairly complicated operation involving many field operations and then what happened. So, A has got A times g and B has got B times g. So, A gives B that A g B gives A the B g the exchange they are public keys and then A multiplies B's public key which she has just received by her private key and B performs the corresponding operation. So, what is A sent to be A sends A g and what does B do with that he takes B he takes that A g and he adds it up to itself B times. So, let me just write it down so that it is clear. So, what is the given you are given a field. So, this P is a given a field like this and you are given the elliptic curve equation over this field. Basically that means you are given the coefficients A and B. So, these are elements of this field. So, we know the equation of this field we just wrote it down before. So, we are given A we are given B and we are given P and all of this thing for a prime field an elliptic curve over a prime field would be this thing modulo P can be a given a generator g which is a point on this curve of very large order that is to say g 2 g 3 g and so on. If I go on like this I do not get a repeat until around 2 raise to 200 times I can keep doing this thing and all of these elements these are not infinite they are finite, but a very very large number all of these elements form a subgroup. I will keep on going this way until I hit the identity element and the identity element of the elliptic curve group is really that pointed infinity that I talked about before. So, this thing is denoted this is a subgroup generated by us this point g. So, all this is given to me beforehand is given to both A and B beforehand now they have to proceed with the actual protocol. So, what does A do she chooses an integer little a which is which lies between 1 and the cardinality of this group g and he does the same thing here. So, this is the first thing she chooses a little a she chooses little b and then she computes this is basically g plus g plus g A times and exactly the same thing is done by him and then they send it across. So, she sends this notice how similar this is to what you already know on Diffie-Hellman Key exchange she sends this thing across and he sends. So, this is like her public key and this is like his public key and then what do you think they do after that. So, he takes this public key of hers and computes how many times b times. So, this is a point and what he does is he takes the point and adds it to itself again and again and again b times and what is he get after he does that he gets and she does exactly a similar kind of thing here. So, she gets and guess what now both sides have a common key a b g. So, this is what is called elliptic curve Diffie-Hellman Key exchange this protocol is a generalization or a variation of what you have seen before it is called. So, here it goes both sides share the common secret a b g and then you can take the point and convert with the algorithms to convert a point to an integer or to convert an integer into a point you can convert it into an integer and that becomes a secret key that they will share for the duration of the session. So, this is like a session key. Now the interesting thing is the discrete log problem on this group is very difficult. So, if you it is infeasible. So, if somebody is able to get a times g and somebody knows g still the person will not be able to get a. So, this is one of the key points. So, here is the easy discrete log problem we are given the elliptic curve and the generator g these are carefully chosen you do not have to choose them they are already there are several standards which will give you this. So, you are given the g and now suppose you are able to tap the line notice that a was sending her public key which is a g you tap the line and you get this value. So, you know this and you know this then the question is can you find a secret and the answer is it is it is infeasible to find a secret. So, the fill in the blanks question it is infeasible to find a. Now there are many attacks on it, but those attacks take very very very long time. The Pollock-Helman attack for example, Polar-Drow and so on and so forth to try to it to try to attack the easy discrete log problem, but those attacks are infeasible as of now. We also write down this problem in the following way we say that a is the easy discrete log of a g to the base g. So, instead of just discrete log we put easy discrete log this thing notice is a point this is also a point and this is an integer. So, if this problem was feasible if it was easy to attack it then recall that we were sending a g across and b g across if from g and a g we could find out a and similarly from b g and g we could find out b then of course, it is trivial to compute the common session key between the two which is this thing. So, it is because of the hardness of the easy discrete log problem that we cannot hack into this and obtain the common session key. So, easy discrete log problem and easy Diffie-Helman key exchange both sides have agreed on a common shared secret. Let us go to some other application of elliptic curve cryptography elliptic curve encryption assume that a wishes to send the message to b encrypted with b is public key which would be decrypted by b using his private key. So, as usual assume that you are given a standard. So, there are standards for choosing these elliptic curves and the base point. So, assume that both a and b know of this of these standards and they have a standard elliptic curve. So, there are several of these they may be 10, 15 such elliptic curves that can be used for cryptography. So, one of those is used by both of them the elliptic curve and the base point g. Now, assume b has chosen this and both sides know about it and then b chooses the private key b only he knows what this value is. So, this is just like the private key in RSA you do not tell anybody about it and from that you can compute using elliptic curve scalar multiplication you can compute b g which is the corresponding public key. So, that is the starting point notice that a needs to send a message to b encrypted. So, she needs to have b is public key. So, this is b is public key and then how does she proceed to actually encrypt the message. So, to encrypt message m that she wants to send to b she represents this message as a point capital M. So, there are algorithms to represent the message as a point and there is also another variation of this which I mentioned in the text which does not really need to even convert from a message to a point, but in this basic example I am taking a message and I am representing it as a point this is what a does then she chooses a random integer which is this value k and computes this pair of points. So, she takes this random integer and she does the scalar product of the two k and the generator g. So, this is the first part of the encrypted text notice that the encrypted text is now a point on the elliptic curve and the second part of the encrypted text is another point which is the message point the message converted to a point. So, that thing plus what she does is she takes b is public key. So, obviously to encrypt a message to send to somebody I need to use that person's public key. So, she takes his public key and then performs a scalar multiplication of this random number that she just chose this random number multiplied by this point. So, this is basically a scalar multiplication. So, one scalar multiplication another scalar multiplication and the addition of two points. Now, I urge you to look at the similarity between this and L Gamal encryption that we covered some days ago it is very very similar over there if you recall if you go back to the slides you will see that we took a random number r and we computed g raise to r over here we are taking a random number k and we are not computing g raise to k, but we are computing g plus g plus g because the group operation now is a plus the addition of two points not that kind of multiplication. So, this is the first element and exactly the same way there is a completely analogous statement for L Gamal encryption that is regular L Gamal encryption this is EC L Gamal. So, this is the cipher text corresponding to this plain text the plain text was represented as a point and now this has been converted into cipher text which has got two components both of them are points on the elliptic curve. So, this is what she sends to B and then B wants to decrypt this thing and get back the original message. So, what should we do we just look at this carefully and figure out what B has to do with this thing to recover the original message M. So, what he does is he takes the second component of the cipher text and then from it he subtracts B times kg what is B little B is B's private key. Let us just go back to this slide and see B has chosen the elliptic curve the group the generator point and has come up with that has chosen a private key as well. So, this is the little B that he is going to now use in the next slide for decryption. So, this is the decryption he is using little B and he is multiplying it by what? So, scalar multiplication of this by a point that point came from where it came from the cipher text that was just received by him. So, the first component of the cipher text he takes which is a point and the scalar multiplication is performed B times this thing this is something that only he can do because only he knows the value of his private key. So, this minus this as you can very well see this is the same thing as this. So, it will give you M the message represented a point and then you do the reverse transformation you now convert from M to little M. So, M is a point on the curve and this is little M. So, I am not describing how you actually convert from an integer to a point there are several algorithms to do it I leave it to you to actually look at those and see what is done. And as I said before there is a variant of this thing which is called ECIS elliptic curve integrated encryption standard which is described in the text that thing does not need to actually convert from an integer to a point ok. So, we have talked now about elliptic curve encryption this is nothing else, but very similar to elliptic to Elgamal encryption that we have seen before except that we do not use that group which was that group Z p comma modulo multiplication modulo p that was that group this group is a different group this group is the components of this group elliptic curve points and the operation is elliptic curve addition of two points ok. Now, I am not going to talk about signature generation and signature verification they are very similar to what we had seen before in regular Elgamal signatures what I am going to do next is look at that operation scalar multiplication I see it appearing everywhere I am doing a scalar multiplication over here I am doing a scalar multiplication over here when I am doing encryption I am doing a scalar multiplication over here in the case of key exchange I computed a times g and b times g there again that is a scalar multiplication. So, it seems like it is used all over the place and it is a time consuming operation let us see how we can optimize it. So, that was that led to the some of the research that we did how do we optimize elliptic curve scalar multiplication. So, what is the problem statement again the problem is to compute p plus p plus p some point let us call it p k times k is a very large number k is something like 2 raise to 200 for example. So, some very huge monstrous number again this kind of thing where I take a point and I add it to itself many many many times how many times 2 raise to 200 times for example, this thing occurs very frequently for example, over here I am adding g plus g plus g plus g k times what is k k is an integer how large is that integer roughly say 2 raise to 200. So, it is a very very huge problem and something that definitely needs to be optimized again I am doing what here I am taking the point b g is public key I am adding it to itself again and again and again k times what is k an integer how big is it around 2 raise to 200. So, I see this problem occurring again and again and again that immediately suggests that I should think about ways to optimize it p plus p plus p k times. So, this notation p plus p plus p k times it is denoted like this this notation refers to this thing and this operation is scalar multiplication. Now, when you see this you begin to think that there is something similar in RSA something that looks like this. So, for example, in RSA we had to take ciphertext and had to multiply it c raise to the power of d mod p or mod n sorry in the case of RSA it is a very similar thing that I have over here. What did we do in the case of RSA we use something most of you might be aware of this we use something called a double and add strategy again it is described in the text. I am just taking a small number 6789 to the power of 82 in the case of RSA this number would be how long it would be 2000 bits long raise to the power of a number that is again 2000 bits long it is a huge monstrous problem we need to optimize this. So, in the case of RSA I use something called a double and add strategy what I did was I took 6789 and I instead of doing this crazy kind of thing multiplying it by itself 82 times which is very time consuming what I did was I took 6789 and I squared it modulo whatever that thing is I am not showing the modulo over here to reduce clutter I am not showing mod p but this really means 6789 square mod n is the modulus in the RSA case then I took that number and I again squared it. So, I got 6789 raise to the power of 4 mod n and so on and so forth again I squared that number I get this and so on and so forth. And then what did I do I selectively chose the appropriate guys from this list and I multiplied them I looked at 82 and I look at the binary representation of 82 and whatever that is I took the ones corresponding to I took the terms over here corresponding to ones in the binary representation of 82 and I multiplied them. So, the second least significant bit was a 1 and then several bits after that and this that 2 plus 16 plus 64 is actually 82. So, I selectively chose some elements from over here and I multiplied them again all of this modulo n and that is the number this thing modulo n is actually this thing modulo n. So, in order to do this I had to do this repeated squaring. So, that is why this technique is referred to as square n multiply in the context of RSA. Now, the square is substituted for double and the multiply is substituted for add in the case of elliptic curve cryptography except the doubling and adds are not simple arithmetic doublings, but these are doublings of points on the elliptic curve or additions of points on the elliptic curve. So, can we use the same kind of trick in doing scalar multiplication for easy points that is the question and the answer is of course, yes. So, now, if I have got the representation of the scalar k then the total number of doublings and the total number of adds. So, the number of doublings corresponds to squaring in the case of RSA and the adds corresponds to a multiplications in the case of RSA. So, the thing is to take the binary representation and work with that the number of doublings is one less than the size of that scalar. So, for example, if you ask yourself. So, there are two things you are doing right the first thing is all the squaring and the next thing is the multiplications my question is how many such things do we need to do that is related to the size of the scalar. So, if it is a 100 bit scalar you do this thing 100 times approximately 100 times or 99 times and then how many times do you do this thing. So, these are the doubling these are the squaring the squaring are related to the number of ones in the binary representation of 82. So, get these facts very very clear others the rest of the slides will be confusing. So, what is the problem the problem is to compute this efficiently when I say this I mean this thing modulo p. So, the first step was to instead of doing this thing which is highly inefficient multiplying by itself 82 times I do these squaring and I do these multiplications. Now, the question is how many such squaring should I do how many such multiplication should I do. So, the number of squaring is precisely the number of bits in the representation of 82 the number of bits minus 1. So, the number of bits used to represent this scalar. So, in this case it would be say for example, 1 2 3 4 5 6 6 bits or 7 bits whatever it be to represent this scalar 82. So, that is the number of doublings that I do that minus 1 and how many multiplications I do the number of bits the number of 1 in the representation in the binary representation of 82. So, there must be 3 such ones. So, I do 2 such multiplications. So, that is exactly what I am writing in the next slide the number of doublings is the length of the scalar minus 1 and the number of addition is the hamming weight of the scalar hamming weight as you may know is the number of 1s in a particular number in the binary representation of that number on average I am saying I take random numbers and I look at the number of 1s and then I will find that on the average the number of 1s is half the size of that scalar n bits. So, if there are n bits to represent that scalar the number of 1s is roughly n over 2 and the number of additions is therefore, n over 2 minus 1. So, these are elliptic curve additions these are elliptic curve doublings recall again that the fundamental operators and elliptic curve arithmetic were unary negation addition of 2 points and doubling of a point. Now, I am doing something more fancy which I see I have to do again and again and again in encryption, decryption, key exchange and so on that is this thing called scalar multiplication multiplying a point by a scalar. So, if the scalar is k and k has and to represent k I need n bits then to do my scalar operation I need n minus 1 times d I need n minus 1 doublings and this incurs a cost n minus 1 d where d is the cost to do a single doubling and it incurs so many additions n over 2 minus 1 addition the approximately half the size of that scalar in bits that is the number of additions roughly a is the time for a single addition. So, your n is the number of bits in the binary representation of the scalar k d and a are respectively the costs incurred in point doubling and in point addition. So, this is the total cost it is much better than the cost of adding that point to itself again and again and again that is k plus k plus k or k multiplied by k depending on which group you are using. So, if you are talking about ellipticals it is k plus k plus k or k times the point that is p. So, p plus p plus p n times instead of doing that which is very time consuming and for all practical purposes impossible I do this. This is by analogy with the square and multiply in the case of RSA here it is called double and add instead of square and multiply it is called double and add. So, this field is a very very hot and important area of research it has been researched for many many years and there are many different tricks and techniques from some of the best people from some of the best universities in the world. So, there are optimizations at different levels the first thing is since all of those additions and doublings elliptic curve point additions and elliptic curve point doublings require field operations the underlying field operations are things like multiplication, squaring and inversion and the inversion thing is one of the words it takes the most amount of time. So, we know one method which is also in the book for doing inversion and that is the Euclidean algorithm for doing inversion. So, you try to optimize that and of course, you try to optimize multiplication and squaring. So, this thing by far is much more time consuming than these two things. So, the first thing is to optimize the underlying field operations we are talking about a curve operation which is scalar multiplication that involves elliptic curve addition and elliptic curve doubling that in turn involve different field operations such as field multiplication field squaring and field inversion. So, if I want this to go fast then I try to optimize these things multiplication squaring and inversion over field elements. So, remember there are two operations there are curve operations and then there are field operations curve operations are in terms of field operations. So, I try to optimize this as a first cut the second thing very very interesting idea is because this thing is so time consuming there are some beautiful ways in which you can translate the coordinates you can translate from regular coordinates which are called affine coordinates to projective coordinates and many researchers have come up with different kinds of projective coordinates for the two fields binary fields and prime fields and they there are standard projective coordinates Jacobian, Lopez, Dahab this is for binary fields for example, Chudnovsky and so on and so forth. This greatly reduces the cost of doing elliptic curve encryption. So, this is one level of optimization this is another then the other thing. So, it is a very very rich area another thing that people have done is to look at special kinds of elliptic curves. So, for example, you have Koblitz curves and twisted Edwards curves another very creative idea instead of using point doubling can we write down the mathematical equations for point halving instead of point doubling and then also point tripling and quintupling and so on and so forth. So, there are tens and tens may be hundreds of papers on all of this thing definitely hundreds of papers on all of these techniques. One interesting idea which is very very well known is what is called W NAF. So, NAF stands for non adjacent form and we are going to focus our attention on this right now because some of our work actually involved taking this to the next level. We use something called near factorization which is a which is an idea that we thought of investigating over here. So, we use near factorization I will describe that in a second in conjunction with this representation of a scalar this representation is called window NAF and NAF stands for non adjacent form. So, the interesting thing is that you will not have to adjacent non zeros in the representation of the scalar I will take an example shortly. So, the goal of all this is to reduce the number of point additions. So, if we go back to some slide we see that the total cost is two different things the cost of doublings and the cost of addition. We do not try to decrease this we try to decrease this thing we try to reduce the number of addition and is the first cut the use of the NAF representation reduces this from in the simple non adjacent form you can reduce this from n over 2 to n over 3. So, that is the first thing our question is can we reduce it further. Now, you can also reduce it further by using what is called window NAF. So, we will see that the next question is can we reduce it even further. So, that is what our scheme called near factorization tried to achieve. So, the goal is to reduce the number of point addition. So, what is the motivation for NAF just a very simple little extreme example the opportunity to remove to reduce the number of additions if there are many clusters of consecutive ones in the binary representation of the scalar. So, what is this k this k is a scalar I am going to multiply the point p by the scalar what is meant by that k p p plus p plus p k times how big is k is roughly something like 2 raise to 200 or 2 raise to 300 that kind of a number. So, random number of that size now when I look at this thing if the scalar is this stuff then I see that using the double and add strategy I would have to compute if I just take this representation actually to start with this thing is 2 raise to 5 plus 2 raise to 4 plus 2 raise to 3 and so on. So, these are these correspond to the doublings and I see the coefficient is 1 all over the place. Now, can I not represent this in a slightly different way I can represent it just as 1 times 2 raise to 6 minus 1 times 2 raise to 1. So, this number is actually 31 multiply by 2. So, this is 31 this is 0 there. So, it is 62. So, this 62 is really 2 raise to 6 minus 2. So, 64 minus 2 it is also the same thing as 2 plus 4 plus 8 plus 16 plus 32. So, instead of this long kind of representation can I not have a much shorter representation. So, that is the first thing the first realization because this way I reduce the number of addition suppose this was my scalar and I multiplied by a point p then I would have to do p 2 p 4 p 8 p 16 p 32 p and in this case actually I would have to go right to 64 p, but then I have got fewer addition slash subtractions. Over here I will have to add for example, this times p that is 2 p plus 4 p plus 8 p plus 16 p plus 32 p. Too many additions while over here I just compute 64 p and 2 p I have to do all those squareings, but now in terms of the addition I do not have to do so many additions I just have a single addition or subtraction. Subtraction is the same cost as addition because unary negation as shown before is a very trivial operation for elliptic curves. If you remember the coordinates of a point of x y then the coordinates of the negative of that point of x comma minus y. So, it is very easy for me to do unary negation. So, addition or subtraction it makes no difference the cost is the same, but here I see there are so many additions while over here there is just one addition. So, this is the motivation for NAF what I try to do is I try to find a cluster of ones in the representation of the scalar and more such clusters and larger their sizes the better it is for me. Now, without getting into too much detail I can use something that is even better called window NAF. So, in the case of window NAF I will have a more sparser representations which gives me even fewer number of additions. Over here the representation in the case of NAF just involves three things the representation of a scalar in the simple binary case there are only two things that you can see either a 0 or a 1. When I take a scalar and I represent it as NAF then there might be three things that I see there might be 0s 1s and minus 1s. So, this representation for example, would be not all this, but we 0 1 bar 0 0 0 0 1 that is the NAF representation of the number 62. There are three things either 0s or 1s or minus 1s while in the case of W NAF you will have more than three things you can have for example, 1 you will have 0 then you can have 1 and 3 and also minus 1 and minus 3 that is a window size of 2 and so on and so forth. So, I am not going to discuss that in very great detail because it will take time, but just recall that there are many more integers that will be represented in the W NAF representation of a scalar than just 0 1 and minus 1. So, this is W NAF and the nice thing about this is it reduces the number of additions even further though there is something called pre computation cost which has to be also factored in, but it still reduces the number of additions considerably. So, in practical systems in practical elliptic curve cryptosystems you might use a window size of for example, that is probably one of the best instead of just using window size of 2. Now, some interesting properties of window NAF there are at least W minus 1 0s between two non 0s and the W NAF representation of a scalar. The average non 0 density of a W NAF representation of k is. So, if you took a normal random number binary number k then the average number of non 0s would be k divided by 2. In the case of W NAF the number of non 0s that is the hamming weight is k divided by W plus 1. So, if you take W equals 2 our simplest NAF case W equals 2 then this number becomes k divided by 3 not k divided by 2 and that has implications to performance instead of having. So, let us get back to that slide this slide if I have simple W NAF then this becomes with window size 2 then this becomes n divided by 3. If I have window size of 3 then this becomes n divided by 4 and so on and so forth. But you do not get this all free there is a cost to be paid and that is a pre computation cost I am not going to discuss this in great detail, but there is also another little cost over here which is the pre computation cost. But that is generally much smaller than this cost and this cost increases greatly as n increases and n is the size of the scalar. So, I said when you have scalars of size 200 300 and 400 it is good to have not only window NAF, but larger window sizes. So, the hamming weight of the scalar for W NAF is k divided by W plus 1. If W is equal to 2 the simplest NAF case then this becomes k divided by 3. If the window size becomes 3 then this becomes k divided by 4 and so on and so forth and therefore, the number of additions keeps decreasing. So, here is the strategy that we experimented with and investigated we call this near factorization. So, what we did is we took the scalar k and we represented it in this form. So, D is the divisor Q is the quotient and R is the remainder and then we selected not everything under the sun because that will be 2 time consuming. We just deliberately chose divisors that have got small hamming weight, hamming weight less than or equal to 3. So, say k is a 200 bit number then you take all possible values of a divisor which is 100 bits which has only 3 or less zeros. So, now imagine a 100 bit number and put 3 ones inside this at different places. So, start with the most significant bit one and the other 2 ones being at different places and then once you know a D. So, you choose a D once you know a D and you have got a k then automatically you will obtain the quotient and the remainder correct. Once you are given the dividend and you are given a divisor it automatically simply and trivial to find out the quotient and the remainder. So, now you try different values of divisor subject to the hamming weight constrained being small for example, less than or equal to 3 and you try all such divisors. So, you iterate over the space of these divisors and you come and you find out in which case does the sum of the hamming weights of D, Q and R when is the sum the least. Now, why do we need that exactly we will see because these are the steps of our simple approach. First and foremost represent k as this thing try out iterate through the entire space of divisors 100 bit strings where the hamming weight is 3 or less and choose that one where the sum of the hamming weights of D, Q and R is minimized that is the first step 0. And then now the scalar multiplication you take that Q once you have chosen a you have given the k you have chosen a D now Q will automatically follow. So, you take that Q and you perform a scalar multiplication Q by the point that is given to you and let us call that point big Q and then you perform the next operation take that Q perform a scalar multiplication D times Q. So, D is from here and R times P R is from here R times P Q is from here. So, in the second step step number 1 that is you compute this Q and you take that here P is already given to you that is the point to be operated upon. So, that is the point P the same point here this Q is from there and these two things D and R falls straight down from this and this and your final scalar product. So, this is what you wanted now the question is is this any improvement over just using k P with simple w nap. So, we are comparing our strategy we are doing all this work and we are saying is this better in terms of the number of additions versus k P using different window sizes. So, represented as a w nap integer and try different window sizes you get fewer additions using our approach or using this approach. So, now for the results. So, what we did is we took 200 random integers actually we took much more, but we just showing 200 over here 200 random integers and we looked at the hamming weight of each of those integers the hamming weight in the binary case as I said is the average value will be n over 2. So, we took 200 bit integers. So, you can see the average hamming weight has been plotted. So, we take those 200 integers and we sort them on the basis of their hamming weights and we simply take each of them and put them in line over here the smallest hamming weight here the next one here and so on and so forth. For the simple binary representation just take all those 200 integers represent them as binary strings look at which one has the smallest hamming weight next hamming weight etcetera and just plot them. We know very well on the average half of them have got on the average half of the bits will be 0 half of the bits will be 1. So, roughly 100 bits equal to 1 will be the middle point sort of. So, some have got hamming weight less and some have got hamming weight more. The next thing is we represent them as a W and a half integer where W is equal to 2. So, take that is very same point we have put them on the x axis over here. Now, for that same point look at what its hamming weight if you represent it as a W and a half integer where W is equal to 2. So, what is its hamming weight? The hamming weight falls down drastically and we know very well that the average will be roughly one third I just mentioned that it is one third not one half of 200 of the scalar length. So, one half of the scalar length where the scalar length is 200 is 100. So, that is this case while one third would be about 67 or so. So, that is precisely somewhere here the average hamming weight of these random numbers each random numbers of size 200 bits. So, these are the different things these are the different hamming weights and then using our approach which is not just W and F, but near factorization used in conjunction with W and F we get an even lower hamming weight which reflects on the cost. So, the cost of doing a scalar multiplication is naturally reduced. So, it is somewhat lower. So, then finally, we would like to actually ok. So, this is another result where why just look at 200 200 is one scalar length, but there are many schemes that require more security. So, you would have higher scalar lengths of and key sizes of 250 300 and so on. So, we looked at the large spectrum of key sizes. So, scalar length ranging beyond 200 to say 400 and then we compared our scheme with using pure W and F. So, the comparison now is W and F there is a W here W and F versus our scheme which is a combination of near factorization and W and F. So, where the. So, this is the hamming weight on the y axis. So, the hamming weight for the W and F scheme is over here our hamming weight is considerably less which means less time to do the scalar multiplication because the additions decrease. So, our scheme performs considerably better and not just for a window size of 2, but as I said before if you go to window size of 3 you do better than window size of 2. So, this is window size of 2 using pure W and F the dashed lines are W and F the solid lines are our approach. So, now you have reduced it from here to here and correspondingly our approach the solid line is somewhere here. So, it does better and so on and so forth. So, you take different window sizes different window sizes get better and better and our approach gets better than the simple W and F our approach being near factorization plus W and F. Now, this is a nice table that summarizes the results we are looking at the number of additions. First and foremost the scalar size is 200 part of the table is pure W and F this part is near factorization plus W and F we are looking at different window sizes as I said before in actual practice. For example, if you take the open SSL code and you look at the code inside how they have done elliptic curve operations they will use larger window sizes of say 5 and then we look at all of those things the additions and the doublings these are the two contributors to the total cost. So, this is the total cost. So, there are some interesting things to see now this part is pure window and F this part is our approach the total cost is a thing to look at and as we increase the window size the cost gets lower and lower and at window size equal to 5 you get the optimal cost over here. So, this is a the number of additions and this is the number of doublings. Now, as it turns out an interesting thing is the cost actually increases after some time. So, with the window size of 6 it increases and the reason for that is because what I mentioned there is some pre computation cost which has been factored in the total cost. So, the pre computation cost increases as the window size increases it decreases actually exponentially, but it starts off with pretty small and then it starts to shoot up when you get beyond a certain point. So, it is not does not make much sense to go beyond the window size of 6 for example, so about 5 is about the best. So, this is our approach and you can see we have got 36.4 and window naff alone has 39.3. So, there is quite a bit of improvement about a 10 percent improvement if you look at the cost of additions compared to just pure window naff. So, this is basically our approach. So, we are about to finish this part of our research thing I just thought I should mention some of the work that we have been doing over here we are working in different areas of security and cryptography. So, this paper was published a few months ago in a conference on BLSI and embedded systems. So, this research is related to side channel attacks. So, these are my former students they just finished last year Jyoti Gajrani who is a faculty member I believe she is in somewhere in Rajasthan I think in Ajmer and Pooja Majumdar and Samprit Sharma. So, this paper of ours is challenges in implementing cash based side channel attacks on modern processors. So, I will talk a little bit about this thing which attacks actually many different schemes can be attacked AES, RSA, DSA, DSA stands for digital signature algorithm basically a variant of Elgamal signatures. So, DSA can be attacked an EC DSA the elliptic curve variation of that. So, this is a paper in January 2014 that is this year in Jan then another paper that is to be presented in August this is the one related to what I just talked about optimizing elliptic curve scalar multiplication with mere factorization. This is by my former students B Tech students Pratik Podar and Achin Bansal and this is to be presented in Vienna Austria in the month of August end of next month. So, this is on side channel attacks this is on elliptic curve optimization and then another bunch of things that we are doing is on web security. So, we did not have much time to actually demonstrate this to you if my student is around I can ask him to come and demonstrate the click jacking attack this is another attack called click jacking and we looked at defenses against both click jacking and XSS and there are defenses as I mentioned before on the server side there are defenses on the client side. So, what we have been doing for a few years is writing browser extensions as I mentioned before what chrome has it has got a filter XSS filter between the rendering engine and the JavaScript interpreter while explorer has an XSS filter between the networking interface and the rendering engine. What we decided to do to protect against all kinds of XSS vectors like partial injection multi-point injection even HTML injection and so on is to put two filter in both places. So, we have got that, but we could not find an API that actually could intercept. So, we have our extension is a browser extension is for the Firefox browser. So, we could not find an API that could pick out inputs to the JavaScript interpreter. So, we have an approximation for our design and we have implemented completely and we have tested it thoroughly and we are writing a paper on that. So, the two places where we put the filter are both between the rendering engine and the JavaScript interpreter and between the networking interface and the rendering engine in the browser. So, this had about a thousand plus lines of JavaScript code and the whole thing is completed it is been tested and now we are finishing the paper on it. So, the paper is titled two for the price of one combined browser defense against XSS and click jacking the authors are students who had just finished now Naman Jain he was there for the five-day workshop together with Nikhil Limjay and Ritubala Karchvahe. So, this is something that is almost complete. So, these are three things that we have been fairly active in the area of security and cryptography, side channel attacks, browser security and basically web security and elliptic curve optimization. And then these are some of the students and the work they have been doing over the years the work when on elliptic curve optimization with Achin Bansal and Pratik Podar. So, they finished for quite a few years ago and we are got top jobs in different places and all of these guys in fact. So, you can see the list of people DD stands for dual degree we used to have a DD program, but we have since stopped it and the most recent students who have just passed out. So, there are some very interesting things over here. So, the blue thing means we are writing the paper it is either complete or about to be completed. This work is also very very interesting this is all on side channel stuff and over here they have used. So, we had cash based side channel attacks these are timing based and these have been combined with some what are called lattice based attacks. So, there is a whole field of cryptography which we would not have time to at all touch upon which is called lattice cryptography. So, we are using lattice attacks and we are combining these two things. So, it turns out that these timing attacks or cash based attacks can reveal certain bits of these secret key. And if you know certain bits of the secret key and you have a sufficient number of messages then you can use lattice cryptography to get the rest of the bits of the key. So, some very nice experiments have been done and reported in their in their thesis they have just finished a month ago and this is some work that I would invite other people also to look at and to do. So, all of this thing I would recommend and I would suggest that the faculty that I am talking to you get your students you and your students should start to look at some of these things and you should also communicate with some of the people who are working on these things. So, all of these students have graduated now, but there are many students who are continuing some of this work. So, are there any questions about any of this stuff? How can you find a generator point in electric source? Yeah. So, let me just get back to this. Take for example, let us start with a group something like Z p star ok. There are very well known formulas for a generator for example, there are certain tests you choose a random number say for example, I choose a random number x and I raise it to the power of I take p the prime number and I take p minus 1. I find all the prime factors of p minus 1. So, let us suppose the prime factors of p 1 p 2 etcetera then I take the generator p the prime number p p minus 1 divided by p 1 and raise g to the power of each of those and if it is not equal to 1 then it is a generator let me write it down. Let us suppose that I have this group I would like to find a generator for this. Now, I take some integer let us say say the number 2 for it to start with then I take 3 and so on. Let us say I take a number x before that I take this p and I look at all the prime factors of p minus 1. So, let us suppose the prime factors of p 1 p 2 up to say p k then I perform the following test. So, is g a generator I do not know I just try it g to the power of p minus 1 divide by p 1 is that congruent to 1 again this is all mod p. If it is not congruent to 1 then I take the next one p minus 1 divide by p 2 and so on and so forth. Again I ask this question suppose it is not 1 in every case for all of these up to k then I conclude that g is a generator for this group. So, there is a similar analogous procedure to check for generators in arbitrary elliptic curve groups. So, take some arbitrary value say 2 first and foremost choose a p then take 2 say p is 31 take g is equal to 2 see if it is a generator. There will be times very often where g is equal to 2 is a generator whether at times it will fail then ask yourself go through these tests and see whether these tests hold or not. If these tests hold then it must be a generator that means, it should not be a 1 for each of these k cases once again. So, take the prime factors of p minus 1 prime factors not anything. So, p 1 p 2 etcetera write up to p k and then perform this particular test g to the power of p minus 1 divide by p 1 what is that value. If it is 1 then the test fails and g is not a generator if this test is true for every single of these k cases then we conclude that it is a generator. Now, there is an analogous procedure for other groups like elliptic curve in general you do not need to actually find the elliptic curve or a generator as far as we are concerned from a security perspective and an applications perspective we are given the elliptic curve and we are given a generator what we have to do is to perform those operations that is to say implement all those operations and implement them efficiently. So, I have one question on SQL injection. Yeah. So, how we can detect the second order SQL injection or it is actually difficult to detect because it is activated later world. So, how we can prevent it? I would imagine right now. So, there is another speaker that is just going to come here a very quick introduction this is Toshendra Sharma from this company he owns the start up he is the one who started this company called Vigilant. So, second order SQL injection is where we embed the particular script inside the page on a permanent basis. For example, if you have seen some user forums where people go and visit the content which is permanently stored inside the page and or in the database and then web browser is fetching the information from the database and showing it to the user. So, this way actually hackers can permanently store a certain script inside the page and whenever user visit that page they get exploited as per the exploit stored inside the page. So, I have some experiment I have some demo to show this experiment. So, like stored accesses we can show you that and how I am going to change the password of the user without even letting him know that his password has been changed. So, this will be the complete demo of the second order as accesses. So, that said if you have any doubt of that.