 So, welcome back and we are going to now begin our session on the basics of cryptography. As I said before, for many of you who have been teaching this thing, this will be sort of repetitive and you might already know it. So, we will go a little bit fast and I deliberately skipped on some of the mathematics and so on because that will take a long time. But if there is any need for it, just please feel free to ask me to say anything you want or clarify anything about the mathematics which is not included in most of these slides. So, this is a dampener before we begin whoever thinks his problem can be solved using cryptography does not understand his problem and does not understand cryptography. So, what is the meaning of this? It is like saying I have got a security problem, you are a security consultant, somebody comes to you and says I have got this security problem and you say use cryptography, there is nonsense. You will be very careful about what you suggest and what you advocate as far as security is concerned. Cryptography does not solve every problem under the sun. So what I will be doing, I have just made a list of things over here. One is cover the basics of cryptography, some of the basic terms, what is the substitution cipher, permutation cipher and so on, a little bit about secret key cryptography and AES, something about public key cryptography, cryptography without going into any mathematical details of RSA and ECC. If you require it, please let me know and I will cover it, you know, one of the subsequent lectures. But I have not planned to cover it over here because I do not think there will be any further discussion. And then we go on to the cryptographic hash and all through this discussion there will be comments on how we apply these things. How is the cryptographic hash applied? How is secret key cryptography applied and so on. So cryptography is a science of disguising messages so that only the intended recipient can decipher the received message. The sender transforms the message or plain text as it is called into cipher text, a process referred to as encryption and the reverse process of converting the cipher text back to plain text is called decryption. So mathematically you represent it as a function, the encryption function which operates on the plain text with the secret key E or encryption key rather E and that gives you the cipher text C and then on the receiver side you take the cipher text and apply a decryption function on it with the decryption key little d and you get back the original plain text. So the first case is where in symmetric key cryptography where E is equal to d. So if we go back to that slide, you will see that little e and little d are the encryption key and the decryption key. A little question for you in the context of public key cryptography, which one is the private key and which one is the public key? So let's start with public key, little e or little d? Public key and little d is the private key. So always remember that to get the, to decrypt the message you need to use the private key so that little d must be the private key. So symmetric key cryptography is one kind of crypto where the encryption key is the same as the decryption key. The send and receive a share key also called secret key cryptography and some of the best known examples of this as you know many of them. DES, AES, RCFLO, RC4. Where is RC4 used? Sorry? Stream cipher, RC4, but any particular application, where do you see it in? Wireless LANs. Wireless LANs or WLANs, you see RC4 as the stream cipher used over there. Any guesses as to why a stream cipher is used in wireless applications rather than a block cipher? What is the difference between a block cipher and a stream cipher? Block cipher, we use, it's different for all the block cipher. Yeah, so basically in a stream cipher, a bit by bit encryption is done and block cipher, a particular block is taken and then we encrypt it. Yes, so bit by bit in the case of stream cipher and block by block. So you take the message and you chunk it into multiple blocks and then you encrypt each block at a time. What is the block size? Some general questions that your students are going to ask you. What is the block size in the case of RSA? 1024, is that the minimum or maximum? Maximum? You can't have block size higher than that in the case of RSA. 1024 with today's computing power is the minimum possible block size in the case of RSA. Anything less than that will compromise RSA, like 768 or 512 is definitely insecure as far as RSA is concerned. Block size in des, 64 bits. Block size in AES, 128. So these are the different block sizes. So the basic idea is you take the message, you chunk it up into blocks, encrypt each block separately and then send it out. In the case of the stream cipher, you'll operate bit at a time. So how does a stream cipher work in general? You have some box that generates a pseudo random stream and you take that pseudo random stream and you exclusive or with the plain text and that's what's used the cipher text. Maximum chunk length is byte in stream cipher. In the stream cipher, yes, it could be a byte actually. It will be byte. There will be a pseudo random key sequence generator. Exactly, exactly. So that's the main idea to generate a pseudo random key sequence from a single key. That is called a seed key. Yes, so there's a key and there's a key stream. So from a key, which could be something like 128 bits, for example. And from an initialization vector, you take those two and there's some fancy algorithm which converts that to an arbitrary length bit stream. So how many of you have seen RC4 and the algorithm for RC4? Okay, so there's a pseudo random generator that generates this arbitrary bit stream, which is a function of the original key and the initialization vector. So even if you use the same key again and initialization vector hopefully is different, you will get a different key stream. So symmetric key cryptography is equal to D and these are some of the examples. A person who wishes to communicate with N other parties would have to maintain N separate secret keys one per user. So this becomes a key management nightmare. If I've got 1000 people I'm talking to and I've got to maintain securely 1000 keys, that's not a good idea. That's something that's a little bit challenging for me. So a solution to that, a possible solution is public key cryptography. The encryption E, as we just said, is the public key and the decryption D is the private key. Examples of public key cryptography that are very much used today are RSA and ECC. ECC stands for elliptic curve cryptography. RSA is Revest Shamir Adelman, the guys who came up with this scheme. So the advantage of using public key cryptography is key management. All I've got to keep secret is my private key. Public key should be known to everyone. On the other hand, what is the disadvantage of public key cryptography? The most important, significant disadvantage is that public key is substantially slower, approximately 1000 times slower. Just remember this, about 1000 times slower than secret key cryptography. So the obvious solution then, so now I've got some disadvantages of public key cryptography, some disadvantages of secret key cryptography. What should I do if I want to encrypt a message to send from A to B? Can I get both the advantage of key management, simplest simple key management and also better performance? How do I do that? Use public key for transfer of the key. Exactly, so use public key cryptography for, so first you generate a random key, a random say 120th bit key, so let me write that down. Generate a random number K, so the message I've got to send is M. I generate this random number on the fly, the so-called session key. Then I encrypt this K with, so I'm A and I'm sending to B. So I encrypt this key K using what? Public key of whose? B. B. So B's public key I use to encrypt this key and I send it across. First thing, generate this K, encrypt it using B's public key. So B can decrypt it using his private key. And what does this give me? So on the receiver side, you do a decryption of the thing that you just received, the thing that you just received is this stuff. So you get the key K and then subsequently every single message, etc., is going to be encrypted with. So K was generated by A, decrypted by B. And now subsequently every single message M1, M2, M3 will be encrypted with K, so that can be decrypted by the other guy. So this K is referred to as a session key. Now look at the advantage of doing it this way. The only time consuming step was wherever I use this public key cryptography. So this is the step that takes time and the decryption. In the case of RSA, do encryption and decryption take the same amount of time? Yes or no? So details are in the book. I don't want to get into it in a big way over here because it'll take time. But encryption is much faster than decryption. So decryption uses the private key and the private key typically is how many bits? Roughly the size of the modulus. So decryption is, and I said the modulus that is the size that's N. N is the modulus, that's around 1024 bits at the very least. If you want a higher security scheme in RSA, you make it 2048 or even 4096. So that's going to take a lot of time decryption because the key size is about 1024 or 2048. But the public key, this thing, this size is much less. For example, it might be only about 10 bits. So that is why encryption is faster than decryption. But even encryption using RSA is much, much more time consuming than encryption or decryption using DES or AES. And I'm saying about 100 to 1000. And of course, decryption is even worse. So that is why now here you can see I get the advantages of both key management as well as performance. Because key management, all A needs to do is keep securely his private key. And B needs to keep his private key secure. And he can talk to anybody he wants after that. And in terms of performance, the only time consuming things are this and especially this. All the rest of the messages I've sent, 50 messages thereafter. All those messages don't take much time because they encrypted using secret key cryptography. Okay, so before we go on to secret key cryptography and how it's actually done, there are two very basic ciphers that we need to talk about. And one is the substitution cipher and one is the transposition cipher. So very briefly, so don't forget these two terms. Substitution cipher and the second one is either transposition or permutation cipher. So what is a substitution cipher? The most basic example of this is what's called the Caesar cipher. You just take, assuming that we want to transfer alphabets or text in the English language, just take each alphabet and substitute for some well-known thing. So for example, D for A, E for B, A for X and so on. You just take every alphabet and you go three back. You go three back and that's how you get, so D for example, is substituted for A and so on. So what is your name becomes, or you go in the forward direction, I guess. So what is your name becomes ZKDW, etc., a very simple idea. Just substitute W for Z, H for K. So there's an obvious pattern in this, H, I, J, K, you go three forward. And that's how you get the ciphertext. So this is a very, very basic, trivial, simple example of a substitution cipher, the so-called Caesar cipher. Now obviously we can't use this, somebody came up with this as a toy about 40 years ago perhaps. Why can't we use this? Because one of the ways in which you can hack into this is by some statistical method. You will just look at the ciphertext. And in the ciphertext, if you see too many alphabets of a particular type, you will conclude that most probably that alphabet begins to one of the commonly, frequently occurring alphabets in the English language. And it turns out that E is one of the most common, followed by T, A, O, I, and N. So the Bowels figure very prominently over there, but also in addition the T. So these are some of the most common. So from the ciphertext, I see a lot of Qs, for example, Q alphabets. I suspect that Q might be either E or T or A, or the more frequently appearing ones. Then the next most common let's say is F, then maybe that would be one of those also. And like that it won't take me much time to write a program to be able to decrypt this thing. So you're using statistical patterns of the alphabets in English text to be able to decrypt this thing. So obviously this is a useless kind of cipher. But it just serves the purpose of illustration. What is exactly meant by a substitution cipher? Then there are some others, which I will give you all these slides. But I won't talk about them in the interest of time. Besides substitution cipher, the other kind of cipher is a very simple thing called a transposition cipher. So basically this is simpler than a substitution cipher. It's basically a rearrangement of the characters or bits in a particular message. So one way of doing this is to arrange the characters of text in a matrix and then shuffle the rows and columns in some predetermined order. So here's a military message, begin operation at noon. So put this in the form of a matrix, b e g i n and so on in row major form. Just put it over there. So you've got a five by four matrix and then you transpose the rows and columns in some way. So now you have shuffled the rows over here. So b e g i has moved from the first row to the third row and so on and so forth. So you just shuffle these rows, then you try to shuffle the columns and you can keep repeating this in some way several times until you get something that most people don't make any sense of. So that message becomes a t, n o t and so on. So this is a perfect example of a transposition of permutation cipher. Everybody knows the word permutation. You are just permuting the characters or bits inside this message. So if you've got five a's in the original message, you'll have five a's in this message. On the other hand, with a substitution cipher, if I had five a's where I have five a's in the ciphertext also, not at all. I mean, it could be anything. Instead of five, you can have eight a's and so on because you can have any substitution function. Well, here you're taking the original bits and you're just pushing them around in an arbitrary sort of way. So we just talked about block ciphers versus stream ciphers. With block ciphers, the plaintext is split into fixed size chunks called blocks and each block is encrypted separately. Typically all the blocks in the plaintext are encrypted using the same shared key. Now unlike block ciphers, stream ciphers typically operate on bits in the message. So practical stream ciphers like RC4 typically generate a pseudo random key stream. So remember the difference between these words key and key stream. The key stream is a function of the fixed length key. So the key is fixed length, for example, 128 bits while the key stream can be of arbitrary size. Fixed length key and a per bit message string. The ciphertext is obtained by performing an exclusive operation between the plaintext and the key stream. Okay, so with that basic introduction to stream versus block ciphers, transposition versus substitution, let us see how you can synthesize one of the most widely used things that you find even implemented in your browser. So secret key cryptography, typically your browser would implement at least DES and AES. So a guy called Shannon, who is well known in the area of data communications many, many decades ago came up with this idea that you can build a very powerful secret key cipher if you keep alternating the stages of the S box and the P box. So S stands for substitution, P stands for permutation or transposition. You keep doing S and P followed by an operation that involves the round key and you keep doing this many times, let's say, 10 times and you'll get something that is very powerful. So that was his thesis and indeed it turns out to be correct. So once again, alternating stages of S boxes, P boxes and boxes that perform a simple operation involving the round key. You keep doing this repeatedly and you'll get something that is very secure. But what is an S box, a P box and a round key? So we've talked about the S box already and even about the transposition, you can represent it this way. P is simply taking all the bits, the input bits of the plain text and permuting them to get the output. So that's the P box. And the S box takes the bits, let's say three input bits and sends it to a, what is that thing that converts from three to eight? Decoder, so sends it through a decoder, you get eight outputs. Those eight are then permuted and those things are further encoded using an encoder. There is some slight mistake, one of those links is not there at the end. So you pass it through an encoder so that you get only three bits. That is basically what a substitution box is. Now, that is one way of representing it conceptually, but anybody has a guess for how you would implement an S box. You have studied DES, right? You've studied it and you teach DES and AES in class. So there is definitely a substitution operation involved. How is that implemented? Let's say for example in software. Is it using some hardware device like this? Table lookup. Table lookup. So let's look at this thing. You invariably use table lookup. If you look at open SSL, this is the software that implements many of these crypto algorithms that sits inside your browser. You look at the code for it, you will see they make extensive use of table lookups. So this is Shannon's idea. Notice what I've done. A permutation box followed by a couple of S boxes because the S box is expensive. So I have to split them up into multiple pieces. And then another permutation, another set of S boxes and so on and so forth. It's not shown the round key operation is not shown over here, but here's the other way of representing this thing. So you start with a round key operation. What is a round key? Is it different from a square key? What does a round mean? You generate the sub-keys. Yes, exactly. So that's exactly the point. So this word round should not be as opposed to square. Round means stages. So each of these secret key operations encryption or decryption involves multiple stages. How many stages in the case of des? 16. 16. So the case of AES? 10 stages. At least 10. So it would involve many of these stages and each stage is a different stage. So it would involve many of these stages and each stage is exactly like any other stage, generally speaking, some small variations. But each stage has a separate round key. And as he says, each of those round keys is obtained or derived from the original key. So that's what's being shown in this picture. Let's say you start with a round key operation. So there is the first round. How many rounds are there? There looks seems to be like there are three rounds. They look almost similar to each other with some slight variation. So the first step is a round key operation. Typically, you exclusive all the round key with the input. In this case, the input of the bits of plain text. And that's the output. Then the output goes through multiple S boxes. Then after the S, you have the, what is the next thing there? The permutation. Notice that this thing is going there and something is going somewhere else. There is nothing else but a permutation of the bits or a transposition of the bits. And then again, the round key operation, again the S boxes, again a permutation and so on and so forth. And you do this 10 times or 12 times and so on and you get a pretty secure cipher. So just to go back to the S box, an S box or substitution box is a device that takes as input a binary string of length M and returns a binary string of length N. It doesn't have to be that M is equal to N. And it's typically implemented. This is the important thing. It's typically implemented at least in software using a table of two raise to M rows with each row containing an N bit value because I've got M inputs and N outputs. So there are M possible, two raise to M possible values for the inputs. So there should be two raise to M rows in this table and each row is of size N bits because my output is of size N bits. If there's any question about this, just ask. How to design S and S box and P box? Very good question. How do you design the S box? How do I know if this S box is good or not? Just choose an arbitrary S box like this. This is a good S box. Take the input 0000, make it 0000. Take one, make it two. Take two, make it one and so on. How do I know when S box is good or not? What is this Avalanche effect on output? Avalanche effect? What does that mean? How to do this variation? It's one of the input data to be collected in this. So let me just say one thing. It turns out that this DES thing, the reason why AES came about was because DES began to be attacked in the 1990s. So around 1995, 1997, there were some serious attacks on DES starting with linear cryptanalysis and differential cryptanalysis. And subsequently, there was a special machine, a supercomputer created to attack DES. That is why 56-bit DES will never be supported by any bank today. The least you should use is 128-bit DES or triple DES. So these are the things. This S box turns out to be one of the most important components in this entire thing. If you attack the S box, you can attack the entire cipher. So what is required is that there be no linearity between bits of the plain text, the cipher text and the key. If I can find a mathematical expression, a linear expression involving the exclusive operation between some bits of the plain text, some bits of the cipher text and the key, maybe bits of the round key or the original key, then I'm in trouble. So you want the S box to be highly non-linear. So therefore, there has to be a lot of design that goes into creating this S box. So if you're teaching an advanced course on security, you might want to ask your students to look at the S box design in the case of AES and see whether there are any linearities in that design. Write a program to find if there are any linearities or even if a particular expression is not linear, does this particular expression, an expression involving some bits of the plain text, some bits of the cipher text and some bits of the keys, does that expression for any key, any input and any output, does it evaluate to zero most of the time or to one most of the time? If it does, we say it has a heavy bias towards either zero or one and that's the beginning of a linear cryptanalytic attack. So this S box design is a highly complicated thing which you would do in an advanced course but be sure that it's an important aspect of the design of a secret key cipher. So this kind of thing is just shown here as an example. This would never be done. What are the number of inputs by the way in AES? Each S box, how many inputs that go into the S box in AES? Just think for a minute. I don't want to just catch you by surprise but you must have taught this. There must be an S box in that whole thing. What is the size? Just think properly. Let's just look at AES a few significant facts. Block size. 128 bits. Okay. Key size. 128 bits, you can have higher also but let's talk about the simplest one. 128 bits. Now let's talk about the S box. Do you know how the S box works in AES? You represent the entire thing. There are some slides over there but I'll just talk about it a little earlier. There is, this 128 bits is represented as a matrix. What is the size of that matrix? So this plain text is 128 bits. It's represented as a matrix of size four by four of bytes. So each byte, eight bits. So eight multiplied by four, multiplied by four is 128. So that's how you represent this thing. And then it goes through multiple steps which are byte substitution. Something to do with rows and something to do with columns. Row shift, column mixing and round key operation. So that substitution is what I'm talking about when I look at that picture which I just showed. That substitution is a critical thing. So what is, how is that substitution done? Just think for a minute. Is it similar to this picture? This is a template. I'm not saying everything should look exactly like this. But the idea is there, substitution, permutation, round key operation. So now in the case of substitution, I see those S boxes somewhere in the middle over there and I see multiple ones of them. What exactly is happening in the case of AES? In the case of AES, there'll be 128 bits input, 128 bits output. There'll be 10 of these things. What is the S box exactly? How many S boxes in each stage, in each round? 16, right? So you will take each of those 128 bits that's represented as a matrix four by four and each byte is going to be substituted. That's why it's called byte substitution. Each byte is substituted by looking at some matrix. Lookup table. Okay, so we'll come to it in greater detail. So this is an example of an S box and this is a P box, just nothing but a permutation of the bits. And then a round key operation. So from the original key, you derive all those round keys, 10 round keys or 16 round keys, all of them are a function of the original key. And then you typically take the exclusive or of that round key in a particular round and the input, whatever is the input to that particular round and you exclusive or them and that's the output. So for example, if I go back to that picture, look at the first rectangle that you see out there, that is the round key operation. You take each bit and you represent it by the corresponding bit of the round key. So for example, there you would have nine bits as the plain text and you'd have a round key, which is nine bits and you exclusive or the first one that P one with the first bit of that round key to get the output and so on and so forth. So that is the round key operation. A very straightforward operation. The interesting thing is, how did you get the round keys from the original key? Now that again is a big story. So you can read the text for the details of how you get it in the case of AES.