 So, there are some basic terms over here in security. One of them is risk, another is trust, another is vulnerability. So, in my reading and thinking about this area, I found these three terms very important. I'll just tell you briefly why. Risk has to do with, I mean, why do you want to secure a system? If there is nothing there to protect in the system, there's nothing in the database, then why protect the system in the first place? So it really depends on your perception of how sacred, how precious is the data, for example, that you are trying to secure. If it's not very precious, if it's just information, if it's just your emails from friends, there's nothing private in it, then why do you want to spend thousands and lakhs of rupees preserving that data and making it secure, right? So one of the things is risk, another is trust, and I'll talk more about this. In fact, there is one standard called WS Trust, Web Services Trust. And then the third term that I find very interesting is really vulnerabilities and attacks. So what exactly is a threat? What exactly is a vulnerability? A threat to a computing system is a set of circumstances that has a potential to cause loss or harm. A vulnerability is a weakness in the design. So a threat by itself is not a problem, okay? But it's when this threat manifests itself as a vulnerability, a vulnerability being a weakness in the design. So it could be multiple things in the design of a system, the implementation of a system. It could be simply procedures that have to do with social relationships and things of that sort, social engineering problems. So the design implementation or procedures of a security system that may be exploited to cause loss or harm. So the key word over here is vulnerability, vulnerability is weakness. That's probably the closest synonym. A weakness in what? It could be in the design. It could be in the procedures. It could be in the implementation, etc. Now what are the vulnerabilities that are exploited to actually result in attacks? So we already know some of these things. A human who exploits a vulnerability in the computer system perpetrates an attack on the system. So you are all familiar with tens and tens of these different kinds of attacks from denial of service attacks to phishing attacks to cross-site scripting attacks, etc, etc. And all of these attacks, incidentally, do exploit some vulnerability in the system which you might want to think about. What are these different vulnerabilities? So I've just tried to taxolomize some of these attacks. You will try to figure out later on why I call this type one, type two, and type three. But type one attacks, I've included buffer overflow, phishing attacks, cross-site scripting attacks, SQL injection attacks, and a variety of worm and virus attacks. So just look at this list carefully. I'm going to show you another list of attacks and just tell me why you think some things are on one list and some things are another. This is nothing sacred about this taxonomy, but I've just tried to put it as different types over here. Type two are man in the middle attacks, replace slash reflection attacks, impersonation, eavesdropping, message corruption, denial of service, and farming attacks. So just look at these two categories, type one attacks and type two attacks. And then I've got a third type also. And see if you can figure out what is common amongst all these type one attacks and what's common amongst all these type two attacks. And again, this is not a sacred classification, but just trying to figure out what might be different between these type one and type two attacks. So what ties in buffer overflow, cross-site scripting, SQL injection. Cross-site scripting is a special kind of phishing attack. So phishing attack basically is one where the user by social engineering techniques, for example, is induced into doing something with, for example, an email message. So he gets an email message. It appears to be from somebody he knows, from his bank, for example. And the message induces him to click on a certain link and he clicks on the link. And the next thing you know is that, for example, cookies from his browser have been stolen or some files have been deleted, et cetera, et cetera. So these are examples of phishing attacks, okay? Where to some extent, you need interaction, you need some action on the part of the user. He's receiving some email or he's visiting a website and there's a link over there. He clicks on that link. He's induced to click on that link. He clicks on that link and lo and behold, something bad happens to him, some harm is caused. So look at type one attacks, these kinds of attacks, SQL injection. How many of you have heard of SQL injection attacks? So what's going on exactly in an SQL injection attack? Give me a typical scenario. There is a- Inputting a value- Exactly, you are inputting a value where? Into a form, for example, on the internet. So you receive a form, it looks very innocent. You enter something into the form. You have to enter your name, for example. You have to enter your password, login name, whatever it is. And then what happens? What does the attacker do? The back end is affected. Sorry, the back end is affected. So this is actually the database in an organization or corporation. It's typically the sanctum sanctorum of the organization because that's where all the sensitive critical data is stored. Now we're trying to launch an attack and assault on this very sanctum sanctorum, the database itself. And one way of doing this is the so-called SQL injection attack. So the attacker types in something. What is that something that he types? That's something. So he's asked, for example, for the password. And you would imagine he types the password. But in addition, he types something that is also part of the SQL query. So the SQL, that thing, which is a password come, part of an SQL query, goes to the other side. That is to the server side. And then what happens there? It is a very volatile. Sorry? It is a very volatile. Yes. So the application itself takes it and constructs an SQL query, which is then delivered. So you've got the application. You know these multi-tiered architectures. You've got the application tier that takes this through a servlet or JSP or whatever, takes this input that it receives and then constructs an SQL query from it and hands it over to the database tier. And you've got an SQL query there that has been partly fabricated by the attacker. And that could include things like, or one equals one, which says that this query is always true. So the password, I don't know what is this person's password. I know his login name is John. But for password, I type in some arbitrary thing. I put the inverted, the apostrophe and then I put an or one equals one, which guarantees that, so or one equals one, guarantees that this clause is true in an SQL query. So the query will execute. And the next thing you know is that the attacker will get all sorts of information because the predicate evaluates to true. So he gets all information about people's passwords and so on and so forth. Now what is the vulnerability in this particular scenario? Yes? Sir, I'm not quite sure because that is one attack of the SQL query on the vulnerability and programming. Exactly, exactly. So that is the key word, programming. So the vulnerabilities are in people's programs. So in particular on the SQL injection attack, what should have happened and what happened instead? What should have happened is that the application program should have looked at the input the user is giving and should have detected and said here, this doesn't seem like normal input. There are all sorts of funny characters like an apostrophe and part of an SQL statement or one equals one, how can that be somebody's password? I mean, it could be, but in general, it's highly unlikely. So it should have had the intelligence to check for all this input and then refuse or reject that particular user. Instead, it actually just takes it and constructs the SQL query dynamically and then lies the problem. So each of these attacks that you can see, buffer overflow, phishing attacks, cross-site scripting, SQL injection, and a variety of woman virus attacks too, not all of them, but many of them are due to sloppy coding. Okay, so lots of these phishing attacks, cross-site scripting, SQL injection, et cetera, are due to sloppy coding. On the other hand, if you look at these other attacks I've listed over here, such as impersonation, which might be caused due to replay attacks, for example, farming attacks, what are farming attacks exactly? You've seen phishing attacks where you try to fish out information such as a user's customer ID, for example, if he's doing internet banking, but what are farming attacks? You go to some other different webpage, for example, it's caused basically due to vulnerabilities in one particular protocol, which is that protocol, DNS. So domain name service, which is actually a translator from the domain address, say for example, www.it.itb.ac.in to some IP address. So this particular translation, somebody has messed up with this translation. There's an attacker who's probably poisoned the DNS cache in the DNS server, for example. So you see now these attacks over here, things like these farming attacks, denial of service attacks. There are a variety of these denial of service attacks. What might these be caused by? You've heard of a SIN flood attack, a very simple, the simplest attack for the earliest example of a denial of service attack and a distributed denial of service attack is you've got a victim and he's bombarded with requests to establish a connection with him, right? So you've got one particular victim, which is bombarded with requests from a variety of sources. It could be one source, but it could be a variety of sources, in which case it's known as a distributed denial of service attack. And he's bombarded in the sense that multiple requests to establish a connection. And for each connection, so there is a SIN, there is a SIN-AC, the three-way TCP handshake. There's a SIN, the SIN-AC and an AC. Every time you get a SIN, you think it's from a legitimate user at the other end. You reserve some buffer space and so on and then you respond with a SIN and an AC, those two flags set in the TCP packet and you've reserved all this extra space, but you're getting a barrage of these requests. So you've actually run out of memory space. You might also run out of computational power. You might also run out of network bandwidth. Okay, so you're actually crippling, you're actually paralyzing this victim. He can't do anything because he's just trying to service all these requests, which actually are phony requests from an attacker or from an attacker who is controlling other zombies, what are called. So herein lies another vulnerability, not a DNS vulnerability, but a vulnerability in another networking protocol. And that protocol in this case happens to be TCP. So spoofing attacks, a vulnerability because the source address, a source IP address can be spoofed. There is no way you can prevent that from happening. I mean, not easy ways at least because the TCP and the IP protocol, the TCP protocols were designed 50 years ago before all these hackers and others came onto the scene. So they did not anticipate at that time that security would be a problem. It's the case in both, in the case of networking protocols as well as in software. These things were designed with correctness number one, performance number two, reliability number three, robustness and reliability, but never with security. Nobody thought, unlike what we've seen in the last 10 years or so, all sorts of denial of service attacks, worm attacks, virus attacks, et cetera. This was not anticipated 50 years ago. One of the earliest worms was probably 1988 or so, the modest worm. So as a result of this, it has now been found that a lot of software might be correct, might perform with acceptable levels of performance and reliability, but might still be insecure. The same thing with networking protocols. So if you look carefully, this type two attacks, basically these examples that I've given are essentially attacks that stem from vulnerabilities in different networking protocols. So TCP for example, ICMP and UDP could result in denial of service attacks. DNS could result in farming attacks and so on and so forth. And then there are other types of attacks which could be caused due to a variety of reasons, things like web defacement. So I've just included them because you might have noticed them in connection with some of the worm and virus attacks such as the code red worm, the internet fast moving internet worm, code red and also the slammer worm. So some of these caused, for example, web defacements. They caused, some of them caused identity theft. You must have heard of many cases where the merchants database has been raided and a whole bunch of credit card numbers has disappeared from there, or has been got by some attack up. So identity theft, data theft, file deletion and disk erasure. So the reason I'm pointing out all this is, now comes web services in the year 2000 or so, or shortly before that, in this very dangerous environment with all these worms and viruses and other kinds of attacks. And many people have said that security is gonna be one of the critical issues in this whole thing. Because another thing about these web services, they all sit on top of SOAP, which sits on top of HTTP, at least most cases. And the HTTP protocol usually passes through firewalls. So now there is a hole through all these firewalls. Since you're gonna allow HTTP traffic because you wanna get these outside people to access your web server, but as part of this traffic, you're gonna also allow web services and SOAP traffic and so on and so forth. Because SOAP sits on top of HTTP, that's on port 80 or 8080. So aren't you now opening a big hole in your enterprise and allowing in all sorts of malware to enter in? So this is the big question and this is the big concern. So how do we address some of these concerns? So in connection with attack and vulnerability and so on, there are protective measures. A control is an action, a device, a procedure, or technique that reduces or eliminates a vulnerability. So some sort of a procedure. For example, a procedure in an organization could be, everybody should have passwords that are at least eight characters long. And all of them cannot be alphanumeric. You should have punctuation marks and other things also. Another could be that the password should be changed at least once in two months and so on and so forth. So these are procedures that you have. You train people to follow those procedures. It could be an action. It could be a device, a device like a firewall or an intrusion detection system, for example, or a technique that reduces or eliminates a vulnerability. And a threat is blocked by control of a vulnerability. So you control the vulnerability. You make sure that that vulnerability doesn't exist. There are, of course, detection mechanisms. There are prevention mechanisms. There are recovery mechanisms. So you should look at each of these three things and see in the context of buffer overflow, for example, would you prescribe detection or prevention or recovery? For each of those problems that are cited, DNS attacks, et cetera, et cetera, what would be your modus operandi? Would you choose detection? Would you choose prevention? Would you choose recovery or some combination of these things? So these are things to think about. As far as internet communications is concerned, you have different kinds of attacks that are possible just on the communication line itself. You can intercept the communication. You can interrupt it, modify packets as they pass. You can fabricate packets that don't even exist. You can just come up with some packets and put them so that the receiver actually thinks they're coming from the authentic sender. Now you must have seen this list. How many of you have seen this list? Okay, so these are some of the features of secure communication. You want to ensure authentication, integrity, confidentiality, non-repudiation, access control availability. Let us very briefly look at some of these things and then we'll see which of these things are supported in web services and which standards are used to support them. Authentication, very simply, it's the process of determining whether someone or something is, in fact, who or what it is claimed to be. So if I say my name is John and I log in, then how does the other side know that John is logged in and not somebody else? So the most basic thing that we are familiar with is John types his password. So this is based on something that he knows. It could be, he could be using a smart card instead or in conjunction with his password, in which case it would be something that he has, like an identity card or a driver's license and so on. And then it could be something that he has, like a biometric, which is getting increasingly popular. And nowadays you have combinations of these things. This is called multi-factor authentication. You might, for example, have a passport or a smart card with your fingerprint embedded inside. So it's stored actually there in digital form. And to activate the smart card, you might need a PIN number. Okay, so I don't know how many of you have debit cards and use them at point of sale terminals, but some of these things actually have multi-factor authentication. You might need the smart card itself. So what you have, plus what you know, a PIN to activate the smart card. And now we are thinking, some of the banks over here are thinking of smart cards, which also have the fingerprint inside. So a person in the rural villages, for example, might not know how to type in a PIN and so on, might actually just put a fingerprint on the fingerprint reader and she might be authenticated using that. Integrity. Integrity is the assurance that data sent is uncorrupted. That is, it is received without modification, insertion, deletion, or replay. Whatever was sent by the sender has been received without even a bit being changed at the receiver side. Confidentiality is the protection of transmitted data from eavesdropping passive attacks. So somebody who is actually tapping the line should not be able to figure out what it is that you're sending. And finally, non-repudiation involves a protection against denial by entities involved in a communication of having participated in all or a part of the communication. So in other words, two parties are communicating A and B. A is sent a message. It should not be possible for A to deny later on, for example, in court, that A has actually sent this message. A has actually sent this purchase order, for example. He should not be able to deny it in court. So he sends it and denies having sent it. Or he doesn't send it and says that he's actually sent it. So in both cases, I want to figure out whether he's actually sent it or not. So I need this feature called non-repudiation. And how do we guarantee non-repudiation in secure communications? What is the mechanism for that? Sorry? Acknowledgement. What else? Digital signature. You want to make sure that the other party, so acknowledgement can be fudged. You know, I can always write an acknowledgement because we can spoof IP addresses and so on. So I can't really take it to a court of law and tell the judge, here is my evidence. You want something that only you, you and only you can create. So digital signature is that something that gives you that kind of protection. So we'll talk about that briefly next. There are some other attributes of security, authorization. Authorization is finding out that the person once identified is permitted to have the resource. And access control is closely related to it. It's a more general way of talking about controlling access to a web resource. So it's not just based on who you are, but it's also based on a variety of other criteria such as network address. If you come from this bunch of addresses, you're allowed, others you're not allowed. If you log in at this time of the day, you can use the sprinter, otherwise you can't. If you're using this particular browser, you can proceed, otherwise not. That kind of thing, not just based on you, but based on some of the other things, the browser that you're using, et cetera, et cetera. And then availability. Availability is the property of a system or system resource being accessible. So I want to be able to use the telephone line 24 hours and use, so 24 by seven. So it's accessible and usable, and not just accessible and usable, but in keeping with certain performance specifications. So I would like to use the internet 24 by seven, but well, if it's going to go very, very slow, then it's almost unavailable to me, right? It's not, no point saying I can use it when I go to sit down there for three minutes, before I get a response from the terminal. So that's why performance comes in the picture here. So it's accessible and usable in keeping with performance specifications and upon demand by an authorized entity. So this is a summary of these different terms, confidentiality, authentication, trust, integrity, non-repudiation, authorization, and auditing. Basically what we have said, just quick summaries of each. So these are some of the other terms related to network security. System security, data security, database security, operating system security, program security. You can choose your pick, but I prefer sometimes the term system security, because it sort of includes all of these. Okay, so we're coming to the question that was just posed. Let me just introduce very briefly the pillars of e-security, the two pillars that make all of this possible. Digital signatures, Macs. How many of you know what is a Mac? Mac? I'm not talking about the Mac layer in a network. Not medium access control, that's another Mac, but this is message authentication code. So message authentication code, so my question to the audience is, a message authentication code, what is it and what does it guarantee? Does it guarantee non-repudiation? Yes or no? How do you compute a Mac? Okay, so we're just gonna come to all of this thing in a second before I answer this gentleman's question. So the pillars, the main pillars are cryptography and this wonderful thing called a cryptographic hash. The cryptography is a science of keeping messages secure. There are two functions to be considered. One is encryption and the other is decryption. So in the context of web services, we would want to, for example, encrypt some part of the message, not the entire message necessarily, but some part of the message. And again, some parts of the message could be with different kinds of keys. So you gotta think about that. How do we actually guarantee these kinds of features in a typical web service? So encryption is this function E of P. What does P and C stand for? Plain text and C is ciphertext. So the little E, so capital E is a function and the little E is a key. So the encryption key takes, with the function capital E, takes the plain text and renders it as something that's disguised, which is ciphertext. And decryption is the opposite process. It takes the ciphertext and it gives you back the plain text using the decryption key and of course a decryption function capital D. Now there are two types. One is called secret key cryptography. The other is called public key cryptography. In the case of secret key, cryptography also calls symmetric key cryptography because E is equal to D. So in the case of secret key cryptography, E is equal to D. And the best examples of these are DES, AES. How many of you are familiar with AES and know how it works? AES stands for advanced encryption standard and DES is data encryption standard. So DES came up almost in the 60s, started off with IBM and then became a standard. And then it was found in the late 90s that DES is rather insecure. You can use a supercomputer and break it. So NIST which stands for National Institutes of Standards and Technologies in the US decided they wanted another standard very soon. So they started soliciting proposals for this. There were about five finalists or rather, yeah, five finalists, many semi-finalists. And out of those five finalists, a team of mathematicians and cryptographers from Belgium finally won. And their scheme is called AES, Advanced Encryption Standard, which is more secure, has better performance than DES. Very quickly, what is the key size for DES? So DES is supported, for example, in the protocol that you all know, which is SSL, the protocol for web security. Now the question is, can we use SSL for also web services? So for many web applications, first generation web applications, like internet banking, you use SSL. Whenever you type the HTTP S there, that S is transparently setting up an SSL socket connection between the browser and the server. You don't know that what's going on, but that's what's actually happening. And they are actually negotiating, exchanging secret keys and so on and so forth. They're actually agreeing upon secret keys, et cetera. It's a very interesting protocol, SSL, makes use of all the things that we're talking about over here. So they use, for example, DES still today, but tomorrow, maybe after two years, people are going to switch more and more to AES. So I have a very quick question for you. Your students ask you, DES, what is the key size? What's your answer? 56-bit keys, that is very insecure. So use 128-bit keys today. And so very often, if you have a browser and you're doing internet banking with your bank over here or bank somewhere else abroad, the bank will complain and say, we can't continue because you're using just 56-bit DES. Your browser might not, the older browsers might not support 128-bit DES. It can't continue the conversation because as I mentioned, DES has become insecure, especially 56-bit DES. So the solution is to go to 128-bit DES or use something called triple DES. And even better than that is to use AES. So these are examples of secret key cryptographic algorithms. Now the disadvantage of using secret key cryptography is that let's suppose in this room we have 40 people. If I want to communicate with almost anyone here, I would have to share a separate key with each individual. So a separate key with her, with her, with him, and so on and so forth, which is a management nightmare. So the solution to that is to use public key cryptography. So in the case of public key cryptography, those two keys are distinct, but they are related to each other. Nevertheless, knowing one of them, you can't deduce the other. So the symbol E is used for the public key. Actually E stands for encryption key. So E is the public key, which is supposedly known to everyone else. And D is the decryption key, which is private. And as this gentleman was saying, this is something that you have to keep secure to yourself. Now the question is how do you keep it secure? And what happens if it is not secure? So how do you keep it secure? What's the answer to that? No, no, it's a, okay, can I remember it? Is that a good, is that a practical thing to do? Can I remember my private key? So notice now we've talked about three things. Please get your fundamentals straight. We've talked about a secret key, secret key cryptography. Now we are suddenly talking about a private key and a public key. I'm saying that the public key is supposed to be known to the whole world. Why was the answer? We'll come to that. And I'm also saying there's a private key, which is supposed to be known to only you. Now I said how do you keep it, that private key really private? So one answer was learn it by heart and don't show it to anyone. Don't show it to anyone is correct? No, no, no, wait, wait, wait, wait. But you said learn it by heart. I'm saying I'm terrified by that prospect. Why? I'm saying it's impractical for me to remember this private key. Why? 128 bits. What's there to remember 128 bits? There are a lot of private keys. No, no, remember your own private key. What's the problem? I'm saying it's 128 bit. I'm saying 128 bits is not a problem. I can write down 128 bits and hex on this paper. Am I talking sense first? That's my question. Is it really 128 bits? Your student asks you what is an RSA key size? The public key, the private key, the modulus. What's your answer? What is practical today? They used to use 512 bit keys. That's completely outdated right now. It's totally insecure. So you have to use at least 1,024, but it's recommended to use 2,048 bits. Now can you remember that? And how do you, and it's not just remembering it, I've got to use it in a program, right? What am I supposed to do with a private key? What do people do with a private key? Is it necessary for you to know what's your private key? No. What do you do with it? What is it used for? Tell me two or three different things that you'd use the private key for. Authentication, let us see how. Not so obviously how. But how? Tell me what exactly is the operation that uses a private key? Encrypt something. Let's see, I want to send you a message. Am I going to encrypt it with my private key? Yes. My heavens, this is going to kill me. So I want to send you a message. I encrypt it with my private key. And then you decrypt it with my public key or your public key? My public key. So this thing goes there to you. I said before that everybody has my public key. All of you are reading the message. So please get this straight. So I encrypt it with my private key so everybody can read it, correct? What's the sense in doing that? I can authenticate that. I've been sent by the... No, I'm talking about encryption. I want to send you a message and don't want anyone to read it. Okay, let's start with confidentiality since this is the most obvious. Tell me quickly because we are running out of time. We have to know these fundamental things, okay? Because you're going to actually program this in your... You encrypt it with my public key. Very good. So the first application, that's what I'm telling you. I was asking you this question. What am I going to do with my private key? So this first answer is there. I want to send her a message which is supposedly secret. So I encrypt it with her public key. And then she decrypts it with her private key. So that is the first application of the private key. Somebody sends you a message which is encrypted using that person, the receiver's public key. And then the receiver decrypts it using his or her private key. Any other very compelling example? No, be more specific. Non-repudiation, how? Suppose you send a message to me. You have to encrypt it with my private key. Sorry, your private key. I decrypt it with your public key. Very good. Right, exactly. So I send a message to her and I encrypt the entire thing using my private key. So I send you both the message and the encrypted version both. Right? Yes. Nothing? Well, then... But then how do you know somebody hasn't changed some bits along the way? Faster, two levels at a time. Two levels faster encrypted with my... Very good, very good. So I think we understand what she's saying. She says I've got a message, encrypt it with my private key and then encrypt it with her public key. Is that correct? So then she'll take the message, she'll decrypt it with her private key and then she'll use my public key to actually get the message back. So two steps like this. Now, let me just warn you about this. This is correct, but there's some problem with it. What is the problem? Okay, that's it. So there are lots and lots of problems, as you can see. I mean, just imagine there are problems in regular web applications. Forget web services. So he mentions there is a problem with how do I actually give, how do I disseminate my public key? Very good, very good. We are coming to that in a second. So before that, so you said registration authority. Before that, the term to use is a digital certificate. So I will disseminate or we will all disseminate our public keys using something called a digital certificate. And these things are typically issued by somebody called a certification authority. Now the main problem with doing the thing that was just mentioned, namely encrypting it with encrypting, decrypting, using public key cryptography, is that it is very expensive. It is very time consuming. So we want to have some combination of secret key operations and public key operations. So I told you the secret key operations have this management nightmare. I have to remember everybody's secret key, the shared secret that we share, sorry. Exactly, so I got distracted with something else. The question that this gentleman asked and now she's asking is, how do I keep my private key really private? So somebody answered, why don't you remember the whole thing? And as I said, what is the problem with that? It's too large to start with. The second thing is you don't need to remember it. It's got to be used for, that's why I asked you, what are the applications? For doing things like decryption or for doing things like signing? Now the question remains, how do I keep it private? I don't want anyone to see it. So what do I do? Where do I store it? I can have my own secret key to which I can keep it whenever I need it. Keep it where, but keep it in your hard disk. Okay. Whenever I want to use it, I can use it. Okay. So you have your private key, you store it in your hard disk and you encrypt it using one secret which is based on some pin or password, which is fine. Which is fine. The fact of the matter is you're actually storing it in your hard disk and the point of fact is that there have been many hacking attacks which try to retrieve things from the hard disk. A probably better way, which is not so much used as yet, is exactly what he's saying is the smart card. So the smart card can or is advertised to store the private key absolutely securely so that absolutely nobody can take the smart card and retrieve the private key from inside. That is exactly why you need a processor on the smart card. What is the meaning of that? You need a processor on the smart card because there's a private key on the smart card. Does that make sense? Look at my statement. I said, I'm storing the private key on my smart card. Therefore I need a processor on the smart card. Does this statement make sense? The smart card needed some... Okay, you have to read from it, so what? It needs some kind of... Why do I need a processor on it? If I need it, I can read. I can read from a normal credit card, right? You store your credit card number on the credit card. You can read. Is there a processor on the credit card? All this is the smart card user is the right person. Why can't I do just like the credit card? I have that magnetic stripe. I store my information on that magnetic stripe. Why do I need a processor? Give me a convincing answer to this. Why do I need a processor? Most of the smart cards are so-called processor smart cards. That is to say, they have processing ability. Some of them even have a full Java virtual machine on it. Yes, I've dated it. To do what exactly? What is the operation that needs to be done on the smart card that you need a processor on it? If I'm just storing my information, credit card number or this number or that number, I just use a mag stripe kind of card like your credit card. For what? I can take it to an ATM machine. The ATM machine can generate a key in and load it onto the smart card. Why do I need a processor on the smart card? Give me a convincing thing from the point of view of security. There is something I can take it to an ATM machine and change it over there. Why is it so important to have a processor and to come up with such a fancy architecture, you know, Java card virtual machine and all this other stuff? Okay, let me give you the answer because we don't have much time. The answer is that the private key is stored on the smart card. I don't want the private key to leave the smart card to go to the smart card reader or to the PC that's connected to the reader. I want all operations involving the private key because I'm so paranoid about this private key. I don't want it to leave the smart card. I don't want anybody to be able to figure it out. Perhaps even I don't know what's inside that. Because of that, any operation that involves the private key has to be done on the smart card. And what are those operations? We just talked about some of them. One is decryption. You don't actually use it in the way that was mentioned but you do clip the session key. And the other thing, what else do you do with the private key? Most important, use it for which operation? Signing, right? To create a digital signature, to generate a digital signature, use the private key. That must be done on the smart card. So you need a processor on the smart card to do any kind of operation that involves the private key because you do not want the private key leaving the smart card. There is a possibility of compromise if it did. Okay, so these are some of the very basic features. It's easy to generate the encryption key and the decryption key. It's easy to compute both the encryption function and the decryption function but there are certain challenges over here. It's impossible or infeasible to find a D, the decryption key given the corresponding encryption key. And it's computationally infeasible to find plain text given just the ciphertext and the encryption key without knowing the decryption key. The other pillar of e-security is the so-called cryptographic hash. And its purpose is to produce something called a fingerprint or a digest of a message. Now, the message could be also a document. So you take the document and you will reduce it to a very small size. What is the size of a typical cryptographic hash? Hamnibits. 128 in MD5, 160 in Shah1 and Shah1 and MD5 are now somewhat insecure. So there are newer kinds of cryptographic hashes. One example is the Shah256, which is 256 bits. So you reduce it just like your fingerprint is such a small thing from a normal sized human being, you get a fingerprint. The same thing, you can take an arbitrary size message which could be one kilobyte, one megabyte, whatever and you can reduce it to a fingerprint which is your cryptographic hash. Knowing the fingerprint, you can't go in the reverse direction and try to deduce what is the original message. So that problem is intractable. So these properties are summarized over here, the one-way property. Given an X, it's easy to compute the hash of X, the cryptographic hash. So the cryptographic hash is different from your regular hash that you might have studied in your database courses and algorithms courses. This is a different kind of animal because of these properties. Given an X, it's easy to compute H of X but given a Y, it's computationally infeasible to find, to go in the reverse direction to find up any X that could map to that Y. Now you can think about why these properties are necessary. There's a property called weak collision resistance. Given an X, it's computationally infeasible to find a Z such that H of X is equal to H of Z. So given one message, it's virtually impossible even though there are infinite number of messages that map to the same hash function, it's virtually impossible to find any other message. So given an X, it's virtually impossible to find a Z so that H of X is equal to H of Z. Forget that, a lighter challenge even is impossible for you that is find any two messages so that H of X is equal to H of Z. I'm not saying take my X and find a Z, I'm saying you just choose any two messages you want and prove to me that you can find two distinct messages so that they both map to the same hash value. Even that problem is computationally impossible in the case of cryptographic hashes. Now these properties are used to actually construct secure digital signatures and digital and secure max and so on. So these are some of the properties to guarantee the integrity of a message to create a MAC which is a message authentication code. So basically what this is is nothing else but the message authentication code is simply the H, what is the definition of a message authentication code? It's the cryptographic hash of the message concatenated with a secret key, the secret that the two parties share with each other. So this guarantees you integrity as well as authentication. So that's why it's referred to as a message authentication code. And then if you want in addition, the property of non-repudiation, you would go in for a digital signature. So very briefly, I'm sure many of you know this. What is the definition of a digital signature? How do I construct a digital signature of a message X? So all of these things are supported by the way in web services using different standards. So we have students over here who actually will sign messages and so on. They have our APIs to sign different parts of a message and so on. So you have the XML encryption standard, the XML digital signature standard, and so on and so forth. WSSEC and so on, we're just gonna come to that very soon. But first we should know what is a digital signature. Are you sure it's a MAC or something else? Just take the cryptographic hash of the message, right? So you're given an X, just take H of X, the cryptographic hash, and then encrypt it with these senders private key. So that's the definition of a digital signature. Is it clear to everyone? So take the message X, create a fingerprint of that message, H of X, using one of those hash functions like MD5 or SHA-1, and then take that quantity which is a 128-bit quantity, 160-bit quantity, and encrypt it with the signers private key. Now, why is it so, why does it become a signature? Because it satisfies some of these properties that a normal manual signature satisfies or presumably satisfies. There are people who forge manual signatures, nevertheless, it should be authentic. So when I see your signature, say for example, you sign a check and you go to the bank, what does that guy do? He checks with something on his database to see whether it is authentic. Is this Ramesh's signature really or somebody else? So is it authentic? It should not be forgable. So there's a subtle difference between these two. Even though it looks like Ramesh's signature, it doesn't mean it was signed by Ramesh. It'll be signed by some great artist who can forge everybody else's signature. So it should not be forgable. It should not be possible to repudiate a signature, so I sign this sales deed and tomorrow I go to the customer, say a manual signature, and tomorrow I go to him and say, hey, I never actually signed it. You should not be able to do that. It should not be reusable. There's a signature, I take her signature on one check and I take a checkbook and I plant it on some other check over there. It should not be reusable. So all of these are properties that should hold two of manual signatures and one should check and see whether they hold two also for digital signatures. And then once this document is signed, you should not be able to alter it. So sometimes in sale deals you alter certain things but then you sign next to the alteration. Now what is the corresponding thing in the case of digital signatures? Are all these properties met? So do a check and see whether all these properties are met or not. Is it authentic? Can you forge it? So let's answer some of those basic questions. Why can't you forge somebody else's digital signature? What's the quick answer to that? Exactly. No, public or private. You don't have that person's private key. So only he can generate a signature. So if there's a document M, he and only he can generate a digital signature on that document M. Nobody else can because it uses the private key and I don't know and I cannot find out that other person's private key. It's just too securely stored. It's not reusable, et cetera, et cetera. So there are many mathematical properties that we would have to go through because of lack of time. I'm not putting down any equations, but otherwise this whole thing is very mathematical. And then from there to this idea of certificates. So very briefly, what are certificates? What is PKI? What are certification authorities? So in the case of web services, this whole thing will be actually generalized to STSs that is security or secure STS. What does that stand for? Secure token services and so on. So a secure token service is a generalization of the idea of a PKI, a public key infrastructure. So let us see what are the problems over here. As we've just mentioned before, you need to have these keys generated. So somebody has to generate the public key and the private key for you. Turns out that you can generate it yourself using some standard Java APIs. If you just go to the Sun website, you'll find many APIs to generate public keys and private keys. Okay, that's one thing. The second thing is, if I want to send you some information and I want it to be confidential, then I need your public key. Now here is the problem. So I want to send out this information and it should be secret. So I need her public, I mean, I need her public key. So I ask her, what is your public key? Now somebody intercepts this conversation and gives me their public key. So I don't know it has been intercepted by somebody else. So I take that person's public key and I encrypt the message and that person can then decrypt it, correct? So the point was, how do I know that this public key really belongs to her and nobody else? I must have some trust, I must have some assurance that it is hers and nobody else's. So how is that obtained? That is obtained using a digital certificate. So somebody has to create these digital certificates and the parties that create them are called certification authorities and registration authorities. So there are certain certification authorities, for example, in India, IDBRT and Hyderabad is one, TCS is another and so on and so forth. So there are quite a few of these who can issue you digital certificates for a variety of purposes, including financial transactions, et cetera. Maybe even your organization issues you digital certificates for purposes of applications within the organization to send secure mail, for example, to people within the organization, et cetera, et cetera. So the PKI is involved with all of those things with creating digital certificates. So once again, basically what is a digital certificate? In a nutshell, in one sentence, you can take this home and think about it. What is a digital certificate? So we've seen that a certification authority creates it. How does he create it? What is inside it? What does he have to do? Do you pay him money for doing nothing? What is inside the digital certificate? It's a digest. What does this word digest mean? Fingerprint, hash. Is that it? Notice the word certificate, certification authority. There is something about he and only he can do this. So signature, signing what? So he's certifying, he's signing, but signing what? Just the public key. So I just take it to him and say, my public key is this. And he issues me a document saying, public key, this sign. Yes, some sort of identity is involved in this. So it's basically a binding, a mapping between the person's identity and the person's public key. So if your name is Ramesh Patel, for example, then this certificate says that Ramesh Patel's public key is this huge fancy number. And how do I know to trust it? It's just like I wanted to send her a message. I needed a public key. So she sends me her certificate. Now I have to do certain things when I receive the certificate. It's just like an SSL between the client and the server. I have to validate that certificate. Is it a true authentic certificate? So actually the certification authority has to sign it. And I have to verify the signature on that certificate. So at the very least, in a nutshell, the certificate says, this is Ramesh Patel's certificate. His public key is this. And it has been signed by this particular CA, Certification Authority. I will verify that person's signature on the certificate to make sure that it is actually valid. And also there is something called a validity period. Okay, this certificate is valid between the 1st of January 2008 and the 31st of December 2008. So I will check the validity period. It's called a digital certificate. There are different classes of this. So there's class one, class two, class three, and so on. So the basic thing is you just go to verisign or something and you send them email stating, my name is Bill Gates. And this is my public key. And you pay them $10 or something and they'll give you a certificate. They won't look whose name and so on. They'll just think that you are another Bill Gates living in India, correct? And they'll give you a certificate saying, so that is almost a worthless sort of certificate. Maybe if you've got some friends or something, you can use that certificate to send them email. Now, if they want to do, they might want to do some more verification. So that's a class two certificate. To verify that indeed, it is you. So then they will look at things like, so that's why I said the minimal certificate is just your name, some identity, which could be your name. But I might want to have a broader identity, which means your profession, your place of work, your organization, your email address, et cetera, et cetera. So this word identity has become a very complicated term these days. So I want to include more and more of your identity, possibly also your photograph inside it. So there are many, many options in these digital certificates which follow a certain standard called x.509. So I'll just go through these slides very quickly. PKI functions include things like key generation and distribution, certificate issuance and distribution, certificate validation, and one particularly tricky problem is certificate revocation. So what happens now is, as he said before, you actually lose your private key. Somebody has compromised your hard disk. You are not very careful. This is human nature. Not all of us are very, very careful. So we lose the private key. Now what happens? What is the problem with losing the private key? For example, I've lost my private key. Why am I so concerned about it? Do you think that anybody can send? Not only send, somebody can sign for me. I owe this person over there 10 crores. I never signed it, but you with my private key can create a fictitious document and sign it on my behalf, take me to court next day. Correct? What is there to protect me? So what should I do if I've lost my private key? Inform the CA and that particular certificate corresponding to the private key. Mind you, the private key is not inside the certificate. The corresponding public key is inside the certificate. So then the CA knows, the CA informs everybody that this certificate from now on is invalid. In other words, this certificate, even though the nominal issuance period has been January the first this year to December the 31st, nevertheless from now, from today onwards, February, whatever, 22nd, it is now invalid. It has been effectively revoked. So this is a problem of certificate revocation. Now how does this, first and foremost, you have to inform the CA? Let's suppose you've done your duty and you've informed the CA. Now the CA has to inform everybody else so that somebody doesn't sign for you and it gets verified somewhere else in a bank or something. So we have to issue what are called certificate revocation lists. Now that itself is a big logistical problem. How do I disseminate these certificate revocation lists in a timely fashion? Okay, very good point. Put it on our website. Now let me just tell you the circular reasoning with that. When I first encountered this business of certificates and public keys, I had the very same question. Why are you making a big fuss about certificates and so on? Yeah, I want to find out what's your certificate. Your name is Ramesh, whatever, a full name. Put it on a website and say Ramesh, what else, public key, is this? Why do I need even a certificate? What do I need your public key in the first place for a variety of things, including verifying your signature? So I absolutely need your public key. So put the public key on the website. Why have a certificate? The reason I have a certificate, what is the main reason? What is the most compelling reason? Not only identity, what is the most important reason? I want to go offline, right? When I'm verifying your signature, you just send me a certificate. I don't have to go to a trusted third party or somebody else. It's just between the sender and the receiver. So that was the purpose of having certificates to go offline. I don't need another party. Otherwise I have to go to this website and see your public key and so on. So that was the purpose of introducing certificates in the first place. And now you're saying, to find out if it's revoked or not, go to a website. Hey, in that case I might as well have gone there to just see your public key, right? So you see the circular argument. So can we think of a solution that is offline? Certificate revocation solution that is offline and not online. Because otherwise it defeats the purpose of going to certificates in the first place. So something to think about. Very difficult problem. Lots of interesting solutions have come up. Okay, I'll just skip some of this because we are running out of time. This is the X509 certificate. It includes your owner's name, owner's public key, the validity period. So issue and expiry dates, the certification authority name, the kind of signature that he uses, et cetera, et cetera. Additional information it could include, for example, whether the certificate is to be used for financial purposes or professional purposes or whatever it is. Might include your photograph and other things inside. Address and other further details of your identity.