 So, in web-based systems, the certificate is issued to a web server. Certificates are used in other systems as well, so it's not just limited to web-based security and HTTPS, but we're focusing on that. So for a web server, the identity is the domain name or the ID of the web server, the domain name. The public key is usually an RSA public key, RSA is the algorithm used for public key encryption, but there are others possible. The timestamp is some date and time, and it's not just a single value, usually it's two values, the start time and the end time. So this equation or statement here is just a simplification of the main pieces of information in a certificate. But in web servers, there's a standard X509 which gives the exact format of the certificate. So it contains some more information as we saw in the example yesterday. And the signature signed by some certificate authority. The data is signed. But let's step back and think, well, why do we need a certificate again? Just to make it clear, why not just have the public key? And let's look at an attack, a very simple attack is if the web server simply sent the public key instead of the certificate to the browser. So the idea is that the browser needs to get the public key of the server so that it can encrypt a secret so that we can exchange a secret key. So that's our aim. Browser gets the public key of the server. And we're saying we use a certificate to do that, but we'll go back and say, well, what if we didn't use a certificate and simply sent the public key? That is, we have our server and it has a key pair, PRS and PUS, that's an R, and we'll have our browser. And the aim is to get the public key of the server to the browser. Because we need that to encrypt some information, in particular a secret value so that we can then use symmetric key encryption. So let's try a simple case where we just send the public key from the server to the browser. So the simplest case, without using certificates, could work like this. For example, with SSL, we could use a protocol before or after the browser and server say hello to each other, then the server sends its public key to the browser. And the browser will then use the public key to encrypt some information and send it back to the server. For example, encrypt a session key or a secret key. Let's see an attack on this and show why it's not the right way to do it. And the attack involves, a simple attack involves a man in the middle or a woman in the middle. So let's say now we have, we still have the browser. We have the server, which has its key pair, but we have an attacker in the middle. And the attacker can intercept messages because this communication is across a network, the internet. The attacker, let's assume that they can get the message before it gets to the browser and modify messages. Maybe I should just jump back one more step. Sorry, we'll continue that one. Let's delete all that. I'll write it again in a moment. What happens after the browser receives the public key of the server? The idea is that we send back some encrypted value. The browser generates a secret key, K, shared between the browser and server, KBS, and sends it back. But of course, we cannot send a secret key across the internet unencrypted. We encrypt it with what? We cannot encrypt using symmetric key cryptography because that's our problem, that we don't have a shared secret key yet. So we encrypt it with a public key of the server using asymmetric key cryptography or public key cryptography. So this was the goal. Server has a public key. It sends to the browser. When the browser gets that, it now generates a secret key. For example, a 256-bit key for AES, KBS, encrypts that using a public key of the server and sends that back to the server. Only the server can decrypt. So the server learns KBS and now the data transfer between the browser and server is encrypted using KBS. The subsequent messages then we encrypt any data using KBS in either direction. And importantly, the difference what's happening here, this is symmetric key cryptography. This is public key. For example, RSA, the public key algorithm and symmetric key algorithm could be AES. Why do we use symmetric key for encrypting data? It's generally much, much faster. So when we're sending packets across the network, then encrypting with AES or a similar symmetric key algorithm is much faster when we have data. So we'd like to use that. But the problem is that the browser and the server don't have a shared secret. So the first two steps are to establish that shared secret KBS. So this is what we want to do. But we want to see why it doesn't work or what attack is possible on this. Any questions on this behavior first? So when we receive this, the server also knows KBS. And they can encrypt data with KBS and only the browser can decrypt. And similar, the browser can encrypt data with KBS and only the server can decrypt. Any questions before we move on? May I just browse through your lecture notes and see what I included? So that's the normal, well, that's the intended operation. Now let's look at an attack on that, specifically a man in the middle attack. Same scenario. If we use this protocol, but there's another entity involved, the attacker. The server sends its public key to the browser. But the attacker is going to intercept that. So before it gets to the browser, the attacker gets that public key. And what can they do? Does anyone have an idea of how the attacker may attack this protocol? What will you do as a malicious user? Change the public key to whose? We could change it to the attacker's public key. So what we want to do as the attacker now is to make the browser and server think that they have a shared secret key, KBS, that only they know, but we also know as the attacker. And then any data sent between the two could be also decrypted by the attacker. And the way to do it is that the attacker intercepts this public key, which is sent in the clear, and changes it to another public key, the public key of the attacker. But the browser doesn't know that. It's just received a message thinking it's from the server. Here is the public key of the server. So the browser at this stage has no way to know that this is not the public key of the attacker, it's not the public key of the server, it's the public key of the attacker. Because it's just a key, it's got no way to know whose key that is. So the browser thinks PUA is the public key of the website it's trying to access. And now it generates a key that it's going to share between the browser and the server. Say some random 256 bit key for AES and encrypts and sends back to the server. But again it's going to be intercepted. But the normal operation encrypt the key using what value? The public key of the web server. Well what we think is the public key of the web server, the one that we just received. The browser received public key A, it thinks it's for the website it's trying to access. So encrypts KBS with that PUA. What does the attacker do? The attacker, can they decrypt? Yes, it's encrypted with a public key of A, the attacker knows the private key of A so they can decrypt, they learn KBS, so now they know that value. And they send a message onto the server. What do they send to the server? What we'd like as the attacker is not just learn KBS but to make the browser and server think that they have the correct key. Encrypt the key KBS using the public key of the server. The server has in the past sent its public key to the browser and then later receives a response from the browser or what it thinks is the browser but actually via the attacker and it's the expected response, it's some key encrypted with the public key of the server so the server decrypts it with the private key and learns KBS so the server doesn't recognize anything's gone wrong. It sent a public key, it received an encrypted key KBS. Similar the browser doesn't recognize anything's gone wrong, it received a public key, it generated a secret key and encrypted that with the received public key. So now because the browser and server haven't identified anything going wrong, they now use KBS to encrypt their data. For example the server encrypts using symmetric key cryptography, some data using KBS. They send it to the browser but the attacker again intercepts. Can the attacker decrypt, yes the attacker can decrypt because they also know KBS and they learn the data and send it on, they can send it on unmodified, the same data or they could even modify the data because they have the key and the browser receives this and decrypts because the browser knows KBS, they can decrypt and they also get the data thinking nothing's gone wrong. But the data that was encrypted and sent from the server to the browser has also been learned by the attacker. So this is what's called a man in the middle attack. The attacker has intercepted the messages for the really the secret key exchange and allowing them to now anything sent between browser and server encrypted can be decrypted by the attacker. And the problem is the distribution of the public key in the first step. Any questions on that man in the middle attack? Easy, so far? Understand good. So this is why we don't just send the public key. The problem in this case was that when the browser received this public key, the modified public key, it had no way to know that this public key was not that of the server. It just trusted the public key it received. So it went wrong here. So the role of digital certificates is that we don't just send the public key from the server to the browser. We send a signed public key. That's what a certificate is, a signed public key. Signed by someone that the browser already trusts. So what would happen? And we'll see it that we send the certificate from server to browser. If the attacker tries to modify the certificate, the browser will detect that. That's what we want to happen. If it's not modified, then the browser can trust that the public key is the correct one. So let's see that and see what happens when we use a certificate instead of just the public key. We send the certificate of the server. What's in a certificate? What is CS? What's in the certificate of the server? What's the important thing in there? What's the one most important thing? Public key of the server. So we're actually sending the public key of the server in the certificate, but we don't just send it as is, we send it signed by someone else. So the purpose is to get the public key of the server to the browser, so we send the certificate, but remember the certificate is signed by an authority. I will not write the full equation because we have it here. This is CS. It includes the identity of the server. It's public key. That's what we want to get to the browser. Some timestamp to give some indication of how long this is valid for. And all signed by some authority, the certificate authority. The attacker intercepts the certificate, and let's say they modify it. What can they modify it to? They want to get a public key to the browser that's different than that of the server. Let's see. So the certificate and we'll write it in full. Remember it contains the ID of the server concatenated now. It originally had the public key of the server, but as the attacker, let's change it to the public key of the attacker. There's a timestamp, and then it's all signed, it's going to be long, we'll fill in the missing parts. So this is the full certificate. The original certificate had the public key of the server here, and it was signed using the private key of the authority, and it had the public key of the server here. What can the attacker do? It wants to change the public key of the server to PUA, so it changes that here. That's possible. What does it do for the other parts, the missing parts? What can they try? Well, there's different approaches. They could not modify this part. Okay? Just leave it as is. The signature part. Leave it the same as before. What was it before? When we received it, CS, it was private key of the CA, and this was PUS. So in this case, the attack was simply modify the public key here. That's all the attacker did. The signature portion, they didn't encrypt again. They cannot do that. All they did is they copied and pasted the signature from CS and attach it to this fake CS, the one with PUA. The browser receives this certificate from supposedly the server. What does the browser do? It decrypts what? Okay. It checks. It tries to verify the signature. So the steps of verification, two steps, decrypt the signature part with what key? The public key of the CA. So this assumes that we know the public key of CA. We decrypt as the browser this portion. What do we get? We get the hash of the ID of S, PUS and T, and we compare that to the hash of the rest, the hash of IDS, PUA and T. Not the same input to the hash, therefore the hash values will not be the same. And that indicates to the browser that something's gone wrong. Don't trust it. So if the attacker just changes the public key, then the browser will detect that and ignore or present an error to the user saying something's wrong with a certificate that you received. So that attack doesn't work because the browser detects it. Changing the public key on its own didn't work. Can the attacker try something else? Just going back. We want to change the public key. That's our aim as the attacker. Can we try and change it here? What can we try? Well if we do, we know that we need to change, if we change the public key, we need to change the contents of the hash. We need to recalculate the hash. So let's change it to PUA. But if we change that and calculate the hash of IDS and PUA and T, we need to encrypt that with some key. As the attacker, we need to encrypt it with a private key. The point is that we cannot encrypt it with a private key of the authority. We don't know the private key of the authority. We can try any other private key, but we will not have the private key of the authority. It's not as A, it doesn't matter, but some other private key, meaning it will not be the private key of the authority. Now the browser receives and verifies. When the browser receives a certificate, it decrypts with what key? The public key of the authority. So always to verify the certificate, you decrypt with the public key of the authority. But so PUCA would be used to decrypt this, but it was encrypted with PRX, that is not PRCA, therefore it will not successfully decrypt, and then when we check the hash, they will not match again. So because we will not get this hash value as output because the wrong keys are used. So browser again will detect a change of the signature component is not possible by the attacker because the attacker doesn't know the private key of the authority. The attacker cannot re-sign the correct message because they don't have the private key. So that will not work either. The browser will detect that. Anything else? Anything that the attacker can try to try and defeat this or another perspective? Under what cases will the attack be successful? What does the attacker need to do? If they know the private key of the authority, they can be successful, but our assumptions are that we don't know the private key of other entities. So the attacker shouldn't know the private key of CA, otherwise it's not private. So anything else? What changed the public key of CA? Let's try. So we said for this to work, if I just go back to the previous one, in this case to verify what does the browser do, it decrypts using, we said it decrypts the signature component using the public key of CA, I'll just s for the signature component, that's this component. That's how it verifies. Importantly it uses the public key of the CA. And that assumes that the browser knows this already. The public key of the CA is known by the browser up front. Now what about the attack? This attack, the attacker or signs the modified public key using its own private key, not the private key of the CA because it doesn't know that. For this attack to be successful, we must somehow get the attacker to, we must get the browser to decrypt using the correct public key. When the browser verifies, they decrypt the signature component using what key? They should use the public key of the authority. If we can, as the attacker, if we can get the browser to think that the public key of the authority is PUX, then this attack will be successful. If we can fool the browser to think, ah, the public key of the authority is PUX, then the browser verifies, decrypts with PUX, compares the hash, finds that they are the same and trusts this modified certificate. So the security of this system depends, or one part of it depends upon the browser having the correct public key of the authority. If the browser has PUX thinking it's the public key of the authority, the attack is successful. Any questions on that part of using certificates? The browser must have the correct public key of the authority, otherwise the attack will be successful. So that leads to the issue of how does the browser get the correct public key of the authority? Any suggestions? When your browser has some public key of the authority, how does it know it is the correct public key? It's not a fake public key. We get the same problem. We get the public key of the authority, how can we prove that it's the correct one? Well, we could get it signed by someone else, someone we trust. We trust that other one because we have their public key. But we have the same problem. How do we know that that's the correct public key? So at some point we must, at the browser level, must trust a public key, implicitly trust it without any proof, and just assume that the public key is the correct one. In practice how that's done is that the browser makers, the people who make Firefox, Internet Explorer, and so on and release it, they load into the browser some pre-trusted public keys of authorities. So it's effectively hard-coded into the browser software. These are the public keys of authorities that this browser trusts. So we need that. And they're called authority keys, as we'll see in the browser. So back to our slides. We said that the browser verifies the signatures using the public key of the authority, and this assumes that the browser already knows and trusts PUCA. If they don't, then we have a problem. And typically we'll see that the public key of the authority, in some cases it's hierarchical, it's signed by someone else, but at the top of the hierarchy it's signed by the own entity, so it's a self-signed certificate. That is, in this case, the public key of CA is signed with the private key of CA. This is a self-signed certificate. And that's needed for this digital certificates to work with web browsing, but presents a whole or a potential weakness in the system. Any questions before we return to our real web browser and look at some of these keys? What we've gone through is why we use certificates rather than just sending public keys. If we just send public keys, a man in the middle attack is very easy. If we use certificates, the public key we receive is verified. It's signed by someone we trust. Let's look at the certificates in a browser. Everyone looked last night at their certificates in their browser. Have a look. For a five-day weekend, plenty of time to have a break from playing and squirting people with water and have a look at your browser certificates. Let's look at some of them. It's available in most web browsers, maybe just a different interface. In some web browsers, the certificates are managed by the web browser themselves, like Firefox and others, like IE and I think Safari. All the certificates are managed by the operating system. Essentially there's a database of certificates. There are different types, but the two main types we will see are the certificates of the authorities, as well as certificates of servers that we have trusted along the way. We'll return to them in a moment. Let's first look at the authorities. What you notice, this is the list of all the certificates. There are many. There's not just one authority certificate. There are many authorities. That's to make the management of certificates practical across the world. The way that it works is that the websites get their service certificates issued by some authority which is local or convenient for them. If we have a quick browse through, you may guess the countries of some of these authorities. I guess Turkey, Germany, different companies. Some are government organizations. Komodo is a large company that issues certificates and security software. Who else do we see in here? The Chinese Network Information Center is an authority in China which just recently apparently issued a fake certificate in the name of Google. Hence, for some people, we will no longer trust it because the authorities must be trusted. The authorities have the role of signing other people's certificates. What the authority does is it signs the certificate for www.facebook.com. But if I go to the authority and say, I own the domain facebook.com and the authority signs my certificate, then I can be an attacker and intercept traffic to the real Facebook website. So it's up to the authority to check. The person who claims to own this domain actually does own it. And if they don't check accurately, then that can allow attacks. So there are many different authorities, companies, and government organizations. And in fact, I actually will go to Start.com which we saw yesterday. The authorities may have multiple certificates. That is, there's a hierarchy. So the Start.com company has multiple authority certificates where they sign their own certificates and then use the sub-certificates to sign websites. How do you get a signed certificate? You graduate next year, you go and set up your website for your company and you're going to use HTTBS on your website so no one can intercept all the data. How do you go and get a certificate? What are you going to do? Contact the authority. NIC, what's NIC? THNIC, the Thai Network Information Center. So many countries will have a national or sometimes called a network information center which does things like gives out domain names, gives out IP addresses, and also maybe gives out certificates. So what you do as a website owner, you go to an authority and say, I want a certificate for my website and that authority will ask for some identification or some proof that you own that domain. So they will need some proof that you are the owner of www.mynewwebsite.com and they will check that and there are different ways to confirm that and then they will assign the certificate for you. Some authorities will do it for free, some will charge you money and some will do very simple checks that you own the website. That is, they will check that you can maybe modify a file on your website. Others will do more comprehensive checks and check your identity or passport, check your company documentation and so on to be more confident that you are the correct owner of that website. Now let's look at what happens with if we don't have a correct certificate. Close this and let's visit a website. Any suggestions for a website? Maybe one of your favorites, registration. The registration website is supported with HTTPS so we visit that to view your grades. Let's hope this works. And we get this message that we saw again in the previous lecture. Untrusted connection. Let's look at some of the details of this. So this is the web browser giving us some warning saying, really, the certificate I just received from the web server reg.sittuacth has a problem. What does it say? Let's look at the technical details. Reg.sittuacth uses an invalid certificate. What's invalid about it? The certificate is not trusted because it is self-signed. The certificate is only valid for seven, so we'll see if we look at the certificate that's got some name in there, seven. And the certificate expired four years ago, okay? So there's multiple problems with this certificate. Our browser checks it. It received the certificate from the web server, but when it does the verification, things don't pass the verification checks. So that's why it presents this warning. It's self-signed, meaning the public key of registration is signed using the private key of the registration website. They signed it themselves. And what the browser expects, it will only trust certificates that are signed by someone who's in the list of authorities. So if the browser receives a certificate signed by an authority, it doesn't present this warning. It automatically accepts it. But if it receives the certificate signed by someone else who's not an authority, that's when you get this warning. What do you do next? Well, you shouldn't trust it, okay? That is, from a technical point of view, this could be someone doing a man-in-the-middle attack. This could be the attacker, somewhere between my browser and the web server, intercepting the traffic and modifying the certificate. There's no way for me to know that. I cannot detect an attack in this case. If we go and look at the certificate, what we can do, if we do implicitly trust this, we hope that no one's performing an attack. I don't know, maybe there's a student in here who's intercepting my Wi-Fi traffic, and they could be, but let's hope not. Then we can view the certificate and see, of course, the expiry date. So the timestamp, it begins and lasts just for seven days. And it's issued by some common name, seven. Okay, I don't know why they named that. The common name for the website should be the domain name. It should be reg.sit.tuacth. But the people who created this certificate gave it some different name. And this is a self-signed certificate. Anyone can create a self-signed certificate. All you do is you create your public key and use some software like OpenSSL will generate the certificate for you. And that's probably what happened here. And if we look, and we see it's issued to seven and it's issued by seven, so that's why we call it self-signed. The authority is the same as the subject. Why would SIT use such a certificate? Cheap, it's free, okay? Lazy, okay, that is right. Getting a certificate requires, often it requires some payment, that is you pay an authority to sign a certificate. There's some services provided for free, but they may only have limited services, so that's one reason. They really should have a correct certificate, okay? But as you see, this one's been used for so long and the hope, I guess, from the SIT computer center is that when people are accessing registration that no one is doing a man in the middle attack. But there's no way to prevent that. But even other organizations may have certificates which are invalid, so you may come across them in some cases. And then it's up to the user to decide whether they'll trust it or not, but there's no way to confirm that there's no attack taking place. The other reason why SIT may not have it and if you visit your other favorite website, ICT, you'll also see it's a self-signed certificate and that's even worse because it's run by me. The reason is not so much because it's cheap, it's because to get the certificate, you actually cannot do it yourself. You need the owner of the top-level domain to do it. That is Tu.acth. So the Tamasat University Computer Center must get the certificate and that takes a lot of effort for us to go through them and convince them to get that and we're in the process of trying to convince Dr. Cormor and others to do that slowly. Going back to now, if we add the exception here, confirm, then of course we get taken to the website but no guarantee there's no attack taking place. Now we view the certificates. The tab for servers shows these ones, especially the self-signed ones that have been accepted recently. So this registration one is temporarily accepted. I didn't confirm it permanently. So that's sorted here. So the next time I visit registration, it will not prompt me for that warning for some period of time. Maybe when I close my browser, it will disappear and it'll prompt me again but if I select the option to permanently accept it, it will be added permanently here and it will never warn me about it again. You want this list to be empty, ideally, but in practice you may have something there. Okay, let's return to what we've summarized, what we know about digital certificates. So we've seen a few examples. It's maybe easy if you look in your browser. You can see all the details that we said. The concept of a certificate is to store the ID, the public key, a timestamp and the signature of that information but the implementation actually includes a specific format and that's defined in the X509 standard. It includes other things as well, like the algorithm, the notation is slightly different and you see that in the example in the web browser. We saw that the certificates have an expiry date. So the idea is maybe I get a certificate issued for one year and after one year I need to renew it or get a new one. But if something goes wrong, let's say my private key of my web server is compromised so therefore you should no longer trust the public key then there's ways to revoke existing certificates. That is you can go back to the authority and say, please revoke, please remove my certificate from the system and there's some complex procedures but essentially the authority distributes a certificate revocation list. Some list of all the certificates which are revoked that people shouldn't trust but that may not work well in some cases because it's difficult to get an updated list but there are ways to revoke certificates. So in practice, how does a web server, how do you, when you create your web server obtain a certificate? Well, you go to an authority and you prove your identity and that may be as simple as going to the website of the authority and following some steps, maybe submitting your personal information and maybe doing some operations on your own website to try to prove to that authority that you own that website, really that you own that domain name. Or it may be more comprehensive validation where the authority, where you submit your company documents and other identification to the authority and they check it to confirm that you're the correct owner. There are free services that do that and there are commercial ones where you pay to get some validation, pay per year, for example, from several US dollars to maybe hundreds of US dollars per year for certificates. How does your browser obtain the certificate of the authority? It's preloaded into the browser. So when you download Firefox, it includes a list of certificates of the authorities in the browser and same with the other browsers. So then it's the responsibility of the browser developer to make sure that the authorities which are included are trusted authorities because all those authorities which are preloaded into the browser, if one of them is malicious, then what that allows them to do is they could sign a certificate for someone who's not facebook.com and allow others to do a tax. So the authorities must be trusted. And in fact, the browser manufacturers or the browser developers will go through a comprehensive set of steps to make sure that those authorities meet some requirements of being trustworthy and sometimes it will remove them. But as you see, there are maybe 100 plus authorities listed in my browser. I don't know most of those organizations. So I place my trust in the developer of the browser in that case. And that's a little bit of a problem. And what if the certificate of the authority who signed the web server certificate is not in the browser? As we see, we get this warning pop up saying untrusted connection. What do you want to do about it? And then it's left to the human user to make a decision. And that's when you will say, no, I don't trust it and not visit the website. But unfortunately, many people automatically accept it and still trust it. But whenever you see that warning, think in the back of your head, is this an attack? And I advise, if you see the warning for a website that you visited before where you haven't seen the warning, that could be a problem. All right, a warning for Facebook, for example, that's probably a very strong indication of an attack. A warning for ICT and registration, well, you trust me. No, you shouldn't, but we have some limitations. So some issues, things that go wrong with certificates in practice. The authorities must have their identity verified, the web server owners must have their identity verified by authorities. And sometimes those checks are not very rigorous. So someone who doesn't own the website may get the certificate signed for them. That's a problem. The authorities private key, remember PRCA, that must of course be kept private. And authorities will usually go through steps and have audits to show that their private key is not stored in some online accessible computer, it's stored in some vault somewhere. That is, it's disconnected from the network so that no one can find and compromise the private key of the authority. Because if someone can compromise the private key of the authority, they can sign fake web server certificates and do attacks. The preloaded certificates in browsers, there are many of them loaded there, so you place some trust in the browser developer. Another problem is that when you get an invalid certificate, you get a warning message in your browser. Many people will just automatically accept that warning and still trust it. And that can be a problem because then people may be attacked. The algorithms used in certificates should be strong. And most of them are, but there are some weaker ones that are still in use. So there's some of the problems that can occur with digital certificates. To finish this topic, we know web browsing in the normal mode uses HTTP over TCP. There's no encryption, no authentication. So to provide those security services, what we do is we use a SSL or TLS as a layer between HTTP and TCP. We insert this security protocol and that's really what HTTPS is. HTTP messages are sent using SSL, which are then sent using TCP. And to encrypt data between browser and server, we use symmetric key encryption. And therefore the browser and server must have a shared secret key. But to get that shared secret key, we use public key encryption. And therefore the browser must trust the server's public key. And we've shown today if we just send the public key from server to browser, a man in the middle attack is very easy. Therefore we get the public key of the server signed by a trusted third party, a certificate authority. And that leads to digital certificates. X509 is the format. They're used in web browsing, but also other applications can use certificates. Email, some download applications or some services where you download software, that software may be signed using a certificate. The use of certificates relies a lot on the trustworthiness of the authorities. If you can't trust them, then attacks are possible. It relies on the action of the users, especially when you see a warning message that you don't automatically click accept, that you maybe don't trust it. And if certificates are compromised, then man in the middle attacks are possible. What we'll do in the next topic, next week is look at some other aspects of web security, but more from the perspective of applications. When you create a web application, say using PHP and HTML, where you create an application where someone logs into a website, what attacks can someone do on your application to get it to do some bad things or unexpected things? So web application security will be the next topic.