 We go today to the transport layer security. We will look at, say, SSL, TLS. I assume everyone has heard about that term. We specifically look at the protocol architecture, the handshake protocol, record protocol. I will explain the relationship between SSL and TLS a little bit later. Then we will spend some time on certificates. They get more and more interesting. HTTPS, what is the relation between HTTPS and SSL TLS? We have breaks somewhere here. I don't know exactly, probably. We'll see. And after the break, at least, we will continue with SSH. And there's also this protocol architecture, but here we have handshake protocol and a record protocol. With SSH, we have transport protocol, user authentication protocol, and connection protocol. So it is somewhat different. Let's start with SSL TLS. SSL stands for secure socket layer, sockets layer. It is basically something that is already there for a long time. It basically started with Netscape around 1995. Netscape was at that time the big internet browser. It disappeared completely, but it was the one who developed, say, the technology for the web. What is it doing? It provides privacy and data integrity. So what does it mean? Privacy means that if you send data over the web, it gets encrypted. And data integrity means that you can verify the authenticity of, say, the remote part. This was used, say, basically by Netscape, for browsing the web. So it is web technology. And it is used if you go to websites and you have to pay, which people started to do 20 years back, that you could be sure that the payment went to the right person and no one could change your payment. So that was, say, the history of it. There have been a couple of versions. And basically, SSL went on till version three. Originally, let's say it's 1995, it had limited strengths regarding encryption, who has an idea how big the key size was around that time. 128 bits, who has something else in mind. Well, you're right, you're right. So it's even lower, how low? 60. Too difficult, 40. 40 bits. So what basically happened is they had for US 128 bits and 40 bits for export. At that time, cryptography was still considered, say, a weapon technology. And buying weapons in the US is something that you can do at every corner. But exporting that to other countries is something that was not allowed. So at that time, we only had 40 bits. So the NSA could easily break them. Now we have much bigger key sizes, so you see how much the NSA has advanced in this 20 years. Isn't that a nice way of saying something? OK, then SSL was a huge success. And then it moved to the internet engineering task force. That's the group who makes the internet standards to make the transport layer security protocol of that. And basically, if you look at what the ITF did, they started with SSL version 3. If you look at the coding in the packets, you see the version number. SSL version 1 was 1. Version 2 was 2. Version 3 was 3. And TLS went to 3.1, and then 3.2, et cetera. So you see that history even in, say, the version numbering. What I said already is the ITF, who is standardizing it. But say SSL is still the name that most people know, but the protocol is basically TLS. Everyone who used still this software, SSL software, is completely outdated. But still, many people talk about SSL. But it is TLS. There's not one version of TLS. There are multiple versions. And I see something that's not on the screen here anymore. Of course, it started with version 1.0. Then we had 1.1, 1.2. And 1.3 is something that is not standard, but it is an internet draft. So people are discussing, say, improvements here. My feeling is that if you look at 1.3, I'm not a real expert in that. But they changed quite some things. And so here you see, I think, from 1.0 to 1.1, there are really minor improvements, making things more clear. But 1.3 looks a bit, say, more different. OK, so there are multiple versions. If you communicate with someone, it's important to know which version. Some versions are, say, not recommended. OK, what is the goal of Transport Layer Security TLS? Well, you can find it in, say, the RFC of TLS 1.2. That is RFC 5246. And they specify four main things. First, cryptographic security. What does it mean? It means that data gets encrypted. Now, the question to you, do they do it with symmetric or with public key encryption? Who has an idea? Symmetric, what do you think? Yes. So like most protocols, they start with public key. But that is, as I told last week, computational quite expensive. And so after they negotiated the first secure communication channel, they go to symmetric keys, because that's much faster. I'll come to that a little bit later. Second goal is interoperability. That means that if people make different implementations, they should be able to work together. So there are no implementation-specific things in the RFC. Extensibility, that's primarily that you can easily add new encryption protocols or authentication protocols. So TLS gives you the framework and says, well, here's the field where you have to specify which encryption mechanism you use. But the details of that are defined elsewhere. You can easily change that. And the fourth thing is relative efficiency. And that has to do with session caching. And let me now explain the way how TLS SSL is used. You use it to go to websites. For example, if you want to buy something, or if you go to your bank, you want a trusted connection. If you look at the original HTTP version, then you basically had to recreate HTTP connection for every object that you were using. If you look at, say, current HTTP, then you can keep the connection open for a longer time. If initially, for every object that you download, and who has an idea if you go to nup.nl, how many objects you will get from there, to how many different servers do they point. So you say 30 to 40. That was indeed the case, say, a couple of years ago. We lately did analysis of how many DNS queries you do. So you start with an empty DNS cache. And you go to nup.nl, and you look at, OK, how many different queries does it do? And it was, if I remember well, roughly 200. So far above 100, which is impressive. And so if for everything you always have to do this first, public key negotiation, create from that symmetric key, that takes a lot of time. So what do they do? They cache sessions. So if you have created already a trusted connection, HTTP connection with the web server, and you stop and you go there again a little bit later, it will reuse the symmetric keys that you negotiated before. That really speeds up, say, your web experience. So that's why they do session caching. And that has to do with that they use the public key in the beginning, and they move to symmetric key. And that's a costly experience. OK, let's look at the protocol architecture. There are multiple ways how you can draw this. Anyway, the key is you run it over, usually, TCP, transport, or say the connectionless transport protocol. On top of that, you have the record protocol, which takes care of confidentiality and message integrity. So encryption and the authentication part is here in the record protocol. But then you have a couple of protocols, which people mostly draw on top of that. That's the handshake protocol. That is the change cypher spec protocol. Although this will be removed, I understood, in version 1.3. Alert protocol and application protocol. The handshake protocol takes care of the initial creation of, say, symmetric keys. The change cypher spec is, at the moment, you update your keys, alert if something goes wrong. Application protocol is, say, the web protocol, or something like that. OK. If you take wire shark and you look at what do I see on the line, then you basically see in time the following. If this is your client and this is the web server to whom you connect. The first step is there is a peer negotiation, where you communicate with each other, hey, what kind of algorithms do you have? In the exercise for next week, which, by the way, I will publish on Friday, so you can't start yet, you will look at, say, servers and the algorithm support that they have. So that is the first step. OK. After the negotiation, you basically know what encryption and authentication mechanisms you will use, and then you create your symmetric keys. You do the authentication, where you often use certificates based on public key stuff. And then the data starts. So this is starting with the public key, and then at the end, you use the symmetric key. Let's look a little bit deeper and say this first protocol, the handshake protocol. How does it look like? OK, the client first sends the message, the client hello message. So if you open wire shark, that's what you see. And it has a couple of things. The SSL version, I already told you that if you look at SSL TLS, numbering continues. Session ID, that's the thing that I explained earlier. If you do caching, you have an ID which you can reuse later. Then you have something random, which is used for, say, initialization of your encryption algorithm. The encryption algorithms that you use. So here you give a list. So you can say I support dash, triple dash, AES, blah, blah, blah. And you specify if you want to use compression or not. I think also in version 1.3, there's no compression anymore. OK, the server basically says, OK, this is my version that I support. So you can down negotiate. Or here you say, I do this and this and this. And here you then decide, OK, let's do that. Session ID, again, another random, the ciphers that you then agree upon and if you do compression. So that's relatively simple handshake. And after that, you have to start verifying if the website is indeed a trusted website and create your new symmetric keys. So how does that work? Depends a bit on specific choices that you make. The red line interactions always happen. The blue one happen depending on the context and can also be integrated. So you can send a certificate and a server key. So if you used public key infrastructure X509, then you include here the certificate, as well as the server key, but they are included in, say, one thing. You can, but that's optional. And most cases not done. Also ask from the server to the client, hey, client, I also want to see your certificate. Usually clients don't have it, but you can in a business environment, you can use that. And then you end with this with a server hello done. Client sees this and basically sends them certificate back if that was requested. You have generated, say, the keys and at the end, the certificate, say, verify. After here, you enter into, say, your data phase where you exchange encrypted data. And I'll come to that a little bit later. But first, what happens if you have a connection open for a long time or whatever? You can change your Cypher specs or you can change your keys. I never tried it, but I think you should be able to even change from triple desk to AES, I guess. So that's something that you can do during the data phase. But if you look at, oh, yeah, so Cypher, so I want to say something about that. This is a picture that was presented and you don't see it here, but you see it on the slides that you can download from the Internet Measurement Conference where people looked at what are the encryption algorithms and authentication algorithms. So the beginning is usually the encryption stuff. And if you look at these, what do you think is interesting? You first. Yeah, it still includes many old algorithms. Do you know some of them? Yeah. RC4, you should not use that anymore. MD5 is still being used, but you better not do it. I think there's triple desk somewhere, but not normal desk. DHE is Diffie-Hellman elliptical curve stuff. RSA you have here. AES, you see key sizes, 128 and 256. They're roughly equal in number. What is also interesting is that you see here RSA with, an option is null. People still use that. That is intended for testing purposes, but it can be that people configured it wrong and that you can still get in RSA with nothing. So this is, say, a test that they did, I think in 2011. And yeah, this, for example, is worrying that you see people use RC4, which is, say, outdated. They measure the two places and this is 25 to 30%. Not bad, or not good, sorry. This is a slide I just copied from Wikipedia, but it gives you an overview of, say, which algorithms, encryption algorithms you should use in which version and which you should not use. I'm not an expert in cryptography. It is something which I found to find difficult. It's always slightly different than I expect. But at least a couple of things that you can see. RSA is still something that is okay. Diffie-Hellman RSA is also okay. There are a couple of things which are not, say, okay, that's this stuff, of course. And okay, well, something is off here, but it doesn't matter too much. So this is the authentication. I was thinking about the encryption. So this is the encryption stuff that you have. RSA4 is mentioned here. You see it is already insecure since the beginning and is still insecure, so you should never use it. If you look here, what is secure? Some variants of AES are secure chameleons that you used for the last exercise. Triple dash EDE, you shouldn't use that anymore. Desk CBC, you shouldn't use that anymore. So there are many choices and you can easily make the wrong choices and what we saw from this measurement that lots of websites still use outdated stuff. Okay, let's now move to the protocol to exchange, say, data. How does it look like? This is your application data. So this is the data that comes from your web application. The first step that the TLS record protocol is doing is, well, since this may be huge, you split it into blocks of a maximum of 16 kilobytes. So you fragment it. That has to do also with, say, efficiency of TCP, et cetera. So you split it. Then you may still compress it. There are some rules, how you, if you want that, that you have to satisfy. There's only one compression algorithm deflate that you can use. Okay, so now we have such fragments, presumably compressed. The first thing is at the end you add, say, a message authentication protocol. Then you encrypt the entire stuff and then you add in front, say, a header here. So this is what happens for the second fragment. You do the same for the third. You do the same, et cetera. Okay, certificates. I want to say something about that. Who is running a web server? Wow. Who is running a secure web server? So who has gone? You hope so. Okay, yeah, that's the right answer. The answer that you run a secure one is wrong. Yeah, okay. But everything is breakable. Okay, so who ever went through the procedure of trying to get a certificate? Good, so not too many. Okay, so what do you do? It's a few steps. First you create yourself a certificate for your website. Then you ask a certificate authority to sign the certificate and then you give the certificate to clients once they access your website. And we saw earlier in the protocol where you include that. So let's dive a little bit deeper in that. How do you do that? No, before I do that. This is another study also from IMC, where they looked at, this was also 2011, where they analyzed the validity of certificates that you see on the web. There are also tools for that. And what you see is that, say, they measure the different places or they use different data sets. Okay, it's between 60 and 90%. Oh, good. Or it's chain valid. You see expired certificates. It is still between, say, 20, 30, 10%. So many people have an expired, say, certificate. Self-signed certificate, that's still even more. I must admit I also run something myself where I have just a self-signed certificate because I didn't want to spend the money of buying something. Okay, you may do this if you are sure. Yeah, you have a question? Yeah. Yeah. They provide a new way for free, properly signed self-signed certificates. Yes. Now you say that I remember that. But I've never played with it. Who has played with it? It's not open yet. Yeah. You can queue for this here. Yeah. Well, the other thing is that you can relatively easy if you are from a university or whatever via Terina get these certificates. It's just also web form. But my feeling, but I'm not 100% certain, is that the university should have a policy on that. And our university, like most universities, don't have such policy. We're somehow able to get that. But that's also something you just click on it and you have it. Yeah, but what you say is a good point. I knew about that. I looked at it, but I never played with it. But I understand why. It's going to be there soon, I think. Yeah. It is important. Okay, so that would get rid of the self-signed certificates. What is more worrying is a couple of things that the root certificate is not in the root store or that there is no root certificate at all or even incorrect other errors. So the landscape doesn't look too well, I would say. Yeah. Good. How do you create one? Well, it's a piece of software that you run on, say your machine, your Linux machine or whatever, where you generate a certificate and you create by that a public and a private key. Of course, the private key should never be shared. What should you do with such private key? Assume you have a bank. So this is something where security is important. Where should you store your private key? Yeah. In a hardware security. Okay, then you do it, say with... Yeah. Thanks to the random... Thanks to the random number that you told me. Yeah, I guess they're pretty much the same. Yeah. Okay, so you're basically saying you have specific equipment for that. Okay, then it should be secure. What you should never do is if you run your own computer that you store it on your own computer somewhere, then it's better to put it on the USB stick and take the USB stick out and put it in another place. Yeah. I know some one-person corner, but you need two or three keys to combine and they all have separate lockers. Yeah. They have to use the option. Yeah. You have... I'm not sure if you refer to that, but forward secrecy, which basically means that if you assume somebody would capture all your data and at a certain moment they get into your machine or whatever and they're able to get your key, then you should not be able with forward secrecy to decrypt, say, things that you have stored in the past. And that's something which gets more and more important nowadays. And therefore, you also need multiple, say, keys or multiple secret things. So if one thing gets broken... Yes, exactly. Well... Yeah. Where is it hosted, China? I don't know. At the same, maybe. Yeah, perfect. Perfect. It's great. Okay. So you have this, then, you generate a certificate, blah, blah, blah, a signing request, yeah, in which you have your public key readable and you sign it with your... using your private key for that. Okay, then you have something, then you have to submit it to a certificate authority. There are many certificate authorities. They check the integrity of it. They verify the website that you want to authenticate. I come to that a little bit later, how they do that. And then they generate the certificate signed with their CA's private key and they send the stuff to you and then basically you're ready after that to use it in your communication. But how you see here, they verify something. How do they verify? There are multiple ways of doing that. And depending on where you read, you see slightly different terms there. One is domain validated certificate. That's the easiest. It's called DVSSL. And the certificate authority sends validation mail to the webmaster. So if you have access to the e-mail of u20.nl, then you can get a certificate. So it's not very strong, but it's relatively easy and for many private people this may be sufficient. For a bank, it's certainly not sufficient. So the second is an organization validation certificate where they check more. They check if the organization really exists. So they go in the Netherlands to Kamen van Koppenbel or they check there if your company exists. But it still can be that you ask there for... although you're not entitled. You can... How the University of Trenta exists, but you're not the one who is supposed to ask the certificate for the University of Trenta. So the third category is the extended validation certificate where they really check if the one who requests this is authorized. What you can imagine is that this is cheap and this is really expensive. And I'm not sure that the one that you said that is this open source initiative, I don't really exactly do, but I think that this is not likely. Well, yes, this requires human. And even this requires human, although you can automate it partially. Yeah. Yeah. Yeah, you often see that they use DNS for... Well, if you have access to the DNS server, then you're probably... Yeah, yeah. How do you know if you're on a browser and you have a kind of certificate you have? Yeah? So if there is a certificate, you see a log icon. By the way, I saw in the past very nice... You also know this Favicon.icon for websites? People who had exactly that log thing there. Very confusing. Most people may not notice it, so that's a good... Yeah, you have a... Sorry? Yeah. Like you just have the Favicon for your website as a default? Yeah. So this sign itself doesn't say too much? The other thing? So what color do you see with which certificate? The problem is the extended validation is usually green, but if you look at different, say, browsers, then the domain validated is shown in different colors. So this is extremely confusing for people. I assume I can't explain this to my sister. She will not understand that she will get lost, and so this is a complex matter. Okay, so this is basically the checking of the certificate, and so once you got the certificate from the certification authority, you include it in your website. Every time someone connects with SSL, TLS, you send it to the client. Sorry, every time a client connects to you via SSL, TLS, you send back your sign certificate, and then here you have to check if the certificate authority is trusted. How do you know if something is trusted? Yeah? Yeah, but how do you know if a CA is trusted? Yeah, then you can manually check, yeah? Yeah, I'm not sure if it's the operating system, at least your browser. But who... How many certificate authorities trusted once are already preconfigured in your computer? Who has an idea? 880. Yeah, it is usually more than 100, and so if you look through that, I have absolutely no clue who I should trust. So if there's someone who wants to do an attack later, just include a certificate authority, and then you are free to later manipulate whatever you want. 350, wow. But you have a Mac on which you run Linux. So basically human beings can't know which certificate authorities you have to trust. There are also a couple of other things. You have a certificate revocation list, so it can be that certificates which are no longer valid are revoked. I think one of the famous examples was at a certain moment, someone had something for Microsoft, but what is probably the best example of wrong certificates? Deginota. Who knows Deginota? Who doesn't know Deginota? Oh, then I have to tell that story. Deginota was a Dutch company. Basically, I think it was created by a group of lawyers, so around 50 people, and there were also about five people who did, say, cybersecurity. And so they had a certificate, how they gave certificates. They were in this list of trusted CAs. The problem was that at a certain moment their system was hacked, most likely by the Iranian security agency. Since people in Iran, they used Google Mail and the Iranian security agency was interested in what they were exchanging and everything was encrypted so they wanted to read and so they needed keys or whatever. So they... Yes, yes. So the Arabian Spring was something where all these countries in the, say, Mediterranean, southern parts, the Middle East, the governments really were afraid. So you somehow can understand it. So they got into Deginota and they used Deginota to create a fake Google certificate. Google found that out a little bit later because they build it in some extra checks. But the problem was that all Dutch, say, important things of the government basically went via Deginota. And the one who was responsible for this was at Govcert, I forgot his name. But he once gave a very nice presentation where they discovered this on Monday morning and I think it was Friday evening where a dead-time minister, Donor, gave in the middle of the night, say, a TV press conference to say that we could no longer trust anything in the Netherlands. And he explains how the awareness of the impact of this, how that grew in five days. And that's really shocking because you basically can't trust everything you do with tax, export, whatever, how all your certificates may have been compromised. So that was an interesting thing. So Deginota is, say, a good example of how not to do something. Probably just as good as Dutch football was. So not a good... They basically didn't protect their servers. So you could... There were only a few people. And... No, no. It is procedural audit, yeah. And what you see is the company was, say, 50 people were lawyers and five people were technicians. The lawyers were all well-paid. The technicians had something like, okay, you're there as well. And they complained already for a long time about outdated equipment or whatever. But maybe I should... There's some nice YouTube videos on that. I should put them on the website because it is just interesting to see what can go wrong. They are, by the way, not the only one who were compromised. There are a couple of others who have been compromised as well. Finally, there's also something which is called online certification status protocol, OCSP. That is what your system does regularly. It connects to see if the certificate is valid. After Deginota, people suddenly started to rely on this much more and it was interesting. I have an apple. If you went to the Wi-Fi in the train, then they also, although they don't encrypt, they do send a certificate or something like that. And it was a kind of, say, deadlock because my system wanted to first check the certificate before I could go on the Wi-Fi, but it couldn't check it because it wasn't on the Wi-Fi. And so, at that time, you had to manually switch this off again, which is not a good idea, of course. Good. Heartbleed, who has heard of Heartbleed? Who has not heard of Heartbleed? Good. At least for someone, I can tell something. Heartbleed was, say, the big fun 2014, I guess early 2014. You can't remember the exit. No, may... Oh, yeah, April 2. It is, yeah, this close to 2000. Sorry. You're reading my slide. That's not what you're supposed to do. Joke. Okay. So, it is, say, a vulnerability in the open SSL stack. Open SSL stack is a stack that is used by nearly everyone in the world. It is maintained by a handful of people. I think there's one full employed and a few who were part-time deployed. They had just enough income to pay the heating and rent for their equipment, but not enough to pay the people, but fortunately, their business model that they also give, consultancy to companies how to use it, and in that way, they made some money. But, say, the development of software was something which was done on a completely voluntarily basis by a very small team of people, whereas the entire world relied on this infrastructure to be there. There's no government who's spending money on this open source stuff. Then the people made a mistake, who implemented it, and I think the mistake was introduced already somewhere in 2012, two years earlier, one and a half to two years earlier, but it was, say, found April 2014. And basically, what happened, and I stole this is from Wikipedia, and I also stole this picture from Wikipedia because I think it's relatively clear. You have an attacker in the normal, and there is in TLS 1.1. I'm not sure exactly where they introduced the heartbeat request. Maybe it was even 1.2. It was something, it's a kind of keep a life message. And so what you basically did is once in a while you sent the message, hey, are you still up? And if you're still up, I sent you a few octets and sent them back to me. And so you did send a packet which had blah, and you had the string that I sent you as a size of four, if you did it well. However, the server did not check if this was indeed correct. So what did the server do? Just took this and copied what came from the start of the buffer for characters. So if you would fill in not four, but 40,000 you just did send four, then in the server you reserved only, say, you sort only blah, but then you requested from the beginning of blah 40,000 bytes. And the server did send it back, so it started with blah, and then you had a typical case of buffer over blah, blah, blah, blah, blah, blah, blah. And somewhere in there you usually also had your, say, your secret, say keys of your server. And in that way you could compromise everything that you wanted. Who has an idea how much repair of this heart bleed exploit? Or, no, it's not even an exploit. The heart beat, say, vulnerability, how much financial damage that it caused. People did make some calculations. Any clue? Yeah, guess? Ten million. Ten million, okay. People made calculations based on some data that was from one of the viruses in 2001 and compensated for a few things, et cetera. And they came to something like $500 million would not have been unreasonable estimates that it did cost to repair this. So what do you see? Six people, one full-time, a few part-time doing work, which is crucial for the entire internet community. No one wants to pay them. Yeah, why should you want to pay? You get it for free, but then the damage at a certain moment is, say, $500 million. So after that a couple of companies that understood that supporting this open-source software that's crucial is something that they should do. So companies like Google, et cetera, they do finance them a bit more. Yes, please. It's just SSL, TLS. So you just sent how you have an HTTP hat, something to your secure web, so 443. And then you send something. And you have this special message heartbeat and you just code this field slightly different. Now it's 443. It's the encrypted... HTTPS. Yeah, HTTPS. So that's what they did. A couple of, say, things that are interesting. People also tried to analyze, okay, who has been using this? And indeed, there have been a couple of examples before it became public, April 2014. Thank you. And where people already tried to scan or exploit it, people found things back that looked like the pattern that you would use for this heartbeat, say, a tank. And some people claim that the NSA was using this already for one and a half year. That's not clear because also other things than the heart-bleed vulnerability or attack could lead to specific patterns. So it's not 100% clear if people exploited who exploited, et cetera. Which is also funny is, of course, many researchers wanted to start scanning and also in the Netherlands, the search starts scanning. And one of the things that they found that if you send such an attack on specific kind of, say, network storage systems, I think even it was HP, but I'm not 100% sure about that anymore, that they would crash. That was, say, the perfect denial of service attack. You just send them one packet and the server crashes, which, well, good. So scanning was something where, yeah, you should be careful. Okay, HTTPS, we already named the term a couple of times. What is it? It's also introduced by Netscape. It's just an application on top of SSLTLS. It's just plain HTTP, which you run on top of SSLTLS. Nothing more than that. Nothing specific. So just a layer in between. So it's in an RFC. It's port 443. We just mentioned that one. Yeah, what is important that you use that for client web server. So a server should have a certificate. And clients may hold certificates. So HTTPS is nothing special. It's just the web, but then using TLS SSL.