 So there are many things in IT security. We're covering a selection of topics. Last topic was what? Firewalls. One interesting, some interesting topics that we're essentially going to skip over is software security, how to develop software like applications that is secure, things like database security, how to set up a database to provide extra security. We're not covering that in these lectures. A lot of the management aspects of IT security, we're not covering in lectures, but in your assignment, you've got an example of using some IT risk management procedures to analyze risks, analyze attacks. So we're not going to have a specific lecture on IT management. We're not covering physical security. One aspect of securing IT systems is securing the physical devices. So if you have a server that serves important information, it stores financial information and so on, then the room where that server is stored should have some security mechanisms. We shouldn't allow any student to walk into the room where the server is stored and be able to plug in their USB and copy all the data. So physical security is another part of IT security. We're not covering it. We just don't have time to cover everything. Software security is relevant. And some of you may have heard of things like buffer overflow attacks, which is one aspect of software security, making sure that applications don't allow someone unauthorized access to systems. We're going to go to web security. But some of the issues that we'll see in web security also apply for software security. What about what is web security? Well, security of web applications. Web browsing, accessing websites in a secure manner. But also, as you know, many applications are provided via a web interface today. And how to develop applications, websites, that are secure. So there's this topic on web security. And then the next one will be web applications and some attacks on web applications. All right, let's get into the general topic. Now, to talk about web security, you need to know about web browsing. Any questions on web browsing? Everyone's expert by now? A, as a user, and B, as the understanding of the technology. So a few slides here are just from my previous lectures or other topics, other lectures that most of you have seen before, I think. Well, web browsing, we have applications, browsers, and servers. And the simple approach is that the user of a web browser sends a request for some content from a web server. The server sends back that content. And the protocol used is HTTP. So we all know this already. There are many different types of web browsers and also implementations of web servers. Some aspects of that communication, so we use an application layer protocol HTTP, send a request, get a response. But in fact, HTTP uses TCP. So that request and response, before the first request is sent, actually there's a TCP connection set up between the browser and server computer. So they set up a connection, and then they exchange the request and response, and then close the connection once that the data transfer is completed. It's more complex than that in that they may be multiple requests and responses for images, for style sheets, and so on. Some of the identifiers that we'll see come into play, I hope everyone knows this, that we have port numbers to identify the applications and IP addresses to identify the computers on the internet, specifically the interfaces of those computers. So those four addresses, port number used by the browser and the server, and the IP addresses used by the client and the server, are important in identifying communications across the internet. And one of our topics towards the end will look at some issues of privacy and how people can track different communications using this information. But this is nothing new to all of you. This is just a summary of HTTP, and again, stolen directly from ITS 323 last semester. So everyone's seen this before, but we know HTTP is a request response protocol. It's stateless, which means you send a request for a web page, the server sends back a response, then you send a request for another web page. When the server receives that second request, the response in theory is not related to the previous response or request. That is, subsequent requests received by the server, the server doesn't keep any state about the previous ones. Now, in the next topic, we'll see that that's not so useful for many applications, and we extend upon stateless communications with things like cookies and other forms of session management. But HTTP, think from the server's perspective, it receives a request, sends one response. The next request it receives, the response it sends has got nothing to do with any previous ones it's dealt with. It doesn't store state, but we'll see we can extend that so that it does later. What's the client? A web browser, the client, sometimes also called a user agent, so different names for your web browser, the formal name in HTTP as a user agent. Default server, port number 80. I think everyone knows this. The request messages have some structure, so do the responses. I don't think we'll see too much detail of that in this course, but they're just text messages, so plain text. And they have some generic first few lines in the request and response, and usually some optional header lines, some optional fields to give information between browser and server. And we're not going to go through these again. You can use a few slides here for reference if you need to look them up, but we send a request, we get a response. We have status codes, 200 OK, 404 not found, and so on. And if it's 200 OK, the response includes the actual web page, the file requested. Nothing new to any of you. So the next two or three slides are just for reference if you need to go back and look up some of the status codes, some of the headers. The headers are optional. In both requests and responses, the browser and server may include some extra information to try and improve the communications. Like the character sets that the browser prefers. The languages of the content it prefers. Some string to identify the web browser, the user agent string. The length of the content, bytes, and many other header fields, so just some that you may see in some examples. Not so relevant yet. So we know web browsing is performed with HTTP. Web applications, what's a web application? So a simple website is a set of static web pages, a set of files. Your browser requests a file. The server sends back the file. Static in that file doesn't change unless someone manually edits that file. So a HTML page, an image, a style sheet, for example, and other files. But I think you know nowadays most applications have dynamic content. The content changes depending upon who's accessing it, at what time they're accessing, and what they've done in the past. So it's not just a static content anymore provided by web servers, but the content changes over time to provide a better experience for the users. So the content, the server, there's some stakes here. Content served to the browser changes depending upon request. That is, one person visits the same URL, the browser sends back one piece of content, a different person visits that same URL, the browser sends back different content, maybe based upon who's logged in. One person logs in, they see the quiz that they need to attempt, another person sees a different quiz. So the server sends back different content depending upon the request. So most applications are dynamic web applications. How is that provided? Well, different ways. You have either client-side functionality or server-side functionality. So client-side includes things like JavaScript. You send a request for a web page or file. The server sends back some file, including some JavaScript, which is then executed in your browser to provide some dynamic features that you can expand things inside the browser. So there's client-side functionality, flash applications, that is, you request a URL. The flash application is sent to your browser, and the flash player loads that and plays that in your browser. And then server-side. You send a request to the web server. The server does some processing. So some code processes some data and sends back a response. Many different ways to do server-side processing. Different languages, PHP, Codefusion, Java, and many others. Everyone's done some server-side processing? Yes? What top? Yes? In database management systems, maybe last semester with Dr. Tanarak, did you have to create a website? Using PHP? OK, so you've done server-side processing. That's it. That is, someone accesses a website. The website just doesn't send a static HTML page back. The web server then loads the PHP code, which executes something, maybe reading a database, and sends back content based on that. That's all we're talking about here. So we can capture it broadly in a picture like this now. That is, we have our web browser. Our user just uses their web browser, sends the HTTP request to a server, and they get a response. But in server-side processing, there's much more happening at the server end. The web server receives a request, and it may forward that request to some engine that does some processing. Like the PHP interpreter. So if they request, say, a PHP file, the web server knows to send that PHP code or execute it via some engine we call here that executes the code, and maybe sends back some content to the server, some HTML that goes back to the browser. Or the engine actually reads data or pulls data from a database. I think many of you did that in your assignment or in database management systems. You have all your data in a database. You have some PHP code that connects to the database and does a query on the database, and based upon the results of that query, some HTML is sent back to the web browser. That's all we mean here. Of course, this engine depends upon the language being used, and the database depends upon the database software being used. It may be a database application. It may be a plain text file. But that's the concept. This will become especially important when we look at some of the attacks on web applications. When someone creates a website using this approach, how do people take advantage of some poor implementation of those websites to attack the system, to see some attacks? Now I've drawn this in this picture as one computer. Of course, the web server and the database and the engine don't all have to be on one computer. They may be distributed across many computers. OK, so that's a very simple view of web applications. What's the problems? With respect to security. Well, different problems. And we're going to cover most of them. First, the data sent between browser and server is sent across the internet. What if someone intercepts and sees the data sent in the request and the response? Then we have a security issue there. We'd like to encrypt the data so that even though some malicious person on the internet can intercept, if the data is encrypted, they will not be able to decrypt and see the original plain text message. So we need some form of encryption if we want to protect or make the data confidential. So that's one issue we need to deal with. And I think most of you know that we've used HTTPS. It's a secure form of web browsing. So we will look in this topic about how HTTPS works. Another thing and related to HTTPS, when your browser is receiving data from a server, often it wants to know that it is communicating with the correct server, not someone pretending to be that server. So you send a request to www.mybank.com. When you get a response, you want to be sure from the user's perspective, you want to be sure that the response is from the server for www.mybank.com. You don't want it to be from someone who you send a request to your bank website, but someone on the internet intercepts that. And they don't forward to the real server, but they send back a fake response from their fake server. In that case, they send back some response, and then that malicious user in the internet can obtain your information. So you don't want that to happen. The user wants to be sure when they're communicating that it is the correct server that they're communicating with. And we'll see, again in this topic, we'll look at digital certificates and how that's commonly done in the internet today, how certificates are used for a browser to be sure who it's communicating with. So it's a form of authentication. The other thing in web applications, of course, many times you log into a web server. You log into an application. The Moodle web server to do the quiz, you log in. And that's some form of the server being sure who the browser is or the human is at the browser. So the server wants to be sure that they're communicating with the intended user. So when the server sends back the grades for this student from the request, it should be the grades for the correct person, not for someone else. How do we do that mainly in websites, using password authentication, logins? The server assumes it's the correct person if they can supply the correct or the pre-configured username and password. You log in to most websites using that technique. We know about how passwords work. We had a whole topic on that. So we'll not cover that in any more depth. We'll see a few examples of how it can be implemented. But the issues of passwords we know about already. We'll see a little bit in some aspects of, OK, how do we implement that in a website? This idea that once you log in, everything the server sends back to your browser is related to you. So there's some form of a session. So for that duration that you're logged in for, the server sends back content specific to you. So we need some form of session management. We'll see an example of that in the next topic. When the server does some processing, the user sends some information to the server, a request. With web browsing, you don't just have to request a page with web browsing, say, with forms. You can submit data to the server. With a HTML form, you don't just request data, you submit data, you post data. So if the server is going to receive data from the user, they must make sure that that data is valid and that the things that the server does is appropriate based upon that user. So we need some forms of authentication to make sure that the data that we receive is correct and even some forms of access control to make sure that a user cannot access things on the server which they shouldn't have access to. We'll see some examples of that in the next topic. And towards the end of the semester, this last one, often the user's actions, they want to be kept private. Private from who? Private from different people. The user, when they access different websites, different URLs, they want to keep their information private from others on the internet so others don't know which websites they're accessing. And in some cases, private from the server so that the server doesn't know which computer or which user in the world is accessing that server, another form of privacy. So sometimes that's an issue. And in the last topic, we'll look at some aspects of privacy. Not so much a privacy of your data, but privacy of what you're doing. So today and probably on Thursday, we're just going to deal with the first two. How does HTTPS work and what are certificates? Then there'll be more in the next topic on some of the other aspects. Any questions so far? Everyone understands this general model of web applications? I'm sure everyone's developed some applications, so I think you have some knowledge there. Let's look at how secure web browsing works, HTTPS. How do we make sure that the data sent between browser and server cannot be read by someone else? How do we keep that data confidential? Well, we use HTTPS. What is HTTPS? It's really using the normal web browsing protocol, HTTP, but instead of directly on top of TCP, using some extra protocol called SSL. And we'll see the full name is Secure Sockets Layer. And there's a newer version, or it was improved to become what's called Transport Layer Security. We'll see the names of them in a moment. But HTTP over SSL, we'll see that's what HTTPS is. So really to understand what is HTTPS, we know what HTTP is, request some responses. We'll need to look at what is SSL. That's what we'll do in this talk. Just some practical aspects. I think when you use HTTPS, how do you tell your browser to use this secure form for web browsing? Then you do so via the URL. So the URL contains some scheme for accessing a server. Normally, that scheme is HTTP. But if you want to tell the browser to use the secure form for web browsing, then the scheme is HTTPS. When using HTTPS, the server operates different than normal HTTP, and the server listens on a different port. So a server that supports HTTPS uses port 443. Some servers support it, some don't. So not all web servers support HTTPS. They need to be set up to do so. And if you set it up, then it listens on a different port on port 443 is the standard port. What does it do? The idea is once we've got HTTPS working, everything of the HTTP transfer, that request and response is encrypted. So the HTTP get request and get this URL, which goes from browser to server, the URL is encrypted. The response is encrypted. All the forms, when you fill out a form on your browser and submit that data, all of that data is encrypted. Things like cookies. We'll mention them later, we'll explain what they are later. But cookies are encrypted. So all of the things that are normally communicated between the browser and server are encrypted when we use HTTPS. So that when we send a request for some URL, some intermediate user in the internet, all they see is the ciphertext. They don't see the actual request. And similar when the response comes back, containing some document, they don't see the contents of that document, they just see ciphertext. And assuming we're using this correctly, then they cannot obtain the contents. With encryption, we need to make sure that we have also some form of authentication. That is, we're talking with the right person. In practice, in web browsing, the server is authenticated using digital certificates. And we're going to cover that in this topic as well, what that means. So the browser making sure it's talking to the correct server. That's one form of authentication. We use certificates. And that's connected with SSL. But for the client, the browser to be authenticated so that the server knows it's communicating with a correct user, the main form of authentication is using passwords. We send a password, say, in a form. There's a form on the web page, which has a box for entering the username and password. You submit that form, which sends the username and password in a HTTP request to the server. So that's built into HTTP and HTML. Let's look at SSL, Secure Sockets Layer. SSL was first developed in the Netscape web browser. So it was developed as an extension to provide security for web browsing, but become a standard across all web browsers. So it was first developed by a company. But then to make it standard for everyone, it went through several versions. And there's one that's standardized by the ITF called Transport Layer Security, TLS. So you'll see both names. For our purposes, we're going to treat them as the same. They differ across different versions, but the most recent version of SSL, or version three at least, and TLS are essentially the same. From our perspective, if you see SSL or TLS, assume they're the same, just different names. SSL provides security services, encryption, authentication, for application layer protocols, including HTTP. And only those protocols that use TCP. So it assumes that the transport layer is TCP. And SSL will allow us to encrypt all the data we send using TCP for our application layer protocol. SSL actually consists of multiple three or four different protocols in itself. So it's used for different purposes. We'll mention them, but we will not go into the details of how SSL works. This is some viewpoint of SSL. And maybe we can draw it differently just to compare very simple first between normal web browsing and HTTPS from a stack perspective. So when you're doing normal web browsing, you've got your application that, what happens? We're using HTTP. So that's the application layer protocol. So the user at the web browser creates some data, or clicks on a link, and HTTP creates a request. That request is sent to TCP, the transport layer. And TCP then, to send it across the internet, delivers to IP, the network layer, which then goes through the network interface card and out of the computer. I will not draw the lower layers. The data link layer and the physical layer are down below. So application, transport, network. And I haven't drawn the lower two layers. Grows across the internet and is received and comes up the stack in the opposite direction. Nothing is encrypted. TCP does not provide any security mechanisms. IP does not. HTTP does not. They have no security mechanisms built in. What's HTTPS? Similar, but we insert this extra layer. Again, the application, your web browser, uses HTTP, creates a request, like a get request, but then doesn't send that to TCP. And let's maybe just go back. Roughly, the HTTP is part of your browser, whereas TCP and IP are part of the OS, the operating system. So the person who programs the browser, if you write your own web browser, then you use some API to send packets using the operating system, some function call or some method that says send using TCP. So you just create the request in your application and create, say, call the function, which says send this data with TCP, and the operating system deals with sending it with TCP and IP across the internet. With HTTP, HTTPS, it's slightly different. We insert this new protocol in between. The normal HTTP request is created by the browser, but instead of sending that to the operating system and to TCP, it sends it to this new protocol, SSL, which will then encrypt everything. So it encrypts all the data, the request, and eventually when it comes back, the response. And SSL then sends to TCP, which goes through the normal process, sends to IP, and then sends out across the internet, where roughly now the separation between browser and OS is here. So HTTP is used as before, create a request, get a response, but instead of sending using TCP, it sends using SSL, which does the encryption of the data, the authentication, sets up a secure connection, which then sends everything via TCP. So we insert this new protocol to provide the security service. From an implementation perspective, that's usually part of the browser or a library that's provided to the browser, some secure sockets later. And then I'm not going to draw it, but the request goes to the other side. The other side, imagine it receives the data, passes up to SSL, which decrypts, and then sends that decrypted data to the web server. And when the web server creates the response, it creates the HTTP response, SSL encrypts sends it back to the web browser, which will decrypt. And from the browser's perspective, normal HTTP is used. But across the internet, everything's encrypted. Any questions so far? Easy. We will not see many details of how SSL encrypts. We'll show a few slides, but not much. So that's what this picture tries to capture, that SSL is inserted between HTTP and TCP. But it's a little bit more complex because there are some other features of SSL. So there are four sub-protocols. The SSL record protocol is for doing the encryption of the messages and adding some message integrity checks so that we can check that the received message has not been modified, some data integrity. So everything sent is sent using this record protocol. That does the encryption. But there are other supporting protocols. Before we encrypt our data, what key do we use to encrypt? So when my browser sends the server, if we're going to encrypt our data, we need to use some key to encrypt with, which key? So we've got two options, use public key cryptography and use the server's public key, so public key cryptography, or the other option, use symmetric key cryptography. But that requires the browser and server both to know some shared secret key. So yeah, we need to use some public key cryptography, but that's generally slow. For fast encryption, usually we need to use symmetric key cryptography. So we need to agree upon keys and other security parameters. The handshake part of SSL is for doing that. Before we send any data, we do some negotiation between client and server saying, let's use these algorithms. Let's use these keys to do our encryption, to do our authentication, negotiate some parameter values. And over time, over the connection, you may change keys, you may even change algorithms. You may start initially using this algorithm for encryption, and then change to use a different algorithm. And over time, you may change keys. With the benefit of changing keys, makes it harder for the attacker to work out what key you're using. So there's some feature called the Change Cypher protocol for saying, let's change our keys, change our algorithm that we're using. And if something goes wrong, or we receive a packet that doesn't decrypt successfully, it contains an error, then there's an alert protocol for telling the other side. I just received a packet that did not decrypt correctly. And either they close the connection, or they somehow deal with that. So these support the secure connection. The record protocol is doing the encryption of all the data, the other three are supporting setting up that connection, authenticating both sides. Before we encrypt, we want to make sure who we're communicating with, and doing some supporting things like informing of errors, and changing keys or algorithms. How do they work, those four different protocols? We will not see many details. It's just lists and concepts that are used. An SSL connection corresponds with a TCP connection. So with TCP, we know before we send data, we do a three-way handshake. We set up a connection between browser and server. Then we send data. And when we're finished sending data, we close the connection. So we have a TCP connection. There's also something called an SSL connection, which directly corresponds with a TCP connection. So we have SSL connections. And same as a client and server may have multiple TCP connections in parallel, you may have multiple SSL connections. So that's similar to TCP. But in SSL, we have what's called a session as well. Where a session is some connection or association between, again, client and server, but may include multiple connections. So the session is some overall association between client and server. And we set up the session using the handshake protocol. So when my browser wants to connect to a web server using SSL, it first uses some handshake, some exchange of packets to say, let's set up a session, a secure session. And when they set up a secure session, they negotiate things like some identity of each user, some certificates, and we'll see them in the next part, what that means. Algorithms used, say for compression. Not for security, but for performance. Sometimes we want to compress the data. The ciphers that we will use, the cipher spec here, is which algorithms are going to use for encryption and authentication. So the specific algorithms. And often we'll use what's called a master secret. So a session has this information associated with it. And we set that up using the handshake protocol. But within one session, SSL may create multiple different connections, where the connections have further information associated, like the specific keys to be used to encrypt the data in decrypt. The keys used for authentication. I don't think we covered it in the cryptography topic, but message authentication codes. So different keys are used for different purposes. Random values are used for authentication. Different sequence numbers are stored and kept track of. Initial values used in different algorithms. So the specific values are stored per connection. Let's try and illustrate that. We have some browser with an IP address and port number and some server that they're going to communicate. We will not show the exact protocol operation, but initially they establish a session between them. So with respect to SSL, they establish a session. They exchange some packets, some SSL handshake packets, that says let's set up a session. Store some information about the session at both endpoints, like the algorithms to be used, the identity of each entity. But then to transfer data, they can set up one or more connections. So now to transfer data, it's established a connection within that session. And transfers data, requests and web pages, images, and so on, using the algorithms related to the session, but using some specific values like different keys. There's some keys for that connection. And both sides store those keys. The idea is that we can have multiple connections within one session. I'll just draw a second one, connection two. And that would have its own set of keys associated with that second connection. Different values of keys, so different secret keys. So they encrypt using different keys. And we may have multiple connections, but all associated with this one session. The idea is to help with the performance. That is, you set up one session, agree upon the overall parameters. And then to set up connections for data transfer, you don't have to change the parameters. You just reuse the parameters for the session. The session may be long lived. The connections may be short. It's for one packet exchange, a request and a response. Whereas the session may be there for a long period of time. So we can create connections, close connections, all within the one session. I just introduced that because we may see it in some of the examples, some of the information related to sessions and connections. How do we set up a session? Using the handshake protocol. There are specific rules for how to do that. Everything that's sent through an SSL connection, we use the record protocol. And the summary of what it does is we take out data from our application. So coming back to our first picture that I drew, here, the application is the web browser. The application protocol is HTTP. So the data, for example, is the HTTP get request. Get this URL or this file, so that the details of the get requested, the application data. That's sent to SSL. And what the SSL record protocol does is takes that, it will split it into fragments when necessary. So it will create fragments of a size that are suited for the encryption later on. So therefore, by using fragments of some fixed size, no one knows anything about or it's much harder to determine the size of the request and the size of the response. So it hides some information about the size of the communication. Break it into fragments. And then for each fragment, optionally, you can compress it. That's not for security, it's just for performance. We can apply some zip compression algorithm, for example. We add some message authentication code. And this is like adding for authentication some signature or some error detection check, so that the receiver, when they receiver, can check, has this been modified along the way? So this is for data integrity. We add a code. We take that, we encrypt it using our symmetric key cipher, usually. So we get some cipher text. We attach some header to identify this fragment. And that is then sent, this part, is then sent to TCP. So all of that is done within SSL record protocol and that encrypted fragment is sent to TCP, which just sends it as a normal TCP segment across the internet. Anyone who intercepts will see that encrypted fragment. And assuming we have appropriate algorithms and keys, they shouldn't be able to find out the original data. That's the idea. What have we got left? Last slide on SSL, and then we'll show a quick example. The handshake protocol. Again, this is done before we transfer any data. We agree upon parameters. Also allows the client and server to authenticate each other. So before we do any encryption, we want to make sure we're communicating with the right entity. So there's authentication. Negotiate some parameters and algorithms, including some algorithms for exchanging keys. Some algorithms for authentication. They use SHA and MD5 and some variation of that. And some algorithms for encrypting the data. And there's not just one algorithm you must use. You usually can negotiate. Choose from a set. But this is just some of the common ones. So depending upon the client and server, they may have preferences for what algorithms they prefer to use. And they'll choose one that matches their preferences. So these encryption algorithms are symmetric key citers for encrypting our data. Whereas the map is for authentication to make sure no one modifies the data. And the key exchange is for this initial, if we don't have a shared secret key, we need some public key algorithm to exchange a key. So there are different ways to do that. And it goes through different phases for doing the handshake. Let's just go to a quick example of SSL and use. An example from Wireshark of a capture where I did two things. I accessed a web server using normal HTTP. And then I accessed it again using HTTPS. So we'll just have a quick look and see and just highlight what HTTPS provides. The first few packets are the normal web browsing. So we see some DNS queries. We can zoom in maybe. Some DNS queries. The TCP connection is set up. SIN, SINAC, ACK, then the request. So my browser sent a request to the server for some URL, in this case just slash, to some domain. So that's the request. And anyone who's in between the browser and the server can see that exact request. So anyone between those computers, for example, if the browser was my laptop and the server was somewhere in the US, if someone was intercepting my wireless traffic or someone was plugged into the SIT switch and they could intercept all the traffic going through it, or someone in the internet service provider had access to the router and they intercepted my traffic, then they would see what we see here. They would see my request. It's not a great example. I have to redo this. They would see the response. Well, the response in this case was a not modified. But if it was a 200 OK, they'd see the exact web page. So they'd see everything that I communicate with the web browser. Not very secure. Even worse, if this request and response was related to me submitting a username and password, my login to some website, you know from labs that you'd capture the actual username and password. So we moved down to about five seconds later. Well, maybe not 10 seconds later. I did it again, but this time using HTTPS. There was a DNS query, a normal TCP SYNAC from browser to server, set up a TCP connection. And then we see here it's listed as TLS or SSL take effect. So this is the start of the SSL handshake. It's my browser and server negotiating some parameters. And actually my client, the browser says hello. It says hello to the server. The server says hello back and sends some information, a certificate. We're going to see the contents of the certificate shortly. Some other things we notice. TCP is used, source port or the port of the server is 443 here. So in the original client hello, browser to server, source port is that of my browser, destination port, the server port 443. So no longer using port 80 for the web server. The client says hello. And inside that hello message, if we expand, are some parameters. We don't need to know them all. But one of the sets of parameters in the hello is my browser saying to the server, these are all the algorithms that I support. If we want to create a secure connection, we should use one of these. Let's choose one. So this list here is a set of combinations of different algorithms that my browser supports, different authentication algorithms and different encryption algorithms, AES 256 with char as a hash. So you don't need to know those algorithms. But this is just the client saying, this is what I support. When the server responds in its server hello, it shows one. The server supports a set of algorithms. The client said what it supports, the server says let's use this one. So it responds with just one. So they're going to use this set of algorithms for this secure session. RSA is a public key algorithm that they'll use for key exchange, AES 256, CBC is used for encryption of the data, and char as a hash algorithm. They also inform each other of other values, like random numbers. Many of these algorithms need random numbers to support their operations. So they choose some values and tell each other, some random bytes. In addition to this hello, they send a certificate and go through a few more exchanges. So the client sends back some messages. Let's go back. And then at this point, application data. This is from my browser to server. And Wireshark says application data. Let's look at what it is. Here it is. This is what's sent from the browser to the server. What was it? Anyone want to guess? It's encrypted data. This is the data sent from browser to server. We don't know what it is. It's encrypted. So all we see is some ciphertext. This is the actual ciphertext here. So after the initial setup, then everything that they communicate is encrypted. So now this specific packet is one of these fragments. So we can see the ciphertext, but we cannot see what the original contents were. We don't know what the HTTP request was if we're the summoner of intercepts. And everything from then on we just see is encrypted data. It says application data, but it's just ciphertext that we cannot decipher. And that was the entire web page transfer, I think. That's it. So if you capture this exchange with HTTPS, you can see in the negotiation of parameters, but they use algorithms such that even if you see that, you cannot determine the keys that were used. And then they start encrypting the data. And all you see is the ciphertext. And therefore, you cannot see the HTTP get request. And you cannot see the HTTP response. So that's the purpose of HTTPS. Questions on SSL, HTTPS? Easy. You use it every day. Let's have a look and see what other aspects are needed to support it. One of the things that was mentioned or come up a few times was a certificate. We need some way to exchange keys. If we're going to encrypt using symmetric key cryptography, a browser and server must have the same shared secret key. How do they get the same key? Well, they need to send one creates the key, a random value, and sends it to the other. But of course, they cannot send it across the internet unless it's already encrypted. Because if they send a key across the internet, someone could intercept and see the key. So they need to encrypt that key before they send it across the internet. But therefore, I think you realize you cannot use symmetric key encryption to encrypt a symmetric key because you need another symmetric key to encrypt that. And you need to exchange that. So we need to use public key cryptography. And that's where the certificate comes in, digital certificates. So our problem is with web browsing and other applications, but we will use web browsing because it's the most common application of using certificates, we want to use symmetric key cryptography to encrypt our data. It's fast. It's secure. But to do so, both endpoints must have the same key. And one way to exchange keys would be to manually go to the other person, write down the key or program it into their computer. But of course, it doesn't work across the internet. So we need some way to exchange some secret key between browsers and server. We use public key cryptography. So going back to our original lecture on cryptography, in public key cryptography, everyone has a pair of keys. We have one key which is known by everyone. It's public. And the other key in the pair is a private key known only by you. And the nature of the algorithms is such that if we encrypt with a public key, someone can only decrypt if they have the private key. And we take advantage of that by choosing a shared secret key, encrypt it with a public key, send it across the internet, and then only the person who has the private key can decrypt and get that shared secret. So we use public key cryptography to exchange a symmetric shared secret key. There are different ways to do it. There are different public key algorithms that will do that for us. RSA and DSA are two common ones. There's something called Diffie-Hellman Key Exchange. I think we will not see an elliptic curve cryptography. So there are different approaches for public key cryptography. We will just go through one. We will look at RSA because it's one of the most common ones. So we will use RSA to exchange a shared secret. And once we've exchanged that secret, then that secret is used for encryption of all our data. So we first exchange a secret and then encrypt everything using that secret. Just remind us of the operations from cryptography. What are some of the operations that we use or the notation we can talk about in the assumptions with symmetric key cryptography, symmetric. We can say that we create ciphertext by encrypting with a shared secret key, let's say KAB, a key shared between users A and B of our plaintext. And we send that ciphertext, C, and then B, the receiver, can decrypt. To get the plaintext, they use a decryption algorithm using the same shared secret key, KAB. So we're assuming we have these encryption and decryption algorithms that work such that if we have some plaintext message, encrypt with the algorithm and KAB, we get a ciphertext. If we send the ciphertext, we can only decrypt if we also have KAB. Therefore, KAB is a shared secret key. We want to do that in SSL and other applications because symmetric key encryption is fast. It's quite efficient to encrypt a lot of data. But the problem we have is how does B, the receiver, know KAB? How do they know it? Let's say user A chooses KAB and they want to send KAB to B, so B knows it. Then we can use public key cryptography for that, public key crypto. The idea is that we'll take some encryption algorithm. We will use RSA. So I'll note as a subscript RSA, encrypt KAB, that's our plaintext that we want to send. And we'll get some ciphertext. What can we call it? We call it C1, so it's different from the original C. And send that across the network. So this step actually has to happen first. And B will decrypt. To get KAB, they will decrypt using RSA or a similar algorithm, C1. But with public key cryptography, what keys do we use here? Now, I haven't drawn it, but the left is A and the right is B. We want to get the plaintext, this secret value, KAB, A chooses a value, some random 128-bit number. That's the key. And they encrypt it and they send it to B, so that we can do the previous step so that we can do symmetric encryption. What does A encrypt KAB with? Which key? The public key of the destination B, the public key of B, and we decrypt using the private key of B. So that's the idea. Do this first. And then once we've done that, A knows KAB, they chose it. And B now knows KAB because they decrypted the message and got KAB. So now they can do symmetric key encryption using KAB, which is much faster than public key encryption. So how do we make sure this is secure? Well, we use the keys in this way, an algorithm such that we're assuming we have an algorithm such that if we encrypt with a public key, we can only successfully decrypt using the corresponding private key. So if we encrypt with B's public key, only the person that has the private key of B can decrypt. And by definition, that should be B. If someone else has B's private key, then it's no longer private. So we can send this C1 across the internet now. Someone can intercept C1, but they cannot decrypt it and get KAB because they don't have the private key of B. So that's what we want to achieve. A, let's say A is the browser. It chooses a secret value, some random value. It encrypts it with a public key of the web server, sends it to the web server, and the web server uses its private key to decrypt and gets the secret value. And then the web server and browser communicate using some symmetric key cypher using SSL. What's the challenge now? So assuming our algorithms are secure, what's hard about this? Or what presents a risk or a security problem? Assuming no one can decrypt unless they have the correct key, what can go wrong? And I didn't note it, but let's say the left side is our, this is A is our browser, and B is the server, the web server. What can go wrong? Why do we need something else? Or what's missing? The signature, sorry, what comes from the real server? The key, which key? The public key of B. Look, the browser, user A, uses the public key of the server to encrypt. Whoever has the private key of the server can decrypt. Now, the problem that we have is, how do I know this key, this value, is indeed the public key of the server I want to communicate with? That's the challenge. Because if I think this PUB is the key of the web server, but in fact it's the key of some malicious user, then what happens? I think it's the public key of the web server. I encrypt my secret value. I send it to the server, but this malicious user intercepts. And since the malicious user chose the public key, they also know the private key and can decrypt. Maybe we can draw that to finish off today. So that's the idea. Let's see what can happen if we use the wrong key. So the browser, our user on the left, follows this process. They create C1 by encrypting using RSA, using a public key, some secret value that they want to share with the server. They think they have the public key of the server, but it's not. It's someone else's. It's the public key of some malicious user. They don't know that though. So they send this C1, but it's in fact intercepted by the malicious user. So the malicious user can decrypt, because the malicious user will have the private key. So malicious user decrypts using their known private value of C1. They get the secret. So now they know the secret that the browser chose. What do they do next? They take that secret and encrypt it using the real public key of the server. Encrypt using RSA of, scrub that one out. That's not there then. Using the public key of the server, B in our case. The same KAB in this case, and they get, let's say, C2, C2, and send C2 onto the server. This was C1 sent. So the two steps. The malicious user intercepts the one sent to the server. Because it was encrypted with the malicious user's public key, the malicious user can decrypt. So the malicious user gets to know the secret key chosen by the browser, KAB. But now what the malicious user does is takes that secret value, encrypts it with the real public key of the server, PUB. Sends that encrypted value C2 to the server. The server thinks it's received this message from the browser and decrypts using their private key, PRB, C2. What do they get? KAB. So C2, which was KAB encrypted with the public key of B, is sent to the real server who decrypts with the private key of B and therefore gets KAB. So browser chose KAB. Server has KAB. They think they're OK. They both have this shared secret. So from now on, browser and server start encrypting their data using KAB. But this malicious user now performs what's, this is a man in the middle attack. What they do is everything sent between browser and server, they intercept all the data, and they can decrypt everything because they also know KAB. So the malicious user, the browser and server think everything's OK. They both have a shared secret key. But in fact, someone else also has that shared secret key. The malicious user knows KAB. So anything sent between browser and server across the internet can be decrypted by the malicious user. The relation should be the server. Yes. Remember, this is the browser, here's a server, this is the internet. So what happens is that, yes, the browser sends to the server. My browser on my laptop sends to the server in the US. But the malicious user is someone who's listening in on the SIT switch. One of our staff is malicious, and they plug into the switch there. And what they do is they intercept my request, or my packet I sent to the server. They intercept it, decrypt it, then create this new one, C2, and they send that onto the server. So they need to be able to intercept and modify it. Link it in. No. Look what the browser has sent. The browser has sent some secret value encrypted. The server receives that secret value. What will happen next is the server will respond encrypting with this KAB. And the browser will see that that's correct. So in this case, the browser is fooled into thinking that the public key of the server is PUM. It's not. That's the public key of the malicious user. But if the browser mistakenly uses this public key, then this attack is successful. Because both have the secret value, but also the malicious user does. So the problem here is that the browser used the wrong public key. And certificates are about trying to make sure that the browser will not use the wrong public key. We need some way to confirm that the public key we use here is indeed the public key of the server. It's not someone else's. And the way we do that is we get someone else, another entity, to sign the public key. And someone else that we trust to confirm this is the public key of the server. And that's what digital certificates are. So this is just motivation. Why do we need to make sure that the public is correct? So in summary, because we're out of time, we said we want to encrypt with symmetric key encryption, all of our data. Therefore, we need a shared secret key. How do we get a shared secret key across the internet? Well, use public key encryption. Encrypt some chosen secret key using the destination's public key, and only the destination can decrypt and get that chosen secret value. But if the person who chooses the secret value uses the wrong public key to encrypt, then a malicious user can intercept, get the secret value, and then forward a message to the server such that both the browser and server don't know anything's wrong. They think everything's OK. Browser and server start communicating, encrypting using KAB, malicious user intercepts and decrypts everything. So our challenge that we'll cover Thursday, how do we make sure the public key used here is correct? That's what certificates will be used for. Questions to finish? Clear? No. Have a look at the.