 We're talking about how to send our data across the internet to keep that data confidential, but not only that, also to keep the actions that we're doing confidential. That is who we're communicating with, keeping that secret from others. So that's what we're focusing on here. And we've started by going through, actually we presented the basic approach if we just use HTTP. We saw web proxies, so we'll just recap. What this picture is showing is that when we use HTTPS, we know we have encryption of the data. So we have confidentiality of our data between you and the web server. Anyone in between you and the web server, such as the firewall or your ISP or other ISPs or other monitors, those other networks, anyone in between cannot read the data that you're sending between you and the server, because it's encrypted with HTTPS. So that's one security objective met. But they can monitor who you're communicating with. By looking at the addresses inside those packets, the source IP address and the destination IP address must be sent unencrypted so that the nodes know where to send it to. So anyone including the firewall, say on your local ISP or any other ISPs, anyone can monitor and see that it is you communicating with the server. So that's where we say it provides confidentiality of data but does not provide privacy of who's communicating. We looked at another approach using a web proxy. And a proxy is an entity that acts on someone's behalf. So we saw a proxy come up in terms of firewalls and it has a similar purpose here. A web proxy here is used to act as a client on our web browser's behalf and send the request to the real server on our behalf. And the basic concept of how the proxy is used, some other computer in the internet is running the web proxy, you send a request to the web proxy saying you want to visit some particular URL. So you use their web interface, maybe in the form you type in the URL that you want to visit on their website. That sends a request to the proxy and the proxy takes that URL that you want to visit, in this case www.s.com slash abc.html and then creates a normal HTTP get request to that actual website. So the proxy is contacting the real web server on your behalf. And the real web server responds to who it got the request from, sends back the web page and the proxy then forwards that web page back to you. It may modify that web page or it may not. For example if the proxy server is supported by advertising it may insert some ads inside of that web page so you see the ads as a way for them to make money. So that's the one way that a web proxy can be used. Just as an example, so we'll see, we'll mention a moment that, right, who runs the proxy? Who owns the proxy? Well, there are some free proxy servers, that is they will allow anyone to connect to them. Now how do they pay for the server and pay for the internet connection that they share to others? Well, one way is that they could make you view ads as you use them. They could do it in multiple ways, it could be as simple as inserting their ad content inside the web page in the response. Or maybe there's proxy services that don't support ads but you pay them $2 per month to use their server. Just an example of how it may operate. From the security perspective, you want to contact the server S but there's another entity involved in this communications, the proxy P, some server out on the internet, external to your local ISP. So if you remember back to the original diagram, the firewall entity here is just, imagine that that's the firewall run by your ISP. Maybe it blocks some websites. In this case, if the firewall is blocking you from accessing the destination, if you use the proxy, you can bypass that block. Because when you create the source, the packet, the data is going to be sent to the proxy and the data will include the address of the server you want to contact. The source IP address will be you. The destination IP address will be that of the proxy. Therefore the firewall would not block it. The proxy receives that and sends it onto the real server. So what the server receives is a packet from P, the proxy. So the server doesn't know it's you contacting it. The server thinks it's P contacting it. So the server cannot identify you, at least from the packet structure. We've bypassed the firewall in this case because the destination address doesn't match. Because between the proxy and the server cannot see that it's you that's communicating with the server, because if you look at the source and destination address in the packets, anyone who intercepts this packet sees it's from P going to S. Therefore they don't know it's computer you involved. And similar here, if someone here, this router intercepts this packet, they see the source is you, the destination is P. So they don't know, at least from the IP address, they don't know that it's computer you identifying or communicating with the server S that need to do more complex analysis to look inside the packet to realise that. So it provides some level of privacy of your communications, because others do not know who's communicating with who. There's no encryption in this case, so others can still read data. So that doesn't provide any confidentiality of the data. In addition, the proxy knows it's you contacting the server. You contacted the proxy saying, I want to contact the server. So the proxy itself knows what you're doing, but others don't. A modification of that is to use HTTPS as well. To encrypt the data, still the server cannot identify you, because it thinks it's from the proxy. The firewall cannot block it, or not easily block it, because it thinks you're sending to P, not S. The data is encrypted using HTTPS, so the firewall or others cannot read it out on the internet. However, for this to work, the encryption is not end-to-end. The encryption is across two segments, from you to the proxy, the proxy decrypts and then encrypts again from the proxy to the server. So there's not quite clear in that diagram, but normally for the web proxy to support HTTPS, you encrypt the data, and because the data contains the address of the server, the proxy must be able to decrypt it so it knows where to send that data to. Therefore, the proxy must be able to decrypt this packet, and then it can encrypt it again and send it on to the real server. So normally, if you use HTTPS with a proxy, it will use encryption from you to the proxy and a separate connection from proxy to server. The result in terms of security is that the proxy can see the data. It decrypts and can see the data you're sending. So there's no hiding from the proxy in this case. No hiding your data or who you're communicating with. Any questions on the proxy? It's a way to provide, compared to not using a proxy, allows the option to bypass the firewall to hide who is communicating, but you must trust the proxy in this case because the proxy knows who is communicating and can read your data. So do you use a proxy to access your bank account online? Why not? Right. Even though you may use HTTPS to the bank web server, normally for the proxy to work correctly with HTTPS, it must decrypt the connection so it can send on the ... it can learn the destination. And the way that it may do that is it may use fake certificates or self-signed certificates. So if you do use a proxy in HTTPS, what you may see on your computer is a certificate warning saying, you've received a web page from your bank, but it's not trusted, whereas normally it would be trusted because it's in fact signed by not the authority that you trust, it's signed by the proxy. So free web proxies that do support HTTPS will either present your browser with a warning saying something's wrong here or they'll ask you to ... the proxy will ask you to load their certificate into your browser, but in either case you must trust the proxy if you use HTTPS. What if the malicious user is the proxy owner, then everyone, they have seen your data and see who you're communicating with. That is the problem. Right. So we must trust the proxy. If we cannot trust the proxy, if they are malicious, then this doesn't provide the security we're looking for. Okay. So yes, that is a problem. And there may be some proxy servers that say they will provide a free proxy, but what they may also do is monitor who you're communicating with and the data you're sending and use that for their own purpose. So you need to be careful using free or using web proxies. It implies you trust that proxy. Just before we look at VPNs, let's look at that issue with certificates. See if we can capture on this one. If we're using HTTPS with this, so we have our proxy here and with HTTPS, the server would have a certificate signed by an authority. So let's say the server, its certificate is signed by, and some notation you may have seen in, or you may see in other cases is this, the certificate of the server is signed by some authority, I'll say CA, the certificate authority. Like in your homework, you have set up your server, www.myuni.edu, and you get it signed by the authority, me. That's the normal behavior. And normally without the proxy, when the server sends back a response to you, you has the certificate of the authority. This notation is just a shorthand to say this is the certificate of the authority. And who signs the authority certificate? And in the homework, you see it's a self-signed certificate, CA. At some point, we need a self-signed certificate, and that's preloaded into your browser, and it's trusted. In your homework, you needed to do that step manually. You needed to load my certificate into your operating system. So the normal step without the proxy is when this certificate comes back, because it's signed by the authority, CA, and because you trust the authority, you already have its public key, then you'll trust the certificate of the server, S. But what if we're using a proxy? How could that work? The certificate of the server is sent back, and what the proxy may do is then, any questions? This notation of two less than and two greater than is just showing this is the certificate or subject issuer signed by someone. Here we could have the certificate of the server signed by, say, the proxy. That's what normally will happen. That is, the proxy will sign the certificate of the server, and that allows to have two secure connections using HTTPS. You'll have effectively a connection from the U to the proxy. You have two separate HTTPS connections, one to the proxy, and then one from the proxy to the server. And for that to work, the proxy must be able to decrypt what it receives from the server, and what you send to the server, and to do so, the proxy would create a certificate of the server signed by itself, for example. And when the proxy looking at HTTPS connection one, when the proxy sends this certificate to your browser, what does your browser do? If your browser receives the certificate of S signed by P, what will your browser do? Not trust. It gives a warning. It will pop up a warning saying, you've received a certificate that's not trusted in this case. And that's what normally would happen if we use HTTPS, because we must use two different connections so that the proxy knows what server you're trying to contact, and it's included inside the packet. So the proxy must decrypt what you send to the server, therefore we need two separate connections, and the way that it works is that the proxy would create a really fake certificate for the server signed by itself. And normally your browser would not trust that certificate signed by the proxy, so that's why you would get a warning saying, don't trust this connection. If you wanted to continually use that proxy, and you did trust that proxy, then you could load into your browser a certificate of the proxy, for example, a self-signed certificate. And that would prevent the warnings from being displayed every time you use the web proxy. This is the normal server certificate used across HTTPS connection two between proxy and server. This P, the certificate of S signed by P, is used across the connection HTTPS one between you and the proxy. And you need to either accept the warning when you receive that certificate, or load the proxies self-signed certificate, for example, into your browser manually, if you trust that proxy. So that's a bit of an issue with using proxies. We must trust it. Any questions before we move on to VPNs? So the web proxy was really our first privacy option. Second one, VPNs, virtual private networks. And there's really two components of VPNs, a concept called tunneling. And we mentioned that when we looked at IPsec. We put a packet from one layer inside a packet of the same or even higher layer. That's the concept of tunneling. And usually we use encryption of that inner packet. That is, we put an IP packet inside another IP packet. So we can think there's an inner packet and an outer packet. And usually we'll encrypt the inner one. And that provides, or is used for a virtual private network. There are different technologies for VPNs. In the previous topic, we mentioned IPsec. So that's one technology to support a virtual private network. Another one is secure shell. You can use secure shell to set up a connection from you to some special server. And that can be used to tunnel your traffic. But there are maybe three others which are more widely used for virtual private networks. And we'll not go through the details, but they're named, there's OpenVPN. And then there's, which uses TLS at the transport layer. And there's two others which use really the lower layers, the data link layer. PPTP and L2TP, point to point tunneling protocol and layer two tunneling protocol. They have similarities, but two alternatives to support VPNs. We're not going to compare really the difference between them. We'll just talk about the concept of using a VPN using any of those technologies. The way that most of them work is that they create a virtual interface on your computer. You know when you say in Linux, when you have a network interface, I think it's a name like ETH0 or WLAN1. But what a VPN will do is create an additional interface like sometimes they're called TUN for tunnel, TUN0. So you have ETH0 and TUN0. It creates this additional interface. And what happens is that your applications send to this new tunnel interface, which then does some encryption and which then sends internally to your computer to the real interface out to the network. The details of how tunneling works and these specific technologies will not get into today. Just look at the concept. With a VPN, we usually have some endpoint, the VPN server. So in this case, we want to create a virtual private network between you and some server out on the internet and denote it as V in this case. So the VPN is between you and V. Between V and S, it's just normal internet connectivity. And using a VPN with HTTP for normal web browsing, the way that it would work is that you want to send a packet to the server. So your application creates a packet. If you look at the inner one with the lines across it, the inner packet was originally created. The source address is set to the special address for your computer, which is created for the tunnel interface. I set it to V in this case, that of the VPN server. The destination is S, the real server you want to contact. And the data, the HTTP get request is included inside there. So think of that as the inner IP packet. Source IP address is that of V, even though it's coming from you, we'll set it to that of V. And the destination is that of the server we want to contact. We take that IP packet and encrypt it all. That's what I mean by the lines across it. I mean that part of the packet is encrypted. And then we put all of that inside another IP packet. So we attach a header and we set the source address of this outer IP packet, coming from you, going to V. It gets sent across the real network. So it goes to our local router, sends it to the ISPs router. What about the firewall? What does a firewall do with that packet? The firewall has a rule to block anything to destination S. Does it match? No, the destination is V. So this is a way to also bypass the firewall. So we've assumed our firewall was set up just to block going to a particular destination. Because the destination is V, it bypasses that block. The packet eventually goes across the internet, it will get to V. The internet routing protocols will deliver it to V. V is the tunnel endpoint. It is the other end of the tunnel, of the virtual private network. What it does when it sees this packet, okay, I'm the destination, it realizes, all right, this packet was sent in a tunnel to me. It removes the outer header and decrypts what's inside. And what it's left with is this packet saying, here's some data from V to S. And sends that across the internet, it will go via the different ISPs and be received by the server. From U to V is the VPN. V to S is just normal internet access, forwarding of IP packets. U to V, we forward IP packets, but that IP packet has actually another IP packet encrypted inside it. It's called a virtual private network, because effectively from U through to V, even though we're using the public internet to communicate, because the data is encrypted, we can say that's private. So we're using a public network, but because the data is all encrypted, we call it a private network. It's not a true private network. A true private network would be when I own the actual cables between U and V. So we call it a virtual private network. From our security objectives, what objectives have we met? The firewall has been bypassed. Destination was V, firewall only blocks the destination S. So the firewall cannot filter based upon just the destination address. Sorry. The firewall or the device here cannot read the data. It's encrypted. Or anyone between U and V cannot read the data. So let's say U is inside one country. V is in another country. The idea is that anyone in that country where U is cannot read the data because it's encrypted through the network internal in that country. What about the server? When the server receives the packet, the source is V. So we say the server cannot identify U. It thinks it's getting a message from V, but actually originally came from U. So here's another form of privacy. You're hiding from the server. What about those between V and S? What can they see? Those between V and S can see the data. It's not encrypted across that segment. So others can read the data in this case. But they do not know it's U communicating with the server. This node here, for example, if it intercepts this packet, it thinks it's V communicating with S. So there's privacy from those other nodes but not confidentiality of the data. Because with HTTP, this is just the HTTP get request. It's not encrypted. The VPN can read the data. The VPN receives the packet, decrypts it and sees the data. So the VPN server can see the data. And the VPN server knows that it's U communicating with S. Because it's the two endpoints of those segments. It is the endpoint of those two segments. We can use HTTBS as well. Very similar, but before just, before U creates the original IP packet, it first encrypts that data. So I've shown it slightly different here. The data is encrypted using HTTBS, put inside an IP packet and that IP packet is then encrypted as well. So there's two levels of encryption here. One from the application that encrypts the data. Yes, this data is encrypted two times. HTTBS encrypts it and then the VPN technology encrypts it. So there's a small overhead of that, but it's not a big issue. A small performance overhead. The benefit is that when the VPN server receives that entire packet, it decrypts and grabs the internal packet but the data is still encrypted. So the VPN server cannot see the data and the others between V and S cannot see the data if we're using HTTBS in this case. So between no or HTTP and HTTBS, the difference is that with HTTBS, others and the VPN cannot read the data. We're almost there in terms of achieving our objectives. We want green everywhere. That is, I'd like to achieve all those objectives. We're very close, but we still have the issue of the VPN server knows it's you communicating with the server. What about certificates when we use HTTBS? What are the issue with certificates? We don't have the same issue as web proxies. We said with a web proxy, if we used HTTBS for the proxy to work, it must be able to see the data because the data contains the address of the server. See this small plus S in here? So the proxy must be able to decrypt the data for the proxy to work, but that's not necessary for a VPN. The address of the server is included in the outer IP packet. It's not included in the HTTP request. So there's the difference here. The VPN server does not need to see the data when we use HTTBS, and therefore there's no issue with certificates. We just use HTTBS from you through to the server S. There's no problems with extra certificates. So that's even better. The server would have its certificate signed by the CA when it sends it back. It's using HTTBS. The VPN server doesn't care about the data in this case. There's no need to decrypt the data. There's no need to interrupt the HTTBS connection. And therefore just passes that certificate all the way back to you, the original browser. So we say in this case, it's HTTBS from one endpoint to another from you through to S. It works as per normal HTTBS. There's no problems like proxies. The problem with proxies is that they use HTTP mainly. That is the data itself and the address of the server is included in the HTTP packet. So the proxy to know that must decrypt and look at that value. That's the problem with using web proxies here. So that's why we have this strange issue of certificates with web proxies and HTTBS. How can we achieve our last objective? I want to communicate with a server without anyone else knowing I'm communicating with a server. Neither of the technologies, web proxies or VPNs has achieved that yet. Maybe before that, some practical things, a VPN. There may be some free VPN servers. There are some free VPN servers you can use. Alternatively, you pay, let's say, $5 per month to rent or to use a VPN server. And that server access the endpoint to your tunnel. There are different protocols as we said to choose from. We may see a slide later, but generally open VPN, PPTP and L2TP are the three most widely used in this scenario. The difference is mainly about support. For example, PPTP is widely used in Windows. So Windows, even the old versions of Windows had it implemented. I mean, you didn't need to load any software to use a VPN. Most mobile devices, Android and iOS, support PPTP and L2TP. If you open up your mobile phone now and find the network settings, you'll see VPNs and you'll usually see the choice of at least these two. Can anyone confirm? Find the VPN settings on your phone. Somewhere under the advanced network settings. I'm sure you know. Find the settings, VPN, select VPN and it should give you some options to create a new one. Can you? Add VPN configuration and what can you choose from? When you do add VPN configuration, there's three to choose from at the top, right? These three. L2TP, PPTP and IPsec are the three technologies your phone supports. Different operating systems may support different ones, but mobile phones usually support these two and IPsec, Windows, Linux and OSX on desktops will usually support the same ones. Open VPN is not so widely built into operating systems. You need to install an extra client to use that. So there's the limitation, but it can be considered slightly better in performance open VPN than the others. So there are some trade-offs between which one's best. So to finish the last thing, we want to meet this final missing objective. Make it such that you can communicate with a server without anyone know who you're communicating with. And that leads us to Tor, the widely used technology to support such a feature. Tor, the onion router it means. It has not Marvel comics. The onion router is one of the original definitions. The idea is to use multiple, like multiple VPN servers. We'll see the picture shortly. And to encrypt the pack up multiple times. And as you send it through each of those special servers along the network will decrypt and you get the concept as you send the packet through each server decrypts, removes the outer layer, removes the outer layer and so on. It's like an onion. If you have an onion and you peel the onion, you remove the layers until you end up with the internals. Let's, what can we say about it before we see the example? So Tor was designed for this anonymous communications in the internet. And the way that it works that it uses what's called Tor relays. Similar to our VPN server or our proxy server, we have a different type of server and it's called a relay, Tor relay. But we'll not use one, we'll use multiple. And Tor relays generally can run on any computer. And what you'll do is you'll choose a path of relays to go via. So instead of going to one VPN server and then onto the real server, you'll go via multiple Tor relays on the internet. Let's see the picture and see if we can explain the concept through the example. So you want to communicate with the server. And in the internet, let's say here out in the internet, I've drawn T1, T2, T3 and even E, they are four Tor nodes, Tor relays. E is a special case, we'll talk about that. So T1, T2, T3 and E are Tor relays. And in addition, your computer acts as a Tor node. So that starts the connection. There's a lot of details, but we'll try and capture the main idea. What happens is your computer using Tor, it chooses a set of relays to go via. So there may be many relays out on the internet. And in this specific example, it's chosen to use one, two, three and E. Now be careful, in this diagram, think of these Tor relays as just computers anywhere on the internet. They're not connected directly to each other. So between T1 and T2 is maybe many other routers. T1 may be in China, T2 in France, T3 in Singapore, E may be in the US. And between them is another portion of the network. So although my diagram doesn't show that, the network's much bigger than shown here. So what happens? You create a packet and it's structured like this. You know you're going to send via T1, T2, T3, E and then to S. So it's gonna take this particular path of relays. So you create a packet. It has the original data that you want to send to the server. And it has a special header that includes the address of E. And all of that is included in another packet with a special header saying T3. And then that has T2. And then this part here is the IP packet, the IP datagram. So you're going to send this from you to T1, the first Tor relay. And the internals of this packet are encrypted multiple times. The idea is that each part is encrypted using a different key. So let's see how that will work. You send this packet to T1. Does the firewall block it? Destination is T1, firewall only blocks the destination to S so we bypass the firewall, that's easy. It gets to T1. T1, which is a Tor relay, knows this is a special Tor packet. And what it does is it removes the outer header and then decrypts. Because it has the key and it's a little bit complex, but before all this happens, there's a key exchange between each pair of nodes, T1 to you and so on. So T1 decrypts this outer packet. And the packet was encrypted such that T1 can see the destination, the next destination is T2. So if you decrypt this part, you see that, okay, I need to send to T2. The rest is still encrypted. So when T1 receives the first packet, it decrypts and sees that the next relay is T2. But it cannot see the rest because it's encrypted with a key that T1 doesn't have. But T1 knows it needs to send what it has onto T2. It sends onto T2, T2 receives, decrypts, and sees that the next node is T3. T2 sends to T3, T3 will receive and decrypt and see the next node is E, another Tor relay. E will decrypt and see there's the data and send that in a normal IP packet. All of this is in a special, it actually uses SSL. This is the normal IP packet sent with source address E destination S without, say, HTTP get request to the server. So first, there are a set of Tor relays between you and the server. You choose the relays to go via. Usually it's three. Here I have four. And you create a packet that's encrypted multiple times such that the relay that you send it to can decrypt but they can only see who's next in the path. So T1 only sees this packet must go to T2. T1 doesn't know what the data is. T1 doesn't know that the remaining nodes in the path are three and E. When T2 receives the packet, it decrypts. T2 knows that came from T1. When it decrypts, it knows it needs to send to T3 but T2 will not know that after T3 is E because it'll be encrypted such that T2 cannot see that. And T3 receives from T2. It knows the next one is E but it doesn't know anything about the data because that is also encrypted. When E receives it, of course it came from T3, E knows that the destination is the real server to go out onto the internet. The result is that if you look at any particular relay, let's say T2, T2 knows the packet came from T1. It knows it's going to T3. T2 doesn't know who the final server is. Doesn't know the exit node. Just knows T3 and T1. In fact, T2 doesn't even know it came from you. It just received a packet from T1. It doesn't contain anything about you in it and it decrypts and sends it onto T3. And that applies at each of those relay nodes that they only know about their partners, the one just before them and the one just after. And the result is that they don't know who's the original source or who the final destination is. T1 knows it came from you and is going to T2. They don't know it's going to the server. In fact, T1 doesn't even know if you are the original source or maybe you have forwarded it on from someone else. When T1 receives the packet from you, it could have been that someone sent it to you and then onto T1. T1 doesn't know either way. So T1 has no way to know who is that first node. Similar T2. It came from T1, it's going to T3 but the others it doesn't know about and so on E, the exit node knows it came from T3. It doesn't know it came from you. It does know it's going to that website but it doesn't know it's you that's contacting the website. The result is that the entities through the network do not know who the original source is and who the final destination is. No one knows both values and that provides our privacy of endpoints. E is a special relay. So these four are all relays but E is what we call an exit node. It's the point where the packet exits the Tor part of the network. So we're using the protocol that Tor defines to communicate between U, T1, T2, T3 and E but once it gets to E, then it gets the original data which is the HTTP get request and sends it across the internet from itself as a source and the destination is S. So E knows it's a request to a web page and it knows the website but it doesn't know who sent that request. So E is an exit, it's the exit of the Tor network. Exit onto the real internet. How does it know the destination? The destination was encrypted in the packet. I haven't shown it here, okay? So I don't have space to show the entire packets but included in each of these packets will be who's the next point in the path. So the idea is that T1 will know the next point is T2, T3 will know it's E and E will know the next point is S. So the packets will include that information. There are special packets that uses SSL to do this encryption between the Tor nodes. Can we draw it again? The way that the packets are structured from you is that they have shared keys between you and the different, between you and the different relays. So you will have a key that is shared with T1 and T1 will also have that. So you will encrypt something and T1 will be able to decrypt it because they have a shared secret key. But also you will have a key shared with T2, a different key. This is set up beforehand, exchange of keys and with each of those nodes. So the source node has keys shared with the different relay nodes and the source node in this case creates the packet such that T1 will be able to decrypt some parts. Only the parts it needs to know that the next in the path is T2 and T2 will be able to decrypt the parts of the packet to know that the next in the path is T3. And similarly, E will be able to decrypt the packet and see the original HTTP get request packet, including the data. If we look at the packet, say from the perspective of T2, just as an example, there's the data, the HTTP get request will indicate the destination S. This is the packet seen by T2 but all of this is encrypted, meaning once T2 gets this packet it knows it needs to send a T3. But he doesn't know what T3 is going to do with it because it's encrypted. And similar, the previous step, T1 would have got a packet which said it knows it needs to send a T2 but not who goes beyond T2 because the different parts of the packet are encrypted with the different keys. So when T3 receives this packet, it will only be able to decrypt the portion that was encrypted with key KT3, which is the portion that shows us just the next node. So the packet that T3 will receive and see, it contains data, S, E, but when it decrypts, the key would only let it to decrypt and see that the next node is E. So the rest remains encrypted. T3 knows let's send it to E but it doesn't know who the server is nor what the original data was. The way that it was originally created at U was such that they could only decrypt the part that they need to see to send it onto the next node. And similar, E would decrypt and realize it needs to send to the server and sends the data to the server. The end result, no relay knows the original source and the final destination. So they don't know who is communicating with who. The only limitation in this case is this portion of the network from E through to S. This exit node is really where the Tor part finishes. So we're using Tor between U and E but then from E through to S, it's just sent across the normal internet. There's a HTTP GET request and of course the data is not encrypted. So anyone between E and S, including E can see the data. So the exit node can read the data. It knows what was in the request and others between E and S can see the data. So when the exit node receives this, it decrypts and has the final packet and it's not encrypted at that point. That's what E sends onto the server. Just as a normal IP packet. That was with HTTP. If we move to HTTPS, effectively that data is encrypted. So those between E and S can no longer read the data. In addition, E, the exit node cannot see the data. It knows it's going to the server but it doesn't know what the data is. So we've really achieved our aims in this case of keeping the data confidential between you and the server, preventing others from knowing who's communicating and even preventing the server from knowing it's you that's contacting the server. In this case, the server receives a packet, the source is E. It thinks it's the exit node contacting the web server but it's in fact originated from you. So this is providing confidentiality of the data as well as privacy of actions. This is the best of the three that we've looked at in terms of achieving those security objectives. With a VPN, we trust the VPN server. So if we compare with HTTPS with the VPN, the server knows who you're communicating with so you must trust the server. Similar with the web proxy, the proxy knows who you're communicating with and even can read your data. So a little more trust is placed in the server with a web proxy. But really we have, it's like having multiple servers, multiple VPN servers, T1, T2, T3 and E and the way that the original source encrypts it such that we don't have to trust any of them. They'll only know who are the immediate prior relays and the next relay. They will not know the original source or final destination. Did I miss something on this slide? There are a lot of details about how to exchange the keys, the protocols used for communicating and it uses SSL to communicate from between each pair of tour nodes, the relays. The exit nodes have a special role in that they are where the data comes out onto the original internet or back to the normal internet. And that's the source node chooses the path and there are some issues with how to choose the best path and so on. But in terms of our security objectives, Tor meets those. What's the problem with Tor? These relays are usually chosen, they may be anywhere in the internet. So they may be chosen such that they're not near each other. T1 may be in the US, T2 is in Germany, T3 is in Singapore, you are in Thailand, E is in Slovakia and the web server is back in the US. So to go from your computer in Thailand to the server in the US, you're basically going all the way around the world to get that packet there. So performance is a major issue in this case. But that's the trade-off that you need to make to get that level of privacy. The header, the header is okay, so the packet can be larger than the normal packets because we need to attach more information. Yes, that degrades the performance a little bit, but not as much as sending via the other side of the world just to reach a particular location. So yes, the extra encryption plus the extra headers are a performance penalty, but the main performance penalty is the fact that the path that we take between you and the server can almost think it's a random path through the world. It's not the best path from you to the server. It also relies upon these nodes to be forward a lot of traffic and these nodes just maybe someone's home computer. So if they are slow, then again, performance slows down. You can use Tor on your computer quite easily. I have it running on mine. It's running and it's running as a server on my computer which will accept connections from my web browser. So it's like a proxy running on my computer. So I can set up my web browser Firefox to use a proxy. The proxy is running on my computer, local host. It's called a SOX proxy and we'll not cover what that means, but it's a way for usually web browsers to contact a local proxy on your computer or even remotely. And the port that Tor uses is 9050. So there's Tor software running on my computer, waiting for someone to send to it. And when I set this up for my browser, my browser will send to the Tor software on my computer, the Tor software on my computer, you will then establish a path or a connection through multiple relays to an exit node and send the data via there to a server. Let's hope it works. So I just wanna first see what is my IP address and my IP address is 23 address and I don't even recognize this country, anyone? I don't know. I think it's somewhere off the coast of Spain, is it? So there I am. What this is identifying is the exit node. This is the address of E. So the server is the web server for what is my IP address. It's received a packet and the source was set to this 23 address. So the web server thinks it's communicating with the exit node, but it's actually coming from me. And I've got some software that will show us some details of the connection. This just shows us details of the Tor statistics that is the packet sent and received via Tor. So if I open another, the packet rate is going up and down there because it's sending packets through via Tor. And one thing that we can see is some connections. So in fact, the first one I think it is is the path that mine's using. So the 203.1301.209.66 is really my computer. It's the public IP address for SIT. And it's going via somewhere in Germany, one relay node, like T1, then another one in Germany, another relay node and then there's one exit node in this case. So there's two relays and one exit node. The exit's in US in this case. So there we've located the exit node. So that's the path that I'm using at the moment. And Tor has things that it will change that path over time, usually in the order of 10 minutes, it will choose a different path. We may not see it now, but here we go. So I selected to get a new identity which effectively changed my address. So now I'm coming from Switzerland from the perspective of that website. So I'm using a different path in this case. I have a different exit node. Last thing to demonstrate before we summarize. In addition to allowing access to normal websites, you can use Tor where there are no exit nodes but the websites or the servers are running as Tor nodes. So you don't have to go out onto the internet and they're called hidden services. Hidden service and sometimes called as the dark web. It's where servers are running on Tor nodes. So you can't even identify who the server is. The server is hidden as well. And there are different addresses that allow you to contact hidden services. Let's try one. They're called onion addresses. I don't know if this one will work. Usually they're random. They're created from some hash value. If that one doesn't work, we'll try another one. Okay, I got there. So that address, the special dot onion address is contacting from my node, not to a server on the public internet but a server running on a Tor node. So even the server is maintaining some anonymity in this case. It's remaining anonymous. It's a web search engine to finish. A comparison of those four approaches. The one of using just normal internet access, basic, with HTTP or HTTPS, using web proxy, using VPN or using Tor. Guaranteed it's useful to know. But you don't need to remember it, okay? It's just a summary of what we've covered. So let's go through it and finish. It's useful to know in life, I think. But I think you will not need to remember this in the exam because from the previous things we've gone through, you know already, okay? It's just a summary. So no, you don't need to remember the details. What it's showing though, data secrecy is our data encrypted. Who can see our data? Bypass firewall, did we get via the firewall? We saw that most cases it worked. Network privacy, can someone see that it's you communicating with the server? Server privacy, does the server know it's you that's contacting it? So one is network privacy, here I say, can others know that it's you contacting the server? Server privacy is, does the web server know it's you contacting it? Log analysis I think we'll not talk about is can someone obtain the logs of the servers and then do some later analysis to find out what happened? And the last three are some convenience issues. Remember, we said we want it cheap, we want it easy to use, and we want it to perform well. So we'll compare. We'll not go through all, but with basic, without any of the techniques, we have no secrecy. We're blocked by firewalls, no form of privacy. HTTPS gives us some confidentiality of data. A web proxy, with HTTP, our data's not secret, but we can bypass a firewall. We hide from others, they don't know it's you contacting the server, although the proxy knows it's you contacting the server. So there's some form of privacy, except you must trust the proxy in this case. So I put it as a question mark, you must trust the proxy. The server doesn't know it's you in this case. With HTTPS, you do get some data secrecy, but still the proxy can see your data. You bypass the firewall, only the proxy knows it's you communicating. The server doesn't know it's you, but one thing we haven't mentioned is that whenever you contact a website, things like cookies and other parts of the web interface may identify you. We're just looking at the IP addresses. So especially using HTTPS, usually you'll log in with HTTPS and then the server can identify you. With a VPN, let's go straight to HTTPS. Data is encrypted, we can bypass a firewall. No one knows it's you communicating, except the VPN server. You must trust the VPN server. Again, the server doesn't know it's you. With Tor, with HTTPS, its data is encrypted, we bypass the firewall. No one knows it's you communicating, and again, the server doesn't know it's you, unless you use some form of cookies with HTTPS and then the server may be able to identify you. So you need to be careful. The convenience issues, comparing the last three, Tor is free, you can use that for free. VPN servers are usually slightly more expensive than proxy servers. In the order of hundreds of baht per month, if you want to rent or use a VPN or proxy server. In terms of ease of use, a proxy server is easy. You just use your browser to visit the proxy website. A VPN, you need to set it up. You saw it on your mobile phones, you need to set it up and entering the configuration of the VPN server. With Tor, you need to install some software. It's not too hard to install, but you need to do something. Performance, really proxy server, it depends a lot upon the proxy. Everything goes by the proxy. If the proxy is slow or in a bad location, your performance can degrade. Similarly with the VPN server, it depends upon the VPN server. With Tor, it depends not just on one server, it depends upon those nodes. And that's usually much worse than the others in terms of performance, because the nodes can be anywhere. With a VPN or proxy, you may choose the location of those servers. So that gives a final summary of the privacy techniques. So we've gone through three techniques to overcome some of the limitations of using basic web browsing in the internet.