 We just talk generally about the security mechanisms, or the approaches to implementing security in the internet. And often we talk about, when we say security, we think about encryption. We want no one to read our data, so we encrypt. This topic on internet privacy options covers similar issues, but also addresses the other security requirement of privacy. That is not just no one seeing your data, but also no one knowing who you're communicating with. That's a slightly different issue. So we'll look at several options for how can you communicate across the internet, have your data secure, no one can see the data, and in some cases no one can see who you're communicating with, privacy. The slides, as we'll go through, there are a number of different acronyms, so that's just for reference there, I think we know them or we've seen them already. This set of slides gives some background on what is the internet, but I think you already know this, but some of the things that we'll need to recap on to understand the privacy options. We know the internet is made up of a collection of networks. We know there are many different internet applications. Here's a picture of the internet. We can think there's you at one end point, and let's say if we consider web browsing as the example, all the examples I think we'll talk about are web browsing, then you want to access a web server. So we know we use HTTP to do that normally. We send a request to the web server for a web page, the server sends back that web page. Of course, the internet is much more complex than what's shown here. Another way we could view that, so to look into some of the details inside that cloud, is we can think, okay, you at home, you have your home computer, you have a home router, so these circles represent routers, and you connect via the LAN or the Wi-Fi to your home router, and then your home router has a cable or a link going to your telephone network or your cable network through your internet service provider, your ISP. So your home router, this first circle, connects into the router of your internet service provider. So think of this portion as the network operated by your ISP, the company you pay for internet access. And that ISP may connect to another internet service provider, inside Thailand or outside the country, which connects to others and so on. So there may be multiple internet service providers across the globe that when we connect to a web server, our data traverses. So in general, we have your ISP connects to another, which these circles represent routers here, they send the data via those routers and some other ISP and eventually to the ISP that the website operator uses, that the server is located at. So we're going to use this diagram to illustrate how different security techniques provide privacy. Any questions about the picture? As we'll see it come up, but you need to understand what everything is showing here. Your computer, the web server we want to communicate with, the circles are just routers. Another thing that we'll look at is that sometimes an internet service provider will use a firewall. And the firewall we know is used to control the traffic going through it. So generally, the firewall may be used by your or another ISP to control what you can access outside of your home network and to some extent what can come into you. So this is a firewall provided by your internet service provider or another ISP, not your home firewall, but one by the ISP. And sometimes that firewall may be configured to block you from accessing some websites. We know how that can be done. We've used IP tables, for example, to block access to different services. We'll look at how that can be bypassed and what the issue is there. As an example, actually, before we go through some examples, the other thing that we need to remember is that we have IP addresses. Our computers have IP addresses. I will hide some information about IP addresses in that in practice, usually your computer at home has an IP address which is only relevant locally inside your home network. Then you have a public IP address which is relevant for the rest of the internet because usually technologies like network address translation and private addresses are used. But to avoid having to explain that, even though you may have seen it in other subjects, we'll generally assume in all this discussion that every computer has a unique IP address in the internet and that's a public IP address. We know about domain names, domain names mapped to IP addresses. So if I know an IP address, I could find out the corresponding domain name. It's possible to go in the reverse direction. DNS generally maps from domain to IP, but given that database of mappings, I can do a reverse DNS and take, given an IP address, what's the domain name? That's possible as well. We know how IP works. That's just a reminder if we need to look at the structure of an IP packet. Web browsing, you know how web browsing works. We've covered that multiple times now. HTTP requests and responses. Okay, we'll not go through that. So we want to look at some of the security requirements in the internet and especially privacy options. Internet security covers many different things. It's not just about encrypting data. So some of the things, and if we go back to our very first topic in this course, we talked about CIA, confidentiality, integrity, and availability, I think one of them. Confidentiality is keeping your data secret. When we send our data, we don't want anyone else to see that data. The technique we use is encryption. Authentication is about checking that the entities we're communicating with are who they say they are. So we've covered some forms of authentication, passwords for human authentication. We use keys for authentication of computers. Data integrity is about making sure that the data is not modified. If it is, that we can detect that, and we can use encryption or digital signatures to do that. Privacy, I will define as keeping your actions secret. Privacy here, we'll say, is different from confidentiality. Confidentiality is keeping your data secret. Privacy is keeping your actions secret. And the actions may be who you're communicating with, at what time are you communicating. So not necessarily the content of the communications, but how you're communicating. How do we achieve privacy? We haven't really talked about that. We know how to achieve confidentiality in crypto data. So we look at some aspects of privacy. Sometimes these words are used to mean the same thing. Sometimes people will say confidentiality and mean the same thing as secrecy or data privacy. But for this topic, we'll say privacy means privacy or secrecy of your actions, and especially the secrecy of who you're communicating with. So we'll focus on we want to keep the actions private and, of course, keep our data private, confidential. That's the focus for this topic. Why do we want to do that? Well, I think you can think of many reasons. So there's just some examples. Say, why do we want to keep data confidential if we're running a business so our competitors cannot steal our ideas and overtake us so that people cannot steal our money if we're accessing a bank and so on. Why do we want to keep our actions private? Well, there may be different reasons. I'm sure you can think of some. If you're working at some company and you start looking for a new job, maybe you want to make sure that your employer doesn't know you're looking for a new job. They don't want to know that you're accessing some jobs website. Okay? Maybe you're working for a company or an organization. You've discovered they're doing something illegal and you want to report that. You want to do that without someone knowing you're reporting that. Maybe you're accessing websites to learn about some medical condition and you don't want people to know that you have that medical condition. So, again, this may be about keeping the information about which websites you're accessing private. Not necessarily the content that's being exchanged, just the fact that you're accessing a particular website. So, there are reasons why we'd like to keep data confidential as well as keep our actions private. And when I, in this topic, when we talk about keeping our actions private, we'll mainly talk about keeping it private as to which websites I'm accessing. So, that others cannot learn which other computers I'm communicating with. That's what our aim will be. Let's see what we've got. Let's have an example. Before we look at the requirements, first, something about identifying users. If I open my browser and go and to a website, what is my ipaddress.com? I think that's the domain. I visit that website. So, I visit this website from my browser here at SIT and this website's located in the US, I think. What does it learn about me, this website? Well, it learns or it knows my ipaddress. We know that because when we send a packet an IP datagram from browser to server, it must contain my source ipaddress. So, the website learns my source ipaddress. 20313120966. If you have your phone open, then maybe you can visit the same website. What does it identify you as? Try. What is my ipaddress.com? Tell me what ipaddress you have. Here's a chance to use your phone in class for a valid reason. The same. I think if you go to this website, you'll be identified as having the same ipaddress as me. Is that right? Some may not be identical, some may be similar to a using SIT internet. Some may will explain why. So those that are using the SIT Wi-Fi are getting the same ipaddress. But we know that our computers have different ipaddresses. What's happening here? Why is everyone being identified the same with the same ipaddress at this website? What's the technology that's doing this? Maybe Dr. Comet's taught you this. Why? Why do you have different ipaddresses? If I check with ifconfig, you check in your setup, you'll see we have different ipaddresses. We actually have internal to SIT, we have what we call private addresses. Mine might be 10.6.211. Yours will be different internally. But when we send something outside of SIT, then the technology called a public ipaddress. And the way that it works is that SIT has this one or maybe several public ipaddresses which is used for everyone. This website identifies us as this ipaddress and identifies not the individuals but everyone as being the same ipaddress. That is because of the use of what's called network address translation. This website knows my ipaddress and it knows also where I'm from, Thomas Art University. So it knows my ISP because again, ipaddresses are allocated to ISPs. And there's a public database that determines which ISPs we use, which range of addresses. So that's how the server, this web server knows to some degree who I am. It knows I'm from Thomas Art University. It doesn't know I'm Stephen Gordon yet because I have a different ipaddress than others here but we're also all reported as the same. But if someone wanted to find out whether who is the exact person accessing this website, they know it's from TU then maybe with the help of the computer center at TU they'd find out that the person accessing that ipaddress at this point in time was me. So it's not too hard to then provide that further mapping of inside TU who accessed that website at that point in time. This is an example showing that web servers can identify who the web browsers are. A server knows who is contacting it and we'll assume in the further discussion without other mechanisms we'll know who is contacting it. We'll come back to the requirements. I want to talk about the assumptions before that. We're going to talk about how can we provide different security mechanisms and when we do that we'll make these assumptions. When we use encryption we'll assume someone cannot break those encryption algorithms. If we use encryption then the data is confidential. The path that we take between your browser, you and the server may change that I think will not be so important. We'll assume that all computers and users can be uniquely identified by ipaddress. Now our example just showed us that that's not entirely true. The website didn't uniquely identify me identified someone inside SIT or TU but with a little bit further work they could map it down and uniquely identify me. So we'll assume that an ipaddress uniquely identifies the user. And if internet service providers do block access to particular websites using a firewall then they do so based upon ipaddress. That is if jump back to our picture here if my ISP wants to block me from accessing this web server S if you had to set up that firewall you work for the ISP what would you do to block the user from accessing that web server? So you'd use ip tables or similar software to add a rule that says if the destination equals S the ipaddress of S and maybe port number 80 then drop the packet. So you could create a rule at that firewall that says if someone sends a packet destination equals S then drop it. Let's assume that if a firewall does try to block something it does it based upon the ipaddress. It's the simplest way to do it and the most common way to do it. Let's now go to the requirements. So some requirements that people would like to achieve with respect to security in the internet are listed here. One but the web server to be able to read my data. If I contact a website the data I transfer to that website and it sends back I don't want anyone to read that. That's confidentiality so we'd often like that. Another thing I would like is sometimes I don't want others to know who or I don't want them to know that I'm communicating with a particular server or who I'm communicating with. I don't want others to be able to identify that I'm accessing www.facebook.com or some other website. So that's about privacy of actions and we may break that into two parts. I don't want others to be able to identify that when I'm actually communicating with a server during the communications and that may be one requirement a slightly different one is that I don't want them to be able to do it during communications as well as after the communications take place. That is I access a website today and then in one month time someone's trying to learn whether I access that website. Can they do that? That's this requirement. If I want this requirement I want it such that they cannot in one month time come back and maybe check all the logs of the website, check the logs from the internet service provider and find out did I access that website one month ago. So we'll distinguish between when I'm actually accessing it can they monitor in real time and see I'm accessing it versus can they find out what happened in the past. Another requirement that I may want is that I don't want the server to identify me. I want to access the website but I don't want the operator of the website know it's me. And another thing is I want to be able to sometimes bypass blocks at firewalls. A firewall may stop me from accessing some website I want to bypass that. So these are some security requirements and the reason we're covering them is not because I want to do all these things like bypass firewalls but just to learn about how the security requirements can be met and how the blocks can be established. I may not want all of them at the same time. And in terms of convenience I'd like the technology we use to be free, easy to use and I want it to perform well. So we'll compare some different solutions with respect to security and convenience requirements. So let's look at how different technologies how well they go with respect to those requirements. And in the pictures that follow I think this notation is used so we'll see U and S or the U and the web server we'll see a VPN server shortly V Tor we may not see today So let's go straight to the first case. Here's a case you're trying to access a website using HTTP normal web browsing and the firewall is configured with a rule that says anything that comes in its destination is S block it drop that packet. If the firewall is configured like that can you access the website? No. You can access the website. So using just HTTP if the firewall is enabled then we are blocked from accessing that website. There's nothing special about that but we'll compare and see some other solutions where it will not be blocked even if the firewall is enabled. So that's the simplest case we cannot send anything out of our ISP What if the firewall is not blocking? The firewall lets you access the website. There's no rule there. So we send our data and using HTTP we send the packet the source address is mine the destination is that of the server the data is not encrypted so with respect to some of the security requirements what happens? First the firewall or the device here run by ISP is the server you're communicating with. This device sees that this packet is from you to this website so the firewall could monitor and see who you're communicating with. If they wanted to the firewall or the ISP running the firewall could read your data. So your data is not confidential and your actions are not confidential with respect to the firewall or more generally the ISP running the firewall. If we would like the data to be confidential or we'd like our actions to be private in this case they are not. Also anyone else out on the internet between your ISP and between the server they can also see who you're communicating with because if they intercept this packet they can identify from the source and destination addresses it's you talking to this website and they can see the data and the server knows it's you because the source address in the packet that the server receives is your IP address so the server knows it's you. So none of the security requirements are met in this case. So this is the simplest case just using HTTP there is no security. So that's the simple one. What if we move to HTTPS we know about HTTPS we've studied how it works let's say the firewall is enabled does HTTPS help us? Does it allow us to bypass the firewall? No. With HTTPS the IP datagram still has your address as the source and the server's address as the destination it's just that the data is encrypted. So still if the firewall is blocking everything to destination S your packet doesn't get through. So what I'm saying here using HTTPS or simply HTTP both of them will not be able to bypass the firewall. We said one of our requirements maybe to bypass the firewall these don't work they don't bypass the firewall. What about those other requirements if we use HTTPS and there's no firewall or it's not enabled the firewall there's no rule to block you but it can still monitor the packets going through. The firewall can still see the server you're communicating with. The source and destination addresses are not encrypted. They cannot be if we're sending an IP datagram we need them to deliver it across the internet. So the firewall can still know who you're communicating with. So in this environment I don't want others to be able to read my data we can achieve that using HTTPS. Others in the internet these are the routers maybe other internet service providers similar with the firewall they can see who's communicating with who but they cannot see the data it's encrypted. The server still knows it's me when the server receives the packet the source address is the your address. So the point is HTTPS can provide confidentiality of data but it does not provide secrecy of your actions it doesn't hide who is communicating with who other techniques are needed for that so that's stuff we already know any questions about that before we move into things we don't know okay so what we're trying to do we're going to look at some other techniques and we're going to look at these requirements can the firewall see the data can they identify who's communicating can others in the internet do that can the server identify me and one technique to try to help with security is called a web proxy quite simply it's a website on the internet that you use to send your request you have the real website via that is you access the web proxy and then you tell the web proxy to access this other website and that website the final destination will send the response back to the proxy and the proxy will send the response back to you let's see an example sometimes you want to read the news let's see if we can find an example I want to read the news so I go to Google news and I click on the link and let's see if this works we get this what's this what happened okay so this what's happened here anyone want to guess blocked by a firewall by some internet service provider so what's happened is I've tried to access some server I'll show you which one in a moment but my request has got to some firewall either by my ISP or maybe an ISP beyond that and that's determined you cannot and it's blocked and it's also sent back a request saying you're not allowed to it was only a news website nothing too dangerous some web proxies so here's a a random web proxy I found that is a website where what you can do is type in a URL and that proxy will send the request to the website for you and the website that I tried to visit on the here was some newspaper in the UK called the Daily Mail okay the Daily Mail so if I copy the URL of the Daily Mail so the Daily Mail is blocked if I type it into the proxy let's hope this works it's getting close and now I get to the website of this newspaper it's not a very interesting newspaper but what happened is that the request to the destination website is sent from the proxy and the proxy so I send a request to the proxy which is not blocked and then the proxy sends the request to the final destination web server and then the response comes back to the proxy and eventually what I see is the website of the Daily Mail so this effectively has bypassed the block of the firewall if you try and visit that same website without going by the proxy you will not get access how does it work with respect to HTTP like this here's your computer here's the proxy server and here's the web server you want to access what happens is that you send a HTTP get request or a special HTTP request not a get in this case a post request to the proxy server and in the form field so when I use this proxy if we go back here's the form field in this field I type in the destination URL so when I click surf here what it does is that my browser sends a request to the proxy server and one of the parameters in the form is the URL of the final server so we can visualize it like this it's a HTTP request visiting the proxy and a parameter inside there is the URL of the eventual destination the server the proxy gets that and now creates another HTTP get request to the server that I want to visit the page that web server replies to the proxy because the web server received a request from the proxy P therefore sends a reply to the proxy P including the web page and when the proxy gets that web page it creates a reply going back to me which contains that web page may it also contain some advertisements or some information about the proxy in there so that's how it works from a HTTP perspective the proxy acts as a server from my perspective and a client from the destination websites perspective if we use such a web proxy how does that help with our security requirements let's see a proxy with just HTTP so the proxy is this blue router in the middle there is some website that I'm going to use to access the eventual server s so we can think of it is that I send a packet, I'm the source I'm sending it to the proxy and the data inside that identifies the real web server I want to access the firewall has a rule saying block anything which has a destination IP address of s this packet has a destination IP address of p so it will not be blocked typically firewalls to make them simple and fast they don't look inside the data they just look at the destination address because it's much much faster to look at the destination address than have to look through the data looking for server s so even though the address of the server is inside the data the firewall would not look at that it just looks at the destination IP address it doesn't match s so let this packet through the firewall the packet gets to the proxy the proxy now creates another packet from the proxy to the web server containing the request it goes to the web server and the response will come back our security requirements the firewall can read the data nothing's encrypted so there's no encryption here others can read the data out on the internet again no encryption means anyone can read the data what else the proxy cannot easily see who you're communicating with or the proxy sorry, the firewall cannot easily see who you're communicating with the firewall thinks you're communicating with the proxy it doesn't think you're communicating with server s when I say easily here in theory it could look at the detail of this packet but in practice if your ISP must handle millions of packets per second it doesn't have the resources to look at every individual packet it's quite complex to do that they can but it's not so common so usually the firewall would not know who you're communicating with or it would not know you're communicating eventually with s it thinks you're communicating with p so that's one of our security requirements what else others out on the internet people out here who intercept between p and s they think it's p communicating with s they don't think it's you communicating with s because if they just look at the addresses IP address p is sending a packet to s so others who intercept this packet do not know it's you communicating with the server s and the server doesn't know it's you the server receives a packet the source address is p it doesn't know it's you who's originally contacting that so that's another security requirement in that the server doesn't identify you a different thing in this case is that the proxy can read the data that you sent to the server and they know it's you that's communicating because the proxy knows that you sent the request to them and it's a request to that server so there's no protection against the proxy knowing who you're communicating with or what you're communicating but we have achieved some different requirements of the server not identifying you and others in the internet not identifying you and bypassing the firewall so if you are working for an ISP and you want to set up the firewall so that you can truly block access whatever you need to do if you want to set the firewall up such that this will not work what could you do some of you will get jobs with ISPs you'll need to configure their firewall drop every packet so none of your customers can access the internet you're fired tomorrow what else could you try block packets to particular proxies if you know what are the proxy websites p to destination p but that becomes hard because there are many proxy websites and you need to as the ISP keep track of them and know about them and update the list on a regular basis anything else you could try you can get your firewall to inspect the contents of the packets to look at the data and do some analysis every time a packet comes in don't just look at the destination address look at what's inside the data and see is this a request to a proxy for the block server so that's possible the problem with that is it slows down every packet that goes through the firewall and maybe if it's too slow your customers become unhappy because the internet access is slow so there's a trade-off there of if you check every packet it slows down the network access through your ISP and that's something you don't want to do with your customers so there are solutions but there are trade-offs with them last one for today let's switch to using HTTPS so basically the data is encrypted the firewall now cannot read the data with HTTPS they could read the data now there's no chance to read the data because it's encrypted others on the internet cannot read the data so using HTTPS we need to keep our data confidential as well as other security requirements so comparing no normal HTTPS versus HTTPS the difference is we keep our data confidential with HTTPS now this requires the proxy to be able to decrypt and encrypt again so the proxy can still read your data and know who you're communicating with so you must trust the proxy in this case on Thursday next week during our last lecture we'll look at using VPNs to do a similar thing and how VPNs can be used to achieve similar security requirements and that will about finish us for this topic