 So the second part of this second lecture is on HTTP, the hypertext transfer protocol. So the protocol that really makes the web run. It is the standard web protocol that is used to transfer hypertext, and as we discussed in the first lecture, hypertext is essentially text with links with additional information. It doesn't only transfer hypertext, but in general hypermedia, so it can also transfer pictures, for example. An important feature of HTTP is that it's stateless, so if you send more than one request, the server does not know of a previous request. This is kind of nice to know for now, but when we later get to the server side topics, you might understand more in detail why this is useful. But in general, this makes, for example, the routing possible that different requests can have different routes, and this does not cause any problems. So that's one part of it. And one important thing of HTTP is that everything, absolutely everything, is sent and received in clear text. This includes passwords. So when you want to encrypt something, when you want to, for example, hide your passwords, you should always use HTTP that is going over a secured connection, and what you typically do is HTTPS. And in fact, nowadays most companies, most websites, force you to use HTTPS because HTTP has, of course, this flaw that everything is clear. So if we go to Firefox, you could, for example, do HTTP, colon double slash Google.com. This should give you an unencrypted connection. What you'll see is that actually this changes to HTTPS. So for example, Google, one of the companies is automatically redirecting you to a secure connection. And that's also why you see this lock. So be careful whenever you have a connection where the lock is not there. And in the practice session, I will most likely show you a live example of how this looks like and how you can actually look at the details of the connection to see that everything is in clear text. This can be problematic. For example, imagine you are in a cafe and you're using the public Wi-Fi. If you use an open connection, an HTTP connection, everyone that is in the same network in the same Wi-Fi could just read your password. So that's quite an implication. So that's important to know, but that's really the difference we're covering here. So HTTP is the regular transfer protocol. HTTPS is more or less the same thing just encrypted. Now, the way this works is when I request a website, for example, I want to go to Wikipedia and I want to get the website back. What I'll do is I send the request as we already discussed in the first part, and it looks something like this. So it says get slash wiki slash URL, HTTP version 1.1, host wikipedia.org, and I get something back. And this is an HTTP request. This is an HTTP response. And we'll look into the details of what those different things mean. The typical case of doing an HTTP request is the one I've shown you so far is when you go in your browser and you open a website. So that's by far the most typical request. But in fact, there are a lot of other ways. So you can do the same with, for example, the Postman software, which we use in this course. But for example, when you use any kind of app on your smartphone and it downloads some information, for example, weather information, you are obviously not opening a browser and opening a website. But you're using some kind of program. And what that program does in the background is actually an HTTP request. So it doesn't always have to be Firefox or Chrome. HTTP requests are literally everywhere in the internet. Yeah. And it's really most applications that somehow interact with the web do these requests. So it's kind of important to understand what HTTP does. The requests look somehow like that. And we could, I believe, look at that also in here. If you go to Firefox, you can go to tools web developer network and Chrome for completeness. We can also show it there has something very similar. I just have to find it because I never do when I want to. Here on the view developer, you can go to developer tools. And you also get, for example, the network information. So it's the same thing in Chrome. In Firefox, you get this thing very cryptic. But now when I do Google.com and I start the request, you see all sorts of things going on here. And what's actually happening is every line here is a single request. So you see that when I open Google, it's not just one HTTP request is actually 18 of them. But to make it simpler, we just look at one thing for now. And if I click on one of them, you get the details. So you get this. And there are a number of different things that are relevant here. So one thing that you see is get. That's the so-called HTTP method. So HTTP knows a number of methods for different purposes. We'll get to them later and we will heavily get to them later in the server side lectures. Then you have the target. So of course you want to go to Google. So you have to give an address. These are URLs. They can have very different formats. And we'll look at them later, but essentially their addresses. So you tell the browser or you tell HTTP what is it that you would like to have. Then you tell which protocol version you have. This is what you see most often, HTTP 1.1. Sometimes you see HTTP 2, but it's actually not that common. So that's just telling, please get me the Google.com resource, the website using HTTP 1.1. And then you have a lot of so-called headers. So that's additional information, quite detailed technical stuff if you look at it. And we'll get into details there. So it tells you what kind of additional thing are relevant here. Depending on what kind of request you do, you might also have a request body. For example, again, easiest thing. If I am on Google and I want to actually search something, web development. And I press search or press enter. Of course, my request has to know that I'm searching web development and that that's in the body. So it's additional data that you need to provide. Because it's not enough to say Google.com, but you also have to say, please give me web development. The headers are meta information as discussed. For example, the first one here says what kind of response to accept. So here we say we want to have HTML or text back or other things. If the server instead sends me a video back, then something is wrong. Please don't accept that. So that's kind of things that are in there. Typical things that are in the header are, for example, as I just said, the accept. What kind of response do I expect? What is coming back? You typically, your browser automatically sends the version. For example, now when I use Mozilla, it says this request was sent using Mozilla 5.0 on a Macintosh with this operating system version, blah, blah, blah. So you actually send to Google quite a lot of information on who you are. Then there might be cookies. I'll discuss that later. You probably have heard that because nowadays, of course, on all the websites, you see these pop-ups. This website uses cookies. And things that are typically sent in the header are, for example, authorization information. So if you send your user and password when you log into Gmail, most likely it is in the authorization header. And the details here are not relevant right now. So those are typical things that you might want to send. Note that these things are not encrypted. So in this case, I'm saying use basic authentication. There are different versions you can use, but basic just means send the user and password. And then this here is my user and password. This is actually an encoding. It's not encrypted. So even though it looks fairly encoded, fairly encrypted, it's actually not. It's just converted. You can just convert it back and then you get the clear text. I'll show that in the lecture. Okay. If everything works, you'll get a response from the server back that might look something like this. HTTP 101 200 OK. So again, this is our protocol version, same we just had. Then we have a so-called status code. We'll dive into those. And then we have some kind of text, some human readable text that says everything is OK in this case. As before, we have headers. Some of them are the same, for example, the date header maybe. Some of them are typical for responses. And the response has a body because we requested some kind of resource. The response should actually return the resource. If the request somehow doesn't work, then we should get some kind of error back, some error message or some error page or so on. But what you get back here in the body is really the HTML, the source code of your website. So again, this is something you can see. This is our request. And here you see actually the response header. So this is what the server sent us back. Here are the request headers. And if you go to response, you see the actual response. Which is now, I don't know why it's not here. We can reload it. But typically you would see it's probably the wrong request because there are so many. If we go to justgoogle.com, it's probably easier. So if we go to Google.com, you see actually this is the response we are getting. So it's just the actual HTML code. Okay. As we already discussed, the response headers might be slightly different from the request headers. So for example, a typical one is age. This is an estimate of how much time has passed since this response was generated. So it gives you some kind of information of how long did it take to get from Google to your computer, from Wikipedia to your computer. It also related information. It tells you when is this information expired. So for example, if I get the response after Thursday, the first December 94, then I should throw it away. It's not valid anymore. So that might not be that important for websites. But if you send the request, for example, to get the current weather. If you get the current weather two days later, it's not relevant anymore. So then you might want to have an expiry date. Let's ignore that for now. But if you want to change a resource, you might only allow certain things. And then you get the content type. That's an important one. It tells you what kind of resource did I give back to you? So for example, if the content type says text HTML, then you know that, okay, this is an HTML document. This is something that I can parse with the browser I can display. If it says instead something like JPEG, then you know it's a picture and so on. And if it's text, then you might want to know which character set is it or similar things. So basically, this helps you to choose the right application to display the response. And that's also why, for example, if I enter a URL with a PDF, your browser automatically knows that, okay, I should use Acrobat Reader or I should use some internal PDF reader to display this. That's the content type that gives you this information. And finally, another important one is relating cookies. So the response might tell your computer to save a cookie. And again, more on that later. Now I've dropped the URL before as a keyword, but this is what we're going to detail now. Whenever you enter a website, a web address in your browser, what you want to do is identify some kind of address that you are browsing to. And this is using a URL. So URL is any kind of reference to a resource. For example, an HTML document, an image, a video, a JavaScript file, whatnot. And they typically look something like that. So that's what you all know. And this URL has several parts. So the first one is the protocol. And as I've done in most of the examples, I've often skipped this. And that's because most browsers, if you just enter a regular website, it just adds HTTP automatically. But usually for a proper request, you have to enter the protocol. Then the second part is this en.wikipedia.org. That's the host name. So that's what we already discussed in the first part. That's this part that gets mapped to an IP address. So that's identifying the machine, the computer, where our resource is located. And then the last part is the path. So that's basically the folder on your computer where the resource is located. And of course, for instance, if I just do enwikipedia.org, I just get to the main page. And that corresponds to the path slash. So it's just the root directory, the basic directory where everything starts. But typically you have an additional path. So that's how most URLs look like. The general case is slightly different. In general, you have something that's called a uniform resource identifier. So we will not discuss the details on that, but it's very similar to a URL. But you have a lot of different information that you can add. So we have already seen the protocol. What you have not seen is this one here. Sometimes you have something like Grisha, Cologne, password. So then directly in the URL, you can supply a user and a password used for authentication. So if you ever see this one name, Cologne, another thing, add, that basically means authentication. Cologne 80 is the so-called port. More details later, but it's basically the endpoint. So on the same host, on the same machine, in a way you have different targets, different slots. So the corresponding thing, if I take my post example again, is you could send an email, you could send a letter to Reykjavik University and you could send a letter to Reykjavik University, but basically to two different departments. So it's a difference whether you send a letter to the School of Computer Science or the School of Business. That's similar to the ports. So same address, but different kind of endpoints. Then we have seen the path and then we have some more stuff here. One of them is called the query. That's typically where parameters are provided. So if you need to tell the targets, Wikipedia, for instance, that foo equals one, then you can put that into the URI. And finally, we have the so-called fragment. So the hash symbol and section one in this case. That's used to identify a part in the URL. So one example for this is if you go to Wikipedia and I directly do a Wikipedia URL. So that should bring me to the Wikipedia page on URL. And if I click on any of these, if I click on notes, for example, you'll see up here that this hash notes was added. And that basically tells the browser, you get URL back, you get this page back. Please go to the part notes. So it's basically a section heading. And the difference is you see if I just open URL, it will bring me to the top of the page. If I open URL hash notes, it will directly jump to the right section. So that's what this fragment is used for. So this is the general thing. Many of these things you don't typically see. And since we are discussing HTTP here, we always have HTTP in the front, but it's perfectly fine to have other protocols. For example, if you use a MongoDB database, you might need some kind of string that looks like this. That says, okay, please log into my database that is on the server. DB host on port 27 something path like this. And then additional parameters for authentication, please use SCRAM-SHA1. So that's a typical situation. You will see this in different cases. So it's not only HTTP, but you can have very different protocols. Okay. Now we jump into HTTP methods, but only briefly. These get much more important when we get to the server. But it's important to know that HTTP knows nine different kinds of requests, so different methods. And the most common ones, the ones that we'll use in this course, are get, post, put, patch, and delete. So it's basically getting, as the name suggests, you want to get a resource. You want to request a resource. Post, you want to send something. So for example, if you want to post on a guest book online, you might do that. If you want to send an email in a contact form, you probably do a post. So these different methods are typically used when you have a form and you want to submit data. Put, patch, and delete you probably haven't seen yet, because these are almost exclusively used in programs. In your web browser, you will never see them. The standard case is get and post. When you want to open a website, you do a get, when you somehow send information through a form, for example, you do post. But as you see in the browser, you do not anyhow specify this. You just see the details. If you again go to your network view, you actually see that everything is get here. I'm requesting some kind of resource. As I said, this will become much more important later on. These methods have different properties that are important. Again, something we'll discuss more later. But for instance, if you do a get request, a get request has to be safe. And that means that when you run the get request, it does not cause any side effect on the server. It does not change anything. So if I, for example, request the Wikipedia page on URL, it sends me back the page. But the server will have exactly the same status afterwards. Nothing has changed. So that's a very important property. Then there's something that's called id impotence. And it means that if you run the request once or 20 or 100 times, doesn't change the result. And without going into details, you can imagine if you delete something, then it's gone. If you delete it 20 times, it's still gone. It's all the same as before. So the delete and the put methods have these properties. But again, that's more important later on. And cacheable. I have mentioned caching in the network part. But sometimes it's important to say that if, for example, I know that in my company, everyone uses Google all the time, maybe I can somehow save the response so that I don't have to ask Google every single time, but I can reuse the response. So get and post requests can be cacheable. So you can save them basically. Again, much more details later. The important thing here is later on when we choose methods for different purposes, we have to keep in mind what they should do. Now the final part in our puzzle are the response codes. So we have seen in the response header that we get this cryptic code that says 200. OK. And these are the HTTP response codes. So whenever you get a response from HTTP, you get this three digit integer and number that tells you how the response worked, what happened basically. And there are different classes to this. So the first number always tells you in general what happened. So earlier we had 200 and if it's anything with a two at the front, it's a success. So whatever we requested, actually we got back. It worked. If we get a one something response, it's some kind of information. If it's a three response, we have been redirected. So we have been requesting something from Wikipedia, but maybe they have actually redirected us to a different host. Four is the client error. So you have sent something that is wrong. You have, for example, requested a page that does not exist, or you have provided the wrong authorization. So you're not allowed to read that kind of URL. And if you get something with five, it means something is broken on the server. There was some kind of program error or similar. Now the important thing is that your client, for example, your browser needs to understand the first digit, because that's what tells you in general what has happened. The other two numbers are not that important. So you can actually, if you write an application, you can actually invent your own codes. But there are typical ones, and we'll just look at that. For example, if we look at 200 OK, that's what we've gotten earlier. This is the website from httpstatus.com. So you get information on all the status codes. And 200 OK tells you that the request has succeeded, and then there is lots of details, what you should get back and so on. 201 created, that's something we'll use heavily later on, but it basically means you have created a resource. It was successful, and now there is a new resource. 400 means there is a bad request, whatever that means. You have somehow not requested something properly. That's sometimes used when you don't have more details. 401 is a typical thing when you're not authorized. So for example, I'm trying to read the emails of someone else, and I'm not authorized to do that. So the server doesn't give me the emails back. The server gives me 401 unauthorized. 403 forbidden, that's maybe similar, but it somehow means I'm not allowed to access that resource for different reasons. Maybe I'm at the wrong place. You're only allowed to access this from within the company. 404 you have all seen, that means not found. You have given a URL that does not exist. I don't know that one. For example, if I do URL something else, Wikipedia will probably tell me 404. And in this case, instead of sending me the resource I wanted, they sent me some kind of error page. Wikipedia does not have an article. So that's the 404, and then other things you typically have seen are 500, the internal server error. That's again, somehow a very generic thing. Something in the server didn't work. 503 service unavailable, that's for example, often the case when the machine has shut down, the server has crashed, you will get 503. So just this number gives you some kind of information. But as we said, you can actually make up other numbers. Okay, now the final part of HTTP, we'll look into cookies. So HTTP as we have discussed is stateless. The server does not know of previous requests. But of course you all know from experience that in the World Wide Web we have states. So if you log into Gmail and you click on something, it still knows that you are logged in. It doesn't ask you to log in again. If you visit a website repeatedly, you might be logged in automatically. This one is of course a bit more problematic, but if you go to a website that you have visited before, you might actually get advertisement depending on what else you have done. So all of these cases you know, and of course that means there is some kind of state. The World Wide Web knows certain things you have done before. And that's essentially because of something that is called cookies. Now cookies were quite unknown until this started to come up. So when all the websites I think two, three years ago started giving you these pop-ups saying we use cookies to make the site simpler, find out more, agree, disagree and so on. So that's new legislation in the EU and in other countries as well. But essentially cookies are text. They are simple texts that are saved in your browser and that are set by the server and the client returns it. Now if we look at that, if we for example go to Google as simple as that and I go, I always have to find it. If I go to my network, I go to cookies, you'll see that actually there is something happening. The response and request cookies. There are lots of things here that I have not set myself. So someone has set these cookies. And if you go to storage, you can get the details on that. So you see that there are actually, Google has stored several cookies. For example, they have stored something that's called consent. That's most likely whether I have clicked agree or not on this little window that asks me whether or not I want cookies. There is often a whole lot of different stuff. You often find advertisement here for example. Now cookies as I said are basically just text and the way this works is I request a website. I send the request to Google.com as we have already discussed and I get the website back, so far so good. What also happens is that the server sends back in the header it sends this set cookie field. And for example, it might say set cookie UID 5. What this means is that the server tells my client, my Firefox for example, to store a text that says UID is 5. And then this text is just here on my browser. But the next time I send the request, for example to Google.com, what happens is the browser sends all these cookies that come from the same host from Google.com back. So everything that is related to Google.com is being sent back to Google. So this way, if I do a second request to Google, Google will get the information user ID is 5 and then for example it knows, okay, it's the same user that earlier Googled about cats. Now the person is Googling dogs, so maybe the person is generally interested in pets or whatever. So this is really what it is. It's just text, but of course you can use that for a lot of different things. So you can, for example, use this to specify advertisement preferences or so on. This is something I'll show a lot about in the practice session, just to give you an idea. But there are some other things about this. First of all, cookies are on your computer, and that means you can change them. That also means that this has a security impact. So for example, I just made the example of a user ID. So I'm not sure which one of this is a user ID. Maybe none of them. But let's say that for example this one is something that identifies me as a user. Here's the value. This is the text. And I can just change this. So I can just put in here whatever I want. And if Google hasn't programmed their servers properly and I run this again, it could actually be a security risk. But something we'll discuss in the security lecture, but technically I could, for example, change my user ID. So you should, as a server-side programmer, on the server you should never assume that people are not changing their cookies. The other thing is that you can delete them. I can delete all my cookies. So if you write an application you should somehow make sure that you can use this application even if cookies are not working. So that's another key ingredient here. And last but not least, cookies have this kind of negative touch because you hear about them, you hear about security problems, you get all these messages nowadays, but they're actually not evil themselves. It's just text. And without them you would not have any states in the web. So it wouldn't be possible to remember login information or so on. The thing why they have such a bad reputation is that they are regularly misused or they are the reason for security problems and so on. But in itself they are important and they are necessary. Good. So this concludes the lecture two, the network and HTTP lecture. What we discussed is that when you open a website you essentially request a resource and you do that using the HTTP protocol. Now HTTP is a worldwide web protocol. It uses the internet and that means it uses the TCP IP stack and that is used to identify where to send your request, the destination, and it's used to route it to basically make sure it takes the right way. Whenever you do this request you specify the method. So what do you want to do with the resource? Where is the resource, the URL, and meta information headers. For example cookies as we discussed, or authorization information. And then if you have done everything correct then the server will respond with the right resource, for example the website, and some kind of headers, extra information. And then as the last part we discussed that HTTP itself is stateless, but using cookies you can basically remember what the client has done before and therefore you get the stateful web. So you get states in your web application. Okay, so that's it for this lecture. The next lecture will start with HTML so it's important to get some tools installed. VS Code will be using this course. If you're familiar with anything else you can use any programming environment you like. You should have a browser installed. We will be using Firefox and Chrome so that's the best to test your applications as well. And you won't need it for the first assignments but it's good to directly install Postman so that you can play around with HTTP requests yourself. So that's things you should be doing. And the next lecture then, the next two lectures actually are on HTML which is all about structure. So what kind of information is in your website. And this will be a start of a series of very, very technical lectures. So it's a lot about code, coding and different elements of HTML. Okay, so that's it for part two.