 What we saw last lecture was an example, a detailed example of TCP connection set up, and then an example of the data transfer. So the connection set up, the three-way handshake, and we saw how the initial sequence numbers are agreed upon. So both sides select initial sequence numbers and then tell each other. We only cover the case where it works normally. There are many other scenarios which may happen, for example, if something goes wrong or one of them doesn't want to set up a connection, but we only cover the very basic case. Once we set up the connection, we can transfer data and we saw an example in the lecture where we send the data and as we've seen with other protocols, we send the data and eventually an act comes back. The main point we saw was that with the data, we have sequence numbers and the sequence numbers count bytes. So if my sequence number of the data segment is one and there are five bytes in there in that data segment, then we can think that those five bytes are labeled one, two, three, four and five. So therefore the act that comes back says, thank you for those bytes. I now expect six. So the acknowledgement number depends upon the data sequence number and the amount of data. So we saw an example of that. There are other parts of TCP which are very important. Error control, what happens when we send data but it's lost across the internet. It doesn't get to the destination. How do we detect that data loss and how do we retransmit? Well, we use a retransmission scheme like selective reject. So we've seen selective reject in earlier lectures. So it's a variation on the selective reject where we're allowed to send a window of frames and then if there's one missing, then the source will try and retransmit that one data segment. There are flow control mechanisms of, well, controlling how many data segments we can send at a time before we have to wait for an act. So we saw sliding window, or sliding window is used in TCP. And congestion control, we haven't touched upon that. There are many details of how TCP controls how fast it sends. So it tries to avoid congestion in the internet. Those mechanisms are very important with TCP, but in this course, we're just giving a taster of different protocols, especially at the higher levels, and we're not going to cover those details. But be aware that what we've seen on TCP is just a small part of it. There are many more details to understand how it works. And this is just the repetition of this point that TCP sends data, or labels data with sequence numbers by bytes, not by segments. And TCP may decide to send our data in different sized chunks. So it performs some segmentation. If we have 10,000 bytes of data to send, TCP may send, for example, four segments of different sizes. The details of TCP algorithms determine how big those segment sizes will be. And this just was one final example of data with sequence numbers and the corresponding act numbers that came back, depending upon the size of the data in those TCP segments. And that finishes for what we're going to cover with TCP. There are a few slides on error control, but again, we will not introduce anything new here, other than say it uses a selective reject or a selective retransmission style ARQ scheme, which you already know how it works. There are many more details to understand precisely how it works. There's TCP flow control mechanisms, congestion control mechanisms, but again, not for this course. So that's a recap on what we've looked at for TCP. Let's go to our last topic, a very simple one because we've covered parts of it, Internet applications. Focusing on the top layer, so we've finally reached the top layer of our five layer stack for this course. Well, what are Internet applications? Applications that involve communication for them to work properly, communication across the Internet. So I install Microsoft Office on my computer and I use the word processor. I don't really need an Internet connection to do word processing. It just runs locally as a standalone application on my computer. That's not an Internet application. But I install Firefox web browser on my computer. I can't do much with that unless I communicate across the Internet. I can browse local web pages, but I cannot access anything outside of my computer. So Firefox communicates with the web server. So we consider web browsing an Internet application. We can often separate the functions of an Internet application into three parts. Usually there's some user interface. Think of your web browser, the user interface to click on the buttons, the menus to see the web page. There's, if we looked at the source code of that application, there's some logic to implement the parsing of the HTML, how to display graphics and so on. And then there's the part of the web browser that does the communications with the other entity, in this case the web server. The communications between the two different applications, say from web browser to web server, follow some protocol. And this is an application layer protocol. So for our web browser to web server, for example, the communications is implemented using or implements an application layer protocol called HTP, the Hypertext Transfer Protocol. So we're focusing on the communications part of Internet applications. And the protocols that we use to communicate between different applications. And there's a list of some example, Internet applications and some example protocols associated with them. Not just protocols, sometimes some languages as well. So web browsing, we know, we've seen many examples. HTTP is the application layer protocol used for web browsing. Email, depending on how we're accessing the email and sending email. There are different protocols used for email transfer. SMTP, to access the local email server or an email server pop and IMAP. And there are some formats of the messages that we send, email messages. And there are some languages or formats, for example, MIME defines a format of the email messages. Instant messaging, there are different protocols, depending on whether using a Microsoft Instant Messaging client, some Google client or some other application for instant messaging. Some protocols like Java, MSNP and others. Streaming video, maybe you're streaming, you're listening to online radio, streaming audio. So there's a radio channel that streams on the Internet. So you listen on your client. So they may use different protocols to stream that audio and similar with video. RTSP, RTP, there's some examples. File transfer, you can use web browsing or you can use specific protocols for file transfer, FTP, BitTorrent and others. Video voice calls, voice over IP. There are different protocols available and other applications. So this is just some example Internet applications and some example protocols used by them. There are many. We're going to quickly look at HTTP, quickly in terms of, we've already seen the basics of it. We will not introduce much more. We've seen it through several examples. The others, well, again, you need to study more to understand how all the others work. You may see some in other courses, you may not. But there are many different application layer protocols. Most Internet applications follow a client's server model. So one application acts as the client that initiates communications to the other application, the server. For example, your web browser is the client. A web server application is the server, as the name suggests. And the application layer protocols make use of the transport layer. So our web browsing application uses HTTP, uses TCP as the transport layer. Different application layer protocols may use different transport layer protocols. And there are different ways for programming your application to use the transport layer. And we will not cover it. You may see it in a lab next semester, the concept of using sockets to write software to communicate across the Internet. All we want to do in this topic is give a brief example of web browsing using HTTP. And to make some sense of that, we will mention the concept of naming and domain name system, DNS, as they're closely related. And the part about DNS and naming, we were not going to many technical details. You know most of it already because you use web browsing every day and you use domain names and URLs every day. So I think you understand the basic structure and how that formatted. But just to be clear, always remember when we communicate across the Internet, when our computers communicate, they use IP addresses to identify who they're communicating with. So the packet sent contain, the IP datagram contains an IP address of the source and the destination. And the routing uses IP addresses. So computers communicate using IP addresses. But we introduce a new address type, a domain name, which is introduced to make it easier for us human users to identify computers. So even though IP addresses identify computers, it's hard for humans to remember IP addresses. So what we use is human-friendly domain names, where we have an additional address that identifies computers, a domain name. And you know examples of domain names, okay, the domain name for the SIT website, www.sit.tu.ac.th, so an example of a domain name. And they have some structure, some hierarchical structure in the domain name. So it's not just any, you cannot use any combination of characters, there's a structure. Originally there were, well, there are top level domains. For example, these are the domains which finish with the .com, .net, .org, and there are others. In fact, it's growing the domain name space. So there are top level domains. And in fact, there are also top level domains for individual countries, .th, .us, and so on. So each country normally has its top level domain name. And the hierarchical structure of domain names is that these top level domains are usually managed by some organization, commercial or a government organization, and allocated to different organizations at the next level down. So we could talk about sub-domains. So an example for SIT, the domain name. This is the domain name for the SIT web server. This is a human friendly name for the SIT web server. The SIT web server has an IP address. So whenever you want to contact it, you need to know its IP address. But for us humans, we just remember the domain name. The top level component is .th, the country level, a country code for the top level domain. So that's for Thailand. Within Thailand, the domain names are separated into different groups, where AC means what? What's .ac mean? Academic, AC for academic. In other countries, you may see .edu, educational institutions, academic, so similar meaning for universities, schools, and so on, academic. So all the academic institutions in Thailand can use this sub-domain, .ac. And then within that, of course, there are different academic institutions. We have Tu, Tamasat University. So Tu is allocated its sub-domain. And Tamasat University manages that domain. Everything under the .tu, .ac, .th is managed by Tamasat. And then they allocate sub-domains to different parts within Tu. So we have SIT. There's engineering, faculty of engineering, faculty of science, faculty of architecture, and so on. They'll have their own sub-domains. And then within SIT, SIT manages their own sub-domains. So there's www.sit to represent the domain website. What about some others? Within SIT, other domain names. You know, Tu, at least. Sorry? Well, you said one which, yes, is true, but not many people know about it. Fine? ICT, you use it every week. ICT.sit, the Moodle website is on that domain name. Another one, registration. I think most of you have visited the registration website. And the student affairs, SA, and there are others. So they're allocated within SIT. So you can see there's some structure in the domain names. And that's about all we want to say about domain names. I think you know many examples. And I think it's quite easy to see the structure. It gets a bit more complex and how we allocate them, which organizations are involved, and so on. But that's the main point. And the simplest thing to remember is that each domain name can identify a computer on the internet. It's a bit more complex. It may identify multiple computers. But in the basic case, let's think it identifies some computer. What computer does www.sit represent? Where is it? How do we identify it? It's a server. How would we contact it? How would we send a packet to it, the SIT web server? OK, in fact, what we need to know to contact that particular server, it's a computer somewhere. I don't know where it is, but it's a computer somewhere. We need to know its IP address. Well, to send a packet to it, we need to know its IP address. My computer needs to know its IP address because every communications across the internet uses IP addresses. We don't use domain names to send in a destination of our IP packet. We must know the IP address. Now, well, yeah, so we need to know the IP address. When I say we, I mean the application or our computer needs to know the IP address. Now, from that comes how do we get from a domain name that a human user knows to an IP address which is needed to contact that server? And that's where DNS comes in. They're different. Usually, your computer automatically finds the corresponding IP address for a domain name. I've got a program called NSLookup which hopefully will work. Something's wrong. Many things. I can't see what I'm talking about. What I want to do is find what is the IP address for this domain name? So this domain name identifies a computer somewhere. To contact it across the internet, I need to know its IP address. When I say I, my computer needs to know the IP address. What is it? This software will try and look it up in some system and tells me the answer here, this last value. The corresponding IP address for the SIT web server is 115.178.61.153. So now, since I know the IP address, my computer can send an IP datagram to that destination. So always remember, our computers communicate using IP addresses. But humans often don't want to remember IP addresses, so we just remember the domain name. And the next thing is, well, how do we go from domain name to IP address? Well, we have what's called the domain name system, which organizes how to collect this information of mapping domain names to IP addresses. Look up the ICT server. We see the ICT domain name maps to a different IP address. There are two different computers somewhere on the internet. One hosts the SIT website. One hosts the ICT website. I can tell you the ICT one is on the third floor. They have two different IP addresses. So if we want to contact one of them, we need to know the correct IP address. How did this program NS Lookup find the corresponding IP address? So I gave it the domain name. It returned me the IP address, DNS. That's the next concept. The domain name system provides us that service. URLs, we'll come back to URLs. Let's stick with DNS. The domain name system, DNS, it specifies the structure of domain names, this hierarchical structure, and how to map domain names to IP addresses. That's the basics of DNS. Given a domain name, map it to the corresponding IP address of that computer. So in fact, you can think some, well, all devices on the internet have an IP address. Some of them also have a domain name that identify them. If we know the domain name, we need to know the corresponding IP. So DNS, the domain name system, how we perform this mapping is, in fact, there is a set of special servers across the internet that you can think store databases. A simple database of this domain name maps to this IP address. And it's a distributed database in that there's not just one database for the world. There are databases containing portions of that data of mapping domain names to IP addresses spread across different computers in the world. And then applications on your computer when you have a domain name will automatically look up and find the corresponding IP address. And the way that they perform that is they use the DNS protocol. There's a protocol called DNS that allows you to send a request or a query. Here's a domain name. Tell me the corresponding IP address. And that makes use of a DNS resolver, a resolver that gives a domain name and returns an IP address. There are many details about how to structure or distribute that information of the DNS database across multiple servers in the world. Again, we don't have just one server is distributed. The way that we distribute the information across DNS servers, again, is structured. And it's quite complex as how it works and how we make use of those distributed databases. There are a few notes here, but we're not going to cover that. We'll show you a few examples. And that'd be sufficient for DNS. So in fact, when I perform this NSLookup command, what happened was DNS was used. What my computer did, the DNS resolver has to find or look up in a database. What is the IP address for this given domain name? And you can think the basic way it does it is it checks the local cache on my computer. Have I cached this domain name in the past? Have I learned about it before? Maybe this morning earlier, I accessed ICT and I found the corresponding IP address. If not, if I don't have a value in my local cache, I send a request out across the internet to a special DNS server. And that DNS server will have a database, some domain names and IP addresses. And if the domain name ICT.sit is in that database, then the DNS server will send back a response saying, here is the IP address. And if that one doesn't have an answer, then there are further steps that can be taken to try and find the IP address. So the next thing we need to introduce, well, so again, what my computer does, given a domain name, it needs to find the IP address. And if it doesn't know it locally on my computer, it contacts a DNS server. So what we need to know is the IP address of that DNS server so we can contact it. And in my computer, if I look at the connection information, so I'm connected to the SIT wireless, here somewhere. Actually, when my computer was set up and it got an IP address from the SIT network, it also got IP addresses of the special DNS servers in SIT. The primary DNS, 10.10.10.9. And a secondary, think of a backup, an alternative if the first one doesn't respond. 192.168.20.103. These are the IP addresses of the special DNS servers inside SIT. So what happens now, when my computer has a domain name, it needs to find the corresponding IP address. What it does is it sends, using the DNS protocol, a message to one of the DNS servers, say the primary DNS. It sends a message to 10.10.10.9 saying, do you know the IP address for this domain name? And if that DNS server knows it, it will send back a reply saying, yes, here is the IP address. If we don't get a reply, the DNS protocol has mechanisms for trying again and trying other DNS servers. We'll see that in an example when we go through HTTP. So we'll see a more detailed example of how that works. The last thing that we skipped over was, OK, URLs. Domain names identify computers. In fact, we have URLs, Uniform Resource Locators, to identify resources that we often want to access on the internet. Usually files, generally resources. And I think you've seen many examples of URLs. You use them every day. Some very common, some formats are not so common. But there's some definition of the structure of a URL or more generally a Uniform Resource Identifier, URI. The general structure, and down the bottom I say, most parts are optional, some are, there are many exceptions. But the general structure is that we have some scheme which identifies the application protocol we're using. So we can give a user name, the host name, which is usually a domain name or an IP address, the computer we want to contact, optionally some port number, the port number of the application on that computer we want to contact, some path to the resource, including the file name. And usually as an option we'll see some query that we can pass to that resource as well. And this is best seen by some examples. You use them every day for web browsing. The scheme, HTTP, that specifies the protocol you use. The host or the domain name, www.example.com, and some path, slash dir slash file.html, identifies a file on a particular computer in the internet. The URL says, access this file in this directory, dir, on the computer www.example.com, using the protocol HTTP. That's what that URL means. It tells us which computer to access, which file to access, and what protocol to use to access that. We don't have to have domain names, we can have IP addresses. Optionally, we can specify the port number. And we see the colon 40240, meaning access that computer, in particular access the application using port 40240. If we don't specify a port like the first one, then the protocol has a default port that we use, port 80 for example. So normally we don't specify a port when we're web browsing, but we may optionally give one. And some examples of using a query. You see that when you access databases sometimes through websites. And some other examples of URLs. Email, we can have a URL for an email, where we use the mail to scheme. And we have a username at some computer, at some host. And we can have queries. And we can remotely log in. So we have URLs to remotely connect to another computer. Again, we can use a username, we can optionally give a password to log in to another computer. At some computer, and some port number. So I think you've seen enough examples of URLs, you use them every day. We don't need to know the exact structure, but just be aware that they identify resources on the internet. We need that to see how HTTP works. We'll see the DNS example after we look at HTTP. So web browsing. Web browsing uses the hypertext transfer protocol as the application layer protocol. HTTP for transferring hypertext between devices, between computers. HTTP is a request-response protocol. It's very simple in that the client sends a request to the server, the server sends back a response. And that's the interaction. Just request, response. If we want to get more data, more resources, then we send another request. And we just get one response for every request. So it's a simple protocol in terms of the types of messages we have. In the basic form, HTTP is stateless, in that I send a request for a resource, a response comes back. The next request I send from HTTP's perspective, there's no relation between the next one and the previous request. So it's just a new request. We talk about a client and a server. A client sends a request message. The server responds with a response message. Easy. Often the client is called, or technically, a user agent. It's an agent on behalf of the user. The port number used by servers, by default, is port number 18. So we've seen that before as well. What does a HTTP request or response message look like? They both have the same general structure. So they both have this structure, where there's some start line. They're just textual messages. So we don't look at them from a packet header perspective. We look at them from a text perspective. So we think about lines of text. So both messages have a start line, and they'll differ depending upon the type of message. Then some optional header lines. So we can add some optional information to the request or response. Then an empty line, nothing. And optionally, the message body. Especially in responses, we'll see the message body is the content that was requested, say the web page. So the start line we'll see on the next slide differs for the request and response. The header lines in both requests and response have the structure of a field name and some value. This is a simple example of an exchange of request and response in HTTP. We send a request from a browser to the server, and the server sends back some response. If we want more information, we send another request, get another response. It's one response for every request. Let's look at the structure of the request and response. The request message, the first line, the start line, has three components. The method we're using to make the request, the URL we're requesting, the resource we're requesting, and the version of the protocol we're using. So that's always in the first line of a request. The method, there are different types of methods. Depending on how we want to access this resource, the most common method we see is get. When we want to retrieve a resource, we say we want to get this URL. But there are other things we can post that's like sending data to a server. Get is retrieving data from a server. But there are other methods as well. So often we'll see get, then a URL. I want to retrieve this URL, this resource. And then the version is the version of HTTP. Usually 1.0, 1.1 are the common versions, we'll see. That's the first line of every request. Get, for example, get URL, the version of HTTP. What about a response? The first line of the response, we have these three values. The version of the protocol being used, for example, HTTP slash 1.0. The version used by the server. And some information about the response. A status code and a reason, so a textual description of the reason for this response. So there are different codes with corresponding reasons. And I've listed some of the common ones we'll see here. The most common one, when everything's OK, the server sends back the status code 200. And the reason OK. Meaning your request was successful. And here is the response. So this is just the first line of the response. Others that I'm sure you've seen, 404 not found. You send a request to a server for some resource. If that server doesn't have the resource, that is you typed in the wrong URL or the wrong path, or the wrong file name, then the server may send back this response with the status code 404 and the status reason not found. Meaning the resource you requested was not found on the server. It's like when you access a web page that the wrong URL or it doesn't exist, 404 not found. Others you may see, 401 unauthorized. Maybe that resource requires a password to access it. And you haven't supplied the password. So you'll get back a response saying you're not authorized to access that resource. Maybe the server received the request, but doesn't want to process it for some reason. So 403 forbidden. 304 not modified is one you'll often see. The idea is that let's say I've requested a web page. I get a response back 200 OK, including that web page. And then one minute later, I request the same web page. What the server may do is send back a response 304 not modified, meaning the web page you just requested has not changed since the last time you requested it. Therefore, use the copy that's in your local browser cache. And my browser will use the local copy. So this is a way to improve performance, so we don't have to transfer the same web page all the time. You visit a web page. You download that web page. Your browser often will store it in its local cache. Then if you visit that same web page again, the server say hasn't been changed since the last time you requested. Use your local cached copy. It's a 304 not modified response. And there are many others. This is just some of them. So there's some status code, a number, and then the reason, some textual description of the meaning for that status code. So a request starts with this single start line. For example, get some URL. A response sends back the status code. Both request and response may have headers, which gives some extra information for the request and response. Some examples of header fields. For example, the date. The date of the request, including the time and the date of the response. Host field identifies the domain name of the server. You see that in some examples. The browser may give preferences for the content that it was requesting. The preferred character set, the preferred encoding of the content, the preferred language of the web page. So the browser may indicate some preferences of what type of content it wants back. The preferences, the server doesn't have to follow them. It sends back what it has available, but it may use those preferences that the browser specifies. Often when you need to access a resource which is password protected, the client, the browser, will send some username and password in some authorization field. The browser can send information about the web browser. What type of, what web browser is it? What operating system is it running on called the user agent field. Content length, how long is the content which comes back in response and some other fields. Then there are many other fields. This is just some of them. What we want to do to finish this, to see how DNS and HTTB works, is just go through one final example. And we'll see most of these concepts in play. And the example is from a capture that I did again last year in the past. What I did, I don't have it, I opened my web browser again and I accessed a website. And I captured the packets. And what I'm going to go through here is a selection of those packets, the main ones for that exchange. And I captured the website. And we'll draw those packets as we go. We're going to look at three computers involved in this exchange. When I did this, my laptop, which was running the web browser, had the IP address 192.168.0.4. So in the capture when we see that IP address, that's my client, my web browser is running on that computer. We'll see shortly that I accessed a website on the server using IP address 203.131.209.82. Anyone want to guess what server that is? 203.131.209.82, I'll show you. It's the ICT server, this one. The ICT server corresponds to IP address 203.131.209.82. What I did in this example was I visited the ICT website and downloaded a web page. So I actually contacted the ICT server. And we're going to see there's another computer involved. In fact, there are several others, but the one that we'll focus on is this other one with IP 203.121.130.39. We'll see that come in play in a moment. So what do we see in the capture? The first packet. So again, imagine what I did on my laptop was open my web browser, typed in a URL in the address bar and pressed Enter. Then, and actually I'll give you the URL I typed in. HTTP, ICT.SIT.tu.ac.th slash tilde S Gordon slash ITS323 slash index.html. That's the URL I typed into my web browser address bar and pressed Enter. And then my computer started to send some packets and the packets of interest to the ones I show in Wireshark here. The first thing that happens, so I typed into my browser this address, meaning I want to access this index.html file in this path on this computer on the internet. That's what I'm telling my browser. Download that webpage. Now, my browser needs to contact the server that has the webpage. My browser knows the domain name of the server, but for my computer to send it a packet I need to know the IP address of the server. So the first thing that happens is that my browser triggers DNS to take place. That is the DNS protocol now needs to find the IP address for that domain name, for ICT. And that's what we see in the first packet here. This is a packet using the DNS protocol. We see it sent by my computer, the client computer, 192.168.0.4. And it sent to this destination 203.121.130.39. And what does it send? It's a DNS message, it sends a query. And you can't see quite see at the top if we expand here. What does the query contain? We'll keep expanding. It contains the domain name. So think of this as a request saying, sent by my computer to someone, we'll see who in a moment, saying I'm looking for the IP address for ICT.SIT.TUACTH. It's a query for the IP address for that domain name. Who did I send it to? It's in fact the DNS server for my computer. In that case, the DNS server was 203.121.130.39. So just imagine some computer in the internet is acting as a DNS server. What happened is that my computer sent a query to that DNS server, querying for this domain name. In fact, in this example, my computer we see in the second packet sent almost the same query to another DNS server identified by 8.8.8.8. I actually had two DNS servers and what my computer did was to get a fast response, send a query to both, expecting a response from at least one of them. Let's forget about the 8.8.8.8 case. Just so we'll keep our example simple. The third packet here is a response. You see it's coming from this .39 IP address to my computer and it's a query response. If we look at the contents of that response, it was a query for ICT and we see the answer. The answer to this query says the ICT domain name corresponds to IP address 203.131.209.82. So this is the response from the DNS server. My computer sends a query to the DNS server. The DNS server sends back the IP address for that domain name. That's packet number one and three. Packets number two and four, if we look at them, are the same. That is, but just sent to a different DNS server. Send to the DNS server 8.8.8.8 and the response. My computer actually sent to two at the same time. Where did these DNS servers come from? Well, they are part of the distributed DNS database. So think of a database with the main names and IP addresses. That database is distributed across many different servers in the world. These are just two of them. Why did I contact those two? Because my computer was configured to use those two as DNS servers. Currently my laptop on the SIT network is configured to use a different two. The ones I showed you before, can we see them still? It depends on what network you're on and how you've configured your setting. So if I did the same example now, the DNS servers in this case are 10.10.10.9 and 192.168.20.103. The example I did actually was a couple of years ago. I had different DNS servers, that's all. So in fact, you may configure your own DNS server. So just think of some computers on the internet that store part of this database of mapping domain to IP address. So the first four messages were DNS working. The response is the IP address of the server. Let's draw, I'll just draw the first packet and the response packet. So we have my browser and the one on the right is one of those DNS servers. So the first thing that happened is I sent a request to that DNS server. So I send a request and let's say this is a DNS request and the DNS server sent back a reply. So this was a query and this was the DNS reply. The answer. Note that in this picture, it's not going to this middle server. It's ignore the middle one in this case. It's going from my computer to the DNS server, the one on the far right. It's just that my line goes through the middle one. How did I know to send it to this particular server? Well, it was configured when I set up my network to use that DNS. And the answer included the IP address for the ICT domain name, which was 203.131209.82. So now that my computer knows the IP address of the destination server, it can now contact that server. So DNS really maps this domain name to the IP address. Now what my computer will do is send a HTTP message to this server, a request message, requesting this resource, this index.html file. What transport protocol does it use? Yep, HTTP uses TCP as the transport protocol. DNS, not so important, but uses UDP. There's no connection set up. So now I want to send a HTTP request from my browser to the web server, requesting this resource. I'm going to use TCP as the transport protocol. Before I can send that request, I need to set up a connection. So the first four packets shown are DNS, request, reply for two different DNS servers. Look at the next three. We will not look in details, but see from the summary information. Well first, my computer sending to the ICT server, syn, synac, ac. This is the three-way handshake for TCP. This is my laptop setting up a TCP connection to the web server. So we covered that in the previous lecture, how that works, and we'll see the port numbers, you can check the details. So we set up a connection by sending a syn message to synchronize sequence numbers, getting a synac in response, and then the final act. And then we can send data, which is the next packet here. Let's draw quickly the three-way handshake. This is from my computer to the web server, synac. So that's the three-way handshake. And now we send the HTTP request, the data from TCP's perspective. And we'll see that it's sent from laptop to server. Turning to our capture, let's look at that data message. So it's sent from my computer to the ICT web server. It's a TCP segment. The data of that TCP segment is the HTTP request. So let's focus on the HTTP request. Let's look at the HTTP request. The first line of the request follows this format, the method, the URL, or the path, and the protocol version. Get this web page using HTTP version 1.1. So that's the first line of the request. The following lines of the request think of the header fields for HTTP. They give some optional extra information. The host field specifies the domain name of the server that we're contacting. The user agent field specifies information about my browser, Mozilla running on Ubuntu. The accept fields give information about the type of content I would prefer to receive in response. The language, whether it's compressed or not. And there are other fields there. You don't need to know all of them. And that's it, in fact, there's a connection field and this is the end of the request. This is actually a blank line. So that's the request for the resource. Get this index.html. I will not write the full URL index.html. So that's the request. What happens next? Server response. And it depends on different factors. But remember that this is actually a TCP data segment. It's a TCP segment containing data. When we look at the capture, we'll actually see there's a TCP app that comes back saying thank you for this TCP data. You can now send me more. So let's draw that and we'll see it in the capture. That's just an act for the TCP data. Okay, good point. Let's see what the next message and we'll discuss that point because there are different options there. Let's see what actually happened and then explain what could have happened. So the highlighted packet is our request from client to server. Requesting the resource. The next packet is a TCP app. So the request is TCP data. The server sends back a TCP app. The next packet is also from the server. It is actually the HTTP response. From the server to my computer and it's the HTTP response and we'll see that the first line of the response is the protocol, the status code 200 and the status reason okay. And we'll see the details. Let's look at that response and then come back to our exchange. So this is the response to the request, the HTTP response. The first line of every response contains the version. The status code, in this case, the request was successful. We requested a resource. We got a successful response, 200 okay. The server also includes some header fields. Again, some information that can be useful for the browser. The date of the response, date and time, something that identifies the software of the web server. The content length, how long the response is. 2,856 bytes. And then after the other fields, this is the blank line. And then it's the actual message body. In a response, we include the resource that was requested, the web page. Weishart calls it line-based text data. But when we expand, we'll see it's the web page that we requested. So the HTML for the web page. So the response contains the web page or the resource that was requested. Contains a state, a start line, header fields, a blank line, and then the actual message body, which is the web page in this case. You see it's the ITS 323 web page in that case. Let's draw that. So we had the blue request, then we had a TCP act, the green one, and then the server sent the HTTP response. And importantly, the status code was 200 okay. And inside that response contains the actual contents of the, or at least partial contents of index.html. We saw part of the HTML in the response. So this is actually a TCP data segment coming back. Now, a point was made before that what happened here was the server received a request, a TCP data segment, sent back a TCP act, and then a short time later sent back TCP data, which is the response. What may have happened, what could have happened, it didn't in this case, but optionally the TCP can combine these two segments into one. That's possible, it's called piggybacking, where we take the act for the previous data and we combine it with the next piece of data that we're sending. Instead of sending two segments, send the same information in just one segment, saving on how much we send across the network. So we could have seen in a different scenario, just one segment here containing the data and also the act. It depends really on the timing of when the data is going to be sent back. In this case, there was a significant delay between the act and the data. What happens next? Okay, it depends, correct? And it depends on what, it depends on how many TCP segments we need to send the web page back. We saw that the content length was something like 2,856 bytes. So the web page, almost 3,000 bytes. Do we send that in one TCP data segment or do we split it across multiple TCP data segments? 3,000 is generally too large for a single sent across the internet, but it depends on different factors, okay? We cannot predict always how many segments. That's not the purpose here. Let's just look and see what actually happened. So this was the first data segment. If we scroll down, this is the first part of the response, the highlighted orange packet. Then my computer sends a TCP act saying I've received that data. But then look at the next two. Both sent by the server and in fact there are a continuation of this response. That is the web page, the HTTP response was split across three TCP segments. I think three is all in this case if we look in the detail. This is the first TCP segment containing the first part of the web page. This one from the server to my computer was the second part. And this is, if we look in the data we'd see it's the last part of the web page. The web page was large enough to be split across three TCP segments. So it depends on how much content we need to send as to how that data transfer occurs. And we see some acts coming back. Let's draw them and see the next step. There was actually a TCP act in here. This 200 okay message was split across three segments. So draw them as two more TCP segments that actually sent one after each another. I'll just write data here. That is those two data segments are actually a continuation of the first blue one, the 200 okay message. The web page is split across three TCP segments. And in fact, there are some acts as well at the end. But from HTTP's perspective, the blue ones, very simple. Client sends a request for a resource to the server. The server sends back the response. From HTTP's perspective, there's one response. But depending upon the size of the response, the transport layer may split it into multiple segments as we saw in this case. What happens next? All right, there are some acts. After the acts, what's next? Are we done? Okay. I've requested, so this from my browser's perspective, I typed in this URL. My, what's happened so far is that my computer looked up the domain name and found the corresponding IP address using DNS. I sent a query to a DNS server. And then I sent a HTTP request to that server requesting this resource. And the server sent back the resource. It sent back the actual web page. But to do that, that HTTP exchange, I actually had to set up a TCP connection and I actually had to split the response into three TCP segments. And if we were done, we would eventually, and it may happen some time later, close the TCP connection. Because we've transferred the data, we closed the connection. I don't think we'll see it in the capture. It happens a bit later. Let's see if we can see the web page. Just look in the first part of the web page in the first response. Here's the first part of the web page being sent from server back to my browser. The browser reads the HTML. Once it gets the second and the third part of the response, it's got the entire web page. And the browser parses the HTML and displays something on the screen. So it displays in the title of my browser, ITS 323 overview. And it displays in some bold large font, header one, introduction to data communications ITS 323 and so on. It just plays that on the screen. What font does it use for header one? Anyone know? Anyone know HTML? What could indicate what font to use, say for header one? Doesn't say the exact font on here, but what would you normally do? All right, it may default to a value. Can you change the fonts of a web page? Well, yes. Can the website set the font? How do we do that? We can easily do it, either do it in the HTML, but we don't normally do that now. Where do we do it? In a CSS file, a cascading style sheets file. In this, this is about HTML, not HTTP, but in this web page, there's a link to a CSS file, which defines colors, font sizes and so on. Not important. What's important is that for this web page to be displayed by my browser correctly, the browser needs this CSS file to know what fonts to use, to know what colors to use to display on the screen. So what my browser does automatically is performs a request for this file from the web server. So my browser receives the web page and then looks through the HTML to look for links to other pages that may need to display the content. For example, style sheets, images, a web page which has a link to an image to display it on the screen, we need to download that image. So the browser will now automatically go and request those other resources, the style sheets, the images. In our capture, we see that happen here. So the request coming back and then we see a little bit later, about 200 milliseconds later, my computer sends a request for the style sheet. That wasn't the human user typing in a URL. This was the browser automatically creating a new request. So we request the style sheet and we follow up and we see the style sheet comes back in a response. This is the response. In fact, the style sheet's quite small. And if there were any images in the page, we would see requests for those images. Again, using HTTP, a request message, the resource comes back. So the browser automatically requests some resources depending upon what's the original resource, the original web page. So in our case, there was a request, a new request then, a request for site.css. And eventually, there may be some acts in there, but eventually from HTTP, there's a 200 OK that comes back that contains that style sheet file. And if the web page had images, there'll be a request for the image, the JPEG file, and the JPEG would come back. And for whatever other content or objects are needed to display the web page, the browser automatically requests. And that's the basics of HTTP. Every resource, we have one request, one response. If a web page has images embedded, style sheets that are linked to, then there'll be separate requests and responses for each of those resources. And the browser automatically makes those requests. This example is similar, but there's a different set of web pages I requested. It's showing not the TCP segments, but just the HTTP messages in the case where in the first case I entered in a URL, I sent a GET request for some URL, for some resource. The web server sent back a reply. Then I requested another, so I clicked on a link in the first web page, some eight seconds later. My web browser sends a request for that next web page, sends back the actual web page, and then if that web page contains links to some resources like images, the web browser automatically requests those images which are sent back so they can be displayed on my browser. And if I, for example, request the page that doesn't exist on the server, the server sends back a response, but in this case, we'll send back a different status code, 404 not found, for example. Every resource, we send one request and get one response. And you're now experts on HTTP. Well, you know the basics of HTTP. It's quite a simple protocol. The details of the different methods, the different status codes, how browsers use style sheets, images, and speed up performance, a lot more details, but the basics send a request for a resource, receive that resource and response. If our resource links to other resources, then send a separate request and get a separate response. And we've got 50 seconds remaining for the course to finish, any questions? That's a hand going up. No, just scratching the head, okay, fine. We've gone through the internet applications reasonably quickly. There are many other applications. We just wanted to give a taste of the most common application, which is HTTP, and a little bit about DNS. What you'll see in the lab next semester is some more practical details of how DNS works, how web browsing works, how other internet applications work. That's next semester.