 to the 24th lecture in the course design and engineering of computer systems. So this week we have been discussing a few high level details about the various layers in the network stack that comprises the internet. In this lecture we are going to study more details about the application layer. So let us get started. So the application layer is what all of us see and interact with when we use the internet you know when we use the worldwide web, browse websites, use an e-commerce website or video streaming or social media or anything all of these are user applications which are present in the application layer of the internet. And all these user applications exchange information with each other over the internet and this application software we have seen is built using APIs like the socket API which let us send and receive messages using the network. So the application layer is defined by a set of protocols which are nothing but rules by which the different components of the application understand each other. For example you know if you want to you know book a ticket then you will send a message with some information then that server will send you back a reply with some other information right. There has to be a certain format for all of these messages and you know the order in which these messages have to be exchanged what is in what order is information written into the packets all of that has to be decided so that the both sides can understand each other. So application layer protocol is like a language that the both sides speak in order to understand each other and you need a common agreement a common language to understand each other. So that is defined by application layer protocols and we have many protocols in use today The most popular one is HTTP or Hypertext Transfer Protocol. This is developed for the web to browse websites and of course many other applications today it was initially developed for the web but many other applications use it. Then you have the simple mail transfer protocol SMTP for email transfer and there are many such protocols in use today. So we will study some high level ideas about these protocols in this lecture. Note that the protocol only specifies how we should communicate it specifies the language but actually not what content is exchanged what data is exchanged in that language. So with HTTP you can exchange HTML webpages you can exchange images videos any such things ok. So let us not confuse between the protocol itself and the actual data being exchanged by the protocol. So what is HTTP? So HTTP is a protocol that helps web clients that is you know typically browsers to communicate with web servers you know websites that actually host content you know web pages or videos or you know the NPTEL website that you are accessing the YouTube website all of these are served by web servers and this content is served to web clients using HTTP. So this is one of the most widely used applications on the internet ok. So let us understand how HTTP works once again go back to our example of what happens when you access a web page. The user will give a certain URL. This URL is nothing but it has the domain name like nptel.ac.in and inside that domain you know your URL has other things also nptel.ac.in slash you know some courses some lectures something something it has the location of the content at that domain name also present in the URL. So the user will enter the URL in the browser then what will happen? Of course you have your server that is listening on a well known port 80 is the port that is used for HTTP or if it is secure HTTP then 443 whatever it is there on some port at this public IP address your server is listening to new connections and then your client will obtain the server's IP address via DNS and it will open a TCP connection to the server. So your server is there your client will do the connect system call or whatever and you know establish a connection to the HTTP server that is listening on port 80 at some IP address. Now the client will send an HTTP request it will tell the server please give me some information about this webpage or about this course or about this video whatever the client will request the server make an HTTP request and the server will send an HTTP response back. And all of these HTTP request HTTP response are just application layer messages that are sent say via you know TCP sockets and on top of that you know you have your HTTP request and then the TCP will add its headers IP will add its headers the layer 2 will add its headers all of that is happening ok but the actual message is an HTTP request. The format of this actual message is specified by the HTTP protocol and once you get some response back from the server then your browser will display you know whatever information is there in that HTTP response whatever HTML you know display the page like this all that information is there in your response and the browser will render that webpage in front of you so that you can see. This is a basic high level view of what happens when you browse a webpage and this data that is exchanged is in HTTP response whatever the server sends back can be actual web pages you know text like HTML based web pages it can also be images multimedia files you know on one web page you can actually have many different things you can have the HTML web page you can have some style files telling how this web page should be displayed you can have some embedded images videos all of that will be sent to you in the HTTP response. So this is the basics of HTTP now let us look at what is the HTTP request you know that the client sends to the server the HTTP request is the first step so HTTP is a request response communication the request is the first step. So the request can be used to get information the requests are multiple types you know it can be a get request you can it can be used to update information at the server for example if you are booking some train tickets you can post information to the server saying these are my dates and this is my route and you know this place I want to go to all of that you can post to the server. So your HTTP request can be get post and there are many such types like this and then you can specify the location the resource location I want to get information about you know this course or something or this product on an e-commerce website. So your get request will also contain some extra information about which resource you want to get and your HTTP request will also have various other HTTP has many header fields some extra information it adds to the message this is the main message get so and so information but you will also add you know what type is your browser what language do you prefer there all of these extra things also you will tell the server and all of this is sent in the application layer message in addition to this you might also have some even more content the body of your request might contain you know whatever value you want to post you have get put post all of these things and then you have some headers and you have some body all of this is your application layer message your HTTP request and for at what granularity do you make HTTP request if you have multiple objects in a web page you know you have the HTML text you have some image you have some video you have some other script files various other things in a web page all of that information cannot be fetched in one HTTP request okay every object on a web page is fetched via a separate HTTP get request so your browser will make one get request get the main page and inside that page whatever other things are there you have to make separate HTTP get request to get each of these separate items okay so an HTTP get request will only get you one item an HTTP post request will only update one item for you now these multiple HTTP get request what do you have to do you have to open separate sockets it is not necessary if you already have one socket open between a client and a server you can get one object after that you can get another object after that you can get another object so you can use what are called persistent connections over the same TCP connection you can first get one object then the other object then the other object one after the other instead of every time you know opening a separate TCP connection doing a TCP handshake connect accept you don't have to do that every time once you open a connection you can fetch multiple HTTP objects using separate HTTP request on the same TCP connection the other thing you can do is you can also open parallel connections to one web server you can open multiple TCP connections and on this get one object on this get one object on this get one object in parallel you can do this also for faster performance if your web page has 20 objects you can you know in parallel open four connections get sort of five objects each on each connection you can do that also so this is about HTTP request and next is HTTP response once the server you know accepts your TCP connection reads your HTTP request right we've seen a simple server socket program accept connection then read from the socket and now the client has sent an HTTP request on the socket then the server will generate an HTTP response back to the client this response can will have various things it will have a status code indicating okay could the server you know successfully handle your request or not this status code a common status code is 200 saying okay then there is a status code 404 saying you know you couldn't contact the server something right there are various status codes that are returned for you in your HTTP response then your response also has various header fields like you know what is the type of content you are sending some timestamp when was the content updated all of that also you will send and the actual content is also sent right you will send status code and various headers and then the actual content all of this so the HTTP protocol specifies how all of this should be put into a packet and sent out to the client and note that there are two ways of generating HTTP response either all of this you know whatever content you want to send this file the user has requested an image file say either the content can already exist at the server in which case it is a static response or the server may have to generate this content in for each request for example if the user has entered some search keywords on an e-commerce website then you know with those keywords you may not have a ready made file that you can send back to the client then the server will you know look up its database construct an HTTP response and then send it back to the client those are called dynamic responses. The one thing to remember about HTTP is that it is stateless that is request response after that that's it the server doesn't remember anything about the client you know it won't remember oh last week you made this request to me I sent you this response now I am sending you this response doesn't remember any such thing HTTP is fully stateless but then you might be wondering some websites track you it or some websites know what I have done before on that website how is that accomplished that is accomplished through what are called HTTP cookies okay so when you make a request the server will send a response back along with that response it will also send some special identifier for this user called the cookie the next time you make a request if you send this cookie to the server then the server knows who you are and it can remember some information about you if you clear all your cookies then the server cannot recollect any information about you so if you want more privacy you can always disable cookies on the internet and these HTTP responses they can always be cached you know you have some image file you got then some other user also is going to the same server getting the same HTTP response then again this idea we've seen again and again in the course you can also cash this HTTP response your browsers maintain a cash or there are some special caches in the network so that if multiple users in a network are accessing the same HTTP content you can obtain it from the cash instead of going to the server all of these also can be done and the HTTP response headers will tell you how long the item is valid in cash and you know how can you cash it that is specified in the HTTP response headers. So now most of the web servers today actually generate dynamic content whatever static content is there if you are distributing static content you can just give it off to CDNs we've seen CDNs in a previous lecture but most application servers web servers today are focused on dynamic content because that is what is where the most interesting computer systems can be built for example how is dynamic content served if you want to search some for some products in an e-commerce website then you will enter some keywords then your HTTP request will convey all of these keywords to the server then the server will look at the keywords contact other you know databases or other servers you know you made some request to an e-commerce website this server will talk to various other components look up some catalog of products assemble the response and then send the response back to the client. So there are many steps here and at the client side also there could be some scripts running which basically modify this response when displaying suppose the server has sent a list of hundred products then on your browser you can have an option to view only the first 10 products right that is done by your client side some scripts running at your browser that will do all of this like for example JavaScripts and all of that. So in this way our dynamic content is generated both at the server side and dynamically displayed at the client side. Now this is a very complicated thing it is not like a simple socket program you can write read something from a socket and you know write something back it's not so easy you know you have to talk to multiple components assemble the response dynamically construct the response therefore you have various pieces of software today which are called web application frameworks that will make it easy for you to build these web servers or application servers that serve dynamic content you know these frameworks will have all the pieces required you know the web servers the databases and you know some scripting languages to know in order to parse the request construct the response you know in a wide range of programming languages some front-end tools at the client side the scripts that you need to display this dynamic web page the server side tools that are needed to handle all this back-end processing all of these are packaged together in a web application framework and you use this web application framework you don't have to worry about setting up your database web server opening a socket all of that is automatically handled for you you can just focus on your application logic of you know how to handle HTTP request what things to fetch from the database you can just focus on your application specific logic everything else is taken care of by the web application framework even things like you know should you have one thread per connection even driven API's all of these are also handled by the web application frameworks so HTTP also has many optimizations today you have version 1.1 that is evolving to version 2 and version 3 today which has many features in order to improve performance for example HTTP 2 improves performance in several ways over HTTP 1 you have a more efficient way of transferring data that is you know HTTP 1.1 send data as text but HTTP 2 will you know send it in a much better compressed format then the server can actually push some objects instead of waiting for the client to ask for everything if the server knows the client needs something it can push you can also have multiple streams over the same TCP connection so that you can quickly send multiple objects all of these are optimizations that you have in HTTP 2 if you upgrade your web server to HTTP 2 you will get all of these features for improved performance so you might wonder why do you need multiple streams in the same TCP connection that is because once your TCP has you know done the slow starts settled into some constant bandwidth you know into some stable state you might want to actually send multiple streams of data into that TCP connection you know get multiple objects in parallel instead of opening multiple TCP connections itself in one connection itself you might want to get multiple objects that feature is available in HTTP 2 for you of course this has the problem that you know what if one of the objects is lost then all the objects in a stream are kind of blocked because TCP will recover do retransmission of that object and you know even if the other objects have gotten through they cannot be delivered in TCP because it only does in order delivery therefore people are also moving away from TCP towards UDP based transport protocols now HTTP 3 uses a new transport protocol that is called quick which is actually built up on UDP and you add reliability and congestion control a little bit differently you are not doing like TCP but you are doing something different you are starting from UDP and doing something different so HTTP 3 uses quick as your transport protocol so in this way I mean all of these are sort of advanced topics that we do not have time to discuss in a lot of detail but one thing to understand is people are constantly working towards how to make HTTP transfers faster and newer versions of HTTP are evolving to handle all of these issues and they are also work being done on how to design your web page itself so that it loads faster you know so that the important content comes first you start to see something on the web page so that the user is not staring at a blank screen for a long time all of these these are all active areas of research and how to improve applications to make them faster for the internet so one other thing that you would have come across is you know things like JSON and all of that you would have heard so I just want to briefly explain what these are so whenever you are sending some big data structures or objects on the internet over protocols like HTTP you have to serialize them what does it mean you have to somehow have a way for deciding how you will break up that object and stream it as a series of bytes okay so there are certain standardized formats in order to serialize objects and deserialize them at the other end so one popular format is what is called JSON so this is a text based format so if you have some data structure like this you know for example a banana is represented as an object it has a name a type and a color and so on then JSON will basically serialize that object into the string and send it over the network and when you receive this string you can once again parse it and make it into an object then you can access the fields of the object like this so you can easily go from an object a data structure to a string and back and different programming languages will support this so this is a standard way note that you need to agree on the standard format of the string otherwise you know if different people are serializing and deserializing in different ways then you cannot recover the object on the other side the other popular format that's being used today is what is called protocol buffers this is another library that can be used to serialize so with protocol buffers you can actually define some message you know you can define a structure with all the fields data types everything you can define and then this protocol buffer will automatically generate code that will convert this object into a string convert a string into the object you know modify various fields of the object for all of those the code is automatically generated that you can use that the application can use okay so you can just create object like this and give it to this compiled code that will take care of converting this object into an output byte string so there are several such serialization formats available so that you as an application developer you don't have to reinvent the wheel every time and figure out oh my application has this big data structure how do I send it to the other side how do I write it into a socket all of that is taken care of if you use one of these standard data serialization formats so the next application that we are going to study is how email works okay we have another application protocol SMTP that deals with email so email is stored in mail servers like how websites are stored in web servers email is stored in email servers and this is again a server is nothing but separate process that's listening on a special well-known port for email it is 25 whereas for web bit is port 80 and this is some software that is listening on this port and can manage all the email that you get so these mail servers like you know companies like Gmail or your own college every organization can have its own mail servers available to store the email now when the user accesses the mail servers the user will use something called a user agent for example an email client or the browser if your browser is your user agent it is called web mail or you can also have other special user agents like Outlook or Thunderbird these user agents talk to a mail server in order to send and receive emails and the protocol that is run between these user agents and the mail server that is called the SMTP protocol now why do you need a separate protocol for example why couldn't we use HTTP note that different protocols are designed for different purposes when you send an email you have to push an email you are not fetching anything right HTTP is designed to get information from a website whereas SMTP is designed to push emails to the other person therefore the semantics the headers everything are slightly different therefore you have different application layer protocols for different purposes so let us just understand if you want to send email from this email ID to this email ID what is happening you have a user a that is you know using some email server at sender.com to send email to somebody at rx.com to another user B okay so then the user agent of a which is either a browser or some email client will first obtain the IP address of this mail server at sender.com how do you do that that also you do via DNS DNS will give you the IP address of the website it will also give you the IP address of the email server if you go to gmail.com gmail.com has a website it also has the mail server that is actually handling all of this email so from DNS you can get the IP address of the mail server also then you will open a TCP connection to that mail server and over that TCP connection you will send SMTP messages to push your email okay you will use SMTP to push your email to the mail server sender.com then sender.com will open a TCP connection send an SMTP message to push your email to the mail server at the receiver side now then this user at a later point of time this user will you know fetch the email will contact the mail server at his the receivers domain and get the email note that on the receiver side you cannot use SMTP because here you have to pull so therefore there are various protocols available like IMAP and so on or you can also use HTTP to pull your email this is like getting information from a website the concept is similar so the user agent of B will pull email but all of this pushing of email from user agent of A to the mail server here between mail servers all of this happens via SMTP. So the one final concept I want to discuss in this lecture is that of what is called remote procedure calls or RPC okay so RPC is just a different way of thinking about client server communication where you actually call the servers code as if it is a function in your local code for example if you want to search for products on an e-commerce website let us continue the same example either you can you know send your HTTP keywords that you are searching for an HTTP request and the server returns an HTTP response and so on that is one way of doing it the other way is if the server has some function defined in its code like search you can just invoke this function at the server you can invoke a function at a remote server like just like you would call a local function in your code that is another way of programming applications which is called remote procedure caller RPC. So now you might wonder okay in my code I know what functions are there how do I know on the servers code what functions are there that is where you have various RPC libraries or RPC frameworks that will help you implement RPC in client server applications. For example one thing that you need is you need a common description of the interface the client needs to know what are all the functions that the server has what are the messages it can send what are the arguments to these functions all of that information it is written in a common interface description language and agreed upon between the client and the server then when the client calls this function like search the client knows now from this interface description language it knows what are the functions at the server then when the client invokes this function like this is RPC library takes care of everything you know this message that is sent to the server you will see realize the message on the client side there is a small stub that will serialize this message open some socket talk to the server on the server side receiving this message over sockets converting it into the arguments for the function all of that is done by the RPC library and you as the user you just have to write implement the functions at the server side and invoke those functions from the client side. So RPC is a very different way of programming in some sense it is more intuitive where you are able to decide you know just write your application in the form of functions and invoke those functions just like how you would do in a regular program with the RPC library taking care of all this remote execution related complications and there are many RPC frameworks available today like GRPC is a popular framework and you know you have very choices for how do you serialize, deserialize, how do you communicate TCP, UDP, how to exchange messages do you do a blocking RPC like when you invoke a function will the client block is it even driven API all of these choices are available to you in different RPC libraries and the RPC library will also take care of network failures. Now when you do a local function call there is no notion of packets getting lost and all but when you make a request to a remote server actually your request can get lost the server may not get that request. So therefore your RPC client has to you know repeat retransmit that request all of that is also taken care of by RPC libraries you know some libraries will repeatedly execute the function at the server and your server code for that to happen your server code has to have some property called as it has to be idempotent that is you have to be able to repeat the function multiple times or the RPC library has to somehow take care that it will exactly give invoke the function only once all of these you know we will study all of this when we study reliability part of our course in more detail but the thing to note is that your RPC library has to handle many more complications RPC is not as simple as just running your local function because all of these other complications that are present and then why would you use RPC versus using an application layer protocol for example if you want to talk to get some information will you use RPC or will you use HTTP well the choice depends ok. So with RPC the problem is that your client and server are tied together very closely you know you cannot just open a connection to any server and request HTTP objects you have to know what are the various functions being implemented at the server function names number of arguments what messages what arguments all of that is you know very closely agreed upon between the client and the server therefore you have lesser flexibility whereas if you use protocols like HTTP you have more flexibility and any client can talk to any server you are not in some sense coupled with the server but that said RPC has is very versatile you can handle many different types of applications you do not need one protocol for web one protocol for email you can do everything with RPC so in that sense it is very powerful and this RPC communication can also be better optimized for example in HTTP you have so many headers because it is very general anybody can use HTTP for anything but with RPC you can actually customize what messages are exchanged what information is exchanged for your specific purpose therefore it can be optimized better it is more specialized but it is also less flexible than application layer protocols so we will revisit this discussion on RPC versus HTTP later on in the course also when we are thinking about end-to-end application design okay so I would like to wind up this lecture on the application layer here in this lecture we have covered some popular applications like HTTP SMTP and so on we have studied how applications can serialize data to send over the network and we have also studied the concept of remote procedure calls so in order to understand this better to actually see the various headers of these application layer protocols I request you all to please use Wireshark to capture packets coming in and out of your computer say when you are browsing the web or sending email and actually see what are the various headers in HTTP what is the packet format in HTTP all of that you can inspect in Wireshark so that is all I have for this lecture thank you all and see you in the next lecture.