 I am a staff engineer at Inmobi, shameless plug we are hiring, so reach out to me afterwards. I'm here to discuss about HTTP2. It's not something which is extremely new to a lot of folks here, but while I was going through the RFC a few months back, actually, I came across some very, very subtle changes, which I feel are going to revolutionize the way we look at applications right now. Chances are that most of you have used HTTP in the last few minutes, and you obviously know what it is. So I'll not discuss too much about it, but just to give you an overview, traditionally HTTP 1.1, a lot of people talk about 1.0, 1.1 started about in 1999, and it was like a 170-odd page RSC. It started growing, growing, growing to now, I think it's close to 400, but it essentially covers the fact that you make a TCP connection, you send a bunch of headers requesting an asset, and the server sends you back a bunch of headers with the same asset, and they ask again and again and again and again, and that just keeps on going and going and going. So what we're doing, we have a beautiful protocol called TCP, but then we are kind of not at all utilizing it efficiently and just trying to find out hacks around it. Say, for instance, you must be familiar with, let's say, JS concatenation, you have all the JS scripts in single line, or you have CSS, what you call Spriting, which is used in games where you have every image in a single big image, and then you use CSS to, you know, pull out chunks of it just so that you make a single call to the server. Then you, there are other hacks called Sharding, where let's say HTTP pipeline has a limitation of the multiple connections you can have to a server. So what you do is you have C names to the server such that you can span out and, you know, just bombard those C names and start getting assets altogether. But what essentially happens is that you get the head of line blocking problem where unless and until the first thing which you ask hasn't come in, you are just lying there dead, you know, you can't go forward. And these things are very much obvious as soon as you start writing an HTTP application. So a few smart folks at Google came up with the protocol for speedy. So speedy essentially, the overview around it is you use a single TCP connection and then you start having streams inside it. So essentially you don't waste the crucial handshake moments to establish a TCP connection. How this started off was Google's adoption to speedy, so the world's adoption to Google speedy was not that much and then there's this consortium of group which is called HTTP BIS that is HTTP 2, BIS is 2 for Latin. They said, you know what, let's just pick this up from where it is. There's speedy 3 draft and we'll call this HTTP 2 draft 0, 0, 0 and then we're going to start from there. Right now it's a draft 17 which expires in August and hopefully 7, 4, RC 7, 4, 5, 0 and 5, 1 would be, you know, up by then. To make sure that the fact that we're using a 16 plus year old technology yet, they knew that, you know, HTTP 1.1 would be used for a few more decades. Everything is sort of backward compatible such that the URI schemes, you know, your HTTP colon slash slash, whatever it is, those schemes will not change. The default ports are on it. Even the handshakes which we'll see later on are HTTP compliant and they've made one smart decision that is there will no longer be any minor versions. So it's 2 and then there's 3 and then there's 4. There's no 2.2, 2.1 and so on and so forth. So how does the HTTP handshake look like? So as I said, in order to retain backward compatibility, the browsers send a regular HTTP 1 request, 1.1 request, which with the upgrade H2C, this value, H2C is HTTP 2 clear text and the base 64 encoded HTTP settings. So if the server is HTTP 2 compliant, it will give 101 switching, it's an empty line switching protocol status and then the client understands that, okay, I'm talking to HTTP 2 server, then henceforth we'll start an HTTP 2 session. If this is not an HTTP 2 server, then the server will just ignore the upgrade H2C request and continue with the 1.1, you know, the traditional ask me and then I'll tell you request pattern. But then you'll want to know how does it look like. Fortunately or fortunately, it is all binary. So no longer telnet, go Google.com, port 80, get a slash HTTP slash 1.1, you know. Everything is binary. You'll have to pull up your TCP dump or via shock or whatnot and then start looking into the protocol. The reason for going into binary was frames. So why frames? So frames are essentially the core unit of HTTP 2 protocol. I'll just call it H2 shorter. So frames are the core units of HTTP 2 protocol where it allows you to A, get rid of your textual spaces and all your junk characters and at the same time it also allows you to kind of, you know, have a better control of how the scheming should be there and the most important bit here is flow control. So flow control, essentially, I'll discuss that further. It allows you, you know, to essentially utilize your streams much better. The generic frame looks like this, which is like nine octets, 24 set for the length of the payload, a type, frame type, the flags, R is reserved for your handshake and stream identifier and then the rest of the stuff is your payload. There are roughly around 10 types of frames. The most important ones are header, data, and push-roms, I'll say. Settings and go-aways, H2's way of saying, you know, get lost. You know, this is a wrong protocol message you're sent. I don't want to talk to you anymore. So how does the request header look like? If you remember in the first frame when you said, you know, get something, something, and I'll accept your gzip and, you know, I am so-and-so user agent. Well, that's slightly changed. What has happened here is H2 has introduced something called the pseudo headers, which essentially says that these few things, like the method that is get, post-push, the scheme, HTTP, PS, the authority, you know, your username, password, if you put it in the URI, the path, like, get something, something, something. All the satisfies which you get from the response, they are, you know, something which will always be there. You know, in every frame which gets transferred, these things will always be there. These are called pseudo headers which have to be there. They are in the binary format and they start with a colon and the name. You cannot go away with it. If you miss one, the packet is discarded. The cookies are then taken in as a, they're concatenated in a single string separated by semicolons, essentially, and it's the job of the receiver to, you know, split it with the semicolon identifier and take it forward. After this comes your regular key value. You know, whatever key value you tend to use, it could be content type, content length, cache control, and so on and so forth. So they work as it is in the traditional way. The best part is the HPAC algorithm. So Speedy was using Deflate. HPAC is slightly better when ITF said, decided to go out with this. They said HPAC is better and there are quite a few tests out there which show a significant improvement in HPAC in header compression, especially. Now, let this sink in a bit where we are saying that we are going to compress the headers because for every request, let's say that you are doing a beacon request, the response is going to be a one-by-one GIF, but for that one-by-one GIF, we sent like, you know, 20 to 40 bytes of junk characters, your user agent and whatnot, and the server responded with the same set of junk characters, which you may be able to control, but are they still junk because I don't need it. And it's extremely inefficient to send them as clear text to both parties when, you know, the bandwidth capacity is limited these days, so the response time is something which you want. So by allowing compression, you are reducing the chatter of clear text requests in the pipe. So this is on the left side is your traditional HTTP response where you said, you know, I'm, as the server said, it's 200, okay, I'm sending you an image with the size of one, two, three bytes, and here's the image binary. On the H2 format, it would be, you know, I'll end the previous stream and this particular frame ends with my headers ending, and the status, as I said, the pseudo header is with the AC 200 response type, and then the key value start in. This are completely user control, and then the data frame kicks in and starts sending you the binary data. Now comes the most important bit, which is streams. So imagine, you know, a pipe which essentially has further sub-pipes in it, and then you don't need to have anything go beyond that particular initial pipe. Sorry for that analogy, but let's say that, you know, if you were to talk to someone, you would rather talk to them once and have all your questions answered rather than, you know, keep going to that person again and again for every question you have. So streams allow you to have bi-directional, concurrent, unilateral sub-channels within a single TCP connection. This is extremely efficient when you're talking about an average web page which has, like, say, 52 or 80, you know, assets in it. Imagine if you guys have, you know, opened the developer mode of your browser, you'd see a bunch of requests going into your target website. Most of the requests are unique TCP connections. Now, this just is plain inefficient to me. So streams allow you to, you know, get away from that. They said, okay, you have already established a TCP connection, and then let's have these concurrent questions getting asked and being answered with. And the good thing about, I mentioned flow control. So here when you say you have streams, then flow control ensures that none of the streams take precedence over the other in terms of capacity. So you don't have, like, a U-stream which is coming with a big window update and then just blocking the other streams. You know, flow controls make sure that, you know, those streams are aligned with the settings which you have set earlier. So this is, like, a small diagram which shows that you have a single connection now and then you have two streams here, stream one and stream two. They are both concurrent, so right now they're not dependent on one another. The client has asked two separate concurrent requests to the server and each of them is being answered individually. What's unique here is the stream identifier. So H2 has said, you know, any client-initiated stream would be an odd number, any server-initiated stream would be an even number, which we'll look into in the next slide. The other thing which a stream allows is stream dependency, prioritization, and hierarchy. So on the left side, you can see, obviously, this is a dependency. You know, stream B and C depend on A, fair enough. Now you introduce a stream D. Now stream B, D, and C depend on A, but let's say for some reason D has a higher priority. We want to give it a higher priority. So you can send in a mid-flight priority frame saying that, you know, give stream D three-fourths of the priority and the other one-fourth of the priority. So even though it was established as a uniform set of streams, you can mid-flight change the priorities of the stream. Let's say in the second example where you introduce stream D and all of a sudden you want B and C to depend on D, and then D depends on A. So you can have sort of a hierarchical priority and what do you say, stream hierarchy set here. Let's say for some reason D vanished or got killed, then B and C will not be often. They'll immediately become the childs of stream A. This would be, this would make sense in this example. So traditionally you have a web page which, let's say, requires two things, one, a CSS file and a GS file. The client asks for a web page in a single stream. The server knows that, you know, the next request which is going to come in is for the CSS and the GS page. So what the server does is it sends something called a push promise. It says, I promise you to send another file which is in a separate stream, and I promise you to send another file which is in a separate stream, and here is this detail of the first file so essentially what happens is the client gets the, gets the details of the fact that styles of CSS and custom.js are something which I will be needing and the server has promised to send me in a separate stream so once I pass stream one, I will not ask for these again. This removes the need for the client to ask for those files at all. So what happens is two new streams would be created where the server will push the file without the client having to ask for it. So imagine the amount of latency this reduces. Let's say you have 80 assets being loaded in a single page and the server sends a push promise for all those assets in the single request for the single page. You reduce about a huge chunk of time essentially. The latency reduces a lot, especially if, you know, you really know that these are the statically included assets which I am going to send to the client. So what's the current state right now? Well, actually what happened was, so Google said, you know, we'll discard using speedy. We'll completely move it to H2. So they have essentially, they are on draft 17. Twitter is on draft 17. A bunch of sites which are using speedy. You can just download one of those plugins and figure it out. As for the application goes, NGNX and Apache web server said, you know, we'll wait for the protocol to be extremely stable and then we'll start adopting it. But until then, there is one very good, you know, web server is called ng-http. I request you to check it out. It's written entirely from scratch and is to the point in coherence with the H2. Apache traffic server is also in compliance with H2 and Firefox and Chrome, obviously, you know, are now supporting HTTP2 along with speedy. So why would you need this? Well, for lack of a better word, just look at efficiency. Forget the fact that, you know, there is header compression. There is possibly the other ways to get more, you know, juice out of it. And an apology that I didn't discuss TLS because, you know, I won't be giving justice. It needs an entire talk in itself. But the fact that we are using a single TCP connection to talk and get access for all the things which we need from that server, this helps you scale a lot. And provisioning, you know, network designing and all, everything else becomes much more simpler and much more sensible. TLS, for a matter of fact, imagine, right, you have 80 assets making an HTTPS connection to a server, you will essentially end up making dozens of HTTPS connections, be it to Akamai or, I mean, be it to a CDN poroid or your servers. But essentially, those TLS connections are expensive. Now, imagine you concatenate everything into a single TCP and you can, you know, for what it's worth, you know, have an entire site served out of TLS and you won't see any significant reduction in latency. So here's something which you could actually try right now. Let's say that you are a website which has a lot of thumbnails or whatnot and assets and you don't want to invest into something like NGHTB2 and you can have your static content being sent through a partial traffic server and let your clients talk to the ATS, let the ATS be your endpoint, a CDN endpoint and let that be serving via H2 and let the ATS talk to web servers via HTTP 1.1 and, you know, let them be at peace. And this should significantly change how you serve your static content to your customers right now. Another off-the-shelf idea which comes to your mind is, you know, where you have designs where you have backbone or angular-like client-side MVC frameworks which make API calls to the back end. You can essentially just piggyback on the single TCP connection which you made initially. That's it. You know, you don't need to make, reconnect all the time you want to make an API call. And with that, I conclude my talk. It was a lightning talk and I'm sorry I'm not giving justice to this protocol but I request you to go through these references and, you know, have a glance at them. They are really, really interesting and we are open for questions. We have time for one question. I have a question about the slide where the push... Push promise. Yeah. Could you please go to the slide? Yes. I have a question. Could you please show the slide? Yeah. Yeah. This one. Okay. So now the client is... the server knows that client expects a CSS and JS file. Okay. Yes. But what happens if actually it's already been cached in the client? That's a good point. So if it's already cached in the client, the client will not... it will say send the go-away request or end stream. Essentially, it will tell the server that I don't need you to set up the stream. But, see, it already got the push promise. Okay. And the client will now have to send two requests saying that, oh, don't send me that. No. Okay. So what happens is when you get a push promise, a new stream is created. Okay. And the client says, okay, this is a push promise stream and I'm gonna... a server is gonna send me on this, but it will send an end stream over there and there. So stream two would be created, but then an end stream packet will go in and then the stream is discarded. But what if it already started sending it? No, you cannot. So S2 does not allow the server to initiate streams. The client will initiate the stream, but it will say that I depend on the server to send me a traffic on this. Okay. The server cannot bombard a client. Everything in S2 is client-generated. So after getting a push promise, the client will say, okay, I agree. Send me. Yes, I agree that you will send me on stream two. So here is the stream two which I'm initiating. Okay, got it. Thanks. And if you remember in the previous slide, I showed S2C header which sends in HTTP settings. That's where you can control the max concurrent TCP frames and stream, sorry, and so on and so forth. Any other questions? Hey, that is something, that is the reason which is kind of, you know, so essentially what happens is the application servers need to be aware of how the stream prioritizations are to be sent to the client. This is one of the biggest reasons the web servers like NGINX and all are not, you know, jumping onto the bandwagon of getting this thing. That's why Apache Traffic Server is a simple enough CDN. It's not gonna give you push promises, but it allows you to utilize a single TCP connection and just use that. Okay. Any other questions? All right, cool.