 Hello, everyone. So you want to learn about HTTP2? I know you do. OK, I'm going to get started. I have a lot to tell you today. So I need all the time I can get. I've already lost 20 seconds. My name is Hooman. I'm the VP of technology of Fastly. We're a CDN. We're here downstairs. Come visit us. All of us, except one person, is wearing really, really bright red outfits. You'll see them right here. OK, I'm going to get going. The title of this presentation was really, really valid about six months ago when we came up with it. Some people started talking about some of this stuff, so I had to amend it. Despite that, if you're like me, a lot of what you heard about HTTP2 sounded like this. It was a panacea. It was a promise of a great internet in the future. There was so much of this, actually, I felt like it was clickbait after a while. Despite all of this, what I wanted to do is I wanted to take a step back and look at the protocol somewhat objectively and practically and see how it acts in the real world and what are those practical considerations for the real world. I have a bunch of data I want to share with you. And at the end of the day, I want us all to have a better understanding of the good and the bad and talk about what happens next. My assumption is that you're familiar with general web performance topics. I'm hoping you've all seen a waterfall from a web page, general things about web performance, and you're somewhat familiar with HTTP as a protocol. What I'm going to do first, though, is I'm going to go through the basics of H2 and lay the groundwork before I get into the meat of the presentation. So in the beginning of time, 1999 or so, sorry, 1991 or so, there was HTTP1. HTTP1 ran over the TCP protocol. And you probably know that HTTP and TCP have traditionally not worked well together. What happens is a client opens a TCP connection to a server, sends a request over that connection, and the server responds. And while that exchange is happening, nothing else can happen over that connection. This is something we call headline blocking. Essentially, an outstanding request doesn't allow that connection to be used for anything else. The protocol had something called pipelining provisioned in, which was supposed to address at least half of this problem. The idea was to send a lot of requests one after another over a connection and get the responses back one after another. It never worked. We never actually used it. And one of the problems with it was when you send a bunch of requests over a connection, they all had to come back in the same order, and you could never interleave those responses. So that was still somewhat headline blocking if an outstanding request or a response, the response took a long time to come back, sort of blocked everything behind it. So what we did is we tried to address it in the browser by opening multiple connections. You probably know this. Browsers open six connections per host, and this is their attempt to do some concurrency and alleviate this headline blocking. That still wasn't good enough so we came up with clever techniques like concatenation, consolidation, sprites, a bunch of inlining, an entire industry called FEO or Funded Optimization was born out of addressing these inefficiencies. But still, something needed to be addressed at the protocol level. So the primary problems that were sort of identified with H1, HTTP1, was that there was a bad concurrency or multiplexing model. You couldn't interleave things, so everything had to come back in order one after another, didn't interact with TCP very well, and there seemed to be a general header bloat problem. So there was a companion sort of effort to address that. Enter into the picture HTTP2, ratified as RFC 7540 just over a year ago. I'm gonna give you basically a five minute crash course on it. At the end, you're all gonna be experts. It's gonna be amazing. It's a binary protocol. That means it's all ones and zeros. It's a binary protocol. What that means on the wire is that it's actually easier to develop for because everything is in a predetermined, well-defined place. If you've ever developed anything for HTTP, you've had a very hard time with text parsing or slash R slash Ns at the end of the lines or anything in that nature. All that stuff is gone. Binary protocols allow us to develop really, really well. But also gone is talenting to port 80 and typing get slash. That's gone as well. We can't do that anymore. We need tools and mechanisms that are outside the protocol to help us do that stuff. The communication itself starts with a connection. The client opens a connection with a server. And in H2, everything happens over a single long-lasting TCP connection. So it's a single connection model. And the theory is that by doing everything over a single connection, better congestion management can happen by the server and the client because multiple connections aren't vying for the same physical resources on the network. Everything happens over TLS. Now this is not mandated by the protocol. The protocol actually is a clear text version of itself. But all the browsers have said they're not supporting anything other than the TLS version of the protocol. So everything we're gonna see with H2 is basically going to be over TLS. This is done through an extension called ALPN, application layer protocol negotiation, I believe. If you've never seen it, basically what happens is during an SSL handshake, the client says, I wanna speak H2. And the server says, you can speak H2. And after that, over that connection, they start communicating over H2. There's also a provision in the protocol to coalesce multiple to do connection reuse. So if you have requests going to two different host names that map to the same IP address and have the same certificate, then you can send all those things over a single connection. Again, with the goal of pushing everything over a single connection for better congestion management over the long term. Over that connection are these things called streams. Streams are essentially virtual communication channels between the client and the server. They can concurrently exist over the connection. And either side can initiate them. Each stream has a stream ID. Stream IDs that are odds are the ones started by the client. Stream IDs that are even are the ones that are started by the server. And stream IDs zero is sort of reserved for management purposes. The only basic rules are each ID generated from each side has to be bigger than ones came before it. And you can't reuse stream IDs over a connection. With a 31 bit space, that's roughly one billion streams per endpoint over the lifetime of the connection. Streams are stateful. I'm not going to go over this at all. It was just, I was just going to show it to you in case you didn't believe me. It's totally stateful. As a matter of fact, H2 is a very stateful protocol. Gone is the statelessness of H1. The smallest unit of communication in H2 is this thing called a frame. Everything that happens between a client and a server essentially happens with frames. Everything is framed up. Each frame carries a stream ID and frames basically can flow in either direction, concurrently throughout the lifetime of the connection. The best way to show you this is to show you what it looked like before and after. So this is a H2P1 request and response exchange. The client sends request headers. The server sent response headers with the body. In H2, the client doesn't send request headers. It sends a headers frame that carries the request headers. And the server sends a headers frame that carries the response headers. And the body is sort of packaged up and sent in data frames. These are two of the many frames that the protocol defines. The protocol defines all of these frames, none of which I'm gonna go through, except a couple later on in the presentation. I am gonna talk about the settings frame really quick. The settings frame is essentially a management frame. And the two sides use this to define the parameters for the connection. Things like how many concurrent streams I can have over the connection. How big are the frame sizes? Server push, which is a feature. Any other features that need to be enabled or disabled are done through settings frames. There's an acknowledgement mechanism. And actually the connection starts with both sides exchanging settings frames sort of set the framework for the connection. And if you send new settings frame with new settings, they just sort of out, they obviate the previous setting. So what's the protocol look like? Well, if this was H2P1, when you had a connection and you sent a request and you got a response, this is HTTP2, the client sends a headers frame with the request headers, the server sends a headers frame with the response headers and a bunch of data frames that carry the body. And when we manage concurrency by opening multiple connections in H1, we manage concurrency in H2 by just talking H2. We can send frames, data and header frames in either direction. Header frames can come one after another without data frames, both in the client a server direction and the server to client connection, data frames can be interleaved. So you can send a data frame from stream one and then say the data frame from stream three and then another data frame from stream one. This is how we get concurrency and interleaving. Somebody thought this was best shown as candy. So this is a delicious analogy of what H2 does. If you've ever looked at waterfalls and I really hope you have, this is an H1 waterfall. You can see it has the familiar waterfall. The reason is called a waterfall. This is an H2 waterfall. Lots of things happening concurrently. You can see that when we talk about concurrency multiplexing, this is what it looks like. And if you're like me and luck looking at packet captures, I'm sorry, that's your disease like mine. But this is what it looks like over the wire. You can see a whole bunch of things happening back and forth. See that one packet down there that's got a whole bunch of header frames together. That's the server responding with a bunch of response headers without sending anybody. That's not a thing we did in H1. So that's kind of cool. Every browser pretty much supports this now and all the servers are starting to support it as well. There's modules for Apache and Nginx. If you've never heard of a server called H2O, I highly recommend you check it out. It's an amazing, amazing server. Supports both H1 and H2. We use it at Fastly. There's a list of these servers also on Wikipedia and I hope that all of you are using CDNs because they're great and you should be using them. And if you're using them, it's your CDN that's going to be terminating those connections. So talk to them if you want to partake in the H2 magic. Okay, you are now all H2 experts. Congratulations, I will have certificates for you after the event. We're gonna talk about performance because arguably the most talked about feature of the protocol is performance. So I am a cynic by nature and I have to kind of see this for myself. So to test it, I constructed basically the perfect page for H2, which is a page that has no browser rendering with scripts or CSS whatsoever. It's just a page with a hundred images on it. They're the same image just named differently. It's a very simple page, 110K images to highlight the performance features of H1, sorry, H2. This is what the waterfall looks like before and after. You can see the familiar waterfall form before and obviously all the concurrency after. If you decide to do this yourself, I highly recommend that you use a different image that I chose because this is great. The first couple of times you look at it, but I can't tell you the nightmares this induces after you've looked at it a hundred times. It is horrible. Please choose a different image. And then what I used is a tool called webpage test. Raise your hands if you've heard of this tool. Yes, if you haven't, familiarize yourself with it. It's an amazing tool. It's basically a synthetic web testing tool that lets you put in a URL and it generates a waterfall for you. The best thing about it is that it lets you emulate real world conditions. So you have bandwidth profiles and latency profiles, basically can simulate a real user and use real browsers. It's a magnificent tool. There's a public version of it for you to use or you can install a private version of it and deploy your own clients and use it in-house. It's a fantastic tool. I highly recommend you use it. So I used it. I deployed a private version of it. I used a profile of five megs down, one meg up with 40 milliseconds of latency. I picked 40 milliseconds because that was roughly the median across all CDNs in the U.S. And five and one was the default profile used by WebPistas public. So I just sort of took their word for it. I used Chrome compared H1 to H2. And instead of testing three, five, nine, 10 times, I tested 270 times because I really wanted to see patterns. This is the scatter plot of the results. So orange is H1, blue is H2. Okay, it seems to be slightly faster. I was maybe hoping for a little bit more separation that would have been kind of what the expectation was considering that this is the perfect page. But we can probably argue by the scatter plot that H2 is faster than H1 certainly over a large dataset. You've probably seen some version of this from other people where they say H2 is more performant. What I had never seen though is anything that includes packet loss. Packet loss is something that happens naturally in the wild. And none of the simulations that I'd seen or any of the data that I'd seen for H2 and included any packet loss. Fortunately, WebPistas gives me packet loss. So I can just put packet loss right in. I can simulate packet loss as well. So I did that. Let's take our scatter plot and put it in the corner. Let's introduce packet loss of the picture. Here's half a percent packet loss, one percent packet loss and two percent packet loss. Certainly a different picture. So you can see that packet loss has an adverse effect on the performance on H2 in this case. And H1 actually performs better generally with the higher the packet loss. Okay, maybe this is a Chrome thing. Let's be thorough, let's test Firefox. Pretty much the same thing. Actually, it actually stands out a little better. Okay, I've heard a lot about how H2 is supposed to help mobile connections and slow connections. Let's do that. WebPistas allows me to do it. Here's a terribly slow connection. I've seen worse, but this is one. It's a slow 3G, 780 down, 330 up, 200 milliseconds of latency, which is a lot of latency. Let's see if things change. Nope, they pretty much stay the same. Maybe there's a bit more separation with zero percent packet loss, but certainly we see H1 stand out with two percent packet loss. You see that little bump in the upper, like in the beginning of all these scatter plots. I thought about taking that out, but what happened there was my server stopped talking H2. So everything that looked like H2 was actually H1. And you can see that all the blue dots sort of commingle with the orange dots. I thought that was an interesting thing to sort of show. You can see what that gap in H2 is because the server stopped talking H2. I'm gonna switch gears. Instead of showing scatter plots, I'm gonna show you CDF charts. If you don't know what a CDF chart is, it basically graphs percentile versus time metric. So in this case, the upper left-hand graph says at the 50th percentile, H2 performed at roughly 2000 milliseconds and H1 performed at roughly 2,200 milliseconds. And in the bottom right graph, you can see that there's separation between H1 and H2 in the other direction. Basically, as you look at CDF charts, if you're to the left, you're performing better. The perfect CDF chart would be a vertical line, but there's always variation and the higher percentiles are always slower. So that's why you'll see curves. Let's put both browsers on the same graph. This is what they look like. Blue and orange are Firefox, H2 and H1. Red and green are Chrome, H2 and H1. Again, you see the effect of packet loss over the performance. Now, what I'm looking at is a metric known as document complete time, which is a common metric. We generally take it as the metric that signifies how fast the page is. That's not an entirely accurate thing. I'll get to that in a second. But to show this to you, I wanted to show you everything in a scorecard. So here's kind of what it looked like so far. I started looking at other metrics. I saw a presentation that said DOM content loaded is an important metric to e-commerce sites. Great presentation, by the way, at Velocity Conference by Tami Eberts and Pat Minin. Check it out if you haven't already. This is what DOM content loaded looked like. Little bummed about the H1 doing better with 0% packet loss, but okay. And I wanted to look at a metric called Speed Index, which is a web page test metric. It's an aggregate calculated metric that takes into account the visual completion of the page as it builds. So it's sort of an aggregate metric. And we see that H2 does better with 0% packet loss at H1 sort of does a little bit better with 2% packet loss. Why is this happening? Well, it's happening because this analogy is actually inaccurate and misleading. This isn't the way to compare HD to H1. This is actually a more accurate analogy because remember in H1, we communicate with a server over six concurrent connections. Look at what happens to a TCP connection when it's subjected to packet loss. Here's a TCP connection with 0% packet loss. The way TCP works, now, what I'm gonna tell you is subject of like for your PhD dissertations I'm gonna give you TCP in 30 seconds. So this isn't the whole story, but this is the basics. This is what TCP does when it's ramping up a connection. The server sends more and more data over the wire until natural loss happens. And then what happens is it slows down and it increases gradually until it gets to a point where there's less loss, less or no loss with the goal in the long term of committing packets into the network at an amount that basically causes 0% packet loss. So it sort of converges over time. Here's what you see here. You see a fast ramp up, probably a loss event that slowed things down and some sort of normalization. This is what happens. The Y-axis is something called C-Win which is the number of bytes that a server thinks it can perform on the network without packet loss. On the Y-axis, on the X-axis you see time. Look at what happens to this connection when there's packet loss. That's a severe degradation in performance. And imagine that what's happened now is you have every communication that happens in H2 subjected to this one thing because everything happens over a single TCP connection. So what we've done is essentially we've taken headline blocking away from HTTP and actually put it in TCP. We've exposed TCP's headline blocking. This is a known issue. I'm not the first one to point it out. It's actually mentioned in the RFC but still it was a risk reward that was worth taking to the authors. If you look at that same 1% packet loss ratio inflicted on six connections, at any one time if I add up the number of bytes that the server can send across all six connections, you see that's more than what a single connection can do. And this is the primary reason H1 does better. Okay, up until now everything I've done with is with the fake page and simulated test. Why don't we do this with real pages? I wanted to see what this looked like with some real pages. So I took eight real pages from eight real sites, most of which I can't tell you about unfortunately but I'll give you some hints. I tested 16 different bandwidth profiles, different download and upload parameters, whole bunch of different latency parameters, subjected all of them to the four different PLR profiles. I tested Firefox and Chrome, TLS only and collected all the metrics. Now one of the things to remember as I go through this is HTTP1 was clear text or could be clear text and HTTP2 is TLS. All I tested here was the TLS version of the H1 pages which are inherently slightly slower than the clear text versions. So the versions of the pages. So take that into account. Probably the clear text version of these pages is slightly faster than I want to show you. But I didn't want to get into the should we go secure or not argument. Let's make everything secure and go from there. And I ran each of these three to 400 different times. This adds up to about 1.2 million test runs on web page test which I kept in a place to analyze. This is a fork. It looks a lot like the one I wanted to stab myself with repeatedly as I was going through 1.2 million data rows. It was very difficult to see patterns develop. I really wanted to come here and not fork myself and give you lessons about what those patterns are. But I can't. That's a bit of a spoiler. So what I did is I divided all the pages into three different buckets. One where most of the requests 75% or more move from H1 to H2. One where about half of the requests move from H1 to H2 and one where about 25% or lower. So less assets move from H1 to H2. I concentrated on two profiles, a broadband one which was five megs down one meg up with the four packet loss profiles and a slow 3G one like the one we already talked about. And I focused on three metrics, same three metrics that we talked about. So site one is a page on the Fastly website. It's our customers page. It's where we show off our customers. It's got about 130 or so requests. About 100 of them move from H1 to H2. This is what the waterfall looks like before and after. I'm gonna go through the CDF charts really, really quick. So brace yourselves. Look for blue as H1 for Firefox and red as, sorry, blue as H2 for Firefox and red as H2 for Chrome, orange and green as H1 for Firefox and Chrome. This is .complete broadband. Just gonna go through them real quick. I have a scorecard at the end. Dom content loaded. Speed index broadband.complete. Slow 3G. Dom content loaded. In this case, you see H2 actually did okay with 2% packet loss and speed index. Here's a scorecard. Okay, again, not really stoked on the H1 coming ahead with 0% packet loss, but certainly we see packet loss have an effect on performance with 2% packet loss. Here's site two. It's a travel site. It's about 100 requests. Half of them are moving from H1 to H2, but notice that it's about 75% of the payload. This is what the waterfall looks like before and after. Because I'm not cruel, I'm not gonna show you all the CDF charts anymore. I'm just gonna show you the scorecard. Here's one that .complete looks like. H2 fares pretty well, but H1 does better with packet loss. Same thing with Dom content loaded. Slightly different story with speed index. Site three had many more requests. It's a media site, so lots of third party content. Only about 25% of the assets move from H1 to H2, but notice that it's more than half of the payload. This is what the waterfalls look like. You can barely tell which requests went from H1 to H2 because there's so much shit on the page. This is what the scorecard is. There's .complete. Here's Dom content loaded. Hey, that's all green, that's nice. And speed index, not so good. Let's put all of them on a thing. Here's a big grade sheet. Broadband only, slow 3G only. Let's just put that up there and take a moment. Certainly we see that packet loss has an effect on performance, but again, not really stoked on those red squares with 0% packet loss. Certainly this is raising more questions than answering them. This doesn't, I mean, there's still more green on this than red, that doesn't mean H2 is in performance, but certainly it means that it's not a magic bullet, if you will. What does this all mean? I have no idea. Basically, it means that we shouldn't be sure about this. Certainly packet loss is having an effect and it looks like metrics later in the page are affected more adversely. Although I'd really wanna be careful with that because in the life of time of a TCP connection, a single page is so short, what I would have loved to see and I can't gather it without more work, which I will do, is what happens two, three, four, five pages deep into the navigation. That would be interesting because TCP has had a chance to sort of adjust itself. We saw lots of exceptions. We saw places where H2 holds up with packet loss. We certainly saw places where H1 was better without packet loss. And we definitely saw the Firefox and Chrome were not behaving the same. That is not a thing I wanted to dig into that's probably another two, three hour talk. And like I said, we probably have more questions than answers but this is data that we probably should take into account. So the natural next question is, what does PLR look like in the real world? Okay, PLR seems to affect performance. What's it look like in the real world? It's actually very difficult to measure but we try at Fastly. At Fastly, we have about two and a half to three and a half million requests per second going through our network at any given time. We sample those to sort of figure out what network profiles look like. Here's what it looks like in the U.S. This is graphed, here's, let me narrate this through this for you. First of all, the X axis is round trip time. And this is the 70th percentile in the U.S. So 70% or more, 70% of the people that went through our caches in the U.S. had 60 milliseconds of round trip time or lower. What we're measuring here is the number of requests that noticed packet loss. The orange band is 0% packet loss. The blue band is a request that saw zero to 1.5% packet loss. And the green band is a request that saw greater than 1.5% packet loss. What I've done here is I've only limited it to requests that are 100 packets or more. So enough packets on the network to actually measure packet loss. And this is mostly H1. This is just what packet loss profiles look like. So that's what this is. This is what it looks like in Germany. Kind of the same. Basically, what we're saying here is roughly 20% of the requests are experiencing some level of packet loss. It could be worse. That orange band is very broad and big. That's great. But there is definitely packet loss in the world. I am not the first person to talk about H2 not being a performance panacea or the effect of packet loss. Here's a bunch of reading. I'll make sure these slides are available to you afterwards so you can follow up. Now what do we do? Well, first of all, some words of caution. I'm not going to stand up here. And I've done this talk now three times in the last five days. At none of those places did I stand up and make grand conclusions. Not gonna do any of that because I don't think that this data gives us any huge conclusions. The only thing we can conclude is that packet loss seems to matter and that H2 isn't always faster. H2 isn't this magic bullet that many of us maybe thought it was going to be. Now what I did is all simulated. It was simulated packet loss. Packet loss in the real world happens very differently than injected simulated packet loss. Packet loss happens because router buffers bloat. Packet loss happens because you turn on your microwave at home and your wifi router decides to drop a couple packets. That's very different than the sort of things that I injected. This was sort of a first step. We need real world data and your mileage will vary when you do this, if you decide to do this on your own. So the biggest lesson from all of this is don't listen to anyone. And I mean anyone. Don't listen to anyone. Do this for yourselves. And that's, if you're interested in this, this is something that you wanna do for yourself. Patrick from the Financial Times did it. Patrick Common had this presentation he gave a little while ago where they tested H2 versus H1. And you'll see that in his case the .complete time for his users showed benefit at higher latencies and did not show that much benefit with H2 at lower latencies. Now there's two things that are interesting about this. One, we have no idea what percent of the users fall into each of the bands. That's not clear from this. It's clear from the next graph I'll show you. And the other thing that's interesting is everybody wants CDNs to do H2 and CDNs basically live in the 100 millisecond or less band and it's interesting to me that there is no benefit there. So this is just an interesting data point. He goes on to say that it seemed to help his 95th percentiles which is in line with what a lot of the pundits are saying that if we're gonna see benefits we're gonna generally see them in the 95th percentile. But the 95th percentile for Patrick's site is gonna be different than the 95th percentile for your sites. And this is why it's probably a good idea for you to do this yourself. So instead of digging through 1.2 million data points standing up here and going through them and killing you with it. I decided a gift was better. So I wrote the script. Now I can't write code. So when I say I wrote the script it means I took a crayon and basically wrote on some napkins and somebody named Mark Tudoro who's great and I have the fortune of working with took it up and cleaned it up. Here's a script. It uses web page tests whether your private version or the public one. You put in your bandwidth profile packet loss put in your site and it tests H1 versus H2 and it spits out a PDF that sort of gives you a scatter plot of what's going on, sort of what I have. But this should be just the beginning. Real world data is much, much more valuable. Did you know about this? This is Chrome.loadtimes in Chrome. Tells you what the connection was if you were speaking H2 or not. If you use rum and I really hope you are you can actually measure whether you're using you can use your measurements and then mark things as whether they were H1 or H2 and come up with some of these measurements for yourself. This is probably a good time to talk about Quick. Quick is a new protocol. You've probably heard of it in the context of Google because they're the only ones that are using it. They came up with it. It's a UDP based protocol and it's actually on the standard track now. There's a proposed spec for it. What it does is it takes the transport which we've always had over TCP for HTTP and makes it UDP. And one of the biggest motivations for it was to take congestion avoidance which is the mechanism that controls how TCP reacts to packet loss and moves it to the user space so they can code it and iterate on it quickly. That was one of the biggest motivations for it. Quick is a protocol that we're all super excited about and I think it's gonna address some of the issues that we've seen. Okay, how are we doing so far? Is it, are we good? Am I bumming you out? Are you okay? Jason, is this okay? All right, it's all right. All right, let's keep going. Let's talk about, we have half an hour, that's great. Let's talk about server push. Probably the most talked about feature. Did I say that about performance? Probably the other most talked about feature. It's server push. The idea of server push is we're going to push things from a server to a browser before the browser requests them. Actually, before the browser even knows it wants them. This is what server pushes. It's a hop-up property which means middle boxes like CDNs or caches or anything in the middle can act independently from themselves to a client rather than themselves to a server behind them. Whether you support push or not is negotiated in a settings frame. Remember we talked about a settings frame? Actually, push is on by default. So the only time you're going to send a settings, a parameter in the settings frame that says anything about push is when you don't support it. So here's a settings frame saying it doesn't support push. Then my client isn't supporting push. Everything in push happens with a special frame called a push promise. Here's how it works. The client sends a request for let's say index.html. That index.html is going to have a CSS file on it. So what the server does is before sending the HTML, it sends a push promise frame. This push promise frame says I'm about to push you the CSS file. There's two things that are important in this push promise frame. One is a promise stream ID. So it's the stream ID that the pushing is going to happen over. That's going to be an even number because it's going to be a server initiated stream ID. The second is the would be request headers that the client would send if it was to naturally request this asset itself. So it's actually sending request headers in a frame from the server to the client. The only rule is the push promise has to come before the thing that references it. The references that thing that we're pushing. So if this is CSS, the only rule is the push promise frame needs to come before the HTML header or data frames that reference that thing that's being pushed. So in this case, you see the push promise frame go. Then I start sending the client some of that HTML with the headers and data. And then I start pushing the CSS file. Again, with headers frames and data frames and I can interleave the CSS with the HTML because this is a mechanism that HTTP 2 allows me to do. You can see it in DevTools. Here's what it looks like. And actually, if you hover over those assets, you can see that those first four things, check it out, those first four assets are pushed. And you can see that with DevTools. So DevTools tells you this. There's two big questions with push. One, we have no idea what to push. This is outside the scope of the protocol. The protocol just created a mechanism for it. So it's up to you and up to us in our applications to figure out what the right thing to push is. And second and bigger is it turns out that browsers don't really work well with push, their browser cache that is. In fact, the push cache and the object cache are two separate things and they don't know about each other. So the protocol creates a mechanism, has a mechanism with a frame called a reset stream that lets you reject a push. So if you're trying to push me, if I'm a client, you're the server, you're trying to push me something and I don't want it because I have it in my cache, I can send you a reset stream. Two things, one, I've never seen a reset stream sent because browsers kind of don't know what's in their cache or that part piece hasn't been wired in yet. And two, even if I did send you a reset stream, it's already too late. You've sent me, this thing is just gonna, wow. You've sent me a push promise and you've probably started pushing me the content. By the time I reject it, there's an entire round trip that's happened. So a whole bunch of stuff has shown up on the wire and is being pushed to me. So even if the browser was rejecting it, the mechanism is sort of flawed. Look what happens. Here's a waterfall without push. This is my cold cache request. I grab a page, the first four requests are CSS files, everything is an image. I am a good performance citizen so I put good cache control headers on this and then I go back to the page with a repeat view, nothing happens except the HTML that I chose not to cache in this case. That's great. This is the way we expect our waterfalls. Look what happens with push. Here's a waterfall with push. I took those four CSS files and push them. That's the first four requests you see there. And then I do a repeat view and guess what? They get pushed again. This is bad. And what's happened is even if the browser ends up using the CSS files that was in this cache, I've now put more content on the wire. And this is not good for bandwidth usage. So let's put that aside for a second. Let's figure out, let's consider a situation where you know what to push. There's a few use cases that we've come up with, the world has come up with for server push. First is a replacement for inlining. If you've ever inlined before, you take CSS and JavaScript and sort of put it in your HTML. Inlining broke caching because you couldn't cache those CSS and JavaScripts independently. Now you can push them and because they're independent objects, you can cache them. Those pushed, the pushed content can have cache control headers then go into the cache. That's one. The second thing is pushing content relevant and essential to this navigation, the page I'm on. This is very similar to link rel preload, if you're familiar with this mechanism which basically invokes the browser's preloader. So the browser starts fetching those things before it parses the HTML and get to them. This saves you one round trip time. Look at what happens. Here's those CSS files pushed versus not pushed. So if I push them, I basically save myself one round trip time. Why is it only one round trip time? Because H2 and H2 lets me do a bunch of things at once. So if H2 parsed the page and saw, parsed the headers, parsed the page and saw that it needed that CSS, it could send all those requests at once. That's a single round trip. That's what H2 gives us, concurrency. So if you were to push things that are essential for this navigation, you're essentially serving one round trip time. That's okay, that's great. We like to save as many RTTs as possible, but it isn't a huge win necessarily. You could have a big win if you did things during what we call server think time. And this is the amount of time it takes for a browser to get the HTML from the server. That server think time could be anything from a long distance, like if you're fetching something over long network distances. Or the server more likely, the server taking time to generate a page. I have an exaggerated waterfall that shows you this. Here's an HTML that took three seconds to come back to the client. And during that time, nothing would normally happen. So what I can do is I can use that time and shove those things down the pipe to the browser and let the browser cache them. So once the HTML arrives, the rendering will be much, much quicker because all those things are in the browser cache. Now, this is exaggerated. And if you ever talk to me about HTML, I'm gonna do everything in my power to get you to cache that HTML in your CDN and never have to incur this sort of big greenness in waterfalls. But there are times that HTML content is absolutely uncacheable. And in those times, push can give us a good mechanism. Yoav Vice has a good blog post about this. He goes into this much more and much more detailed. That's the blog post. This isn't a trivial thing to do because it's an asynchronous operation. If you're doing this in a CDN, something has to go to origin while something is simultaneously going to the client. So talk to your CDN to see if this is doable or not. A lot of people thought push would be great for next navigation, the next page. It can be used for the next page, but we already have a bunch of mechanisms in place in the form of resource hints that allows to do this. If you've ever heard of link rel prefetch and link rel prerender, those are ways where as you're reading a page, for example, the browser can be invoked and go fetch things that you're gonna need in future navigation. Those mechanisms are kind of in place already. You can use push, but we already have mechanisms that are pretty much good enough. We still have two big questions. What do we push? We still don't know. Google published a paper. That paper's a good paper. That sort of did an analysis of this. They have some rules of thumb about what to push and how often to push them. It's a good paper to read if you're interested in server push. And we still don't have the answer to the question of what do we do if the thing is in the browser cache already. There's two ways that I know of that this is being addressed, none of which are sort of fully baked. One, H2O, the server that I talked about earlier, has a mechanism called Casper, where basically there's a cookie generated. It's through a bloom filter, like a sophisticated mechanism. Once I hear a bloom filter, I black out. That is essentially a way for the browser to tell the server what's in its browser cache. So it's a signal so the browser, so the server won't push things there in the cache. This is being, it's an experimental feature. It's being standardized in a mechanism called cache digest, which is the first couple of iterations. It's the idea is the same, is to have a way for the browser to signal the server what it has in its cache, so the server doesn't push those things. H2O also doesn't push assets on a connection that it's already pushed, it remembers. So that's another safety mechanism. But still, that's an unanswered question for us with push. I think actually the most interesting use cases of push are the ones that we haven't thought about yet, because push enables all sorts of shit that we hadn't thought about. So I think what's gonna be interesting is when we get creative and figure out new ways of using server push. Facebook's done a couple of things. They actually just released this video a few days ago. It's a very interesting watch and they do some things creatively for their application, for their use cases with server push. Okay, let's keep going. H pack, we're still okay? Everybody asleep or you're with me? Okay, good. I see a one smile, that's good, that's all I need. Okay, let's talk about H pack. H pack is the companion RFC, 7541. This is header compression. This is supposed to address the header bloat problem. And it's got two primary mechanisms. One is all headers are Huffman encoded. So A becomes, and B becomes blue, and it's the whole thing. And the idea is that over time, you're basically gonna create this table where instead of using headers, you index them. The idea is instead of sending, for example, user agent, blah, blah, blah, blah, blah, blah, you send index three. And that means the same thing to the both sides and you've compressed headers that way. There's two tables. There's a static table, which is defined by the RFC. It's super static. It's never gonna change. The RFC goes out of its way to say this definition is never changing. I don't know what that happens when you come up with new headers, but we'll see. And then there's a dynamic table that the two sides build over time and they add entries to it as they communicate. It's a five foot table, so things will fall out of it. Here's what the static table looks like. This is the one that's defined, 61 indexes indices. And the dynamic table is from 62 on. Now, when I hear the word compression, the first thing I think about is performance. Not just because I live in the performance world, but because that's compression, right? That's compress things, make them smaller, perform. So what are the performance benefits? Now I'm gonna show you two things that are kind of obvious post facto, but it didn't make sense to me until I actually saw them. So when you're compressing headers, you're supposed to send less data, right? So let's look at that. Here is the data sent from a browser to a server with H1 versus H2 using HPEC. You'll see there's a lot less data. That's cool. The browser to server direction has a lot less data going. And that makes sense because most of the traffic between a browser and a server is request headers, right? It's headers. So if we've cut down those headers, obviously we're gonna cut down the amount of data. What happens in the other direction? Not much. The reason for that is most of the content that we get from a server is payload. Very like a small fraction of it is actually headers. So we see a tiny bit of benefit there, but not a huge benefit. So if you're expecting performance benefits in the server to browser direction with HPEC, there's not gonna be that much of it. If you're bandwidth constrained towards the server, you're gonna see some benefits. And in fact, H2 couldn't operate without HPEC because if you had headers in the H1 world, H1 headers like big headers, you wouldn't be able to do all those things at the same time. One of the reasons H2 can do all those things at the same time is because it can send very, very short frames. Dropbox had a blog post about this that basically said the same thing. What you see in that blue line is when they enabled it, their ingress bandwidth dropped significantly. This is the incoming bandwidth. And not much happened to their egress outgoing bandwidth. Some things to know about HPEC. Well, the dynamic table has a default size of 4K. And that's for the whole dynamic table. And browsers can adjust it. There's a settings parameter that says I'm gonna adjust my table size, but no browser does. So the table size is staying at 4K. This is the content security policy header from Twitter. If you go to Twitter right now and look at your DevTools, this is the CSP header. You know how big it is? That big. So that means half the dynamic table gets taken up by one header. And it's a 5-0 table, right? So that thing's gonna go into the table and come out of the table and go into the table and come out of the table. So that table may not be as efficient as you want it to be. So that's a thing to take into account. There is a proposed protocol called SiteWideHeaders, which is intended to basically say, headers that you're gonna keep sending over and over again, especially from the server to client side, just send it once and be done with it. That's a proposed draft right now. Compression context is for connection. So once that connection drops, the dynamic table gets wiped. So your keep alive timers are now a bigger deal because you wanna maintain that context for longer. It is definitely a attack vector. And there's a paper from Imperva that basically attacks it over and over again. Exploits all sorts of things. It's a fun paper to read, a little scary. But we're early in the lifetime of this protocol, so we're gonna learn these lessons. You can't turn it off, so you can't experiment. And again, like I said, you couldn't be doing pipelining without it anyway. That's HPEC. Last thing I wanna talk to you about is prioritization. Prioritization is probably the least understood mechanism in H2 and also probably the most powerful. It's a very interesting mechanism. And the idea is that with all this concurrency and all these things happening at the same time, we have to give the server a way to resolve contention issues. We're giving a lot to the server at the same time. Priorities are a way for the server to be able to pick what to do before what else. It comes with two mechanisms. There's a stream weight, which is basically the importance of the stream. And then there's a dependency mechanism. So you build this tree. You dictate priorities through headers frames. When you send a request, send a headers frame, you put a priority in it. Or you can adjust priorities with a dedicated frame called a priority frame. And the spec goes out of its way to say that priorities are suggestions. That means that basically it's a suggestion from the browser to the server. Server doesn't need to do anything with it or the protocol doesn't say this is the way you're supposed to prioritize. It's just a suggestion. Let's look at it with some examples. Here's a simple one. So we're gonna build trees, right? This is a priority and dependency tree. So here's the root note. Everything comes off the root note. And you have two streams A and B and these are their weights, 12 and four. Basically what this means is it's a suggestion from the browser to server to give three quarter of its resources to stream A and one quarter of the resources to stream B. It's the ratio that matters. So it could be 12 and four or three and one. It would mean the exact same thing. This is not for me. This is from a book called Browser Networking from Ilya Guruk. It's a fantastic book. I don't mean to say that I came up with these examples. Read that book. If you haven't read that book, read this book. I'm gonna talk about it again later. Here's a dependency example. You have C here dependent on D. The weights actually don't matter in this case. What it means here, what this means essentially is get done with D and then after you're done giving all your resources to D, give all your resources to C. This is what this dependency means. We combine them, it looks like this. And this means work on D alone. After you're done with D, work on C. And after you're done with C, work on, work, put three quarter of your resources on A and one quarter of your resources on B. And now in the advanced class, we can look at this. And basically what this means is you start with everything in D. After you're done with D, you give half your resources to E and half your resources to C. Those H don't matter. It's just the ratio between them. And then after you're done with C, take three quarters of the resources you gave to C. So three quarters of the 50% that you had given to C and put them on A and one quarter and give it to B. That's the idea. Now, when this gets really cool is if you look at the browser trees, the way the browsers build them. Here's how Firefox does it. Firefox starts with five streams that actually have no requests over them. They just sort of sit at like roots of a tree. And all the actual streams that carry data are coming off of these nodes. This is in mind. This is work done by a guy named Moto from Yahoo Japan who's done great contribution to H2. Definitely check out this presentation. I have a couple more references to his work later. Basically what Firefox does to build this tree, I have no idea why they built it this way. But it looks really, really smart. So I'm gonna guess that some serious data scientist sat down and said, this is the best way to build this tree. And what each of those branches get used for is for different elements. So images get one branch. CSS in the head gets a different branch. JavaScript in the head gets a different branch. JavaScript in the body gets slightly different branch. And that's the way they do it. Chrome used to look like this. Basically this means weights only and everything was dependent on the root nodes. So no dependencies. That's changed recently. In Chrome 53, they started doing some dependencies. And here's what, I know it's really difficult to read. So here's a tree, here's a dependency and priority tree for Chrome. I'm gonna make it a lot more difficult to read because this is what it actually looks like. Frederick, one of the guys that works in our H2 team came up with this tool where you can basically map out what Chrome's priorities look like. That's what the tool is. Check this out, it's kind of neat. What's interesting about this is you see all those, that branch that goes straight down with those exclusive dependencies. What that basically means if you think about it is that breaks interleaving. Because that means you can't work on the next response until you're done with this response. The whole idea behind H2 is to be able to do a whole bunch of stuff at the same time, right? This, that entire branch that goes down like this, that breaks it. That makes it look a lot more like H1. My guess with Chrome, and I love Chrome, is that they're just iterating right now and they're trying to figure out what the best thing to do is. Canary actually looks slightly different. So this is, I'm just gonna give them the benefit of that for now. If it keeps going like this for code versions, then I'll pick on them. And Microsoft has no dependencies and no priorities. Last we checked. Maybe not surprising. Maybe it's different now, I have no idea. Okay, I've done a lot of stuff at you, right? I'm just going at 100 miles an hour. And we have nine minutes, isn't that exciting? I don't have nine minutes of content, thank God. I'm gonna give you some tools and resources. First of all, read this book. If you haven't read this book, read this book. It's an amazing book. Gives you a lot of insight into how browsers work, especially when it comes to networking. And Ilya added an H2 chapter to this, which is a great overview of HTTP2. It's only about 25, 30 pages. It's a fantastic sort of primer. If you wanna learn, if you'd rather read like a chewed up version of the protocol rather than the spec itself. And this is a very well chewed up version of it. It's a very, very good overview. It's the first thing I read for H2 actually. There's a little extension for both Firefox and Chrome that tells you if the site you're on is using H2 or H1. It puts a little lightning bolt up there. I started using it about a year and a half ago. I can't not look at it anymore. Every site that I go to in the world, my eye instantly goes to the lightning bolt to see what protocol I'm talking. This really, really hampers casual surfing. Like if I'm buying something on Amazon, at some point I'm gonna look at DevTools and see how they're using H2. DevTools shows you what's going on. There's a protocol column. It's not on by default. Right click on the columns and add it. You can see if you're talking H2 or H1 to a server. And if you wanna get geekier, there's a net internals panel that gives you all sorts of stuff about your HTTP connection. If you click on that, if you look there and click on that one stream, you actually see the frames go back and forth, which is really cool. And that ends up being the input for that priority graph that I showed you earlier. I can't have a talk without mentioning Wireshark. And I know I've done it once, so I'm gonna mention it a second time. Wireshark is an amazing tool that lets you look at packets on the network and frames on the network. And it has a decoder for HTTP too. It's not the best decoder. It sort of has a couple of flaws, but it's great because you can actually see what's happening on the wire. Now, because everything is TLS, things have gotten really difficult with Wireshark. That link right there tells you how to sort of decode it the easiest way. It's still not super easy, but it does make things slightly easier. Curl supports it. We love Curl. And there's a new version of Curl. Actually, it's not new anymore, but it may not be in the standard distribution, so you may need to install it yourself. That supports H2 and it's a flag. You get to see some H2 things happening. And it's H2 compliant. There is a Curl-like tool called NGHCTP. That's actually amazing. It's similar to Curl. It's H2 only. And it gives you a lot of debugging info about what's happening inside the connection with frames and streams and everything. You get to see push promises. You get to see push frames. You see data frames. You see everything. It's great. There's a bunch of other tools out there. Most of them start with H2. Proxies have started adding it. H2C is a good tool for sort of analyzing things with a kind of a man in the middle proxy. And the working group itself is putting a list of tools together sort of as reference and there's a link to that at the bottom. Okay. This, by the way, is slide number 179. I'm proud of you for not falling asleep all at once. I saw some of you may have been here and there, but all of you did not fall asleep at once. 179 slides. H2 is complicated. My intent wasn't to come here and bum you out about it. My intent was just to take a sort of a practical look. This is by no means the end of this. We need to do more of this. We need to take some real-world data and feedback into the system. It's okay if H2 isn't perfect. It's a step. We iterated finally on a protocol that was 25 years old and I guarantee you H3 will not take 25 more years. H2 to H3 is gonna be shorter. It's up to us to feedback into the sort of community and inform what the decisions are going to be for the next version. I know the authors and all the people in the working group are hungry for data and that data is gonna come from us. I'm gonna try to do everything I can because I'm in a position with a CDN where I can maybe provide some data, but real-world data from you and your applications is gonna be much more valuable than aggregated things that I can provide. What we learned is it's probably not gonna be as fast as you like it to be, but maybe it will be fast as you like it to be. So it's gonna be on a case-by-case basis. And most importantly, all of the sites that I tested were sites that were built for H1. Maybe building sites for H2 to leverage the mechanisms that H2 gives us are going to let us sort of uncover some of its features a lot better. So it's time that we understand some of these features better and hopefully this was a good primer to start you and start building our applications with those features in place. We still have a lot of learning to do, but I think feeding back into the system, H2 is ultimately the way we're gonna be going forward. Feeding back into the system and sharing data. If you find anything that looks like this, if you do your own tests, find anything that's educational, publish it. Everybody wants to see these things, especially now. We're in a very special place. We get to see a protocol be built and that's not a position a lot of people have the luxury of being in. So let's take advantage of that. I have four minutes and 30 seconds for questions and I'm gonna be around. I'm gonna be outside right after if you wanna come up to me afterwards and ask questions. Are there any questions? Yes, yes, I'm gonna, so the question is H2, H-Pag can't be disabled and is an attack vector. The answer is yes. It doesn't mean all implementations and the actual question is, does that mean I should never use H2? No, everything we use is an attack vector. Everything is at some point or another an attack vector and it's up to implementers to make sure those points are addressed. Now, if you read that particular paper, it'll tell you what they attacked and what's been done since because just like any security advisory, things get, man, I did that like 20 times today. Things get exposed and we fix them. What's happened with H2 is a whole bunch of new mechanisms in place and we have a lot of learning to do and those things get exploited and we fix them and we move forward. So yes and no, it doesn't mean you don't need to use H2, yeah. Anything else? Yes, that's a good question. I showed a comparison between H1 and H2 in data sent from the browser to server and the server to the browser. The question is, was it GZIP? Was it not GZIP? What was the content? Well, the browser to server direction was just all requests. So there's no, there wasn't any content to GZIP there and that was what I was showing there was H1 versus H-Pag to H2, which is the only version. That's where the biggest gap was. The response was in that particular site was actually all images. Actually, no, that one was the Fastly site so it's compressed, everything is compressed. Basically I took the standard site and it's what it would be like with H-Pag and without H-Pag. That's what I was showing. So whatever content was coming across was the same for both H1 and H2 and what you were seeing was the delta between basically delta caused by H-Pag which was about this big for server to browser communication. Does that make sense? Oh, why don't just GZIP headers? That's a good question. I think Speedy tried that and it was a problem. So one of the design decisions they made, I don't know, may not have been GZIP, there was some compression algorithm that they put for headers. One of the design decisions made when Speedy turned into H2 was to use Huffman encoding. I don't know what the motivations were. I tried to stay out of it and say give me a paper to read and I'll learn. Anything else? Yes, sorry. Oh, question is Speedy actually finished? Yes. Chrome is pretty much pulling out support if they already haven't. Firefox is gonna pull out support. Speedy turned into H2. And one of the things that happened, this may not be interesting to you but it's interesting to me, is when H2 started there was a call for proposals, basically a where do we start? There was three or four different ones actually and Speedy was one of them and Speedy was the one that was chosen and Speedy turned into what's H2 and Speedy is also part of what's enabling Qwik. So Qwik can actually be a transport for Speedy and Qwik can be a transport for H2. So a transport for H2. So over time Qwik is gonna be more H2 carrying and then the standard version of Qwik will be H2 versus Speedy. So Speedy is basically getting phased out, yes. Anything else? Okay, we have 26 seconds left. Way to go, I'm proud of you. Thank you for your time. I'm gonna be right out here if you have any more questions. Thank you.