 All right, so HTTP resource prioritization, I'm going to be honest, actually quite a boring topic. And so for today I decided that I'm going to talk about something completely different, something I think you and I will much more enjoy, which is of course food. I love this stuff, I eat at least once a day, typically with my family, and over time I've noticed that we tend to eat food in slightly different ways. For example, if this is the meal, what I like to do is to keep the best to last, right? I typically eat the broccoli first because I don't really like it and I like to finish strong with the fries. And my girlfriend, she thinks that's a bit stupid, she thinks by then the fries are all cold and soggy, so she switches it around. And you can see in my household we often have a lot of leftover broccoli. Now my sister, she's much more strict, she typically devours something in its entirety before moving on to her next victim. It's easy to see why she decided to become a lawyer. Now finally my dad, bless him, he's old age, he simply doesn't care anymore, he just goes around the plate picking it up as he goes. Now you know the old saying, if you turn your hobby into a job, you'll never have fun again. And that's what I ended up doing because the way I see it, loading a web page actually looks a lot like eating a meal, at least in the modern protocols. In the old style, HP1, you would open multiple parallel TCP connections, which in my analogy would be the same as everyone having six mouths eating at the same time, which is of course insanity. Luckily with HTTP2 and TCP and the upcoming HTTP3 and Quik, we can move to a single underlying connection which is much more sane. Now this also means of course that we now have to start multiplexing our resources on this one connection. Sounds simple enough is actually quite complex in practice. And I've found four problems with this in practice that I'll be talking about today. And the first problem is which of the possible options is actually the best, right? Because like with the food, we have many different options here. We can send everything back to back sequentially. We can do some kind of around robinning scheme, switching the bandwidth between resources. We can even combine these things. And for the food, it didn't really matter with most like personal preference. But of course when loading a web page, this order really matters in the terms of how the web page renders and in turn the user experience. So which of these options is best for web performance? Let's try to deduce that by using a very simple example. So in this web page, the two top ones are JavaScript and CSS. We can say that they are render blocking. This means that they have to be downloaded in full and executed before you can actually render the rest of the HTML page. That means it makes sense that we fetch them at first sequentially back to back. Now the next two resources like you see are progressive JPEGs. What does that mean? You typically have two ways of encoding a JPEG image. The first one is typical. It just loads from top to bottom. The second one, the progressive is actually much smarter. It encodes the JPEG in different quality layers, meaning that even if you have just a little bit of the JPEG available, you can already render a kind of a low quality placeholder up front. This means that if you know that you're going to have progressive JPEGs, it makes sense to round robin them on the stream because you can already start rendering all of them in some way or another. On the other end, if you don't have progressive JPEGs, you can make an argument that is better to load them sequentially. You can say something very similar about HTML as well. If these things are render blocking, then why would I send the body down before the CSS? Because you can't render it anyway. A counter argument would be that even though the browser doesn't render it, it already parses the HTML, discovering new resources like the images, which can then already be requested. A counter-counter argument is then that doesn't really matter because the images are blocked behind the CSS and JavaScript anyway, and if your HTML ends up being quite large, you will block everything behind it while rendering nothing to the screen. So it's not immediately evident what the best approach here is, and we can make it worse. Let's add a new resource, an asynchronous JavaScript file on the bottom. This means it's not render blocking, but it should still be executed as soon as possible. Soon as possible JavaScript, you might think, okay, high priority resource, load this before the images. You could also say, I think the developer put this on the bottom of the page for a reason, it's not high priority, I'm actually going to load it after the images. And then the final cherry on top comes when we finally downloaded our first JavaScript file, we execute it and we find what does it do. So we also request a completely new resource that we didn't even know about, a JSON file. Now, if this is one of those fancy new React or a few client-side rendered apps, this JSON is probably going to contain the main content for your web page. So it's going to be high priority. On the other hand, if it's not, this might just as well contain some random comments on your blog post that nobody's going to see anyway. So you might actually leave them to the back, right? And I could keep on going, but I think everyone by now understands the gist of this, and that is a browser simply does not know. It doesn't know how large the resource is going to be, it doesn't know what it's going to end up doing, it doesn't know what kind of sub-resources it's going to require, right? So the only thing the browser can do is guess from kind of course signals, things like the type of file you have, the position in the document, and things like async, defer, preload, which I'll also include later on. So what this actually happens is that the browser constructs what we call a heuristic, a guess, of what is going to be most important. Now we can see here there's a bit of difference, but also some agreement. Most of the browsers think HTML is quite important, so is JavaScript and CSS, but there are some differences in opinion about how important fonts are, and especially for the fetch example, there's definitely some disagreements. One very important one is the one on the edge there, exactly. That's the old edge browser, not before they moved to Chromium, but actually failed to specify prioritization at all, at least for HTTP2, right? Now that you notice, can you try to predict which of these heuristics is actually going to work best for webpages, right? Maybe you do, let's try and see if you were correct by an example. So here we're going to load the same web page on the different browsers. Let's see which one comes in first, it's Chrom. Then we have Firefox close afterwards, we get Safari, and now we can wait 10 more seconds for edge to complete. I'm going to play it again and I want you to focus on the differences, especially between Firefox and Chrome and how they load fonts and images. It's going to be very stark of a difference if you focus on fonts and images, right? So it's quite different. And I don't know if you guys would have been able to predict this, I would have had a tough time with this. That's actually because I was kind of lying, I was being a naughty boy, because I actually only showed you one half of the equation, because it's not just the heuristics of the browsers that are different, it's also how they wish this to be enforced on the wire. For example, Chrome really likes everything to be downloaded in fully sequential order. That's different from, for example, Safari, which does a weighted round-robin scheme. So HTML is more important, it gets more bandwidth, but everything else gets at least a little bit of bandwidth until it's HTML's turn again. Firefox does a little bit more complex, it tries to give more priority to the more important resource obviously, so the images are left to a bit later. And now on the bottom we can see what happens for edge, it didn't specify anything, so it falls back to the default in HTTP2, which is fully fair round-robin behavior. And we saw that that was clearly the worst of everything, right? But why was that? Well, if you remember the orange and the green resources, those were the JavaScript and CSS that were render blocking. You can see that all the way at the end there, they're still downloading those resources. And if you remember, like I said, we have to download these in full to be able to execute them and get them usable. And if you start round-robin-ing between this kind of resource, you're actually delaying when that moment happens. It's actually much earlier if you send them sequentially. This is, of course, very un-nuanced. Modern browsers are much smarter than this. They, for example, some of them have streaming script compilers, which actually can already parse and compile JavaScript as it comes in. It's not as bad as you might think, but you still need to wait for this final chunk to come in to be able to use it. But then you might wonder, why are we using round-robin at all? Well, there are some cases in which it's actually quite okay. For example, if you have resources of very big file differences. In this case, you have a very large resource that is holding up to much smaller resources in the sequential scheme. Instead, if you would do this in a more of a weighted round-robin like Safari does, you actually get those smaller resources downloaded quite a lot faster, they can be used, and it doesn't really delay the big resource all that much in the grand scheme of things. So the question was which of these browsers is best for most web pages? We've now only looked at one example, and from this you can see that maybe there are pages that actually behave differently on different browsers. So what we're willing to do to do good science is look at many more different web pages, of course. And the problem is that I know of only about five or six studies that have actually done this at scale, two of which have been my own. So let's look at a couple of these examples. In this graph, each of the lines is a different prioritization mechanism, a different browser if you will. And you see actually there's not that much difference between them, the only one that's actually apart from the rest is indeed what Edge is doing, the completely fair round-robin. And this was only on quite bad cellular networks if we do this on much faster cable networks, for example, the differences become even smaller. Then we did the second research, looking at a bit more of the theoretical side, so taking out the specific network conditions a little bit, seeing what would happen in an idealized situation. And at least for our data set we can see that Chrome is clearly the best. But we can also see that Safari in some cases will be better than Firefox, okay? That's our data. And then we have the third research was from Cloudflare. So they're a big CDN and they've implemented their own prioritization scheme at the server. You could say that they kind of combined what Chrome is doing for higher priority stuff and then kind of what Firefox is doing for the other stuff. And it's difficult to really draw conclusions from this because they don't really publish any papers or data sets, but there's one very good quote on their blog post on this, and they say that this is about 50% faster than what Edge and Safari is doing, okay? So that means that the sequential behavior should be better than ground-robin in general. So we start to see kind of a trend. We all say Chrome is quite good. Of course, the next two studies completely contradict this. So these are from Google itself, the guys who make Chrome, right? The first study says we compare this to a random scheduler like no logic at all and we only got faster results for about 31% of the web pages. Then the second study quite recently they compared this also to the fair-round-robin scheme and where we found 50% or more differences they find only about 2.7% differences. And it gets so much weirder because they also compared this to LIFO which is lost in first out. So the lost requested resource gets sent to the browser first and they're still only getting a 3.1% benefit, right? When I first saw these things I thought these guys have made a mistake. This can be true. They used bad websites. They have bad setups. I don't know what, right? However, I know some of these people. They're very smart guys. They really wouldn't make any basic errors like that. So it's very difficult for me to just discard these things as nonsensical and just say my results were correct. That means that if you would ask me today which of the browsers is best, my answer would be I don't know. And I don't think anybody knows. I think many people have opinions but they don't really have proof on this, right? And I agree that's not a very satisfactory answer. What I think is happening is that indeed most of these schemes are quite good for most web pages but we have a lot of huge outliers. Some pages are going to do really well on some of the browsers but some are also going to do really, really badly if the browser gets the heuristics wrong, right? So like most things in web performance it's going to depend on a very specific web page and you're going to have to tweak for your use case. Now how can you do this in practice? The best thing to do is first verify that you actually have a problem. You can use the web page test tool for this. It will generate you this nice waterfall and since last year they started showing with the opaque bits there where exactly your HP2 data is coming in for that specific resource. And then you can also collapse this waterfall into one connection view, like they call it, like beneath and that actually gives you a timeline like the visualizations that I've been using in this talk that really give you an overview of how things are coming in. And if these don't match up with what you were expecting then maybe you have a problem. There are some ways of dealing with that. You can start switching things around ordering in your web page. There are also some client-side features that you can use. We've already seen async, defer is similar but also allows the JavaScript execution to be delayed but those are only for JavaScript. You can then use the preload function which is, as I explained before, if a resource would request a sub-resource somewhere down the line you can actually say to the browser I know I'm going to need this in the future. You can already start requesting this now. You don't need to wait for example the fetch API call. This also allows you to do nice things including CSS in a non-blocking way. Very cool, has some problems. There's been a long-standing bug in Chrome where you can actually end up with a priority inversion where the preloaded resource gets sent first before the resource actually requesting the preloaded resource. Andy Davies has a very interesting blog post on that. The other problem is that it is currently not supported on Firefox and it's quite unclear whether it will become available anytime soon. The final feature, the core feature I would say for this is what they are calling priority hints. These will allow you to manipulate the heuristics on a per-resource basis by saying something like importance high or importance low. So the browser knows that it might have to tweak its guess for this specific resource and you might also set this using the fetch API. This is fantastic, I'm very excited about this. The problem is this is only implemented in Chrome and I think this was implemented about three years ago. They then tested it on several select web pages but they haven't yet enabled this by default. This is still behind the flag and it's very difficult to know if this is actually going to be enabled soon and also if any of the other browsers are actually going to implement this but it's not looking like that at the moment. So maybe you're helped with this, I hope you are, but if you're not there is still a big red panic button that you can push to make everything better which is using server-side overrides. Because up until now we've been talking about what the browser wants. This is what the browser wants to happen but it's of course the server that has to send the data to the browser. So if the server thinks what you are telling me isn't right, I know better than you, it can of course just ignore what the browser is telling it and just stream the resources in the order it thinks is best. Sounds fantastic, it's actually very complex in practice and to understand that we actually have to move to our second problem of the day, that is how do we actually communicate what the browser wants to our server. Again, sounds simple, can get complex. The easiest way of doing this, the original way of doing this which was in the Google speedy protocol which was the precursor to what eventually became HTTP2, was very simple. It just said every resource gets one integer, a priority level or a priority bucket and the server can just go down these buckets in order of priority serving the resources. Very simple to set up, works quite well in practice but there are some problems. It's for example impossible to indicate how you would like round robin to happen in this scheme if at all. There's a second problem having to do with fairness. Now follow along with me because this is like eight years ago where they were very optimistic about everything using a single connection all the time. So the use case was we will have different clients connecting to one CDN node and in the CDN node ends up connecting to the back end server with just one big persistent HTTP2 connection and on this connection we're not just going to multiplex the resources for each client individually but across the clients. Everything is sharing one big connection. If everybody plays nicely that works well but what happens if one of these clients decides to say hey everything that I'm requesting is actually highest priority. If the origin just follows that you can get a very unfair situation which is great for the client misbehaving but quite bad for the rest. And if you take this to the extreme why would other clients not also start saying this? Which is good you get fairness again but a completely useless prioritization mechanism. You might be thinking this seems a bit of an esoteric issue it's not really important in practice but if you look at the history of all these things which have included there you will actually feel that this is like one of the main issues why they moved from the very simple scheme to what eventually ended up in HTTP2 which is quite a bit more complex because HTTP2 actually says we're going to do everything in a dependency tree. There's no longer just one integer node you have a specific place in a resource tree and if you're alone at your level that means you get all the available bandwidth at that time. That's the way you get sequential behavior but if you have a sibling, two things at the same level then you end up round-robbing between them and you can actually specify a weight for each of the resources to get an unfair round-robbing as you can see here. This is good. It's also quite easy to communicate to the server. Every time you make a request you just have to say hey this is the parent for this request maybe this is the weight and the server can kind of deduce what the browser wants from that. It also very elegantly solves the whole fairness issue. The only thing the CDN has to do is simply put a new root node on top of everything give everything equal weight and we will have a fair bandwidth share without having to mess with the individual client's priorities. This all seems very sensible, right? That's of course the reason it's also in the HTTP2 spec. Now, did it actually pan out to work well in practice? Well no. As it turns out for example this I don't know if any CDN that actually ended up implementing this scheme. None of the CDNs are actually using this in practice even though it was one of the main reasons to switch to the dependency tree. And it gets worse because it's not just the CDNs if you look at what the browsers are doing they also don't really use all this flexibility. Chrome just builds a very long sequential list. Safari and Edge just add everything as a sibling to the root. The only difference there is what type of weights they imply. The only one actually using this to its full extent is Firefox. But as we've seen it before they don't necessarily get better performance out of this on every design. If you think this is bad let's look at the server side because even today there are servers that don't actually implement this mechanism. And of the ones that claim to do Patrick Minin and Andy Davies have done some good tests they find that only a very small subset of them actually do this properly. Only a very few actually listen to what the browser is saying and actually serve the resources in the correct order. And the final nail on the coffin of this whole thing is that it's difficult to do server side overrides as I said before. Because remember the use cases I have one for example image that I want to make higher or lower priority. How exactly am I going to do this? It depends on the browser which tree manipulation that I will have to end up doing. It's possible but I need to know details of how these browsers internally construct their trees and have to manipulate. This is actually what CloudFare does. So CloudFare tries to guess which browser is connecting to it based on the priority tree it sees. It then extracts the resources into their own scheme at least for non-edge browsers because it's not sending any priority information so it has to just use MIME type to try and determine what is happening. So I'm not saying it's not possible I'm saying you need a CloudFare level engineering team to manage this complexity, right? This is not something a normal developer can do it's normal that they have put this behind their commercial offering. So I think we can conclude that this was a state of HTTP prioritization around the time that we started on HTTP 3. So HTTP 3, new version of the protocol runs on top of Qwik which is a new transport protocol next to TCP. And because of Qwik it's very different we already had to change quite a lot of things to make HTTP 3. And the discussion then was, you know, how about we also change the dependency tree setup why don't we use something simpler? There was a huge amount of discussion about this literally months and months we ended up deciding yes we're going to remove the tree setup and actually switch back to something simpler. Now this is only the current proposal this is not the final spec this is just the way we're thinking about it right now. The idea is to go back to something that looks a lot like what Speedy did but with some key adjustments. The first thing is that you can see that there are a couple of these levels that are now reserved for the server. This means that it doesn't really matter what the browser ends up doing with this the server is always going to be able to do the server side overrides the second new feature is that each resource can now get an incremental flag indicating if it should be round robin on the wire or not. This is the main part of the spec there's a second aspect of that is that we're proposing to communicate this using a normal HTTP header. In HP2 everything was communicated using HP2's binary framing layer so it really wasn't all that visible it was difficult to debug you didn't really see it in normal dev tools so the idea is here if we just use a normal header it's going to be much easier to view much easier to debug maybe we can even expose this to JavaScript and let users manipulate this themselves. I personally think this is a very good idea I'm very excited about this but like I said there are many open questions for this concrete proposal for example let's say that we have six different resources all of the same priority level but one of them needs to be sent sequentially and the rest incrementally how do you actually as a server put this on the wire there are many different ways you could say I'm first going to send the first two and then I'm going to switch to the sequential one and then the rest you could say the sequential one is probably going to be more important right it's probably a blocking resource so send that one first you could say I'm going to send a little bit for all of the incremental ones so that they can start rendering or processing or whatever then the sequential one then the rest keep going it's not really clear what the best solution is then you also no longer really are able to do what Safari wants you to do with the weighted round robin the question is do we really still need this in a simpler scheme or not this is just one of the questions there are many more there are many people that don't like the idea of using a normal HTTP header for this and even the ones that do are not keen on the idea of exposing this to the JavaScript fetch API without any special things this was still the fairness issue to talk about right so this is not literally what is going to be in HTTP 3 but I do think it's going to be a small evolution of this proposal still something much simpler than we had in HTTP 2 easier to comprehend easier to implement hopefully less bugs which is great this means we're done right end of talk thank you however there's of course more right we've only been looking at the HTTP layer for now but that's not the only thing that influences this of course because we are running on a transport layer as well for example there's a very nice issue here this was talked about by Patrick Meenon in about a two hour talk almost just on this issue well the thing is if you have TCP buffers they're too large you can actually end up with a priority inversion situation again where you fill these buffers with low priority cash resources and then there is no more room for the high priority stuff that you get later so things get delayed again this means that even if you have a well behaving client and a well implemented backend server you can still have problems with prioritization there's another problem with TCP it's called head of land blocking this is because on HTTP we know that we have different files on the same connection TCP is not as easy as everything is a single opaque byte stream this means that if we have even a single packet loss in there TCP has to back up everything waiting for that single packet to be retransmitted until it can deliver packets 3 and 4 to the top layer even though they are for unrelated resources this is actually the core thing that separates quick from TCP because quick does know that there are different streams on the same connection if you have a jml or a javascript file it just knows these things are completely independent this means that if quick suffers a single packet loss it can actually just bubble up packets 3 and 4 to the browser only waiting for the one packet to be retransmitted this is why they sometimes say that quick removes TCP's head of land blocking and so it's better for performance and I kind of agree but there is a lot of nuance there because if you look at this more closely this only happens if you are round robining your resources you need to have multiple things in flight at the same time if you're sending things sequentially this benefit goes away because of course you can't reorder things within the same file and you end up with something that looks a lot more like the normal TCP behavior right so this could work in practice could not the most interesting thing is we've been saying that the sequential behavior is probably better for low wage performance but on a lot of networks it might make more sense to switch to something at least a little bit round robiny to get around this behavior so quick brings a whole lot of new different challenges that reflects up to what we want to happen in the HTTP layer this is just one example there are many many many more chance has it I just finished a new paper on this please go and read it or come talk to me afterwards if you want to know more of these details because it's about time for me to start wrapping up with the fourth and probably most important problem and I say here there is no problem why am I saying that because maybe you've noticed but I've actually given you some very contradictory information on one end I've said you know edges 50% slower than what Chrome is doing and all of the HTTP server deployments are enormously broken they do prioritization completely wrong we have a big problem on the other end you have the source from Google saying it doesn't really matter it's only a 3% difference who cares and you have other things that I've noticed over the years I haven't actually seen a lot of developers complaining about for example I've enabled HTTP and suddenly my web page is so much slower on edge I haven't actually seen a lot of these posts so it seems that people are not actually noticing these things a lot in practice and so this is completely opposite few points and only one of them can actually be right and if it's the right one that is right that means that I've wasted two years of my life researching a non-issue but if the left one, if I am right and if Cloudflare is right this actually means that HTTP2 has been broken for a long time or websites are quite slow and nobody has noticed but it's a difficult thing to sell to a room full of web performance experts so I've thought so long and hard about how to combine these two things and I'm still not there yet but I do have some conjecture some things that I think are happening personally I think performance matters well of course performance matters I think prioritization matters but for most web pages because not everything is still using the one connection there's a lot more other aspects coming into play here is that you won't really see it even if the prioritization goes wrong you'll primarily see it on very complex pages and if you test on slow networks and sadly a lot of us still are not really testing on slow networks let be honest the second thing is that even if you have a problem at the network it's often not the bottleneck if you're shipping 5 megabytes of JavaScript to your mobile device it's probably going to be stopped at the main thread processing and you might not even have a problem at the network even if it's suboptimal it could also be that if something breaks in this prioritization stuff it actually breaks very hard it's very obvious and people fix it quickly without thinking too much about it or and I think this might be more likely is it people have seen problems but they have been unable to match this with the core reason and that is the prioritization mismatch that's happening because people don't really know enough about how the system works internally that's one of the reasons I wanted to do this talk to hope get maybe a little bit more of insight into this topic out into the world but one of the main reasons I think is because we have a very unhealthy browser ecosystem at the moment so many different people are just using Chrome as we've seen Chrome tends to do quite well in our tests I believe it's doing prioritization the best of all the browsers so it might be that we're just not noticing this because everybody is focusing on Chrome right so we're not there yet and I think that we still have some issues to discover with regards to prioritization what is actually going around at scale what I do think and I hope is that Quik will actually help us with this I think Quik will unearth some of these problems and introduce some new issues because it's the way it works so that's going to help us make some progress on this I also think that what we're going to end up with for HTTP3 that isn't going to solve all our problems there are still going to be edge cases of course but I hope it's going to be simple enough that we can understand this I also think that we'll be able to backboard this into HTTP2 so that we can solve some of the existing issues there and if that's true it's if it all becomes much more easier to understand and debug that means that if I maybe get to come back next to you and do a new talk on this in just ten minutes because it's so simple and I can spend the rest of my talk talking about something that I actually care about which is Belgian waffles thank you I think we have two minutes for questions five minutes, okay five minutes for questions no questions oh there it is I don't see you, shout, oh there so the question is what do you expect the changes you need to do to move from HTTP2 to HTTP3 yeah so the most difference is here on the quick layer quick changes a lot so you're going to have a lot of problems with your DevOps setup and your firewalls and opening up ports and that kind of stuff but the HTTP layer not much actually really changes except of course for the prioritization stuff but normally you as a developer shouldn't have to care about that too much it's still going to be the browsers that translate the heuristics over the new thing so the move to HTTP3 should be fairly simple if you have a good DevOps team in place more questions okay so the question is with HTTP3 will the browsers end up changing their heuristics that's kind of what I tried to say with the whole head of land blocking issue because quick allows new things to happen it could be that the browsers are now saying we have to work well on TCP but we find something new that works better on quick and so they might try to change up some of the things depending on what works well in practice okay so I think they're going to stay largely the same but I hope they're going to diverge at some point because quick allows you some really cool internal optimizations for this as well anyone else thank you all