 Hey everyone, welcome back to the State of the Web. My guests are Addy Osmani and Katie Hampenious. They're engineers on the Chrome team making the web fast. And today we're going to be talking about a web performance technique that takes the guesswork out of resource loading. Let's get started. Music So Addy, welcome back to the show and Katie, thanks for being here. Good to be here. Thanks for having us. You've developed a technique called predictive fetching, but let's set aside the predictive part for a second and just talk about fetching. What is that? So fetching is the process of going and getting a resource off the network. Could be your JavaScript, your web fonts, or something else. Now, as web developers, we're usually shipping like a mountain of code down to the browser. And the browser's just going to scream when you do that because it doesn't really know what's really important. Thankfully, there are a number of different resource hints that have come to the web platform of last couple of years to help you kind of give a hint to the browser that, hey, some of these things are slightly more important than these other things. Things like DNS pre-fetch and pre-connect, which are useful for pre-resolving your DNS names for different origins you might connect to. Pre-connect is useful for warming up connections to those origins as well. We've also got sort of pre-fetch. Yes, there's pre-fetch, pre-load, and pre-render, and they all take it a step further and actually load the resource instead of just setting up the connection in advance. And pre-render is actually deprecated, so pre-fetch is something that people should be paying attention to. Pre-load and pre-fetch sound very similar. What's the distinction there? Pre-fetch is intended to be used for resources that you're going to be using like on the next page load, whereas pre-load is intended for resources that you're going to be using on the current page load. Okay. So I was looking up on HATB Archive how these things are being used and most people are using the less intensive hints such as like pre-connect. Very few people are using pre-render probably because it's deprecated, but I noticed that Google Fonts is actually using it a lot. Can you explain how they're using it and why? Yeah, so one thing Google Fonts does is it's got a CSS file, right, that you're going to include when you're trying to add those web fonts to your pages. And then it's also going to be referencing like the actual web fonts in there. So it could be the WAF files, for example. And those are hosted on two different sort of CDN origins. And when they're pre-connect... When they're telling you to pre-connect, you're pre-connecting effectively to those two different domains to shave off some of that connection and warm up time as much as possible. That can have a benefit to your page load times, especially if web fonts are pretty critical to the experience. If I pre-fetch something, is it guaranteed to actually load? It will load eventually, presumably. Part of the reason why is called resource hints. It's ultimately you're just like hinting towards the browser or expressing a preference for when it's loaded. But the browser is going to use its own heuristics to decide the order in which it's going to fetch resources. Okay. So if there's a lot going on in the page, the network is congested. It might not load during that time. The browser will wait until things are quiet. Yeah, and then so that's why you have to be kind of careful and judicious about how you use these resource hints because if you say, you know, everything's important, nothing's important. So you really want to prioritize the most important resources. Do developers have a way to prioritize what should get loaded in what order? There are upcoming specs that will help with things like that. Things like priority hints, which are a emerging standard proposal where you as an author can say, okay, well, the importance of this resource is low or the importance of this resource is high. Still in an experimental phase in Chrome at the moment, but we're thinking about this. Okay. So that's fetching. Now let's move into the predictive part. What is predictive about it? So when developers are trying to decide what to prefetch or preload or use any of these resource hints for these days, a lot of the time they don't have a lot of data that they're using to make those decisions. What we observe sites doing is, you know, they'll take a look at their home page and some of the pages that they consider to be popular and they'll just drop in a few potentially preloads or prefetches in there because they know that, you know, a user going to a category page for groceries is likely to go check out the vegetables page or something like that. Wouldn't it be neat if we used data to drive some of our web performance decisions a little bit more? So one of the things that people can do actually is take a look at their analytics to help with deciding these things. If you take a look at the Google Analytics Reporting API, it actually gives you quite a lot of rich data about the different behavioral patterns your users have. So it can give you insight into things like, you know, what pages are users coming into? What is the next page they're probably going to exit to? And using that kind of information, you can actually start to build up a very simple understanding of, you know, those types of patterns so you can apply it to the rest of your site. So anybody watching this using Google Analytics can take advantage of this today. You've actually got like a simple demo they can check out just like select what your GA account is and take a look at a table with some probabilities. And the future for this is to start pulling in more data so you can make richer predictions. Currently, if you pull down your data from Google Analytics and want to make some predictions, you'll just be looking at, you know, what page do people visit most frequently from this current page? You can consider that like a very, very simple Markov chain, you know, Markov chain with one state. But if you want to maybe improve the accuracy of it, you could start incorporating other signals that are available to us. For instance, you can use different browser APIs to maybe detect the little cow or the platform that the user is on and see if you can incorporate that into your models to improve the prediction accuracy. Wow. So this is like really cool. This is using machine learning to optimize the web performance of the page of the whole site because a user journey goes for multiple pages and if you're able to predict where they're going or sites that have a well-lit path for users to go through from one page to another, then you can kind of clear that path for them and make sure that they can get there more quickly, right? One of the great things about applying machine learning to web performance is that as data improves over time, the accuracy of your models are also going to improve. So we're not just looking at, say, you know, five users on your site are doing this and that. We're actually looking at the entire corpus of what your users are doing and trying to learn, okay, what are the patterns that we see in there? On that note, something that we've been trying to do is connect up these two worlds of prefetching and the data that's available to users and add a little bit of machine learning to help with the prefetching space. So how can developers start using this? Is this open source code now? It is. We have two different paths for you to get started. One is adding a script to your website that will just ping a server that you set up and the server will send down whatever page it thinks the user is going to go to next and then it will append a tag to the page that prefetches that page. And that's great because you don't have to maintain those prefetch tags at all. And so, you know, you're always making accurate and smart prefetches. You know, it's not something like, oh, I added it to the page and we redesigned our website and now it's out of date. So that's great. The other path you can try is Fuse Webpack. We have a plugin that allows you to start bundling based on, you know, the paths that your users are following and so you're making intelligent decisions about how you bundle and ship your code. Okay. So we're calling that data-driven bundling and the idea is that, yeah, we can use information that comes back from analytics that's modeled using ML to decide what to prefetch but we can also fact that in using Webpack to actually chunk up the code in a more efficient way. This entire effort is something that we're calling guest.js. So it's open source. It's available for people to go and check out today if they're interested. We've got some demos and everything. It's amazing. So I worked at YouTube for a few years and they used to do something where, or they still do, where you're in a playlist. It will start rendering the next video in the playlist when you reach towards the end. So this is like taking it to the next level where it's like general purpose for a website. They can start pre-loading their content on the next page and by doing it at build time also, you're ensuring that you're just sending the content down that the user's actually going to need there. Yeah, I think that what we would love to see come out of this effort is just more interest and excitement about the idea of exploring, you know, how can data be better used to drive our Web performance optimizations? Is machine learning something that can actually help ensure that we, when we're making these optimizations, are doing it in the right cases and are factoring the right types of resources that are actually going to be useful and valuable to the end users? Touching on something Katie mentioned earlier, one big problem that arises when you are thinking about pre-fetching is how do you avoid over-fetching things? And one of the ways that we've tried to factor that into guest.js is by not pre-fetching as aggressively if we think that you're on a connection that's effectively sort of 2G, 3G, something that's on the slightly lower end and slower. If you are on a decent connection, you know, we can go all out, we can pre-fetch quite a lot more. But I think it's important to be considerate about some of these types of optimizations. And again, thanks to probably work by Ilya Grigorik, we've got things like the navigator.connection.effectivetype API which enable lists to be possible. So for users on 2G, they might not see a lot of these predictive fetching? They actually won't see any because a Chrome does is, well, they won't see anything on Chrome because Chrome will not do pre-fetching if you're on 2G because the consequences from making, like, a wrong prediction are so high, so we'll only pre-connect. And this goes for any resource hint on 2G. Web developers could look at their analytics to understand how much of their user base is on 2G, and then that'll give them an idea of how much runway they can get out of this. I'm curious about the page load improvements that you've seen so far in your testing. How fast could this get? So it's still early days for these efforts, and I do have to give a nod to folks like Mark Edmondson who've been exploring the sort of machine learning and predictive fetching space for a number of years. The wins we've seen so far are anywhere up to a 40% improvement in page load times. I do have to stress these are all sort of demo applications not used heavily in production just yet, but we are working with a number of sites to explore. What are the real wins that they see at scale? But it appears like there's some interesting opportunities to improve page load performance there. When things go right, is the experience instantaneous? Like, these will click a link and then it'll just appear? It can be, yeah. That's great. So that's really like what we're after, right? Instantaneous page loading. There's no latency at all. Yeah, ideally. It's the holy grail. Yeah, it's the holy grail. It'd be great if we can use all of this data and all of these techniques so that while you're reading the content for a page, we're already prefetching things. We think you have a high chance of clicking on next so that they load instantly and they're already in the user's local cache. And the rest of the experience is they click through from those other pages and get deeper and deeper into a site also load instantly. So some interesting work ahead of us, but... It really is a great user experience when something loads instantly. It's very cool and it really makes the content much more engaging. For a page that loads quickly, it's not just, yay, the page is fast. The user is actually going to engage more, you can definitely tell the difference. It's very cool. So I have to ask, what can developers do to ensure that they're preserving user privacy as you have this like machine learning model that says if you're on page A, you're probably going to go to page B. That tells the user something about how other users are behaving on the website, right? I think that there are many ways that you can implement these techniques where it's primarily done on the server side and you're exposing less of those behavioral patterns to the end site. You can also look at ways of just generalizing that information so that the model isn't quite as granular. And it certainly wouldn't expose things like fingerprinting, for example, because it's modeled based on the entire corpus of user base and not individual users at the moment. But I think that that just requires some additional care when you're implementing these techniques in general. For instance, just be careful about the sample sizes you use. So for instance, if only five people have visited that page, maybe you shouldn't be using preloading on that because it could almost be exposing information except such a small sample size. That's also something you probably shouldn't be drawing conclusions from anyway because that's five people isn't allowed to make a generalized assumption about how people use your site. Is there anything browsers or Chrome could do to nativize some of these techniques? I think over the last couple of years, well, for a very long time, Chrome has explored these ideas of natively trying to intelligently kind of prefetch pages that a user is likely to need. I think that those explorations are still ongoing because of some of the things we talked about today, things like how can we ensure that we're not over-fetching on certain connection types in certain device classes? But I think that to the extent that browsers can explore this stuff, I'd love for us to try getting it done in a slightly more automated way until we're there, still lots of web platform APIs and techniques folks can use to get this done themselves. In the future, how else do you see machine learning being applied in web development? Besides just web performance, there's so many more applications of it. I think there's a lot people can do to just improve the user experience. For instance, I know recently I was going to install some stuff and it detected that I was running Ubuntu and directed me straight to the Ubuntu instructions. That's cool and that's actually not that hard to implement. It's simply using a line of code to detect what platform a user is on. I think there's a lot of things like that that people could implement right away. Customization of the user experience. Imagine if you're coming to a shopping site and based on your previous behavioral patterns, it was able to reconstruct the UI in some way. That would drive further engagement, maybe get you to see some things that you would like a little bit more. Drive a few more purchases like that. That's another implementation detail you could use. Addy and Katie, thank you so much for being on the show. This was fascinating. Thanks for having us. If you'd like to find out more about Guest.js, we have links in the description. Check it out and let us know what you think about machine learning on the web. Thanks for watching. We'll see you next time.