 Today, we want to talk about optimizing the critical rendering path. So this is some of the work that we've been doing and experimenting with at Google. My name is Ilya Grigoriuk. I work on the make-to-well-fast team, and I also have Dave here with me today. We're going to talk about some of the criteria that we've been experimenting with, thinking about, and also some tools that we've built along the way to hopefully help deliver fast mobile sites or instant mobile sites. So specifically, this is kind of the frame of the problem and how we're going to tackle it today here. And first of all, we're going to talk about latency and bandwidth. So specifically, we're concerned about mobile, right? And as you heard in the keynote today, mobile is slow, networks, mobile networks have their own constraints, wireless networks have their own constraints, and we need to optimize for those. So we're going to cover on that for a little bit, and then also talk about what is this critical resource or critical rendering path for mobile such that we can get the page visible as quickly as possible. So I shouldn't need to talk about this here because, of course, preaching to the choir, but of course, we know that performance matters. And we know that 85% of mobile users expect their sites on mobile to load just as fast as on desktop. And not only that, but of course, sometimes they even expect it to be even faster, right? You're on the go. You want to get the page very quickly. You want the answer quickly. So that's the expectation. Now the problem is more than half of these users actually have a problem when you actually ask them. They say that, hey, the mobile site didn't work as well as I expected. And not only that, but the number one criteria or the number one problem identified is usually just plain and simple, which is slow loading time of that page. And then almost half of that audience then says, well, I just went somewhere else and I didn't return. Right? So that's that example that we heard in the keynote this morning where you're clicking away to Amazon.com and the users are just not coming back. So this is the problem that we're trying to address, which is how do we deliver the same experience or better on mobile and get the pages visible quickly, such that you can deliver a great experience to your users? So speed is a future. And of course, we all know that here, but this is something that I think needs to be repeated nonetheless. So despite the fact that our life seems to be getting faster and faster, right? Everything's moving at a faster pace. There are some pretty good constants. So there's been a number of studies done in the early 90s. They've been repeated in early 2000s and these numbers still hold today, which is there's certain constants that hold for how quickly we need to respond to the user for the application or a web page or any feedback system for it to feel responsive. So below 100 milliseconds, it feels instant, right? You click a button, you get feedback immediately. Somewhere within that 300 millisecond barrier, you start to feel like the button is just kind of sticky, right? Like you click it and there's something wrong. And then after one second, you kind of lose that task focus. So you've been focused on the task, you say, you know, send email and a second later nothing happens. You know, like, oh, yeah, I got an email, Bob, and I got to talk to Sherry and et cetera. And before you know it, you lost the user. So really our goal is to break what we call the 100 or 1000 millisecond barrier or the one second barrier, which is we want to react to the user in less than one second, right? And specifically this is our design constraint for everything that I'm going to talk about here. And we're going to work kind of bottom up for what does it take to make this happen on a mobile network? So first, of course, the networking constraints because that is one of the fundamental limitations that we have and we have to work with. So this is kind of a scary looking diagram of like how a 4G network works. So just stay with me for a second, right? Like you don't need to be an expert in this, but I just want to illustrate, like, what's actually happening when we pull out our app, you know, we type in a search query and we hit send, right? The phone has been idle, I turn it on, I hit send. The first thing that happens is that the phone will actually go and talk to the tower, right? And it says, hey, I would like to send a packet, and the tower needs to negotiate with your phone when you're allowed to send that packet. It'll relay that information back to you at which point you'll say, great, I'll wait for my assignment and then I'll send the data to the tower. The tower then sends it to the serving gateway within the core network, which sends to the packet gateway within the core network of the carrier, then we hit the external network, right? So there's at least three or four hops in here that happened before we even hit the external network. So this is not even the, you have to add the extra time to hit your server as well, or to get routed to your server. And if you look at some of the numbers that, for example, AT&T shares for their core network infrastructure, here you can see that, you know, numbers are getting better, especially with LTE. We have about 50 milliseconds, right? So this is the latency just for the core network. This is between your phone and hitting the external packet gateway on the core network. So that's 50 milliseconds, right? That's very optimistic. That's the latest 4G technologies. More likely, your user will be either using HSPA plus or a ChSPA at which point you're looking at a couple of hundred milliseconds, right, for just a packet to get out from your phone to the edge of the mobile carrier network. And then after that, you have to route it to your destination, whether that's your origin server, your CDN, et cetera. So latency matters, of course. And that was just for a single packet, right? And of course, when we construct a page, a web page, even a simple HTML file actually takes more than one roundtrip. So first of all, we have the DNS lookup, maybe have it cached, in which case, great, you can skip that. Then we need to do the TCP connection handshake, right? So the SIN and SINAC. Then we need to send the HP request. And then we actually need to download the content, which probably doesn't fit in just one roundtrip either. So right here, we're looking at this, right? We haven't even looked at HTTPS, which would require another couple of roundtrips to negotiate the secure tunnel. And you do the math, and you're looking at four to five roundtrips to establish just to get the HTML file. And of course, we all know that our pages today are not just HTML files. We also happen to have about 80 other resources that you need, which is JavaScript, CSS, images, and all of the other assets. So let's do some simple math, which is like we have our budget of 1,000 milliseconds, right? We need to get our content visible on the screen. How does this work? Well, we can look at 3G and 4G. So we're kind of picking an average here, we're saying. For 3G, let's assume a 200 millisecond RTT, which is actually fairly optimistic. Like this is a good 3.5G upgraded network. And for 4G, we'll go with 100 millisecond, which is the average that you get today. So if you do the math, actually, before anything even happens, remember the first step that I showed in the diagram where you had to negotiate with the tower, went to send the packet? That is known as the control plane latency. So in 3G, this could take up to 2 and 1 half seconds. So 2 and 1 half seconds before the packet even gets out of your phone and to the closest tower. With 4G, that time is much, much less. It's about 50 to 100 milliseconds. But nonetheless, right? Then we have DNS, TCP, optionally TLS. We do the SPIR request. And before you know it, if you account for the time, we are left with either 0 milliseconds or 0 milliseconds left on 3G, like we spent our entire one second budget just on the latency between your phone and the core network. Or in 4G case, like great, we've got, let's split that down the middle and say we got half a second left. So immediately right off the bat, we can discount half of the first second just on pure network latency. And of course, there's another problem here, which is we've opened our TCP connection. But there's this other great feature in TCP known as TCP slow start. And despite its name, it's actually a future, not a bug. And the way this works is we don't want to overwhelm the network. So the reason we have TCP slow start is we start slow, hence the name. And we just send a couple of packets at the beginning of the connection. And then your server successfully acknowledges those packets, or your client successfully acknowledges those packets. And then we double the amount of data that we send you. So we start with, let's say, 4 kilobytes, or 10 kilobytes. And then we double that to 20, then to 40, and so on. So that in that first round trip, we can only send a limited amount of data. That is the key takeaway here. So you add this up and you get a picture like this, which is, let's say we want to fetch a 40 kilobyte HTML file on a 4G network. So we have five megabits of bandwidth, which is great. Everything's awesome. We have about 200 millisecond round trip time. And we're going to spend about 100 milliseconds to process the request on the server, or whatever that may be. We're generating dynamic HTML, et cetera. So if you actually go through the sequence diagram, first we need to establish the TCP connection. So I'm skipping a couple of things. There's no DNS. There's no TLS. We're just kind of going straight TCP. So one round trip just to establish a TCP connection. And then we send the GET request for the actual file that we want. The request processing happens. And then we can send, so I'm assuming that you're using the latest Linux kernel, which actually initiates the connection with 10 network segments. And 10 network segments works out to be roughly 14 kilobytes of data. So the server can send 14 kilobytes of data, and then it has to pause. Like we want to fetch a 40 kilobyte file, but we only send 14. And then we send that to the client. The client acts to those packets. And only then can we dispatch the rest of the data. So just to fetch 40 kilobytes, we have three round trips, which will take 700 milliseconds of time. Add to that, once again, DNS and other things. And before you know it, we've blown our budget of one second. So the takeaway here is we have to be very, very careful about our round trips and accounting for round trips on mobile networks. With 3G, this is a very hard problem. With 4G, we start to get a little bit more leeway because we're looking at 100 millisecond latencies, but nonetheless. So a quick summary of what we've seen so far. So 50% of the budget, we have our 1,000 millisecond budget. 50% of it, just write it off immediately as network latency. And this is the optimistic case. You have to account for TCP slow start. That's how TCP works. And our best case scenario for getting something visible on the screen is, of course, to do a one round trip render. So what that means is we can fetch up to 14 kilobytes of data. And then with that 14 kilobytes of data, we have to be able to render something useful to the client. That is our ultimate goal, which, of course, is a very tricky problem. Because our pages are not 14 kilobytes. They are about 10x that nowadays. And just a side note, there is no place for redirects in this picture. The optimal number of redirects on mobile is exactly zero, specifically because it's so expensive to do the round trip to just get like, oh, yes, you've asked me for the index file. You actually want the m.domain for this resource. That is the worst case scenario, because we also have to do a new DNS lookup, a new TCP connection, and we have to repeat this entire cycle. So m.redirects are definitely an anti-pattern. So that's the networking part. Let's look at the critical rendering path, or what is the critical rendering path in the browser. We have a little bit of data that we can work with, 14 kilobytes, so I'm just gonna give you like a very simple example here, right? Here's a valid HTML5 doc, right? So just a hello world and a link tag. So a style sheet and some HTML. That's about as simple as it gets. An external style sheet contains just two rules, right? So this is very, very simple. And in theory, we should be able to render this in no time flat, right? Well, let's see how this works. So first, we make the TCP connection and we start fetching data, right? And let's just say, for the sake of example, that the first part of the page that gets shipped is just kind of this above the header content, right? So it's just the first three lines here. You can imagine how you can take this to a bigger file with 300 lines and you can only ship the first 150, right? So just for sake of an example. The thing about HTML parsing is it can actually be done incrementally, right? So the browser discovers or looks at the incoming tokens or the byte stream, it tokenizes the byte stream. It looks for things like, hey, is there an image tag? Is there a link tag or other resource that I can start fetching immediately? So right now we haven't seen anything, right? But we already fetched some HTML and we started constructing the DOM. The phone, of course, the screen is, of course, blank. There's nothing to show yet. Next we get the remainder of the document and at this point we can actually discover that, hey, you're requesting a styles.css, which is an external research, which means I need to dispatch another HSP request to fetch this file. And now we have constructed our entire DOM, but we can't actually paint anything to the screen, right? Because your CSS is what determines how the pages get laid out. Like, otherwise we would get a very nasty page and you guys have all seen that experience of sometimes the CSS fails to fetch and you get just like white screen of text that is completely meaningless, right? So we hold the rendering until we have all of the CSS. And the thing about CSS is unlike HTML, we can't do partial evaluation of CSS. By definition we have to have the entire file in order to evaluate the whole thing. So, you know, if we fetch the CSS partially, we can't do much, so we're still blocked. And then finally we get the CSS and we construct the CSS object model. And then we take those two things together, we have the document object model and the CSS object model, we put them together into the render tree. And the render tree determines things like, hey, should I be showing, for example, the world part of the Hello World, right? If you actually pay attention to the rule here, I'm saying the span should be display none, right? So when we create the render tree, we will actually hide that and we won't even paint those pixels. So, we get the render tree, we perform the layout, we calculate all the widths and heights of the boxes, we paint the actual pixels on the CPU, we transfer them to the GPU and finally, finally we have our page visible, right? This is kind of that complex machinery that is required to make something as simple as a Hello World text string in your browser. So, a couple of important things here. HTML is parsed incrementally, so what does that mean? It means that if you can flush your HTML early, that is definitely a good thing, which is to say, you don't have to wait or you shouldn't wait to render the entire page and then flush it. Instead, if you can feed the browser like partial content to say, like, here's my header, right? And I'm still working on the body, but that's okay. Because we can look in the header, we can discover the resources and start fetching them early. So a good example of this is actually a trick that we do on Google search, which is our Google search header is the same, right? Across all the different pages. We have dynamic components, but those are filled in by JavaScript. So the first thing we do when we get a search request on our service is we don't even look at the query, like we don't even care what you're asking for at this point. We just flush the header immediately back to you and we say, like, here are the packets, start interpreting them. If you don't have the CSS, go fetch it and all the rest. And only after we do that, then we dispatch the query to our search index, get the results and construct the body and flush to you the rest of the content, right? So this is a trick or an optimization technique that you can certainly use in your own applications as well. So that's HTML, right? So flush it early and flush it often. And then the other important thing is, we all know that CSS is critical, but CSS is really critical on mobile, right? Because everything is blocked, rendering is blocked. We can't paint anything to the screen until we have all of the CSS. So that means that you need to be able to get the CSS down to the user as quickly as possible. Nothing should stay in the way of CSS. But of course, that was a very simple example, but we forgot something very important, which is our friend info JavaScript, right? We construct a lot of our pages using JavaScript today and JavaScript is used both as enhancement to provide additional functionality in our pages. And in some cases, JavaScript is actually used to construct the entire page, right? We just ship a bunch of JavaScript code, which then constructs all the DOM elements on the page. And the thing about JavaScript is it can actually interact with both the document object model and the CSS object model, right? In your JavaScript code, you can say, find me this node in a DOM tree and get me its contents. Or you can query for the CSS, like what is the style of this object that I just fetched, right? And that is very powerful, but it creates an annoying dependency tree, which is to say JavaScript can query both the document object model and the CSS, which means that if I have JavaScript code, which let's say it reaches into my DOM, fetches an object and says, what is the style of this thing? Now we're blocked, right? If the CSS is still being fetched, I need to wait for the CSS to finish for it to be evaluated and only then can it return the answer to that. So JavaScript can block on CSS and JavaScript will obviously block, can block on DOM construction as well. And JavaScript can also block DOM construction as well. So the important thing here is that effectively you want to eliminate all blocking JavaScript. There is literally, you have no room for it if you want to be able to meet this once or break this one second barrier. In order to have fast DOM construction, we can't make an extra request for let's fetch that JavaScript library and do something else. A lot of times, things like jQuery and all the rest, what you can do is you can simply defer the loading of that library until after the first paint happens because frequently we add additional capabilities like on-click handlers and additional functionality that is not critical, right? JavaScript is not the one that's creating or painting the actual page. Now that's not true of all pages but for most pages that we see that is true today. So you want to eliminate it from your critical rendering path. So in short, what do we need to break the barrier, right? Well, one, we know that due to the network latency is we basically need to be able to do the entire render in one run trip. And not only that but we don't actually need to render the entire page, right? Like some of our pages are very complex, they're very long. What we really care about is just getting that above-the-fold experience, getting something useful to the user so actually they can start interacting with the page. I don't need to see all the images to start reading the text or interacting with your site. That can be filled in in the background. What we want to avoid is having the experience of you staring at a blank screen for five seconds and then the entire thing just appearing magically, right? Like here it is. Instead we want to render something within let's say one second and then progressively fill it in the background. So no redirects, you obviously need to have a very fast server response time. As you saw, we don't have much time in our budget and we also need to optimize the actual critical rendering path, which is to say if all the budget that we have left over is about 500 milliseconds, they can very quickly figure out that you can't really make additional external requests. If you have external JavaScript or CSS, you're gonna go over that one second budget. So if you want to get the above-the-fold content painted, some of that content needs to be inlined into the initial 14 kilobyte payload. So let's look at a very simple example, right? Like this is your very simple mobile application, which is hey, I need a style sheet and I need a JavaScript file in my app. What we're saying here is first of all, it doesn't really make sense to have all CSS. What we want is we want the above-the-fold or the critical CSS that gets us the first paint and then you can load the rest afterwards. And similarly for the JavaScript functions, can you defer that? Because this script right here will block all rendering, right? The DOM construction cannot proceed until we fetch that application.js file. We execute it and only then can we continue to construct, they actually read the body and construct it and paint the rest of the page. So either this file needs to be moved all the way down to the bottom of the page or it needs to be deferred until after the paint happens. So how does this look? Well, this is what effectively what you would see, right? We have our, we've expanded our above-the-fold CSS and we've inlined it into the actual page. Similarly, we're inlining the critical JavaScript if there is any. In practice, I think we're finding that a lot of times we can simply just defer all of the JavaScript and that doesn't affect the page functionally even. And then at the bottom, right, we have some additional code to load the additional JavaScript and CSS on after the first paint happens, right? So the page is visible, the user can interact with it and we've met our goal. So long story short, we have a very small window of data that we can send. It's about, as I mentioned, on the latest Linux kernels. If you guys, by the way, if you haven't upgraded your Linux kernels, if you're still running on two, three, or later, you probably, or two, three, or earlier should say, you definitely want to upgrade because previously the initial congestion window on Linux was three segments. So you could only send about three kilobytes of data and then we would wait for a round trip and then we send six, 12 and so on and so forth. With the new settings, we can send about 14 kilobytes of data because the limit has been increased from three segments to about 10 network segments. So that's first. Second is you want to defer non-critical assets and you definitely want to account for every single round trip. You should be able to look at the head of your document and account for every single round trip and hopefully there are no extra round trips in there. And simple things like, hey, I'll put my CSS or JavaScript on a sharded domain. That is actually an anti-pattern here because that involves an extra DNS lookup to speak connection and all the rest. So you actually want the critical assets to be on the same domain or ideally you actually want them to be in-lined and then loaded later. So at this point I'm gonna hand off to Dave and he's gonna show you a tool that we've been working on to test some of these assumptions. Hi folks, my name is Dave Mankoff. I work on the PageSpeed Insights team at Google. You know, a lot of what Ilya and other folks have been bringing up at this conference so far is that bandwidth and requests are no longer the problem by themselves. It's really about latency. It's about how long does it take to make a round trip to the server? Just alt-tab out of it, there you go. It's really about round trips to the server. And what that means is a lot of the web development best practices that we've had for the past five, 10 years really kind of run contrary to getting the best performance you can on mobile. Things like putting external CSS in the head of your document that we've all traditionally done, as Ilya mentioned, it's kind of an anti-pattern. It's something you really don't want to do anymore. So PageSpeed Insights that has been one of the tools that recommended you put your external CSS in your head has had to really reevaluate and rethink how we are scoring sites and how we are providing recommendations. So what I want to demo for you today is a prototype of the new version of PageSpeed Insights that we've been working on. And after this talk, I'd like to invite you all to have a chance to try it out at the Google booth. So Ilya just pulled it up. This is the new PageSpeed interface. It's all beautiful and nice. Very simple. We're going to give it a try here. The Wi-Fi's been a little spotty, but we're going to see what happens if we type in our favorite procrastination site and press Enter. This is where we have a drum roll and we hope that maybe I have it preloaded over here. So you're going to see right now we have, we're finally presenting both mobile and desktop at the same time. You see, because mobile is an increasing presence on the web, we're actually highlighting mobile first and scoring the both at the same time. You're going to see our four primary rules that Ilya pointed out before. Avoid redirects, reduce the server response time, eliminate render blocking content, and prioritize visible content. Now the first to avoid landing page redirects, you can test yourself, you go to the webpage. Pretty simple, we can all figure out how to do that. Reduce server response time under 200 milliseconds, ideally 100 milliseconds. The DevOps people should already be living and breathing this. If you're not thinking about server response time, there's a bunch of people in the exhibition hall who would like to give you their business card. The render blocking CSS, again, this is something that's kind of new to me. If you'd asked me a year ago about render block, or about blocking CSS, I kind of want to give you a skeptical look. But again, we're finding that's really important. And so now the PageSpeed Insights tool is highlighting for you the JavaScript and the CSS that's occurring higher in the page and causing the page to stay blank until it's loaded. Finally, prioritize visible content. This is the idea that you want to get the first render of your page in 14 kilobytes. And if you're not doing this, your page is remaining blank until another congestion window or two congestion windows or three congestion windows have passed. Now what you see here in the, one of them is red, that means it's really bad. One of them is yellow, that means they probably got it in two round trips or something to that effect. But let's take a look at a few of the other sites. Favorite tabloid, TMZ, you're gonna see, if you look at the top, they're getting a red marker overall. That's because they have 16 blocking scripts and two blocking CSS files. Everything else, they're doing great. But on desktop, we're actually giving them a pretty good score. Now, PageSpeed Insights is actually changing their internal scoring model to use round trips. Traditionally, it's just been a checklist of things focusing on reducing bandwidth and reducing the number of requests made. So you know, turn on GZIP, you save 100 kilobytes, great, that's a few points. Reduce the number of files by combining them together, great, a few more points. Now we're actually modeling internally the number of round trips that's gonna take to download this content. And what we're discovering is that for some sites, that makes a huge difference on mobile, but maybe not such a big difference on desktop. So it's really allowing us to improve our scoring mechanism and let you know the severity of your problem. Just to show a few other sites, SFGate not doing very well. They have the redirect problem that you see at many sites. If you go to sfgate.com, it goes to www.sfgate.com before going to the mobile site. And then mobile site says, hey, you actually wanna go to this index page over here. Just put it on sfgate.com. If you have to have one redirect to your mobile site, fine, but there's no reason to go to that www intermediary there. Server response time, again, 0.65 seconds. It's not terrible. We'll give you a yellow on it, but it's not great either. It should really be aiming for that 100 to 200 millisecond server response time. And just as a quick example, now the bad news is most sites right now don't pass this criteria. This, these suggestions, they are just suggestions. But the good news is it can be done. We have one site here that actually gets this right out of the gate, Silverton Casino. They have their CSS inline properly. And it renders quickly nice and snappy. If you run this through web page test, I think you'll find it's on the order of just maybe less than a second, most likely. That's PageSpeed Insights. This is early prototype. This isn't available outside of this conference today. But we're gonna be at the Google booth just after this. There's some schedules floating around the conference. We're really looking to solicit feedback. See what you guys think of this tool. See what you think of these suggestions. And I hope you like the tool. Awesome. Thanks, Dave. So if you guys, I'm sure some of you have already managed to copy down the URL. Unfortunately, it won't work. So you'll have to come out and talk to Dave at the Google booth. But we're really excited about this tool and some of the other work that we've been doing. So the next question is, like, great. This all sounds great. I should start counting round trips and I need to think about the critical rendering path. How do I get there? What are the tools that can help me? And how do I even find what my critical rendering path is? So this is still definitely a place where we can have a lot of improvements. There is tools to be built. There is infrastructure to be built to make this whole process easier. But a few tips that may help you. So for example, in Chrome DevTools, you can actually go into an audits and you can run a web page performance audit. And one of those things will actually show you the CSS that is being used. And more importantly, the CSS that's not being used. So here you're looking at this file. And I believe this is actually on Guardian CoUK. So you see that there's about 54 kilobytes of CSS that was downloaded. But majority of it, or 61%, is not being used by this current page. So we have best practices today, things like concatenate all your files. So you take all of your CSS and you put it into a style.css. And then you find that your average page only uses 15% or 20% of that CSS. And of course, each 50 kilobytes, as you saw, will take us three or four round trips to fetch. So what you really want is to slim down that file, get that header visible above the fold content, and then load the rest later. So maybe that's a base CSS file which you shared across all of your pages. Or maybe that's an inline snippet inside of your page. And then you load other things, right? So now that we've gotten all of you guys to concatenate all your files, yes, please undo that. Another thing that we've been experimenting with is, can we automate this, right? So deferring JavaScript, we've had modules and code that can do this automatically. We can inspect the page and the server can do some of that work for you. So I'm going to try and show you a live demo. And this is where, once again, we're going to pray to our Wi-Fi gods. So Google has a service called PageSpeed Service, which is basically an optimizing proxy service, right? So I have my site hosted on PageSpeed Service. And what we're looking at here is the actual HTML markup of this existing page. So if you look at the header, I actually have two different style sheets right now. So they're being rewritten and they're being hosted or provided through the Google CDN. But that's all we're doing right now, right? We're just rewriting the original file from my origin server to the Google CDN. If I go into my console, we actually have a whole bunch of flags for like, okay, I want to optimize images and all the rest. We're looking at optimized CSS, right? So we have a couple of flags like, let's combine CSS, minify and move CSS to the head. And I can apply this to the site, okay? And if I go back and I will just reload the page, right? So you can see that I've reloaded the page. And what it has done is it's actually looking at my HTML. It detected that I'm using multiple style sheets. It combined them into one file. And now it's serving a dynamic file, which is a combination of those two. And if I change any one of those files, I'll get rewritten. And it'll just generate a new file for me. So all of this happens without me having to modify my site, which is great, right? So that's step one. But we're talking about identifying critical CSS and how do we inline? Like, this is all of my CSS for the entire page. And what I've just been talking about here is identifying just the critical pieces. So we have this new filter which we've been experimenting with, which is inline critical CSS. So let's once again apply that to the live site. And what's going to happen here is I reload this page and you can see that there's an inline block all of a sudden within the head of my page. And this is not all of my CSS. This is a small fraction. This is about 20% of the actual CSS used by the page. And then at the bottom, we add additional JavaScript code. And of course, you can't read much of this. But somewhere in here, there is an extra network call to fetch the rest of the CSS after the first paint happens, right? So all of this is done by the tool. And the inlined CSS is what allows us to avoid the extra blocking resource, which makes the page render much, much faster on mobile. So how does this magic work? Well, in the case of PageSpeed Service, what it actually does is, let me flip back here, is we actually have a server side rendering, a headless render running on the server, which renders the page, looks at the content in the viewport and says, let me walk the DOM of this content, find the critical CSS, and then we inline it. So this is a pretty heavyweight process, right? That happens in parallel to user requesting pages. So this background processing that we do, of course, to the user, to me, this is completely invisible, but implementing this is non-trivial. Now that's PageSpeed Service, right? You're welcome to try. This is a product that we have in beta. But you can also use our open source tools. And we actually have prioritized critical CSS filters in both our Apache and Nginx modules, which are both open source. And they will do effectively the same thing. They actually use a slightly different mechanism. So in your Apache or Nginx, we don't run a headless renderer to figure out what is the critical CSS. We actually defer that work to the client or to your visitors. So what happens is we actually add a little bit of JavaScript code to the page. That JavaScript code gathers the above default content. It walks the visible content. And it beacons that data back to the server. And then Nginx PageSpeed or ModPageSpeed gathers that data, figures out what is the actual critical CSS. And then using that, it inlines the critical CSS into your page. So we kind of have this feedback mechanism of, we'll just render the page. We'll let the clients render it. We'll gather the data. And then we'll inline it. So this is experimental filters. You're definitely welcome to try it. And it's available both in our Apache and Nginx modules today. And I should also mention this is for CSS. We also have modules or filters, I should say, for deferring JavaScript as well. So we have a deferred JavaScript which we'll rewrite and images as Josh is mentioning here. So how does this actually look? This is an example of taking from ModPageSpeed.com. So if you go to ModPageSpeed.com, we actually list all the filters. And you can actually find the prioritized critical CSS example in there. So what happens here is we have an original page, ModPageSpeed.com critical CSS, which is a HTML file, which then fetches a PNG file and a bunch of CSS files. This is kind of your typical way that you constructed the page originally. What we're doing here is when we enable, sorry, I'm looking at the wrong one, so CSS right here. So we're fetching all the CSS files. And you can see that this green line, which is the time when the first render happens, this is on a mobile connection, is about four seconds in. And the reason for that is we have to fetch the HTML to discover the resources or the URLs for the CSS. Then we have to go fetch the CSS. And we're blocked on all of the CSS. So if one of those files happens to be slow for whatever reason, we have to wait for all of that to complete. Finally, about four seconds in, we have all of the CSS. We can paint something to the page. If you enable prioritized critical CSS and you've inlined that, shortly after we have the HTML, we can construct the DOM and the CSS on. We can paint. And then we can load the remaining CSS afterwards. So the difference here is huge. It's a difference between 1.3 seconds and 4.1 seconds. So we're still over budget here. We haven't beat our one second difference. But I hope you guys agree that 1.3 second and four second load time or first paint time is a huge, huge difference. And this is a visual comparison from a web page test, which basically shows the visual progress of that same page. So as you can see, we didn't actually change how the page gets painted incrementally. We just shifted the entire line by about three seconds to the left, which is exactly what you want. So with that, just to summarize some of the points, we have one second. 50% of that or more is going to be your network latency. You have to count the RTTs. You really have to start thinking about optimizing pages with respect to RTTs. And don't forget TCP slow start. You have to be very careful as to what you put in those first 10 or 15 kilobytes of your data. The best case scenario is, of course, one roundtrip render. Hey, sometimes you can't do that. That may mean that you need to incur several roundtrips, as Dave mentioned. That's OK, right? But hopefully, once you start thinking about it in terms of roundtrips for your render time, you can optimize for that. Because then you can critically examine all the resources on your page and say, do I need this? Should this be in a critical rendering path? Does this analytics beacon need to be in a critical rendering path? The answer is almost always, no, it shouldn't be. You can inline CSS and JavaScript. We saw some tools. And you have to defer all the non-critical assets. So with that, we'll take some questions. If you want, the slides are available online. And as Dave mentioned, we're going to be demoing the tool. You're welcome to drop by our Google booth to ask us questions, or just ask your site on it. Thank you. Question back? Sure. Yeah, so Dave? Just so you know, the PageSpeed tool, I can tell you how that does it. We actually used the same question, was how do we differentiate between just a paint and the real looking paint? Is that what I'm talking about? The PageSpeed tool actually renders the page on Google servers and actually will compare screenshots. It'll say, what does the screenshot look like at the first congestion window? What does it look like at the second congestion window? Two, what does it look like after on load? And we don't require you to get 100% accuracy there, but we do basically do an image diff. And we say, does the first paint look a lot like what it looks like after the page has finished loading? I guess maybe a little bit more generally, right? So you can definitely game your first paint time. You could paint a blank screen. That's not very useful, but you get a very nice score. So in WebPageSets, we have the VisualSpeed index, which actually looks at how complete is the page in terms of actual pixels on the page, right? So when we go back to this graph here and we look at, this is not just like the first paint happened here. What this actually says is that over 65% of the pixels of the final page have been painted by 1.2 seconds, or 1.2 seconds in. So the question is, how do we determine what's actually above the fold given that there's a variety of different screen sizes in all the rest? So the answer there is there is no one definitive above the fold, right? Because there's tablets, there is a different screen size and all the rest. But what we do, for example, on PHP service when we do the headless render is we just pick a very large above the fold definition, right? Like we'll take a big desktop monitor and use that as our setting. For something like ModPageSpeed, when we beacon the data back, we will actually gather the data from a variety of devices and pick based on a sensible kind of tradeoff between all those things. So that is a good question. I don't think this is actually related to ModPageSpeed. It's much more of a question of why did Chrome decide to defer the PNG download in this case? Oh, so you're referring to this one here. So do you want to? The reasoning there is there's no API in the web right now for knowing when the paint has actually occurred. So that's not available to the browser. And similarly, if that CSS gets injected into the page before the paint has occurred, even if it could paint, it'll wait. It's like, oh, there's more CSS. So if you took that CSS and say put it in the footer, if it got to the footer before it tried to paint, it'd say, oh, I found some CSS, I'll wait a little longer, essentially. So another thing that I think we should mention is that today there is no good utility or API in the browser to do things like I want to load this CSS file asynchronously and don't block rendering. Today, any time you put a CSS file, the browser says, okay, stop the world. This may change how the page looks, or I'm gonna block. So we're still trying to figure out what is that right API? We're using some tricks to get around the problem today by basically deferring everything until request animation frame, which the first time it fires is well after the first paint. But I think this is where we can both improve the waterfall in the future by getting better APIs from the browser.