 Today, I want to talk to you about what it takes to build a super fast mobile web experience. And to do that, well, it's going to require us to plan for our performance. Here's my pitch to you. You've all been in that meeting room. When your manager or boss has asked you, well, does this site work on mobile? And you think, well, sure, it's worked on a small screen size on my desktop. And I wrote some CSS, so things scale. So definitely this works on mobile, but that mentality has a flaw. And the flaw is the mobile web is no longer just this subset of the web that you think about at the end of your development. The mobile web is just simply the web. And we see this through the numbers. Mobile traffic is actually outpaced desktop traffic now since 2014, and it's only continued to grow since then. So whether or not you like it, the users that are visiting your web app or website are on a mobile phone. They are not on a desktop computer. They are not on the laptop that you're developing on in your office. But you as a developer, you're like, OK, that's fine, Sam, because I have a mobile phone in my pocket. Yeah, just right here. Well, I'm just going to test my website on this, and it's going to be great. And I will have accomplished everything that you told me to do. But that's not completely complete. That's not the right approach, because that phone that you have in your pocket is a $600 phone. It's the brand new phone, because you're a developer and you want the best thing. So you have the best thing. But the phone that your users have, well, they look like this. As Alex touched on, these are the low-end phones. These are the phones that you get for free when you sign up for a cellular plan. These are the free phones. These phones have a gigabyte of RAM. These phones have a single core. And we see the numbers that back this up. On the average Android device that checks in, they have, on average, a gigabyte or less of RAM. So not great. And this has some real performance implications on mobile websites. Believe it or not, the average load time for mobile sites is 19 seconds. So from going to the URL in your browser to that actually loading in your phone and being interactive, that's 19 seconds. 19 seconds that your users have to wait. Now, you've seen this graphic before, I'm sure. But it's really important to think about, because the web that we're shipping today has a 19-second average load time. And the users expect, when you just ask them casually, they expect a page to load in two seconds. And most users, if that site takes longer than three seconds, they'll just leave the page. So that web app that you're delivering right now, if it takes longer than three seconds, you're just throwing away half your traffic. OK, we have 19-second load times. We are shipping experiences that don't match what users want. So how do we align the web that we're building today with what users want? To do that, I think it's important that we take a look at what it takes to load a web page from when you issue the request to when that page loads in your browser and when you can interact with it. What exactly happens between when you hit Go and it shows up? So I've taken the network stack and everything that happens in between and simplified it down because it's a lot to talk about into three phases. And the first phase here is requesting the page and the server getting that page and then shipping back an HTML document to you. As I've so beautifully illustrated here for you all, this is what it looks like. You go to the website on your phone, it hits the server. The server then generates a document and has to do some processing work to figure out what exactly we're going to send to you. And then it ships that document back to the browser. And don't forget, that request has to go across cellular towers and then find its way through the internet to the computer that's responding. And then when it gets it, it has to send it back. And so that round trip cost on mobile phones, it's real. All right. So we have the document. The server has responded. So the browser has to do some work now. We have to parse the HTML. And we have to figure out what exactly is required on this page for us to actually show it. So we have to find these critical assets. And once we've found these critical assets, the browser says, okay, I need this JavaScript file. I need this style sheet. I need these two images. And when I have that, I'll be able to draw something that you actually care about. But we have to issue requests for those. And those requests go over the wire, through the cellular towers, onto the server. The server figures out what exactly we're sending to you, the server then responds back. And all of this takes some time. So all right. We have our high priority resources. We've made that initial connection to get the document. And here we are in my favorite phase, the parse and execute phase. We have the assets. The browser isn't showing you anything yet, because it actually has to compute what all of these individual pieces look like when they're combined together. So we have our HTML, our JavaScript, our CSS, the JavaScript parses and compiles, so it can actually be evaluated by V8 in Chrome. It runs some JavaScript, maybe it adds some DOM onto the page. The browser then tries to render, takes your styles, it calculates where all the positions of all your elements, it lays them out, and then it composites any layers. And then you end up with a beautiful page. But this is just the loading phase. This isn't interactive yet. This is just getting something on the screen. All right, given all that, I think it's important that we take a look at this network phase. If we're gonna optimize one layer to start, the network seems like a good opportunity because we're issuing a request to the server, the server has to respond, then we issue more requests, and there's a real cost here. So what I did was I took the Polymer shop site like you've seen today, and I cloned this down locally and I started making a bunch of changes to it. I started out with a pretty simple baseline. I didn't bundle any of my assets. So I have individual assets for everything, not one big blob. I then set up an HTTP2 server, and I simulated a 3G network. And by simulate, I mean, I actually had a 3G network that I tested this on. So we got some real world sort of metrics. So let's take a look in DevTools, how we can get a feel for what this is like. All right, so here's shop on the left and here's our recording that I've done before and saved this off. So you'll notice I have this pane up here which you might not be familiar with and this is the network view. Network view is usually hidden away until you check this box. And then you see this beautiful waterfall of all the things that are getting downloaded. And you might be saying, what is this gray line? I've never seen that before. Well, of course you've never seen it before because the patch line did yesterday and I have it right here. So what this means is this gray part is we're just waiting for content. We haven't gotten anything. The request went out but we haven't taken any action yet. So you can see that in this case it takes 1,500 milliseconds until we actually get a response from the server and then we're downloading the content. Okay, and then it looks like we issue a request for this other file, this shop app which is for this site, it's sort of their app wrapper which then has a bunch of dependencies which all need to download which then have a bunch of dependencies which all need to download which then have a bunch of, okay, so so and yeah, okay. And so all of this results in a paint that takes around 5,500 seconds, or milliseconds. Okay, so five seconds, 5.5 seconds for this on a 3G network, it's not terrible. It's nowhere near that 19 second average. So we're starting off from somewhere that's not horrible. However, as a developer we can see like clearly there's some room here for improvement. Yeah, so again, I'm just holding down shift here in the timeline as I drag, I can actually measure how long. So right, so 5,500 milliseconds roughly between when we got our first bite from our server to when we rendered a screen. Now shop paint something pretty quick. We have this frame up here, so the shop but really that's not useful to the user until we actually have content. So this is the first content full frame or the first meaningful paint and this is what I like to measure till. So we see we get these network resources, the browser then executes some JavaScript here just for a tiny bit and then we decide to paint. All right, so HTTP 2, no bundling 3G, 5,500 milliseconds until first paint. Okay, let's look at some techniques that we can apply here to drive that number down. Now, I don't know if you've heard of this really cool feature that's in Chrome called link rel preload, but what this does is it allows you as a developer to basically say, I know exactly the things that my page is going to need before it's displayed and loaded and so what I'm gonna do is tell the browser what those are and the browser's going to see this and say, okay, I'm going to download these. I know you haven't requested them yet but like I know something as a developer, you're like something is going to request these to start, so the browser will download these, they will sit there and so when your browser finally issues a request for them, it's like, oh hey, I already have this, so you don't have to hit the server again. So I took the site, working off of our base with no bundling going on and I looked at all of the assets that were required before we actually issued a paint and so I did link rel preload, put it all in the head tag and let's see what the effect of this is. Oh, so this looks quite different. It's now very flat. We no longer have this sort of stepping effect of dependent resources on dependent resources that eventually resolve into a painted page. Instead we just have a flat line of things. It's like, hey network, just grab all these, thank you. And then when we have them, we end up with a paint. So from our first byte here to when our first contentful frame was, we're down to 3,300 milliseconds. So we've cut out two seconds roughly. Just by avoiding that network work of having to download, discover, download, discover, download, discover and flattening that out. Now you might be thinking, okay, but why would I ever have a site that was daisy chained like that and have all these dependent resources and this is totally contrived, fair point. But on your site, I guarantee that you're downloading JavaScript, I guarantee that you're downloading JavaScript. If you're taking advantage of some of the new module splitting features of build tools, you probably have lazily loaded JavaScript files. And those are all late discovery documents. So what you can do is you can use link rel preload to sort of say, hey, start downloading these because I'm going to need them. Which can be a real win and in this case, a massive win. All right, we're down to 3.3 milliseconds on a mobile network which is good. We're getting closer to that two second time that people want, but we aren't quite there. And we still, if we look here, we have this cost that's kind of crazy. We have this initial connection cost which is five seconds, which is insane. But then once we finally get a bite, we're spending essentially 1.6 seconds just downloading this document. And we're not taking advantage of the network during this time we're saying, give me this HTML document, please give it to me. Thank you, I'm receiving it. And then once I have it, I download more stuff. It seems like a wasted opportunity. Like it'd be nice if we could jam some requests in here as well. Turns out there is a solution for this. That solution is called H2 server push. How does this work? What is this? Well, it's the basic idea that your page when requested from your browser, it hits the server, it opens this connection. The server, as like while it's responding with HTML, the server says, I know you're gonna need these assets. So I'm gonna go ahead and push them to you. You didn't ask for them. The browser has no idea that they're coming, but I as the server know the world. So here, just take them. And when you finally need them on the client side, well, they're already there. This compared to the previous approach that we were doing, which was two round trips, which was get the document, scan the document, request all of the critical assets, and then we can paint. So in this case, the server is pushing the assets to us. It changes who's responsible for what. Now, you're probably like, I read a blog post about H2 push, and that blog post told me lots of things that were not so great about H2 push. And I probably read that same blog post. So let's talk about them, because it's important too. H2 server push, it's not cash aware. What does this mean? Well, this means that the server doesn't have an idea about the state of your client. So if you've visited the page before and you have these assets cached, the server, it can just say, hey, I'm gonna push main.js again, because I don't know if you have it. And by pushing it, the client's unable to cancel that right now. And so it causes this network contention and can cause a real cost to your users, because you can basically push everything that you want. And the browser's just going to take it. So not great, and can actually slow down your page quite a bit. The second is that H2 push doesn't have resource prioritization. All right, what does this mean? Well, it means that the browser natively is pretty smart about what things it requests, because it's able to determine that like, this is critical, this is not critical, I can defer this work. And so it tries to optimize for its network activity. H2 server push is like a bully. It just pushes it to you. It's like, here, just take all of it. And the browser's like, okay, I'll take it. So it's not so great. But there's a solution for this, which is kind of the first P in purple. It all revolves around this one idea. It's that H2 server push plus a service worker is like this nirvana, it's a magical thing, because it avoids the big downfall, in my opinion, of H2 server push. So let's step through this so we can understand it. On our first load, the browser requests to our server. The server then pushes the critical assets and at the same time delivers our document. Our document is then scanned by the browser. The browser says, okay, I need these files. Go out and make the requests. But oh, you already pushed them to me, so I don't have to hit the network, great. Now on your next load. You have installed your service worker as represented by this beautiful green square with the letters SW. The page you go to your page, but it doesn't even hit the server. It hits your service worker. The service worker intercepts this request and it says, hey, I have index.html, so here you are. And oh, you want these assets too? Well, I have them cached as well. And what this avoids is you are not even opening a connection to your server. So because you have not opened a connection to your server, the server can't push you things that you don't need. So by combining H2 server push plus service worker, you can get all of the benefits of that first load with H2 server push and avoid the majority of the downfalls of getting pushed assets that you already have cached. So this is really good. So let's take a look at what shop looks like if we actually used H2 push. And this is the default thing that the shop site actually uses, which is really awesome. So here we are, we have the site. This timeline looks really different, like really, really different. So it looks like we're still paying that really high cost here to get our initial page and we're still spending, you know, 800 milliseconds or more inside of that initial request. And then we see this very interesting sort of representation of what's going on. And what you're seeing here is actually the files being pushed to you from the server. Now these in reality are probably being pushed out here, but right now it's shown just right here, which is fine. So we look, we see, we get all these assets that are sort of sticking around, they're all marked as lowest priority, all right. Then we scroll down and here's that wrapper element, that root element, that shop app, HTML. We see the moment that this is finished right here being downloaded, we then fire off all these requests. But you'll notice that these requests are really fast, 63 milliseconds, 78 milliseconds, 62 milliseconds. How are we doing that? Well, it just so happens that these are those assets right here. And when we request them, the network says, hey, do I have this? Go into the network stack of Chrome. Network stack of Chrome says, why yes you do, here they are in my cache, please take them. Which ends up making for a super, super fast experience. And we can see from first byte here to when our page actually renders, we are under two seconds, which is amazing. So we went from 5.5 seconds to 1.7 seconds on a 3G network for our initial load with no service worker installed. So that's pretty impressive. All right, what did we learn? Network takeaways. Link rail preload. What is this good for and when should you use it? Well, it's good for moving the start of downloads earlier during that page parse. So the browser parser can instantiate those requests before the browser knows that it really needs it as a dependency. H2 push. H2 push, when you boil it down, it's good for cutting out one full round trip to your server. So the most optimal delivery of H2 push is you're going to save at max one full round trip. And that's important because when you're measuring your site and figuring out where to optimize, you wanna optimize your biggest cost. And so if your latency that exists for a round trip is your biggest cost, well congratulations because you are in a minority because that is amazing. And H2 push is probably your solution. All right, we've talked about the network, cool. We know how to get our assets to our browser quickly and how to get that first paint, that first meaningful paint, pretty quick. But paint in that first frame is only part of the story. The real interesting bit, in my opinion, is getting to interactive. And the costs for modern websites to get to interactive, well, it's our friend, JavaScript. Just sitting there, like, hey, you need to execute me and parse me. And on our phones, that tends to be kind of slow. To understand just how slow this is, well, we need to look at what JavaScript deliveries have been like. And these are a gzip size. In November, 2010, we were shipping on average 100 kilobytes of JavaScript, pretty good. Today, we're shipping four times that, on average, 408 kilobytes of JavaScript on every page load. Okay, so what? Well, there's a cost there. There's a cost to shipping this parsed code. If we look at some older devices, like iPhone 4 or Nexus 1, our parse time for 300 kilobytes of JavaScript is anywhere from 300 milliseconds to 500 milliseconds, half a second. Remember, users want that page to load in two seconds. But you're looking at this and say, Sam, these are ancient phones. This is a terrible graph, not applicable to me. Well, remember, the phones that your users are actually on are those older devices, which are very, very similar to these phones listed here. But it's fine, I understand your concerns. Because I have a chart for new phones. All right, four megabyte of JavaScript because you're shipping all the frameworks all at once. On a Nexus 5, it's a pretty new phone, I like that phone. We're at like 1,800 milliseconds of parse time. iPhone 6 with the latest operating system. We're at a full second of parse time. S7, about a second. All right, there's a cost here. Users want it in two seconds. You're shipping a megabyte of JavaScript. All that time is gone. Sorry, you lose. Try again. All right, so let's measure this because I can talk about numbers and we can talk about goals. But I find it really helpful to take a look at something that represents the real world. So I found a cool little app, it's a budget app that uses some technologies that you're probably familiar with. Webpack, React, Redux. Awesome, sounds fun. So let's load it up here. All right, here's our app. Let's clear this out. This is, you know, really good budget app. I'm gonna say, what do I wanna buy? Cat food, great. Let's spend 100, no, I spent $100 on cat food. I got the fancy kind, so great. Okay, I'm really struggling with my guitar and the Trader Joe's food. All right, so let's record a profile here. And as you've heard today, we can enable this CPU throttling, which is not a real device, but it helps kind of get slower. So this is on a MacBook Air? Yeah, sure. And we're just gonna do a 5x slowdown, which should make things slower, not quite a mobile phone. And we're gonna look at the JavaScript costs, just for the simple site. So we'll do a reload, command shift R, we record the timeline. It comes in, here we are, my friend, all of our friends really, just this chart. Okay, well what does this mean? Well right here we see that we have this bulk of yellow, it's goldenrod color. And that's a great sign, well actually it's a bad sign, it's JavaScript executing. And we're spending this frame 421 milliseconds just handling this. So where, what's our cost here? Why is this taking so long? So I click on this frame here, which gives me a bottom-up view of the JavaScript here. And we zoom in here and we can see 140 of the milliseconds is spent in T. Obviously it's T, of course. T, T, oh, T, I can't, why did I, okay. So I'm not really sure what that is, but we can find out what T is. And looking at this code really isn't useful so I can pretty print it. Okay, great, oh there's T, hello T. Oh, I don't really know what this is. But I kind of know what it is, I don't mind. It's actually the sort of module loader that Webpack uses. But what's strange is this doesn't look like a very expensive piece of code, it's not doing anything that's explicitly slow. So it feels like the numbers that are reported to me in DevTools are lying to me, or not completely accurate. This cost is being associated here, but I don't think it's actually here. Yeah, so I really think that it's probably parse time, it's parsing that JavaScript just from just some experience that I've had. So we start here, we have this T function, okay. But I want to see the parse cost and I know that I could step down into about tracing and click those checkboxes and reload it and then find it, but it's hard and it always confuses me. So wouldn't it be nice if there was a way that we could see those V8 metrics inside of DevTools? I agree. Well, I'm happy to say that in Canary, we have an experiment, V8 internal metrics are now in timeline, which is exciting. Screenshots are great, but demos are better. So here we are, let's have a demo, great. Enable our CPU throttling again, awesome. Let's just reload the page. Command shift R, wait our three seconds. One, two, three, okay, here we go, four, great. Ooh, whoa, good, 739 milliseconds that time. All right, where's our cost? Oh, that's different, parse function. That's what I thought, parse function, compile code lazy, compile full code, compile script, function callback. These sound like native V8 runtime things, excellent. And we could go into exactly why this cost is there, but there's a lot of blog posts that describe how to optimize, and this at least gives you a starting point, so you can understand the real parse cost of your website. Okay, but parse time, that's part of it, but you have parse time because you're shipping JavaScript, and so when you're shipping JavaScript, it's really easy with npm just to require everything and then say, okay, great, it works. So it's never been more important to actually understand what you ship. Webpack recently had a plugin released, it's called the Webpack Bundle Analyzer, and this actually shows you what's in your bundle, your Webpack bundle, and you can explore this tree map and see the GZip size, the bundle size, everything great about it. You can find out why you have three versions of jQuery on the page to improve things. And then for those of you using BrowserFi, there's a similar tool called disk, which has a sunburst chart, which is beautiful. I'm always so happy when I see this. I'm able to find disk size, et cetera, same information. So, okay, you understand what you're shipping, you're shipping less code, ideally, but you're still having that parse cost, and you have to think, well, the only way to shrink parse time is to ship less parse code. Okay, well, how do I ship less parse code because I'm shipping JavaScript and I wanted to be parsed? That's a good question. Well, there's two techniques that I like. Well, pretty simple to do. You know, the script tag, it always has that type on it when you set source. Well, you can change that type to be whatever, and if it's invalid, and the browser doesn't know how to handle it, it will download the script file for you, but it won't do anything, it'll just sit there. And so you can put type inert, type poo emoji, it all works, you'll get that script tag. And then what you can do as a developer to say, oh, I have some idle time here. Let me go ahead and evaluate this JavaScript. Take the contents, put in a script tag. This gives you control over when the parse time happens, so ideally not at that initial boot up experience. Now, you can also have another script tag where you wrap your code in comments. Okay, so you ship down JavaScript, that's commented out, but then when you wanna evaluate it, you just strip those comments out. And then the code is now evaluated. So some techniques for manual shipping of inert code, which can help you to avoid that parse cost, which is non-trivial. All right, that seems pretty manual, it is. Wouldn't it be nice if there were ways to do this automatically in my framework of choice? And there are. The community has been listening and solving these problems for you, so it's much easier to do these things. Angular 2 actually ships with lazy module loading by default in the ahead-of-time compilation, so you can do per route JavaScript bundle downloads, which is pretty remarkable. Polymer CLI, which built the shop site, has per route fragment sharding, so it's aware of your page and all of your routes. And it's able to say, I know that this bundle is gonna be for this route and this one is for this other route, and so here are these independent bundles that you can download when you feel like you need to. The other way is a little more manual, but still really amazing. Webpack has a plugin called aggressive splitting plugin, and it redefines the semantics around require to add require.insure, and what this does is it essentially builds a dependency graph of your requires so that Webpack can say, okay, I'm going to split this JavaScript bundle, this one, this one, this one, this one, this one, and you get like a lot of JavaScript files, but they're great, because they're really small and you can manually control when you download them, which is so awesome. All right, so we looked at network, we looked at JavaScript, we looked at DevTools and visualizing it and measuring our cost, so what do I want to leave you with? Well, as you've heard time and time again today, it is a mobile world. It has never been more important to test on real devices, not the one on your pocket, the one that your users are experiencing your website with. And it's not just testing on a mobile phone, it's also testing on a mobile network, not the Wi-Fi in your office. You need to go and you need to go get a 3G network and say, okay, here I am on my phone, on a 3G network, what does my Web app feel like? Is it slow? We should probably fix that. Now, given what your users visit your site on, it's critical that you optimize for network utilization by using techniques like service worker, link rel preload, HTTP to push. These things can help you get super fast first loads in under two seconds. And on reload, it's instant because you have that service worker just sitting there. And finally, my favorite, it's a JavaScript parse cost. JavaScript that you ship has a cost because it has to get evaluated, it has to get parsed. So when you ship more JavaScript, parse time goes up. There's no way you can really avoid this. So the trick is you ship less JavaScript, you ship less parse JavaScript. The end result of all three of these is that your users have a blindingly fast web experience and everyone wins. Thank you very much.