 Hello, everyone. My name is Sam Sikoni. You may know me a little better like this. This might seem more familiar. But here I am. It's a pleasure to see everyone. And hopefully it's all right seeing me. So today we're going to talk about the future of performance on the web and a look into performance patterns that will be coming to the platform soon, hopefully. I've never been so excited to be a web developer. In 2017, we have all these amazing primitives at our fingertips, from WebGL, from service workers for offline, to web USB to interact with physical devices that we can plug into our computer. We have so many ways that we can express ourselves on the web like never before. But as Eric put so well, Chrome ships a lot of features. In the past four years, it's shipped over 1,000 features in Chrome. And if you're like me, I get a little overwhelmed. All these new things coming at me, all these blog posts about new things that I can do. And I lose track of what's coming up because I can barely keep track of what's already landed. So before we go into talking about performance, I want to make sure we have a common language that we can use when talking about performance. So let's zoom out and think about performance in three primary pillars. We have our network bound operations, our parse and execution bound operations, and finally our render bound operations. You can think of network bound as when you go to a web page, you have to download some files. And how those files get to you is important because the faster they get to your user, the faster that you can actually run them. When we finally get those files and run them, we enter into this phase where the files are being evaluated and executed. And how they are run impacts our user experience and impacts how fast we can get things to the screen. And finally, when our files have run, we drop into a render phase where the page takes all the side effects of our operations and paints them onto the screen. So to start, I want to talk about bundling performance. Bundling performance is bundling your files together and getting them to your user over the network. And I want to take a look at some future patterns, which are going to kind of question our common best practices. Apps today kind of look like this. If you've shipped an app or looked at the network profiler, when you've loaded an application, you might see this sort of shape. You have an application bundle, a vendor bundle, and a styles.css. And these are the combination of all your individual files compiled together. If you've read any sort of performance blog posts or books, it will always say, you need to bundle. You need to be combining your assets together before you ship them to your user because it's going to get the assets to your users faster because they're not going to have to round trip with the server over and over again. However, this approach has some performance downsides that not too many people talk about. Let's consider the following situation. Our application bundle has page one.js, page two.js, and a common.js file that's required in both our vendor bundle and our application bundle for these bundles to be valid. Think of these like string utilities, something that's required in both files to even run. And in our styles file, we have our individual CSS files rolled up into one single bundle. But we have this common.js file that's shared. So this means that when we change a single bundled file, so common.js, all dependent bundles are invalidated. Invalidated is a fancy way to say that the browser needs to redownload them because now when I load the page for the second time, the browser will say app bundle has changed, vendor bundle has changed, so I need to refetch them from the network, redownload them, re-evaluate them, reparse them, and finally execute them. So a small change like an update to your strings.js file and common.js might cause all of your files to be invalidated and thus redownloaded, which will cause a negative perf impact for your end users, ideally something that we want to avoid. So I postulate what if our applications looked like this. Some of you may already be doing a form of this where you're shipping assets based on route, where you're using something like React Router and Webpack to do route-based splitting, but I want to push it a step further and say what if we took our individual routes and also our individual libraries and started shipping them independent of each other. So page one, page two, modal library framework and all of our CSS individually as well. Now when you would go and make a bug fix to page two.js, instead of your entire bundle having to redownload, you only need to redownload one single file, not these entire bundles, meaning that when your user reloads the page, they already have the majority of the assets in their cache and do not have to roundtrip for those assets. Now when we ship granular assets, there are some other interesting wins that come along with that. So let's walk through what it looks like to load a single JavaScript file. When we load a JavaScript file, we first have to download it from the network. The JavaScript file, once we have it, enters into this parse phase, where the JavaScript engine has to take this textual representation of your code and turn it into a shape that the browser can understand. Once it has that parse phase done, it drops into a compile phase where it takes that parsed structure and compiles it into a machine and platform-specific binary format that then can be run in V8 or in your JavaScript engine of choice. So we start shipping granular assets. This means that our initial download gets saved on repeat loads because we have that asset locally and cached. We then have to still do our parse, compile, and execute phase, but we have a significant savings already. Now, wouldn't it be nice if there was a way that we could get around this parse and compile phase, which is non-trivial for large scripts? Well, it turns out that V8 is smart and does some fancy optimizations. When you reload a script enough times, V8 will say, hey, I've seen this page 1.js multiple times now, and I'm going to take the parse and compile work that I've done before and reapply it back to this file so it can cut down the amount of time that your JavaScript engine is spending in parse and compile because it reuses the work that it's done. Now, on average, this yields about a 40% reduction in parse and compile time, and this is across all sites. Now, you can think of this number as an implementation detail in V8. This number is only going to rise, and I know the team is actively working on driving this down, but by shipping granular assets, we opt into the savings, whereas before we were invalidating entire bundles, we now invalidate single files, which means that the engine can reuse the logic that it has to reapply, parse, and compile onto the same file. So with all these optimizations, we end up saving a non-trivial amount of time when re-delivering assets to the user on second, third, fourth reload. So the good news is today it is possible with ScriptType module to load completely unbundled code via ES2015 imports. ScriptType module, and you're good to go. Chrome engineers recently kind of put this to the test. They asked the question, well, what would it look like if I went all in and loaded all my code completely unbundled? So they took some popular libraries, like MomentJS and 3.js, and tried to load them without having any compiler phase. And what they found is that when you unbundle everything, it gets slower, unfortunately. In the case of MomentJS, which this graph is from, the perf gap was over 100 milliseconds of loading. In MomentJS, which is hundreds of files using ES2015 imports as compared to bundling, not minifying in this case, just bundling the files together. So this is kind of disappointing as a developer right now. And it kind of leads into the fact that delivering granular assets and ensuring good performance is tricky slash not really possible to do today. So why are we talking about granular asset loading? Well, it's because I wanna talk about where we're going in some of the performance patterns that are going to be unlocked on the platform soon. In the examples that we've talked about, we've sort of looked at loading all the individual files upfront and paying that cost. And in that case, bundling makes a lot of sense. However, wouldn't it be nice if in the browser we could have some information about what the current state of the client is before we send off those requests so we can intelligently choose what to request and when to request it? Well, this brings in this feature that has landed in Chrome for dynamic imports. Dynamic imports is JS code loading on demand in the browser. This looks something like this. We have a import here that uses a familiar import syntax that can load a file. It returns a promise when that promise is resolved. We have that module that has been loaded in the browser inside of an if condition here, which is the interesting part. And then we execute run, which is an exported function from that module. Now dynamic imports are cool, but that demo didn't really show their full potential. So I'm going to lean on another new API. That's the network information, which you've heard about a little bit already. The network information API allows you, inside of the client, to determine the runtime conditions so that you can then choose how you're going to fetch your assets. So we can look at things like type, download, round trip time estimates, download max and effective type to figure out what is the correct solution for this specific client. So using that primitive plus dynamic imports, we can come up with a dynamic loader bit of code that can determine should I be downloading a bundle for this client or should I be downloading granular assets and thus getting that repeat page load performance game that I get from granular assets. So we're saying if the round trip transit time is over 500 milliseconds, go ahead and download the bundle because we don't want our users to pay the cost in this case. Otherwise, drop into granular assets because we want that repeat page load performance. So with client conditions and dynamic imports, we can start delivering optimal performance across a multitude of client conditions. So one case or one solution won't have to fit every one of your clients. So we've talked about loading. Now I want to talk about the next phase that your browser drops into. And it's, once you have these assets, getting to interactive faster. So painting what's important for your users on the screen as fast as possible. I think that we are at sort of a loading inflection point on the web. We're moving from a world where the client has been initiating fetches to a world where the server will be pushing us assets. We are moving from a world where the browser discovered all of your assets to you as a developer are able to declare your assets upfront without your browser having to read all the files. And finally, we're moving from a world where assets have inferred priorities about what's important and what's not to explicit priorities that you as a developer can say what's more important than something else. The first part of this is cache digest with HTTP to push. Now I've been a vocal critic of HTTP to push and saying that it is a foot gun that can get you into trouble. However, this proposal sort of course corrects and gives us a new primitive which will allow us to get around one of the major problems, which is over pushing. So consider the following situation. Imagine your client is repeating a page load on a web page and for that page, the server says, you're loading index.html, I'm going to need assets A through F. I know you are because I'm the server, I know exactly what I'm going to be sending you. But the client already has assets A through D locally. So the server pushing assets A through D would result in an over push occurring. So wasting network resources to send these assets that the client doesn't really need. So in this case, I call this over pushed. It's unnecessary. Cache digest would enable a client to tell the server exactly what it has in its cache. So it would be able to say, as soon as that H2 connection is opened, hello server, it's nice to see you. Here's what I have in my cache. Whatever you do, don't send these files to me because I already have them. And the server would say, okay, cool, I see that you have A through D, here are E and F, go on and have a great day. So in this case, we are no longer over pushing and the user has a faster load in experience. And this is one of several solutions, this is just a proposal, but there's a lot of people working on this, which I'm excited about. So this brings us into the next phase. When it comes to delivering assets, not all assets should be treated with the same priority. What does this mean? Well, it'd be great if I as a developer was able to hint to my browser exactly what was important and what was not. And we can do this via a proposal for priority hints, explicit priorities versus inferred priorities. Think of your browser sort of like a gigantic inference engine. It loads and index that HTML file, it scans through the page, it looks at the position of your script tags, of your images, of your CSS, and it says, this is critical to the page, this is not critical to the page, this seems medium priority. But because it's an inference engine, it doesn't always do the right thing when you as a developer know exactly what you want it to do. So priority hints give you as a developer explicit control over what you load and how you load it. Now before we had to do some kind of silly things to hack around priorities, where we would use a link rel preload tag to force an async script to load as high priority, or do something even crazier, like use an image tag to load a JavaScript file that we would dynamically import into the page on error. Kind of strange stuff, but hacks more or less. Priority hints give us a path to remove these hacks and to opt into this explicit performance path. So let's take one of my favorite sites, my favorite hat site. So this site has three images, there are two images and a JavaScript file. I want my users to always see my favorite hat first. So what I'm going to use is use resource priority hints to use a syntax of group to say, I want you to download my favorite hat first. And once that's downloaded, go ahead and download coolhats.gif in the background and hatsstore.js in the background. Because the most important thing for users to do when they load this page is to see my favorite hat. I could care less right away about my cool hats showing in my store. Now this proposal goes a step further and it's not just for image tags and script tags, but it's also for fetches. So this means with a fetch, you can set the priority of a fetch so that it doesn't cause contention in your network stack for low priority fetches. So consider you had a long pull operation that pinged the server every end seconds, wasn't really important, didn't need to happen right away. You could set the priority for that fetch as low and offload that work from the critical user story around your network. Next, async images. Async images is a intent to ship proposal from Chrome that unlocks the explicit control over image decoding and allows you as a developer to move images outside of your critical path. What are async images? Why would I want this? Okay, well, let's look at this example. We have a very large image and then we have a JavaScript file. So what can happen in this case is our image starts downloading on top, our script starts downloading on bottom, our image download finishes before our script. The image goes into this phase of decode and a decode of an image ties up the main thread. That means that it actually blocks that script from dropping into parse, compile and eval. So it pushes out when that script runs. So I'm guessing that your website maybe, an image isn't as important as delivering your framework to actually show the page. So this seems undesirable. With the async proposal, we'll be able to work around this. We'll be able to free up the main thread from the image decoding overhead. So we would mark this image as async and now our waterfall looks like this. We download our script file, we download our image file. The image says, hey, I have an async attribute, I'm going to defer the decoding of this off the main thread, off the critical, or it's still on the main thread, but it's off the critical path. And then the script is going to say, I'm ready, I'm going to drop into parse, compile, eval. And once my critical main path story has been completed, the image is going to go into a decoding phase, which is going to unblock the rest of the interaction on your page. Okay, so we've talked about bundling and delivering assets and making sure those assets are to our users when we want them and things are executing in the order we want, but we still have the entire runtime phase of our web app. So our assets are running, our page is now interactable, but we want to optimize the actual runtime of what's going on under the hood. So I want to talk to you about how to move work out of the critical path and how to do more work without impacting user experience. This brings in my friend, web workers. Now, web workers are not service workers. Web workers are a utility that you can think of as friends, that you can give JavaScript tasks to that will run these tasks and keep your main thread responsive because your main thread is concerned with doing other things. So imagine our browser here in the middle and it has all these tasks that are coming down, react to user input, download this script file, calculate pi to 10 digits, okay, doing them, doing them, great. But then our page starts to get overloaded. We ask the browser to do more than it can keep up with, and this results in jank. We've all had the experience where we load a page and try and interact with it and nothing happens, we're tapping and nothing's loading, and all of a sudden everything snaps in. And this is because your browser was overloaded and trying to catch up. But with web workers, we can take these tasks and hand them off to our web worker friends, and the web workers will execute these tasks for us and then pass them back to the main thread when they're done. This leaves the main thread open and available to react to user input to take care of the critical tasks that the browser needs to do to make sure your user experience is great. So I made a demo to illustrate exactly what's happening here. All this demo is as I'm moving a rectangle across the screen in Canvas, I'm doing it on the main thread on the top and then a worker thread on the bottom, and I'm passing just an array buffer for the screen or for the canvas back to the main thread to draw this. We can see that the worker thread goes quite a bit faster than the main thread. And why is that? Well, it's because the main thread has a lot of stuff to do. The main thread has to render all this. The main thread has to react to user input. The main thread has to keep track of a whole bunch of stuff, but my worker thread, all it's concerned with is move the box by one pixel, move the box by one pixel. And because it's focused on just one specific task, it's able to do that a lot faster. But as web developers, we find ourselves working with the DOM often and basically always working with the DOM. And if you've looked at web workers at all, you may have noticed that web workers don't have access to the DOM, which is painful as a web developer and makes it seem like it doesn't quite fit. But I'm excited because there are some proposals in the works to lift this restriction, to make it possible to do DOM manipulation in a web worker. DOM changelist is one of these proposals. It enables the construction of DOM operations in a web worker that can be backed by an array buffer. So you can use that array buffer to transport the DOM operations, the DOM operations in between the web worker and your main thread. So what does this look like? Well, here's the API. It looks familiar but new. We construct a DOM changelist, we then batch up a bunch of mutative operations, and then we apply those changes from the DOM changelist. You can think of this like a transaction on a database where it can be accepted or rejected. Now, this is really interesting in a web worker because as I said, this DOM changelist can be backed by an array buffer. And array buffers are very inexpensive to send between the main thread and the worker thread and back and forth. So imagine that you had a library that had to do a lot of DOM diffing. Well, that DOM diffing could actually take place in a web worker, freeing up your main thread to be still responsive while calculating what it needed to update. And then finally, post-messaging that new mutation set out and applying that to the document on the main thread. Now, sharing messages as I alluded to earlier between a worker and the main thread is a little different than you might be used to. You have to use message passing or post-message. But post-message has a slight overhead. So for applications that have chatty needs, meaning you have to go back and forth a lot, post-message can actually get in the way. But there is a proposal that has shipped in Chrome and several other browsers that sort of unlock this limitation. Shared array buffers, and this is a scary thing to say in a room of developers, it's mutable shared memory, meaning multiple threads and multiple processes can mess with this memory at the exact same time and stomp all over each other. It's scary, let's just avoid that for now, but let's talk about what this unlocks. Well, here's what it looks like if we were to use it. We can allocate a shared array buffer, we can post-mess that shared array buffer into our web worker and start acting on it. But unlike regular array buffers, the shared array buffer in this case refers to the same thing. It doesn't pass the whole array buffer into the worker and then pass it back out. We're creating a direct link in between our worker and our main thread. What does this let you do? Well, I made a silly demo where I count up from zero to 10,000. On the top one, I'm using post-message to post every single time that I mutate my counter. So I go one, two, three, post-message, post-message, post-message. In the shared array buffer case, I don't have to use post-message because I have the data that's shared in both places. So I'm able to simply read from that shared array buffer. Think of it like my shared array buffer is a model and my view in this case just reads from that model whenever I want to. So I'm able to work around the post-message overhead, which roughly translates to about 10 milliseconds per message. Now normally post-message is great, it works fine, but in applications where you need to be super fast and you need to make sure that the data can go back and forth really quick, shared array buffers might be your answer. Now you may be thinking web workers seem really great, they seem super powerful, but I don't wanna have to write a framework that utilizes web workers because it seems tricky and difficult and I would agree with you shared array buffers and web workers are very hard to grasp concepts and hard to translate into normal web development use cases. But luckily for us frameworks have already started taking these low level primitives and applying them at a higher level in their frameworks. Preact is one example, Preact actually has a demo where they're able to do most of the work inside of a web worker. So this works today, it's pretty amazing, I recommend that you check it out. A library from the Google Chrome team called comlink has a pretty advanced idea where it allows you to export a class via a worker to your main thread. So you can write our class in this case app which is running on our worker and then we can expose it via comlink. So on the left side, we comlink proxy the worker and then we're able to call methods on the app class from our main thread even though it's running in a web worker. So this is one of those building block primitives that I think has a lot of potential and people can really latch on to and build some interesting demos with. Finally, Angular. Angular actually ships with the ability today to run fully in a web worker. You have to opt into this mode but it works and it works great, it's surprising. I didn't think it was this easy and I asked someone on the Angular team and they sent me this demo and it just worked. So Angular is able to do the majority of their work in a web worker which means that your main thread stays super responsive and able to react. So this is amazing because Angular is a very high level framework and the fact that it works like this is kind of jaw-dropping. All right, so where do we go from here? Well, I would encourage you to take a look at these APIs that we've talked about today and experiment, push the limits, see what breaks, see what things are missing, share your demos with us, tweet at us, post bugs, and just in general, provide feedback. So ideally, with your feedback and the things that we're pushing on we'll be able to push the overall web performance for our end users to be a fast experience regardless of what kind of device you're on and regardless of what kind of network condition that your user finds himself on. Thank you very much. Thank you.