 I met Shubhie years ago when she came to speak at a conference that I founded called CSS Conf. And we bonded over production quality loading performance then because we both deeply cared about it. And it turns out we both still care about it now. So we thought that we would do this talk about where loading performance has been and what we see in it 10 years later and hopefully a little bit wiser. Thanks, Nicole. So yeah, Nicole and I are once again finding ourselves working on loading performance. And this time it's performance of open source powered ecosystem apps. Six or seven years ago I was working on solving these same problems for Google internal frameworks. And since then there have been big strides in technology advancements. For instance, we have HTTP2 and now HTTP3. We have great primitives like preload, preconnect. We've had a lot of amazing browser optimizations and formats like VEPI. But what I found is that we still have basically the same problems today. And it's been humbling that this is still such a hard problem. We're still juggling with the relative priority of font, CSS, images, JavaScript. How to effectively deliver these things at the right time by utilizing the network and the bandwidth. Loading more than 10 or 20 resources in parallel still has a performance cost. And efficiently incrementally loading JavaScript is still largely unsolved in the larger ecosystem. And as an org, Chrome and Google and browsers in general have done a lot of good work. We've had great advances in browser engines, in standards, in dev tools and documentation. Still a lot of the web is not meeting loading performance metrics. Chrome recently announced the web vital metrics. It's this set of three metrics that we have high confidence impacts user experience during page loads. We want a lot more of the web meeting these metrics. Rick Byers, my manager on the web platform, tweeted this interesting bit of data. Chrome's usage is about evenly divided between sites that are in the head, in the torso, and in the tail. And this essentially breaks down into these numbers. We have a small number of headsites, about 100 sites, a medium number of torso sites, about 10,000. And then the web has a very long tail of 3 million plus sites. So this might give us a clue into the UX problems. Are we reaching each of these different audiences with our work? First, best practices tend to make it into well-resourced headsites. They might even have teams of dedicated performance engineers. But we are missing out on reaching the torso and tail sites. And that is a bulk of the web. So to reach more of the web, we realize that we should be looking at developers doing. And these sites are often built using Flansite frameworks or static generators, CMSs, or website builders. And looking further down the stack, we see that a lot of the web is powered by these open-source ecosystem tools, like Webpack, Babel, NPM, TypeScript. So now we have this new surface, a full stack, to consider when thinking about UX than loading performance. But how? And what's the problem with loading anyway? To answer this, my team has spent a lot of time over the last six months looking at performance of real production apps. And all of those apps were powered with open-source ecosystem tools like Webpack, Babel, NPM, and React. And at a high level, we have the same problem that I saw six or seven years ago. It's still not easy. And it's a scheduling problem of proper ordering, prioritization, pipelining, optimizing for metrics in the correct order, first FCP, then LCP, then FID. The assets have to be staggered, so it's ready by the time the metrics should figure based on user expectation. This means figuring out which assets are most relevant to which metrics, ordering them appropriately, first get the fonts and CSS in time for FCP, then get the hero image ready for largest contentful paint. And I often see that time where the CPU on the CPU are waiting on the network and vice versa. So pipelining is important to keep the network and CPU fully utilized. There's also specific problems for every metric. So here's a couple examples. I'm not going to dive too much into this. For largest contentful paint, fonts and CSS are critical and often problematic. They might be loaded from a different domain or not be self-hosted. They might not have pre-connect. Fonts might be late discovered when CSS is parsing and CSS can be too large or have too many files. And sync scripts often third-party synchronous scripts block FCP by holding up CSS and fonts. Another metric example is first interaction delay. The most common culprit here is large JavaScript or non-critical third-party JavaScript. That's sneaking up ahead of the primary JavaScript. Getting ahead on the network or getting ahead on the main thread and causing additional style and layout work. So I've definitely learned a lot of lessons along the way. And the key theme here is grounding with real-world apps and proving things out in production. So the first lesson is using production apps to find problems and opportunities and getting to the ground truth. Early on in the project, we didn't spend enough time looking at production apps. We would ask developers what their problems were instead of looking at their apps and understanding and finding the problems ourselves. And we've definitely been guilty of looking at simplistic apps versus full-featured production apps. The second lesson was validating designs and solutions early. Like lab metrics and a lab methodology are really important here. And also we need to track movement in real-world metrics as features land. So some early features, we had some trouble with early validation. Like when we implemented, you know, serving a modern JavaScript to modern browsers using module or module. We didn't see expected size improvements in real apps. It was too complicated. There were too many, there was too much third-party NPM packages. There was overcompilation issues with Babel. Sometimes the code base didn't have a ton of modern JavaScript to leverage. Another feature of granular chunks, we reduced JavaScript bundles by 30%. But it didn't move the web vital metrics. We didn't have a good setup for real-world measurement or AB testing. And perhaps JavaScript size wasn't the best proxy. So the lesson overall is early validation with real websites is really critical. So now over to Nicole. You could solve all of these problems yourself, but I think like us, you'd find that it was two steps forward and one step back. What we've realized is that the web has a lot of really great raw materials. But there's some assembly required. And if we're being honest, it's massive amounts of assembly required. And at its worst, it's massive amounts of assembly required by every dev on every project. Some devs absolutely love infrastructure. I'm not one of them. Others love building features. So what if we shared the infrastructure and the tools? New devs find this horrifying. And I think that we could make this much better for them. We think we can get to a place where much less assembly is required. And the web just works for all different kinds of developers and supports the sort of user outcomes that we want to have as well. SDKs are absolutely key to making this happen. Next, Nuxt and Gatsby wrap their respective frameworks and provide a lot of things out of the box that the framework doesn't necessarily come with. They provide client server integration. Absolutely essential for good perf outcomes. And many already improve performance over using a framework alone. What's better, there's room to make them even stronger performance wise. So what are SDKs in the end? We think of them as a commons where we can collaborate together as an entire developer community to get better outcomes. What sorts of things goes into a web SDK? It might be backend server logic, unit integration testing, translations, encapsulation, security, image handling, and the list goes on and on. Essentially, we're trying to help a dev to be as productive as they possibly can be. They're building a production app. It has so many moving parts and it can be really difficult to reason about how to string those together and how all the tiny decisions you make lead up to performance outcomes. So if you're thinking this would bog me down, I want minimum opinions and maximum flexibility. I'd love for you to turn that around and imagine your entire team with minimum opinions and maximum flexibility and think how that would turn out. Often the sort of guardrails that keep people from making mistakes and the helping teams work better together too. We believe that most developers want to and should focus on building features. So that's why Google and Chrome are investing in Next.js and React. I want to be super clear. We want all of the frameworks and all of the SDKs to be successful. But we did realize that we're a small team and we needed to invest in one set of stack tools in order to have the biggest impact that we could. That said, we're very careful to look at the whole stack and to realize that each piece that we can put lower down in the stack means that those changes that we've made will be easier to pick up for other stacks. If we make a change in Webpack, then everybody that uses Webpack can leverage that as opposed to simply building it into Next itself. So we work with our partners to develop opinions, to test those opinions in production, to bake in good defaults for the SDKs and then to lock in those defaults with guardrails. We've had some initial perf wins that we're pretty happy about. Improved chunking, reduced bundles by 25 to 75%, improving caching and faster navigations. Particularly happy to see that ship in Gatsby 2. That's the sort of ecosystem benefit where when we will all work together in a commons, we see greater outcomes than any of us could get alone. We also shipped a polyfill chunk in version 9.3. It reduced about 16K from the baseline of a Next.js Hello World app. And we shipped CSS prioritized over JavaScript in 9.2. That reduced FCP by up to 40% in our partner apps. We're really excited to see that filter out to the rest of the Next.js ecosystem so that we can find out if it has similar effects on other websites. We also added out of the box perf metrics. Shuby talked to you a lot about the Web Vital metrics that have been recently released. We added them to Next.js out of the box. So an application that wants to measure their performance doesn't have to actually do it individually. They have it baked in and they just need to turn it on and point those metrics where they'd like to get them. The other piece of that is that we added a couple new metrics, time to hydration, which measures something like interaction readiness and time to route render, which is sort of like an SPA navigation metric. Those are two areas where we hope the Web Vital metrics can grow in the future and so we're testing them out ahead of time in Next.js apps. We also tested React's concurrent mode with a partner app, Vercel, and we found a 50 to 75% improvement in interaction readiness for the first interaction. And we found similar improvements in total blocking time and also saw improvements in lighthouse score and metrics. But we realized as well that the upgrade path is non-trivial for apps. Our hope is that we can make the upgrade smoother by baking in some defaults into the SDKs so that other developers benefit from our experience upgrading to concurrent. There's a trade-off between developer experience and user experience in a lot of cases. It's hard to fix loading performance wholesale without a significant change to developer experience. When Shuby made these changes at Google, they were able to make the head completely declarative and configuration driven. That means that any change to the head was explicitly declared in a manifest file and the configuration had defaults. For example, for fonts and CSS, it also had an ordering and loading strategy that was controlled by the framework. Developers like writing code and they really don't like updating config files. So this was a trade-off that could be made at Google but might not stand up in the ecosystem where idiomatic developer experience is paramount for developer mindshare. The lesson we take from that is that it's important to maintain that developer experience and at the same time improve user experience. Developer experience is gold for the web. It means that developers will choose to and want to work on the web, will continue working on the web once they've gotten started, and it gets new devs up and running and building great features faster. It's also important for content diversity. So clearly compromising developer experience is a risk to the web as a platform and it has to be very thoughtfully considered. For one, we need to absolutely stop asking developers to abandon their tools and frameworks. So what's next in 2020 for our project? We're going to look at critical inlining of CSS. We're going to look at font optimizations and third-party script scheduling. We're also looking at streaming and more developer-timed guardrails that are built into Webpack, ESLint, and probably into other bits of the stack as well. We're looking at images around sizing and scaling and image placeholders. Also polyfill delivery, data fetching, and React's concurrent mode will be digging into that further. Will all of that work? Almost certainly not, but we're excited to see some of it turn out performance improvements through the end of the year. So when I go back to this, we really think that we can get to a place where much less assembly is required for a developer on the web, and all different kinds of developers can produce great user experience. Thanks so much for having us here today. These are our Twitter handles and we'd love to talk to you more.