 Hello, everyone. My name is Dylan and my name is Michael. We're software engineers on eBay's web platform team and maintainers of Marco, a UI library that is a project under the OpenJS Foundation and also heavily used by eBay for our e-commerce platform. Thanks everyone for taking the time to join us today. We're going to be talking about something Michael and I have spent a lot of time thinking about. We hope that with this talk we can get the conversation moving around some optimizations we believe have been left on the table. The optimizations we're about to dig into do not necessarily apply to all applications. Our primary focus for these optimizations is multi-page applications, but that's not to say that there aren't learnings to be had either way. Specifically, we're going to talk a lot about hydration. If you haven't heard the term before or just need a quick refresher. Hydration is a term for the process by which modern frameworks allow you to mount your application on a top of a page that was initially rendered on the server. This means the framework will handle rendering a complete HTML page from the server and pick up seamlessly in the browser initializing application state and registering event handlers. So hydration has been around for quite a while now and it's been popularized by modern frameworks such as React. It exists primarily because we've shifted from building websites with a sprinkling of JavaScript on top of a mostly server rendered application to our modern frameworks which allow us to specify the view as a function of our application state. We've grown accustomed to the declarative and expressive templates and it also means we get to shift the work of attaching to and updating the DOM to our framework of choice. The promise is that the framework will handle these things so that we can focus on managing the state and business logic of our applications. And for the most part our modern frameworks deliver client-side performance, which is perfectly acceptable for many common use cases. However when it comes to the initial load performance falls short. When it comes to performance ultimately what matters is how it affects our users experience. Google recently announced a set of metrics called web vitals which focus on three aspects of delivering a great user experience. Loading, interactivity and visual stability. Visual stability is largely determined by the application logic but loading and interactivity are heavily impacted by your framework of choice. So these are the metrics we'll focus on today. The primary metric associated with loading is largest contentful paint or LCP. LCP measures the render time of the largest content element visible within the viewport. This is notably different than similar methods like first contentful paint, which just measure the first time anything is visible on the screen. The goal of LCP is to identify when the main content of a page is visible to the user. The primary metric for interactivity is first input delay or FID. FID measures the time from when a user first interacts with a page to the time when the browser is actually able to respond to that interaction. We care a lot about the first input from the user because the biggest interactivity issues we see on the web today occur during page load, largely due to modern hydration. FID is what's known as a field metric. It measures real user interactions as this is what ultimately matters. But when comparing approaches from a framework perspective, synthetic metrics are typically more helpful. We'll take a look at total blocking time and total time to interactive, which correlate closely with FID. Looking first at LCP and how quickly we can get content visible to the user, HTML is the absolute fastest way to do this on the web. HTML is able to be parsed and rendered as the browser receives it. However, one common problem with modern frameworks is they have a subpar solution when it comes to rendering on the server. In many cases, this unoptimized rendering means that a server can easily get overloaded, causing a significant amount of time between the requests coming into the server and the server responding with any HTML. This can end up negating all performance benefits, making it actually faster to just serve an empty page and let the browser do the rendering, which is really unfortunate. Our other two metrics, TBT and TTI, don't fare much better in the current web dev world. Modern hydration approaches require an insane amount of JavaScript, and unfortunately, JavaScript's not cheap. Besides downloading the JS, we also have to parse it and execute it. And unlike HTML, execution can't happen as it's being downloaded. This is especially troublesome for lower-end devices, but it's actually not that uncommon to find sites that are taxing even our most maxed-out developer machines. The main issue from the user's perspective is that even if we can address LCP and get content quickly to display, while a JS downloads parses and finally executes, the user's unable to interact with what is visible, and probably angrily tapping on their screen waiting for hydration to finally kick in. I believe this poor first load experience is what is called single-page applications, or SPAS, to become so popular. The idea is, even if the first load is terrible, at least subsequent loads will be fast, because the code has already been loaded and can be displayed without another round trip to the server. SPAS do have some benefits, but they actually further increase the amount of JS needed in the browser and come with a new set of challenges, which are outside the scope of this talk. So this sounds bad, and it is. Modern frameworks create modern problems. But there are actually fast sites out there that use modern frameworks. It's just that it's currently a very manual process. Some meta frameworks out there, such as Next.js, are beginning to put in some of this work for you, but they still inherit a lot of the problems of the underlying framework. Because of this, there's a large number of sites out there that don't perform well at all. Optimization through static analysis is really appealing, because it's something that can happen automatically from the application developer's perspective, using information that's already present within the code that they're writing. The compiler can put in the extra work at build time to make things faster and even eliminate code altogether. Template-based frameworks like Marcos, Veldt, and Vue are a bit better poised to take advantage of these types of compile-time optimizations. But other frameworks may be able to leverage some of these ideas as well. It's also worth pointing out that we're going to primarily be looking at these optimizations from the perspective of a multi-page app or what you might call a traditional website. That said, many of these ideas can help improve the first load experience of any website. When it comes to LCP, the question is how can we optimize server rendering so we can consistently deliver contentfully HTML to the user's browser in a timely manner? Some frameworks out there recommend using a headless browser on the server. This is obviously expensive, but even frameworks that create VEDOM on the server aren't ideal. On the server, the end goal is always going to be building an HTML string. These additional abstractions don't help us here at all and just end up consuming additional memory and CPU cycles. The templates associated with their frameworks already closely mirror the HTML we need to generate. We shouldn't need to spin up a browser or create a bunch of objects representing the DOM just to generate some HTML. Instead, we can compile to basic string concatenation, which will always be the fastest implementation on the server. There are actually a few frameworks out there that are already compiling server optimized versions of their templates, including Marco and Spelt. And in Marco's case, we even compile these to writing to a string. This allows us to flush out partial HTML while the certain parts of the page may be waiting on data from an async surface. Compiling to strings has the benefit that certain work like escaping content happens at compile time for static values. When HTML generation happens at runtime from an object tree, there's typically no differentiator between static values and potentially unsafe dynamic values, so everything needs to be escaped, which can end up getting quite expensive. To see how Marco does this, let's take a quick look at an example template in Compilot. It's pretty simple. Defining a value called name and interpolating that into some paragraph content. When we compile for the server, we can see that we've got string concatenation and some utilities are pulled in to handle escaping dynamic content. In the browser, compilation looks a little different. Marco creates a VDOM in the browser similar to other frameworks to do lightweight comparisons across renders. That's not to say that Marco's outputting the absolute best performance in either environment, but it is pretty close on the server. But Marco is not artificially bottlenecking in either environment either. The nice thing here is that Marco is able to compile your template to JavaScript, which is optimized to perform some pretty radically different tasks. This has a huge impact on rendering performance on the server. Still, there's room for improvement. For most components written in modern frameworks, there exists code that's only ever going to run in a single environment. Here are some examples. Events never need to be called on the server. Creating the functions there is wasteful. There usually exists life cycle events as well that only ever happen in the browser or effects if you're using hooks. These also should be stripped from the server side code. It's worth noting that serving a static site is an option for sites like blogs where the content changes infrequently. Now that we've addressed how we can get the content to display quickly for our users, let's turn to hydration to see what we can do to allow our users to more quickly interact with the page. The primary issue with hydration is that there's an incredible amount of duplication. The exact process for hydration varies from framework to framework, but at a high level, all modern frameworks require three main elements to hydration. The HTML markup, which is rendered from the server or served statically. The JavaScript necessary to build the initial application state and to mount on top of this HTML. Finally, any data that's needed to create the initial application state. This is often serialized as JSON. Looking at some code will probably help here and make it more obvious where the duplication exists. In our example, you can see that a lot of static content exists in at least two of the elements of hydration we're just talking about. For the most part, static content such as the DOM structure is represented in both the HTML and the JavaScript. Also, the dynamic content, typically the initial input to the view is represented in both the HTML and the data. Maybe in our example, world may only ever render on the server. In fact, for many apps, a large chunk of this rendering only needs to happen once and is not ever updated. With remainder of the talk, we'll use the term immutable to refer to this content that was once rendered can never be updated. So how can we start to optimize this process for real applications? Let's quickly take a look at a very simple hypothetical template. In it, we can see that there is some stateful value denoted by the dollar sign, which is used to update some text below. First, we'll look at how much static content exists here. You can see that all of the DOM structure for the page is actually static, as well as some text content and attributes. Taking this a step further though, we can see that the footer element contains some text, which, although dynamic, is not tied to any state. That content also cannot be updated by the browser, and we consider it immutable. Most multi-page applications have a bunch of content on the page that, once initially rendered, is immutable. For this content, it should be possible to completely skip the hydration process, meaning that rendering logic, data, et cetera, is not duplicated. One of the best ways to improve time to interactive is to skip hydrating unnecessarily. Okay, so we can skip hydrating the immutable parts of their page, but how can we optimize the dynamic parts? Looking at the simplest case, you might have some dynamic text and attribute, or even just an event hantoid that needs to be attached. There would be nothing too fancy about it if you were hooking up the code yourself. You'd probably write a solution that would use order of magnitudes less JavaScript than any modern hydrated framework would. For this simple case, the key is that the layout of the template is still immutable, once rendered from the server, meaning no new elements are added, moved or removed. Since the layout is immutable, we should be able to avoid sending any layout to the browser. The server has already done that work. Text content, attributes, et cetera, may still be updated, but to do that, you do not need to necessarily render a full section of layout in the browser just to hydrate it. You might be thinking, well, what about events? Well, hydration as it stands is far from the simplest approach to attaching events to the DOM. A much simpler approach for both of these would be to simply mark our elements with some kind of identifier. Then we can easily grab a reference to that element, adding any event handlers, and also update its attributes and text content. We can actually represent the required JavaScript for the example template on screen pretty easily in just a few lines of code. We grab a reference to the DOM notes that we need. First, the search input, then the dynamic search text. Finally, we add an event listener and update the text content as the input value changes. Another way we can reduce the impact of hydration, especially during the page load, is by deferring as much of the hydration work as possible. Ideally, elements would be hydrated in batches, reducing their total blocking time, and allowing for the user to interact with the page during the hydration process. A few frameworks are already doing this. Also, content could be hydrated and even downloaded lazily. Framers can take on part of this work for us automatically. Although in many cases, it will be a manual process. Either way, it's important for you to keep in mind. You can also lazily hydrate content based off various user interactions or even just by simply detecting if the element's in the viewport. Everything we've mentioned so far can be manually optimized, but it probably already sounds like a lot of work. Ideally, our frameworks need to start picking up the slack here. So what are our frameworks doing today? One project that has taken a stab at this issue is Next Super Performance. This module is a community plugin for Next.js, which uses Preact and exposes some APIs that optimize the hydration process. Preact itself offers respectable performance on the server side and the client side while having a significantly smaller bundle size. This plugin uses an approach they call partial hydration. In essence, it allows developers to explicitly mark a subset of components as hydratable. It then takes care of serializing the initial data for these components and allows immutable top-level components to be only rendered on the server. Another interesting project in this space comes from the View Community. It is called View Lazy Hydration. And as you probably can guess just based off the name, it provides a nice component-based API to explicitly configure when a component should be hydrated. It also lets you skip hydration for a section of code, but more interestingly, it allows you to explicitly define when that code should hydrate. You can opt into hydrating when a component is interacted with, when it's visible, or even when the browser is idle. In Marko, the framework we maintain, we've put a lot of thought into the hydration process. eBay itself is a multi-page application, and the performance of our pages, while not perfect, is extremely important to us. Marko is the only framework that implements some of these hydration-specific optimizations out of the box. It shifts with support for partial hydration, but it goes one step further and automates the process. Marko components actually take the entire pages componentry into account when bundling and rendering. It uses this information to determine where the top-level components which are stateful exist. If a component does not contain state while walking the components on the page, it can safely be omitted since it will always render an immutable HTML structure. Because this is all handled by the framework, it has some unique advantages over other solutions. If two hydrated components share serialized data, it de-dupes it. Marko's APIs are designed to avoid creating input which is not serializable, for example, passing a function as input, which is typically used for event handlers, would not be properly serialized by other solutions. Marko also takes care of tracking what all the browser dependencies are for an entire page. It will analyze the componentry and ensure that only hydrated components and their children end up in the browser bundle. Marko currently automates a fair amount of what is being done by add-ons in other communities, thanks in part to its compiler. We believe without this automation, maintaining which components need to be hydrated becomes a headache. Since developers typically like to avoid headaches, it typically means that they're avoiding doing these manual optimizations in the first place. Now let's take a step back from the solutions in the wild and think more about what room we have to continue improvement for hydration. One issue is that all of these solutions treat components as the source of truth or hydration, although in some cases that's going to get you quite far. For most applications, the top-level immutable content doesn't stop at component boundaries. Let's think more about where the truly immutable content is for a page. Ignoring component boundaries, ultimately it boils down to this. Can this section of the template be re-rendered in the client side? Ignoring hydration. The only way content will need to be rendered from scratch in the browser is if it's conditionally displayed and practice this happens in one of two ways. Is the content under an if statement of some kind or some kind of conditional that can be toggled in the browser or is the content within a loop that can be appended to in the browser? Ultimately if the content is outside stateful control flow, then it can't be re-rendered from scratch in a hydrated application. That is a much more granular boundary than relying on components since with components, even if a single piece of the template is dynamic, the whole thing becomes dynamic. So now that we know where the hard stop is for things that need to be able to render on the client side, how can we determine exactly where stateful control flow exists? Static analysis. This is easier done in some templating languages than others. Essentially it means tracking where the stateful data within the template is used and ultimately if it ends up either directly or indirectly as a part of some computed data used by control flow. In practice, when analyzing a template, you can tell which parts of the template will be updated as they'll be tied to some state. By tracking if a state is used to update text content or an attribute, we can have our framework inline code to perform targeted updates just for that value, much like you'd write yourself. By also tracking state that makes its way to the control flows, that is the if statements or the for loops, we can use this as our new full hydration boundary. No framework is currently doing this level of granularity for hydration, but we do have plans to incorporate it into future versions of Marko. With proper tracking of state flowing down the page and component tree, we should be able to statically determine significantly more dead code to allow for skipping even more of the hydration process. Another technique yet to be explored would be to use static analysis, potentially across templates, to determine which input or props can actually be used, essentially running dead code removal and the data that is serialized for each component. In most frameworks, client-side logic is clearly separate from server-side logic and it's possible to determine which part of the templates can be updated and which input data is needed for the client, that is used by events or lifecycle events. Previously, we mentioned that we can use the store advantage simply to remove code that will never run in our target environment and to optimize hydration for immutable content. However, we can also use this to track which input is actually needed to be serialized for hydrated components. For all immutable components, it's possible to check which inputs are used in events or life cycles or effects. Every other input is actually not needed on the client-side, which if tracked correctly, it allows to prune the serialized data even further. Hopefully, all of this has helped show just how much dead code and extra work our modern frameworks currently do when it comes to hydrating. These optimizations are just a sample. One of the best features of the web platform is its ability to provide an instant experience to our users. And we think that it's time our modern frameworks once again start competing on this fundamental feature of the web. It's time to ask our frameworks to produce optimized applications across the stack and it's time we finally outperform that old WordPress blog. Thanks, everyone, thanks, everyone, for taking the time to hear us out. We hope we've left you with some ideas on how to optimize your own app and we look forward to continuing this discussion with all of you.