 My name is Alex Machuk. I work on the site isolation team, which is part of the wider Chrome Security team. Before we begin, I just want to quote the gift credit to Charlie Rice and Camille Lamey and as Kyle Oskoff who gave this talk on previous occasions and actually made the original version of these slides. Also note that this talk is about 15 minutes shorter than the previous version, so I'll go faster and try to stay at a higher level. But if you need more details, I encourage you to also check out the past versions. So in this talk, we'll talk about basically how navigation works. We will start with a simple case of what happens when you type something into the on-evox. How does that generate a network request? How we choose a renderer process for this? What does it mean to commit a navigation? How that differs from blow-stop, when you load some resources like images and so forth? So we'll start with that and then at the end, if we have time, we'll get into some more advanced cases as well. Now, this talk describes the current state of the world, where the M76 milestone is the current stable release and M78 is tip of tree. A few notable recent changes are that the navigation code is in flux in terms of moving from classic IPC to Mojo. So at this point, most of the important navigation messages use Mojo, but you will still see classic IPCs here and there. Site isolation, which you might have heard about. After many years of development, it's now the default behavior on desktop platforms, and this affects how we choose our renderer process, and I'll get into that a little later. Finally, network service is done by default, and this is how the navigation code interacts with the network stack. Now, navigation involves coordination among many processes, and John covered this in his talk already, but basically the browser process is a privileged one and interacts with the user who controls privileged resources like disk and manages all the other processes. In terms of navigation, this is where all the important decisions get made. Renderer processes take HTML and JavaScript and produce rendered pages that users see. They run in a sandbox because we don't trust stuff coming from the web, and for navigation, it's one important thing to realize is that when you navigate from one page to another, you can actually swap processes under the hood and go from one renderer process to another one. So now, let's just go through what happens when you type something into the underlocks. So first of all, you could type either a URL, or you could type a search query, which we first need to take and convert into a basically URL to query the default search provider. So then once we have the URL, we give it to the current web contents to actually start the navigation. Now web contents is this class that basically represents a tab in most cases. There are some other uses of it, like for extensions and pop-up synchro apps and so forth, but you'll be fine if you just think of it as a current tab specifically of the white content area, like not including the tab title, not including the underlocks itself. As you navigate in the same tab from one page to another, the web contents object stays the same. Now each web contents has another object called navigation controller. When you actually do nominimox navigation, we'll go through navigation controller, your load URL function, and navigation controller is basically your session history. So this keeps track of where you've been in a particular tab so that you can go and use your back forward buttons to navigate back and forth. I know that this is different from browsing history, which is where you've been across all of our tabs. Next, each navigation targets a particular frame, usually the main frame. To represent frames, we use a browser-side object called a render-frame host, and a render-side object called render-frame, and they talk over, as I mentioned, classic NPC or Mojo. This naming you'll actually find very common in Chrome, where you have a full host object on the browser-side and a full object on the render-side and that they talk. Okay. So now I'm going to show you a timeline of roughly what happens in this navigation. It starts in the browser process on the UI thread. This is where we render UI, but this also happens to be where the bulk of the navigation code lives. This is where web contents and render-frame posts and all of these objects actually live. In blue we have the network service. So this in many cases today, in most cases, is in a separate process. In green we have our render process. So the first step, as I mentioned, is to convert the URL, basically convert your search query to URL if necessary. Once we have the URL to navigate to, we call a function called beginNavigation. The next step after that is to hop over to the network service which will actually start the URL request. Once we hear back the response, we jump back to the browser process UI thread and choose the render process which will show the page. Next step after that is to tell that render process to commit the new navigation. The render process will create a new HTML document, replace the O1 with it, and then basically send back an act to the browser process saying, yes, I did commit this new navigation. At this point, the navigation flow is basically finished, but nothing is shown in the page yet. Like this Throbber is still spinning, and the next step after that is the loading phase. This is where the render is going to basically read the response body, parse it and render it into a web page. Once that's done, it will let the browser know with another message and say, okay, load stop. This page has finished fully loading at this point. So this is the brief overview and I'm going to dive into all of these stages and more details as the talk goes on. But before doing that, just brief mention on how you would actually write code that interacts with navigations. Most of the time, you'll be interested in doing one of two things. So either you will want to observe navigations or you won't want to control them or modify them somehow. So for observing navigations, we have this class called Web Contents Observer. This is an interface that you can implement, and basically it will give you read-only notifications about various stages of the timeline that I've just shown you. The first four events are navigation events, and they will provide you an object called a navigation handle that you can query for details like what the URL is, what the frame that's navigating is, and all kinds of other details. The last two events that start loading, that stop loading are about loading. They basically bracket when the thrower starts and stops spinning. Now, these are supposed to be read-only events. You're not supposed to just cancel navigation and start a new one from them. If you want to do something like that, then we have another mechanism called a navigation throttle. So, throttles are objects that live on the UI thread, and they allow you to defer, meaning pause or block navigations at various stages. So, they will also give you a navigation handle that you can use to query for context. The way you use them is basically you can implement the public navigation or override the public navigation throttle class, and override some of these methods. Before navigation enters a particular stage, you will get a callback and from there, you can return a value that indicates whether the navigation should proceed or whether it should be canceled with maybe a specific error code or even some error page HTML, or whether you want to defer the navigation and actually answer this later. This is useful if you want to do some async work before you make the final decision, for example. This is a pretty powerful mechanism. It's used for many things. One example of what it's useful is security checks. For example, if you want to verify that the URL where you're going is actually safe for you to visit, we might defer the navigation until we verify that it is and then resume or block it. Okay. So, now let's go through some of these steps in a little more detail. Begin navigation, as I mentioned, is what kicks off the navigation form. This is where we'll create the navigation handle object, and this will basically track all the states of the navigation until the navigation commits. First thing we're going to do afterward is to ask all the throttles that implement will start request, whether they want to allow this navigation to start. If all of them say yes, then we'll create this navigation URL loader object that will actually manage the network request and interact with the network service to make it. We'll also dispatch some observer events that start loading and that start navigation, which you can use if you're interested in observing this stage. The next step is for us to hop to the network service to actually make the request. I'm not going to go into a bunch of detail here. There's a networking one-on-one talk, I think that you can check out for that. But basically, this is the most things like looking at the IP address, DNS. If it's an HTTPS URL, we'll want to establish the TLS connection. We want to construct the request, attach cookies based on where it's going and actually set it out. Now, the server may respond by saying, actually, this is not what you're looking for. Actually, go to this other URL instead. This happens with status code 302 or 300 something. We call this a server redirect, and if this happens, well, we'll actually hop back to the UI thread. We will ask all the throttles that implement will redirect request, whether or not they want to allow the redirect to proceed, and if all of them say yes, we will hop back to the network service and reassure the request to the new URL. And we'll also dispatch the redirect navigation web content server event. And know that this can happen multiple times, so you could actually have a chain of these redirects. And also know that this kind of redirect is different from a cell called client redirect. And this is a completely different thing. The client redirect is when a document actually commits, but the first thing that it does is actually instruct us to navigate somewhere else, like by reassigning one of that location to something else. And in terms of how redirects look, how server redirects look, just to add another step to the timeline is basically this hopping back and forth before we proceed with the next step. So once we are past all the redirects and we have the final URL, we probably are starting to receive data for the new document, except that this is not always the case. And in particular, this is not the case if you are navigating to something like a zip file or an unknown file type, or if the server responds and actually says this is like an attachment and you should not render it, but actually download it instead. So in all of these cases, this becomes a download. And we hand the response over to the download manager and no navigation actually happens and we leave the old document in place. And the same thing would actually happen if you get a 204, which is this. Let alone HTML or HTTP response code, which basically says no content. Like the server says I got you, but I'm not gonna give you anything new to show. But once you're done with all of that, we do expect after some additional security checks, the new document to commit and replace the previous one. So at this point, we'll hop back to the UI thread. And the next step is for us to choose the renderer process, which won't be used for this navigation. This is a critical step. And it's very essential to understand to get the whole Chrome security model. There are a bunch of policies that govern how this is done and I will just go over some of the most important ones. So some pages must share a renderer process and in particular, same site pages with window references to each other. So if I have an a.com page that has a.com iframes or if it does a window that open to open another a.com tab, all of these frames, the web platform allows all of them to synchronously access each other's DOM, which we cannot do if we put them in separate processes. Okay. Now, other pages should never share a renderer process. So for example, if you navigate to evil.com and then to an internal settings page, we don't want those two to share a process because if we did then the attacker could take over their renderer process and then they would get to control your settings page and manipulate it and make use of its privileges and basically take over the browser at that point. So we try to not share, we basically never share a process between privileged pages like that and untrusted web content. Now site isolation takes us a step further and says that different sites also should never share a process. And that means that for example, evil.com and bame.com should also never end up in the same process. So for the remaining cases, we have a little bit of flexibility and in general, the strategy for us is to swap when possible to get a clean slate. So for example, if I open 10 brand new tabs, navigate them all to some Google.com stuff. We try to use multiple processes for those rather than just swap them all into one bloated process. Actually in the early days of Chrome that did happen, that policy and it did not work out so well. But we also have a process limit that depends on how much memory your machine has. And once we cross it, we do try to reuse existing processes when possible. However, this is subject to security. So we only will reuse a process if its security properties are compatible with new page. So crossing the limit will not allow an evil.com page to share like a process with bank.com. And in that sense, the process limit is actually soft. We will go over when needed for security. And to make some of these process selection decisions, we use a class called site instance. And site instance basically represents a group of documents that share a process. So as I mentioned, same origin documents that can script each other always end up in the same site instance. So in this example, if a.com or window.opens another .com pop up, they will end up in the same blue site instance. Now with site isolation, each site gets its own site instance and different sites never share a process. So b.com, window.opens, c.com even though they can talk to each other, we'll put them in separate processes in separate sentences. Now because they're cross origin, they can only communicate using a very small eSync API, like post message. And in this case, we will actually route the post message from one process to another back and forth. Without site isolation, site instance may actually group multiple sites together. And this used to be a lot more common with site isolation. It's basically very less so on desktop you might still see this on Android where we still don't have site isolation deployed yet, although that is about to change very soon, hopefully. And the last thing I'll mention here is that you could have multiple instances of the same site. So if I control T and open a new unrelated tab, navigate that to a.com, that will end up in a separate green site instance, which is different from the blue one even though both are for a.com. Okay, so now once we've picked the site instance and the process, the navigation may either commit in the same process that you were on or it may need to swap to a new one. Now, if you're staying in the same process, the navigation will probably commit in the same render frame host. And this is often confusing to people. So if you go from like a.com slash one to a.com slash two, that will keep committing in the same render frame host that you are on. This is often confusing to people. Longer term, we want to actually change this and change render frame hosts in every navigation that navigates to a different document. So it becomes essentially render document host, but this is not the case yet. Now, if the navigation is cross process, then we must swap to a new render frame host. Basically, render frame host is associated with a particular process. In this case, the old render frame host stays visible until the new one commits. And then at that point, the new one becomes the current render frame host. And we have this interesting optimization because the network request and the process startup actually both can be slow. We try to do them in parallel. So when the navigation initially starts, we try to speculatively start a process that we think will need. And then at the end of the day, when we hear back the response, hopefully the process is just ready to go at that point for the commit. Now, this may turn out to be wrong if we have redirects because redirects may change the neural and does the process to which we're going. So you should never, for that reason, rely on the speculator render frame host because it can change before commit. And just to kind of, okay, so the other thing to note is that the first point at which we finalized the render frame host and the process and the sign instance for our navigation is called ready to commit. And if you want to observe that, you would use ready to commit navigation from web contents observer. And just to show on the timeline how the speculative startup looks. So basically begin navigation. First thing we're gonna do is do the speculative startup then proceed with the network request. And hopefully by the time we commit, the process is ready to go. Okay, so now we are at a step where we have the render process ready and we need to tell it to commit the navigation. So to do that, that's pretty simple. We just send a commit IPC with a whole bunch of information about what to commit. And along with that information, we send a mojo data pipe, which is basically what the renderers is going to use to read the response body. So we'll basically write the response body to one of the pipe, and then we'll pass the other end to the renderer with the commit message. And the renderer will just directly read the data from there. At that point, the navigation needs to commit and blink the rendering engine and blink will manage that transition. It'll create a new HTML document, replace the old one with it. The way this happens is currently way more complicated than it should be. In most classes like frame loader and document loader, but like this whole thing is basically still based on an older navigation architecture that we no longer use and it's currently being refactored and simplified. So I'm not going to go into too much detail here, but hopefully this will become much simpler soon. Once all of this is done, the renderer will send back a did commit provisional load message to the browser process. And the browser process, once it receives that message, basically this is the key moment in the navigation. This is what finalizes everything. So this is where if we did a cross process navigation, this is where the new process becomes the current one. This is where we update the new render frame host to be the current one. This is where we update the new frames origin, URL, session history, security state, like all kinds of things. And to observe this, you would use the did finish navigation that from web contents observer. But at this point, you're still not seeing any content. So the next step is for us to actually load the page. And at this point, we start streaming the response body into the renderer. The renderer is going to do a whole bunch of work. The life of a pixel talk went into like all kinds of detail about what happens here. But basically we need to go and construct the DOM tree, do a layout, apply style sheets, run scripts. The renderer might need to request additional sub-resources like images and JavaScript files and CSS and so on. These go directly to the network service without involving the browser process or any of the navigation code at all. The one exception to this is if the page adds in subframe, so like an iframe tag, in which case we actually turn that into a new navigation on that new subframe. And after all of this is done, the renderer will notify the browser using a load stop ITC. And at this point, the thrower stops spinning and we consider the page fully loaded. Now, of course, with today's dynamic pages, there's no such point at which like you're completely done. There can still be a 2D after this, for example, like the page can still request additional stuff in response to user input. But at this point, like all the unload events have run in the page. And you can observe this using the stop loading on web content server. Okay, so we've kind of gone through the basic on your box flow. I'll now cover a little bit about some of the more advanced cases in navigations. So leaving a page turns out to be surprisingly more difficult than you might imagine due to two events exposed to web pages. So before a navigation proceeds, web pages can run it before unload event that allows them to show a dialogue to the user saying basically, hey, you're about to lose data. Do you really want to go away or not? And then the user can either proceed or cancel at that point. And then once you actually decide it to leave, pages can run unload handlers to do cleanup work. And interesting thing about unload is that for cross process cases, we can actually do this in background after we commit the navigation. So in terms of how this looks on the timeline, for a cross process navigation, before once we have the URL to navigate to, we will need to go actually to the old render process and ask it to run before unload. And once that's done, only then can we go to begin navigation actually proceed with the rest of the phone. Of course, if there is no old render process, we can directly proceed to begin navigation. And once we're ready to commit in the new process, we will ask the old one to run unload while the new one actually starts loading the page. And so these two will happen in parallel and you will actually have two render frame calls to activate at the same time. One of them is loading the page and another one in pending deletion state that's processing the unload handler. Okay, so an OmniWax navigation was what we call a browser initiated navigation. It starts from browser UI. We also have render initiated navigations which would start from the render process. And this would be when you'll say, click a link or use submit form or maybe the page runs a script to assign a windowed application to something. So this is our render initiated navigation. And we basically try to treat this as less trustworthy than compared to browser initiated navigations because yeah, web pages can basically try to send you to bad places. And the only difference in the navigation form is that this will start in the render. We can run the before unload handler right away. But then we'll hop over, basically make an IPC to the browser process to actually begin the navigation. And after that point everything continues as before. Now up to this point, I've been talking about mostly navigating the main frame but actually frames can embed other frames as I've hinted before. So you can have an iframe tag and actually any of those frames can navigate including the subframes. And when the subframes navigate cross process we call them auto process iframes. So actually we have a tree of frames maintained in both the browser process and of the master copy and in all the render processes. And any of these frames can go and navigate anywhere else. And last thing I'll talk about is so far I've kind of covered a successful navigation but navigations can and do fail for a whole bunch of reasons at pretty much every stage up to the commit point. And just to show you a couple of cases that might happen. And by the way, when things fail we will still dispatch the finished navigation and you can query the navigation handle to find out whether this was an error or not. But just to show you just a few places where we can fail so you can cancel navigation from a before on load handler. You can block the navigation before it starts to make requests from a throttle. The URL request might resolve the network error so your navigation that way, the redirect might be blocked or the whole thing might turn out to be like a tool for our download. So yeah, this is a pretty complicated flow of things.