 Hi, my name is François. I work on the Montreal FS in their Chrome Catan team, which is responsible for optimizing Chrome resource usage. In this talk, we're going to look at Chrome execution from the moment you click on the Chrome icon until you click the X to close the last Chrome window. More specifically, we're going to look at the startup, study, and shut down the phases of Chrome execution. As you have probably learned in the anatomy of a browser talk from Jan, Chrome is a multiprocess application. In this talk, we're going to focus on the browser process, which is the first one that gets created, and also the one with the most privileges. That being said, a lot, but not everything that I'm going to explain today is also true for child processes. So let's start the game into startup. This is a trace that shows everything that happens from the moment the Chrome browser processes launch until the first non-empty paint of a tab happens. It can be divided into two main parts. The first part is running the synchronous initialization part of the main function of Chrome. It calls into many subcomponents to prepare everything that is needed to run Chrome in a steady state. The second part is running the main thread loop. In this part, the main thread of Chrome is running a loop in which it gets tasks from a task queue and runs them. Let's zoom into the synchronous initialization part of the main function of Chrome. The first thing we do is to initialize the state needed by the base command line API. There are two reasons for doing that. First, almost everything in Chrome depends directly or indirectly on the command line state. Second, the base command line API can be used from any thread. So we need to initialize its state before we start launching threads that could access it concurrently. Otherwise, we get data races. Next, we create the task executor in the thread pool. This allows code to start posting tasks using the base post task and base thread task-runner-end-all APIs that we'll look at later in the stock. However, the task will not start running immediately. After that, we run a function called post-early initialization. In this function, we read the local state from disk. The local state is a set of preferences that are not profile-specific. And we set up field trials. Field trials, you heard about it in the previous talk. The control base features, for example, they control everything that Finch can enable dynamically. And it's interesting to know that they are initialized so early because it means that you can run AB experiment or enable-disable feature very early in Chrome startup. For example, that allows you to test if a change in the initialization of a component affects startup time by running an AB experiment. Next, we start the thread pool. So the thread pool that we created before, now it starts having thread. And that means that that task can start running on other threads. If tasks are running on other threads, it means that we need to be careful about potential data resist from that point. We also start the IWAD thread. Now the process can decide to run in service manager mode or continue full browser startup. What does that mean? Well, Chrome can run in a lightweight mode in which only some services are started. For example, we can launch only the download manager to resume a download from a previous session. In this mode, Chrome uses less resources than in full browser mode. For this presentation, we will continue to look at full browser startup. The next step is to pause the thread pool that we just created and run a function called pre-create threads that performs some initialization that cannot be done while tasks are running on other threads. The reason for why some initialization cannot be done while tasks are running on other threads, is when the initialization sets some state that can be accessed from other threads. If we're doing that while tasks are running on other threads, we can run into data races. After running this function, we resume the thread pool and we run the post create threads function which performs some initialization that can be done while tasks are running on other threads. Next, we run the pre-main message loop run function which is very often the longest one that we run in Chrome synchronous initialization. The first thing, the first important thing that this function does is to create the user profile and load keyed services. Have you heard of keyed services? Oh, okay, I can give you examples of keyed services. For example, the history, the new tap page feed, auto-complete, bookmarks, content permissions, translate all these things that depend on the user profile that are called keyed services. Once the profile is ready, we load the extensions. We need to load the extensions very early because an extension can affect how a navigation is performed. So before we start the first navigation, we need to have all the extensions ready. Then the last important part of pre-main message loop run is to figure out which tabs we need to open in the first browser window and start the navigation on these tabs. This is something that is not real to do and that is done by startup browser creator. Finally, the last thing we do in Chrome synchronous initialization is to disable anything slow on the main thread. Up to that point, code could just read a file from this synchronously on the main thread. But I'll show you later that if we do that during Chrome steady execution, we will cause jank, which we do not want. So from that point, there are no more disk accesses or weights that are allowed on the main thread. Startup is driven by code that lives in the content directory. And the content directory can only contain the multi-processed browser architecture. It cannot depend on any feature such as autofill or translate. So how do you integrate your component with startup? Well, at different levels in the code base, there are classes, interfaces, that you can implement to hook into the different startup phases that I just described without needing to add code in big function that perform a lot of unrelated work. Now it's time to look at steady execution of Chrome. This is a trace that shows the threads that exist in the browser process of Chrome when it runs in a steady state. Perhaps the most important one is the main thread that we also call the UI thread. Any modification to the UI needs to be performed on that thread. Other things that need to happen on that thread are any access to the profile class, reading user preferences, starting navigation, accessing methods of the web contents class, which is the interface that we have to control a tab. All of that needs to happen on the main thread because there are no locks protecting that state. So if we want to avoid data races, we need to do everything from the same thread. Next, we have the IO thread, which is very poorly named because it is not the place where we perform these corporations. Instead of that, it is the thread where IPCs are received and sometimes sent. That thread performs very important work, but most Chrome developers don't know about its existence, it just works. After that, we have the compositor thread, which was described in the life of Pixel Talk. I have nothing more to add to that description. And we have the thread pool threads, which are general purpose threads that can be used for any task that doesn't have to run on one of the other threads that I just described. Finally, there are still a few legacy threads that were migrating to the thread pool. Let's zoom on the IO and UI thread. The IO thread runs an infinite loop in which it alternates between taking a task from its task queue and running it and reading IPCs from message pipes. Similarly, the UI thread alternates between getting a task from its task queue and running it and getting messages from the operating system. An example of message that the UI thread can get from the operating system is an input event. What happens if a long task, let's say a 100 millisecond task runs on one of these threads? We get what we call jank because either the IO thread will not be able to receive IPCs from other processes or the UI thread will not be able to receive input events, new input events, and the user will find that Chrome is unresponsive. So we never want to have any slow task on the IO or UI thread. One very important goal in Chrome is to keep it responsive and that starts by not having any slow task on the UI or IO thread. We do not use Lux in Chrome outside of low level constructs because they can cause priority inversions or worse dead Lux. For the same reasons, we do not use synchronization primitives such as withable event or condition variable. I said before that we ban disk accesses and complex computation on the UI and IO threads because these operations can take time. If they take time, IPCs or input events are delayed and Chrome is unresponsive. So, what? How is that enforced? Oh, good question. How is that enforced? Okay, if you use base API to make disk accesses and you do that on the UI or IO thread, we have checks that will just make the process crash. So you will not be able to check in that code. If you use native APIs, just don't do that. Okay, so how do we write code that complies with the rules that I just described? Let's say that you are on the UI thread and you receive an input event that tells you that the user is typing in the omnibox. You want to read the history from this to be able to find suggestions to present in the omnibox. You cannot just write some run code like that on the UI thread because the get history items from this function will read from this, which can take time. And if it takes time, Chrome will be janky. So what do you do? Any expensive operation need to be performed asynchronously. This code will post a task to the thread pool to run the get history items from this function in the thread pool. Then the return value of that function will be forwarded to the thread that called post that can reply with results where the reply add history items to omnibox drop down will be called with the return value of the first task as argument. So since all UI changes need to be performed on the UI thread, it's important to call that on the UI thread. And since we cannot do anything expensive on the UI thread, we need to do that in the thread pool. This is how we connect everything. And what's nice about that is that while get history items from this is running, which can take time, the UI thread can keep receiving input event and updating the UI to make Chrome very responsive. Now, what if get history items from this is not thread safe? What I mean by that is what if it accesses some state that is not protected by any locks? When we receive an input event, we will schedule a first call to get history items from this. And when we receive another input event, we may schedule another call in the thread pool and the two calls may run concurrently. If there's no lock to protect the state accessed by this method, we can get a data race. So what are our solutions? We could use locks, but no, because I said that we don't use locks in Chrome. What do we do? Instead, we use a virtual thread. All tasks that are posted to the same virtual thread are guaranteed to run in order one at a time. So if the two get history items from this task are posted to the same virtual thread, they will not run concurrently and there will be no data rates. What's a virtual thread? A virtual thread, a task posted to the same virtual thread are not guaranteed to run on the same physical thread. For example, if you get the thread ID from two different tasks running on the same virtual thread, you are not guaranteed to get the same value. But for most cases that doesn't matter. What matters is that the task don't run at the same time. Let's look at the code that's involved in using a virtual thread. The first thing we do is to create a sequence task runner that you store in a variable. Then when you want to post a task that runs on the virtual thread, you pass that task runner as argument. Now the two tasks here and here will run on the same virtual thread. That means not concurrently and we don't have any data rates. This is a class that requires all this method to be called on the same virtual thread. It encodes that requirement by using a sequence checker. In all methods, you can see that the de-check code on valid sequence macro is used. And what that does is that at runtime, if one of the method is called on the different virtual thread than previous calls, the Chrome will crash and you will know that you've done something incorrectly. Yeah, so there is no data rates for the accesses to the full member of this class. Now, what if the reply is, consists of calling method on an object and that object is deleted before the reply gets to run? You get a use after free. So how do you solve that? We could use refcontain to ensure that the target of the reply is not destroyed before the reply runs. But we do not like to use refcontain in Chrome because it makes the lifetime of objects appear. It's hard to know what code is retaining references to the object and it's hard to know when the object gets deleted. The preferred solution is to use weak pointer. So when you bind the reply callback, instead of using a raw pointer, you use a weak pointer. Under the hood, Chrome will automatically cancel the reply if the target gets deleted before the reply gets to run. So one thing that's very important to know about weak pointers in Chrome is that all the weak pointers issued by the same factory need to be de-referenced and invalidated on the same virtual thread. If we allowed the invalidation to happen concurrently with de-referencing the pointer, we would get a racy behavior and that could lead to use after free, which we don't want. Now, we will look at some C++ code that shows all the concept that I just explained. So in this code, we are posting a task that calls the task function and the return value of task is forwarded to the thread that called post task and reply with result where reply is called with the return value of task as argument. Now this counts in a bit is just a little bit different. We added the best effort trait. What that does is that the task will run at best effort priority. Best effort priority means that the task is queued behind any task of our priority. Also, it may be preempted during important phases such as startup and it may run on the background thread. So the OS will know that the task is less important. In this code snippet, we change the thread pool trait to use browser thread UI instead. What that means is that task will not run in the thread pool. Instead, it will run on the UI thread. Here, we change the callbacks. So instead of just calling simple function, we are calling methods on objects. And we are using raw pointers to specify the targets of the task and the reply. Here, we just change the code here instead of using a raw pointer, we are using a weak pointer to fix the potential uses that the potential use that I just presented to you before. Finally, when you post a task, you don't always need a reply. You can just decide to post a task and run it without doing anything after. So we can just use both base post tasks instead of post tasks and reply with results. Base bind that we use in Chrome is very similar to a state bind. If you know how to use state bind, you probably know everything you need to know to use base bind. But we keep using base bind in Chrome for a few reasons. First, the automatic cancellation of a callback when this target is deleted using weak pointers is implemented by base bind. Also, base bind protects us against some user after free by disallowing capturing limb does. And when you use a raw pointer in a bind call, you need to use an unretained annotation to just make sure that you know that what you're doing is unsafe. Finally, a base bind can create, with bind once, you can create a move on little backs and that makes it easier to control when the arguments are deleted. The last phase of Chrome execution is shut down. So when you click the X, what happens? The main thread loop exits. What that means is that tasks that are posted to browser thread UI can no longer run and input events are no longer received. Then Chrome enters the main runner shutdown function. First thing we do in that function is to join the IO thread. That means no more communication with child processes. After that, we perform thread pool shutdown. And finally, after thread pool shutdown, we perform some synchronous tear down on the main thread. When thread pool shutdown completes, Chrome is almost single-threaded. So it's fine to delete things on the main thread because they should not be accessed by other threads. I said almost single-threaded. This is mostly because of shutdown behaviors. So when you post a task to the thread pool, you can optionally specify one of three shutdown behaviors. The first one is block shutdown. What that means is that the task needs to run before thread pool shutdown completes. Then you have skip shutdown. What that means is that if the task did not start running by the time thread pool shutdown starts, it is canceled. But if it already started running, we will allow it to finish its execution before we complete thread pool shutdown. And finally, continue on shutdown. What that does is that if the task did not start running before thread pool shutdown, it is canceled. But thread pool shutdown can complete when the task is still running. That means that continue on shutdown task run at the same time as synchronous shutdown on the main thread. So usually they should not access any global state because that state may be deleted. Block shutdown is good for anything that persists state to this because the tasks are guaranteed to run. Chrome is not supposed to lose user data if it crashes before shutdown. So what are we doing during shutdown? If we already save the state as Chrome is running? Well, we are performing some tasks to speed up the next startup. For example, when Chrome is running, we write the actions the user is doing to disk to be able to perform a session restore on the next startup. On shutdown, we compress that state to speed up the next startup. We are also freeing a few objects and that is not really necessary because the process will be terminated soon after but this is an area for future development. Last part of this presentation is about testing. We saw that you can post tasks in Chrome. It's important to test that code. To do that, you just need to put the task environment member in the test fixture to allow use of the base post task APIs. I encourage you to look at the constructor arguments to that class because it can be very useful to control when the task run in the test. That's useful if you want to test different condition at different steps of execution. In particular, you can fast forward time which is useful if you have code that posts the task with a one hour delay and you don't want your test to take one hour.