 It's a good story. So hi, I'm Joey. Thank you for coming. In this talk, we are going to take a look at something that I've been doing a lot of refactoring on, which is the bootstrap of NodeGuard. So allow me to introduce myself. I'm Joey. I work on the computer scene at Gallup. I am on the NodeTechnical steering committee, and I'm a via-committer. So in the past year, I've been working on the startup performance initiative in Node, which involves a lot of work related to the bootstrap of NodeGuard. You can find me on GitHub or Twitter via the handle joicheng. So enough about me. Let's talk about Node. So this is the process model of Node and also the goal of the bootstrap process. So in a Node process, you usually have one main Node instance running on the main thread, which includes an inspector agent, a VA context, a VA isolate, a libv event loop, and a Node environment. So don't worry about all those lingos. If you're not familiar with them, we'll get to them later. There is also a sick user-won watchdog thread for handling the signal that users send to Node to make the inspector start listening on the port. There are also one thread pool for VA and its test schedule thread and one thread pool for libv to serve asynchronous file system operations. So starting from Node 10, you can spawn worker instances in addition to the main instance in one Node process. And this is what spawning a worker looks like. We basically just create a new thread with the worker instance inside and share all of the threads in the process. So compared to that, spawning a child process is more expensive because we need to set up more stuff. So here is an overview of the Node bootstrap process. We'll talk about these steps one by one. So first, we need to do a few setups that should only be done once per process. This includes setting up the signal handlers, parsing the command line arguments from strings to some other structures, and initializing the dependencies such as ICU for internationalization support and OpenSSL for crypto. Then we'll need to initialize the VA platform, which includes a test schedule and thread pool that we can use to compile JavaScript or run garbage collection. And when initializing the main instance, first, we initialize the LibuV event loop on the main thread. But we'll only add handles to it later when we initialize the environment. So the main instance simply use the default LibuV event loop. And after the process and one default event loop has been initialized, next, we'll move on to set up the VA isolate for the main instance. So VA isolates are the instances of the VA JavaScript engine. It encapsulates, for example, a JavaScript heap, a microtask queue for the premises, and pending exceptions, and so on. So to set up the VA isolate, we will first configure the resource constraints, including how much memory this VA engine instance can use. We'll also create an array buffer allocator that is in charge of allocating external memory for buffers and other typed arrays. Then VA will deserialize the isolate from an isolate snapshot. We'll cover snapshots later. Then we'll set up several per isolate callbacks in C++. But most of them are not ready to be called at this point because they will have to work with JavaScript callbacks initialized later. So these include the garbage collection callbacks uncalled exception listeners, promise rejection callbacks, et cetera. And after the VA isolate is initialized, Nodin starts to initialize the VA context. So a VA context is a sandboxed execution context that encapsulates JavaScript built-ins, aka prime modules, including global disk, array, object, and others. So when you call VM that create a context from the user end later after the node has been bootstrapped, this is what the return result includes. It's a new context with a different set of buildings. So when user creates VM context sometime later after the bootstrap, we only add one more context to the instance without doing any further setups. So what does Nodin do to initialize this context? It creates an immutable copy for the prime modules so that internals can use them and won't be affected when users monkey patch these buildings. It also initializes the done exception for web APIs to use. It's somewhat funny that Nodin has done exceptions when there is no done, but this is done to be spec compliant. So each node instance has a main context where most of the JavaScript is executed in, but it can also contain context created with the VM module. The main context contains a pointer to is associated node environment. This will be assigned later when the environment is created. VM context, on the other hand, do not have this pointer, and they are not bootstrap further beyond this point. So context can be deserialized from the start-up snapshot. Note that this is different from the heap snapshots that you use for memory debugging. Before we integrated the start-up snapshot into node, we had to run a few per-contact scripts in order to initialize these prime modules and the done exception. And now we run the scripts a real time and serialize the context after initializing has been done. We serialize the context into a blob that gets embedded into the node executable. Then at runtime, instead of executing the scripts, we directly deserialize the result of the previous execution into one context, which speeds up the bootstrap. After the node context is initialized, we'll move on to initialize the entire node environment. So a node environment encapsulates a node instance. It is associated with one VA inspector agent, one main VA context, one VA isolate, and one libvv in-value. To initialize the environment, we first initialize the components that are independent of runtime states. So this includes the internal JavaScript module and seamless binding loaders, the process object, and other globals, and JavaScript callbacks that some of us hooks invoke, which will be in charge of invoking user-provided callbacks later. So these objects and functions when used by the users may depend on runtime states, but the creation of them is runtime-independent. And that's why we're doing this at this point. So to bootstrap itself with JavaScript, node needs to create an internal loader system to load C++ bindings and internal JavaScript modules. The C++ bindings are looked up from a linked list with linear search, while the native JavaScript modules provided through the internal version of require are looked up from the map. So node recently integrated VA co-cache to speed up the compilation of internal JavaScript modules. So before the integration, we needed to parse and compile the source code of these modules at runtime before executing them to create native modules. Now we compile them at build time and deserialize the compilation data, which is embedded into the executable. At runtime, we can directly deserialize the compilation data and execute the JavaScript to create these internal modules. This speeds up the bootstrap about 40% to 60% since parsing and compilation used to take up a lot of time during bootstrap. After we have an internal loader system set up, node can start initializing the globals, which are implemented in internal JavaScript with access to internal C++ bindings. These globals are then attached to the global object or the process object. In node, the global object is now legacy alias to the ECM script stage for global this project object. So this is a very simplified version of the code executed to set up the globals. We usually used internal require to load and the implementation of something internal, runs some setups to get a JavaScript function or object and assign them to either the process object or the global object. Other than the globals, we also need to initialize several hooks when setting up the environment. This includes, for example, process next stick, which need to invoke queued callbacks when its incorporations are done. So during bootstrap, we need to create the tech queue and the tech callback. And store the tech callback in the environment so that it can be called later. We initializing the runtime independent states. No ticks should be added to the tech queue. We are just initializing the machinery to process the tech queue. And after the runtime independent components are initialized, we then move on to set up the event loop. At this point, we need to initialize a few handles. Some are activated immediately. Some are activated on-demand later. We only initialize a fixed number of handles during bootstrap. More handles, specifically the ones for IO, are added on-demand later. So the libv event loop has many different phases in each iteration. And this is roughly what it looks like. So at bootstrap, we'll initialize one timer handle for set amount and set interval. One idle handle for set immediate. One prepare handle for idle notification for dash-prof. And one check handle for dash-prof and another for set immediate to pair with other handles to work. So after the event loop is fully initialized, we then initialize the VA inspector, which is used for JavaScript debugging. This includes initializing the inspector agent, which is done, even when the inspector is not active. And we'll spawn a sick user one watchdog thread that wakes up and asks Ming thread to start listening on the inspector port when the user sends a sick user one to the process. This is only done for the main node instances and not done for workers. So if the user passes, for example, dash-inspect break, dash-cbuprof, or dash-hiprof when launching the instance, we'll also immediately create more threads for either listening on the inspect port or for profiling. After the inspector is initialized, we then continue initializing components that depend on runtime states. At this point, we will need to handle various runtime configurations, including command line flags and environment variables. This phase is also called pre-execution. For example, this is what the initialization of dash-no warnings look like. If the user asks, no, not to write warnings to STD error, we don't install the warning listener that does this. If the user does not configure node to stop doing this, we'll install the mixer. Other initializations done during the pre-execution phase include setting up IPC channels for clusters and child processes initializing the user-land module loaders, including both the CJS module loader and the ESM order. So only after this point, any user-land module can be executed. Right after that, we'll also load any pre-loaded module specified with dash-require. After the pre-execution is done, the node environment is ready, and we can start the execution. So at this point, we'll choose a main script according to the command line flags and run it to start the execution. These scripts are all located under lib internal main in the node repository and compiled into the binary at build time. So for example, if the user passes a file name to node from the command line as the entry point, we'll load a main script that detects the type of the entry point and load it with either the CJS or ESM loader. If the instance launched is a worker instance, we'll load a different main script that set up various listeners on the message port and start listening. So when the worker thread receives the scripts from the port, it will compile it and start execution from there. And after the main script is run, we'll kick off the event loop and run it until nothing keeps it open. The LibUV thread pool will be created if any asynchronous file system operation is used. So a quick summary of what we covered in this talk. To initialize a node process from scratch, we first run per process setups and then set up a VA isolate, VA contacts, and the node environment. The majority of the work is done in the environment setup, which includes initializing the runtime independent states, the event loop, the V8 inspector, and handling runtime dependent configurations. Then select and execute the main script and start the event loop. So as mentioned earlier, we now have integrated the VA startup snapshot into node, but at the moment, the snapshot we have only includes the contact setups. We're currently working on including the runtime independent part of the environment bootstrap into the startup snapshot to speed up the bootstrap further. So this is currently the focus of the startup performance initiative. You can check out the link if you want to know more. Thank you.