 OK, let's get started. This talk will be about improving the boot up speed of AOSP, which is a project that we are currently working on at Lenovo. So I have some patches here that I'm going to show some ideas that we have already tested. And I'm also going to talk about a couple of extra things that we are still about to look into that may or may not help as much as we are hoping they will. So there's two basic approaches to improving boot up times. One is keeping the regular boot up process, which is what we've always been doing, going through the init-ramfs and switching to the normal root file system, launching init, going through all this stuff, init-acid, launching things. The drawback of that is, obviously, that there's lots of initialization that needs to be done, that needs to happen at every boot up. And those things take time. And we have to figure out where we can save time without breaking things and what still has to be done. And the other possible approach is, essentially, looking at solutions that are based on suspend to disk. So if we use regular suspend to disk, there is currently no reliable implementation on AOSP, but we are working on that. How do we handle power loss without having a chance to save the current state first if there's an immediate drop in power? We just don't have time to write out a suspend image first. How do we recover from a crash? I mean, the worst thing that can happen is the system totally freezes, then that status gets suspended to disk, and then next boot to kernel tries to restore that, which is obviously what we should be avoiding. And one idea was to just save the session after the first successful boot up. Keep that forever, and essentially, just keep restoring into that session forever until the system update is installed. So first, let's look at the suspend to disk approaches. What we've already done is porting the regular Linux suspend scripts to AOSP. That is essentially working. We're having some issues with our beloved non-upstream SOC specific code, like drivers that will just not resume properly on one board. Everything is kind of working. But as soon as we suspend and resume, sound goes away because of a bug in the sound driver. And there's similar issues on some other boards that still need to be fixed. Boot time on an RB7HL test hardware from the time the kernel starts to boot to the point where the launcher is up, got to 12 seconds with the resume from suspend, which is actually longer than I thought it would take. But it's related to the IO on that board being really slow, and it has to load some more than a gigabyte of data to restore the session. So that's where that time is coming from. That's compared to 16 seconds with regular boot up. So it's not saving all that much, but it's a good step. We're still looking at the Tux on Icepatch set that might make the suspend resume better. But I don't have any details on that yet. And the other idea I've mentioned before, essentially restoring an image that is only suspended once after first successful boot up is also still being done. There's a couple of possible problems with this approach even before it's implemented. One is storage space. Obviously, if we suspend that session and need to boot into it over and over again, it has to be on the disk again. And even while the system is running, it can't be used for a swap or anything else. So this is not something you want to do if you're on a board that has low storage. There's a potential security issue, because if everything is suspended, that would include the seats for random generators, unless there's a hardware random generator. But that can be fixed, obviously, by editing the scripts that resume to just reset those seats and everything. And of course, when you do resume, you're not going to get a boot animation. You can only display a static boot image, which is typically what the same people who say it needs to boot in one second take most. So the other approach is looking at where we spend too much time in the regular boot up process. Come with AOSP. One is boot chart. This is actually preloaded on just about any Android phone you can find. If you just create an empty file in slash data boot chart enabled, the device will record whatever it does on boot up. And then you can use the grab boot chart script that is part of the AOSP source tree to get the files that were generated, and you get a nice graph that shows where it's spending time, what has been started, what's running power and stuff. Another useful tool that comes with AOSP is this trace that lets you essentially look at where it's spending time in system calls. And obviously, it has the good old tools, the message logcat that will give more information. So one of the first things we looked at is what services does AOSP start all the time that don't really need to be up before we can display the UI. We came up with this list. The vibrator service is not really useful before the UI is up. The consumer IR service is not useful unless an application is running. Some Wi-Fi services are sufficient if they have started after boot up. After the user can already see the system is up. A couple of other network services has just not much of a point in being connected to the network before the launcher is up. User can't use it anyway. Audio Flinger would be another candidate unless you want to play a chime while you're booting. You don't need sound before the launcher is up. Then there's ADB and other development tools, which of course make our life harder because we like to be able to ADB shell and just see what's going on even during boot up. But on a production device can make sense to start them after the launcher is up as well. And we found that, especially on first boot, a lot of time is being spent in the package manager. And there are some things that we can do to fix that. Let's look at the package manager first. What happens there is the package manager scans all the APKs that are installed. It uses an XML parser and ZIP archive utilities. So it takes some time. Then it reads all the attributes from that APK to check signatures, and then stores a list with all the relevant information on those installed APKs in the data partition. And on second boot and on any subsequent boot, it checks if a package has been updated. And if anything has been updated, all the APKs will be scanned again. And if not, it uses the information from the lists that were generated on first boot. So one thing we can do to make first boot much faster. This saved us around four seconds of 24 seconds total boot is just building those package list files and package XML files that are generated by a package manager on first boot on the build system, including them in the user data image that gets flashed to the system. So essentially, the first boot will act much like the second and subsequent boots if that's done. On subsequent boots, obviously, that doesn't change a lot because those files would have been generated on first boot anyway. Next thing we've looked at is init. And init.rc files to see if there's a lot of things we can do there to speed up things. At first, we thought there's some constructs that are happening quite frequently, like creating several directories in a row, changing ownership, and changing mode of various files, and procs as in CSSFS. And coming from a regular Linux world, we thought, oh, there's some overhead involved there all the time to keep starting external processes and launching them. If we just patch in it to provide a command that runs several of those at the same time, it should be much faster. But it turned out because the AOSP in it has internal implementations of MKDR and change mode, change owner that overhead from starting an external process isn't there in the first place. And we only got to save a couple of milliseconds using that approach. So that is not really worth doing. We still have to patch this. So if someone really wants to save a couple of milliseconds, that might still be something to look at. One thing that still might make sense in that context is using a couple of threads, especially when mounting several file systems, mounting them all at the same time on different CPUs might still speed up things. But that's something we didn't get to play with yet. Next thing we found where we are spending a lot of time in regular boot up is preloading classes. So when AOSP starts up, it preloads a list of Java classes that are listed in the file system ETC preloaded classes. And if you look at that file, it contains a lot of things, including the entire web view factory. Most of the classes in the Android namespace, pretty much the entire Java language call libraries, all the Java dot anything, Java X anything, and Libcore. Some of this stuff is not really needed until the launcher is up. So first approach was just getting rid of all the preloading and seeing what that does. And it actually improved boot up time a little. But that's not the smartest thing to do because obviously some of the things like the launcher are going to access those classes. And it makes more sense to split that preloaded classes file into two. One doing what's already being done, just preload everything on boot up. And then another one that essentially also preloads the classes, but at a later time when things are already up, so the user can already start working without having the classes preloaded. I think it makes sense to keep it in configurable list files because some people will have very different needs of what they want to preload. For example, the normal phone launcher might use different classes than an IVI system or a smartwatch or something else that is just stripped down or less general purpose that might even want to preload extra classes to talk to some other interfaces that we aren't even thinking of. One of the next things is power management. Usually we are trying to reduce power use as much as possible, but if we are after boot time, it might make sense to only start all the power management tricks we are usually using once other things are already in place. So usually, especially on the ASMP system where we have some small CPU cores and some big CPU cores, a lot of background tasks get pinned to the low-power CPUs. But if you want to speed up booting, it makes sense to have the high-powered CPU cores available to do that work because they can do it faster and only pin those processes to the small cores at a later point when the launcher is already up and the user can start doing things. Next thing, most of the things that happened during boot up pretty much all of boot up time is pretty heavy on IO. So one thing that can save a couple of seconds is just figuring out the best settings for the kernel IO. Using the right scheduler, CFQ is good on some boards. No op scheduler is better on some other boards. Optimizing read ahead by default, the AOSP kernel has a read ahead of 128k. And on some boards, reducing that makes a lot of sense because the IO is really slow. On some other boards, increasing it makes sense because then it reduces the number of total calls. But there's no general advice that we can give on what settings should be because it's really different depending on what hardware you're running on. This is something that has to be redetermined for every possible target board. But there's a couple of seconds to begin, so it's really worth doing that when bringing up a new device. Next thing is some kernel features. First, we looked at file systems and found that QoSFS tends to result in better boot up time than X4, probably because QoSFS is read-only and doesn't need to do a lot of FSTK work and stuff while scanning the file system. And also because QoSFS is compressed by default and therefore reduces the number of bytes that have to be read from actual storage, which tends to be slow, then the kernel provides a couple of different compression options. On most SOCs, the kernel defaults to being compressed with ZLIB. And switching that to LZ4 increases boot up speed, at least on all of the boards we have tried. There's probably some other boards with really slow CPUs and fast storage where you might see the opposite effect again, so it's another thing that needs to be determined for each board individually. But it added around 2.6 megabytes memory requirement, so that's a trade-off that you may or may not be willing to make. Given that none of those compression algorithms really did everything we wanted to, we've looked at a couple of other compression algorithms and found the ZSTD thing. That's a relatively new compression algorithm that is supposed to be able to compress to smaller sizes than LZ4 at around the same time. So that may provide good results for kernel compression and also for QoSFS compression once porting is done. We have a kernel patch that adds it, that compiles, but currently there's some problem that prevents it from booting, so we don't have any numbers on it yet. But this may be interesting in the future. Next thing is making the kernel modular. So up until now, the AOSP guys have been saying you shouldn't use kernel modules and just use a monolithic kernel that has everything built in, but this seems to be changing. In current AOSP master, the mod probe utility that we know from the regular Linux world is actually being built, and it looks like configurations are moving to actually making use of it. In terms of boot up speed, this is going to help a lot because even when a module doesn't do a lot of initialization and some modules do, for example, sound drivers typically take at least a couple hundred milliseconds before they return from initialization. It saves time because the kernel is smaller, doesn't have to be preloaded, doesn't have to be decompressed. So even if there's no initialization involved at all, just reducing the kernel size and loading those modules as they are needed is going to help with boot up time. Next thing, we identified a couple of libc functions that are used quite heavily during boot up. Mem copy, mem cmp, string copy, string cmp. There's a lot of implementations of those functions available. Even if you look only at standard C libraries, like Bionic, G-Lib C, Muzzle, U-C-Lib C, you'll find at least 10 different implementations of those functions. One thing to make sure is that we really have the fastest implementations of all of those. It looks like Bionic has already imported a lot of those, but recently NULIP has made a couple of modifications to the ARC64 implementations of some of those functions. Those seem to be faster, so one of the next things we'll be doing is we'll grab those functions, we'll port them to Bionic and make sure we don't face any delays there. Another thing to do is just make sure you're using count compilers to generate the best possible code for those functions. Yeah, that's pretty much what we are up to so far. Are there any questions? So the question was, we are already trying to delay some startups and system D is already doing that with the system socket activation bits and whether we are thinking about porting that functionality to Android. And the answer is yes, that's an interesting thing and should really make sense in at least some situations, especially when we're also talking about loading drivers on demand, right. And it should also save a bit of memory from not having unused background processes. So that's definitely another thing to look at. Anything else? Looks like there's no other questions. So if there's any other questions after all, feel free to find me or email me in my address is there. Then I think we can end this session.