 So, my name is Hans de Goede, I work for Red Hat on desktop hardware enablement and today I would like to talk to you about Flickr-Free Boot and I actually plan to keep it short because I always like to have a bit of discussion at the end, questions slash discussion, so I'll go through the slides a bit quickly, so this is the basic things I would like to discuss, basically the boot order, right, we start before the kernel, then we have the kernel and then we get user space and hopefully eventually we get to the login screen. So, Flickr-Free Boot, this has been something which has been discussed along Linux desktop people, I know Ubuntu has worked on it in the past or canonical for a decade I guess or more. The purpose is both simple and complex, it is to basically boot the machine without any visual gering transitions, right, so especially in the past when lots of people still had external PGA monitors, if you do a mode set, right, so if you reprogram the graphics hardware to often essentially get back the same resolution as the firmware just initialized it to, the screen loses sync and then it takes one or two seconds to get a picture back, that's where the original Flickr-Free term comes from. Since it took us a decade to get there, I've gone slightly further, I not only want to not do a mode set ever, so never have the screen lose sync, I also want to have everything visual pleasant, so not, I mean you can keep sync and throw a black screen on there, right, so you get the vendor logo usually when you turn on the machine and then a black screen or a grub menu, we are already there at some level that some of these transitions don't cause a sync, right, they don't cause a mode set, they don't cause the monitor to lose its fissing signal, but they are still ugly, so we also want to get rid of those, so it's not only about not losing signal to the monitor, it's also about having all the graphical transitions be nice and non-gering. Interesting thing which I've noticed on the internet is technical users often don't care much about this and they sometimes even complain that I'm hiding all the useful informational messages, like where did all my beautiful scrolling text go? I swept it under the rock, at least according to Pharaonix, some Pharaonix user posted a picture of someone sweeping stuff under the rock, so sure I'm sweeping all the technical messages under the rock, but for regular users and also for OEMs which we whom we have some contact and also for embedded hardware often if you look at your phone, your phone essentially already does something which resembles Flickr reboot, right? It tries to show a splash and keep that going and moving while it's booting, so this is something which definitely for vendors is important and I also think for regular users, I know for a fact when I talk to Red Hat's internal IT, right? They administer all our laptops, also the laptops for the salespeople, salespeople get a laptop with a flavor of Red Hat Enterprise Linux on there and our internal IT people were really happy that I was getting rid of the grub menu because default by Fedora you got a menu like do you want to boot the latest kernel or one version older or another version older and that just confuses the hell out of regular users, so we're working on on cleaning that all up, so that's a bit about what and why. Now let's go through all the phases of the boot process, so when I started working on this, my goal was mainly to keep the vendor logo on the screen. My first version just kept the vendor logo on the screen all the way into GDM and it means that something like SHIM, have you ever seen, do you all know what SHIM is? SHIM is a little thingy which loads before grub and basically it hands over, it's a transition between two signing keys, two keychains. SHIM is signed by Microsoft for us, so it allows us to have Linux working, Fedora working, Ubuntu working on a machine which has secure boot enabled and itself it has a different public certificate of our own keychain and it checks that grub and then the kernel etc is signed with Fedora or Ubuntu's key. So SHIM normally is completely quiet, it will only show you a text message if something is seriously wrong, but still the first line in its main function was ask the UAFI a frame buffer to go into text mode which makes it turn black. So pretty much same story for grub, so what I did for both, I patched them to make the call to put the frame buffer in text mode, so turn it into a text console on demand, so they only do this, now they do this when they print the first character of a message. That pretty much solves SHIM and grub. For all the changes I've done, I also have a line on my slides which discusses the upstream state. Unfortunately the SHIM changes are upstream, that's fine. Unfortunately for grub I've been unable to get this upstream, I haven't even submitted the patches to delay the switch to text mode call to the first character printed upstream because before I can do that I need to get another patch head upstream which I like to call the shut the fuck up patch head because grub by itself is pretty verbose, it prints a version header for example which is really useful because grub hasn't seen a release in like forever. So all the distros have a heavily patched grub and a version header of all the distros says 2.02 so you still have no clue which grub flavor you're actually running but they insist that that version header is there and I've been unable to convince them in here with a reasonable patch head they had some weird idea which was like two weeks of work to make it one time configurable the printing of the version header and I said no. So yeah unfortunately grub changes long story short are not upstream. So the next step is the kernel booting and the kernel actually on a normal boot on a UEFI system and this is all about EFI systems classic Piles boot is sort of that actually has two graphical drivers first during early boot the EFI frame buffer driver loads that just reuses the frame buffer which it gets from the EFI firmware it doesn't touch the graphics hardware at all it just takes that frame buffer and it gives other applications or internal users the options to draw on the screen. What used to happen is that there is an internal user of the frame buffer driver which is called fbcom frame buffer console which gives you your text console that used to immediately say all a frame buffer has become available I'm going to claim this I'm going to make it black and raw blinking cursor. Given that I wanted to have no jaren graphical transitions that don't not made me happy so same trick as with shim and grub I added a patch to the fbcom driver to not take over the frame buffer until the first character needs to be printed. Pretty much same story basically. Also another issue is before I said most machines start by drawing the Vendor logo in firmware some machines have the Vendor logo in firmware there's an ACPI extension for that called the bgrt table background table but they don't draw it themselves for dollar string reasons. They rely on the Windows bootloader to draw it as soon as the bootloader loads. Usually these boot pretty quickly because if you have a long boot then it would have probably gone through the trouble of drawing the logo themselves so we need to draw the logo as early as possible. So I've chosen to do that in the kernel and the EFI frame buffer code so that is capable of using the ACPI bgrt extension using code which was already there and it draws the logo it does this unconditionally it draws it over the logo which is already there so unless I get the positioning wrong you won't see any transition because it's just replacing the same pixels by the same pixels so no tearing or whatever. This is also useful in case you did get a drop menu because you have multi-boot or you press the key during you because now it will restore the logo over the drop menu. All patches for this are upstream. If you want to disable this for technical users who don't like things you can use fbcon is no defer on the kernel command line to immediately get the blinking cursor again or a video is FIFB colon no bgrt to disable the drawing of the logo if you like to look at whatever was there before. So the kernel the next step is that your initial RAM disk usually loads on an Intel GPU system and I'm currently only supporting this on Intel GPUs sorry we'll load to the i9-15 driver. The i9-15 driver I'm only interested in this case not about the 3D engine part of the integrated Intel GPU so really not the GPU part but the display pipeline driver the kernel mode setting part that actually has support to read back the hardware state. So what it does for a long long time is it reads basically it's the first mode set you do so when Plymouth loads Plymouth will tell the i9-15 driver set the display to this resolution which usually is the same resolution as the firmer already brought it up in. Microsoft actually recommends that the firmer brings the panel up in its native resolution and we try to put it in its native resolution so it should be the same mode. What the i9-15 driver does it creates a software copy of the register state and one or two copies actually which it fills in with how it's going to program then and then for one copy it reads back from the hardware and it fills in all the values which is read back from the hardware and it compares them and if the two are identical it can optionally skip the mode set. So it says oh it's already programmed with exact the register values which I was planning to program in it so why bother to go to the whole shebang because the whole shebang is first turn everything off and then turn everything on again. So this is called fastboot support in the i9-15 driver and this has been available for quite a long time it's even useful in some systems because a mode setting is a bit broken there and the panel doesn't come up properly if you don't use fastboot but you use fastboot since the mode set gets skipped it stays working. But so far this has been disabled because certain types of hardware were not working. I've been working together with Martin Lankhorst from Intel to fix the last few known bugs and very recently we thrown the switch at least for Skylake and newer so starting with Linux Next at the moment in the 5.1 kernel when it's released we will have this enabled by default on Skylake and newer as well as on Cherry Trill and Betrayal Hardware which pretty much removes the last mode set from the boot. So with this we are clicker free. It's enabled currently in wall height so if you're running Fedora wall height and your display start acting up after you apply updates tomorrow or whatever because this this was pushed out yesterday to wall height then let me know and I'm not to blame. So user space this one was actually interesting. Plymouth is like the graphical boot splash thingy which shows some animation and shows not entirely unimportant your disk unlock screen if you have a cryptid disk. There was quite a bit of work to do here. Until recently Plymouth actually secretly relied on fbcon to do a bunch of stuff. So what's happening is that fbcon the frame buffer console driver when it was taking over the frame buffer it would go over all the connected monitors and give them all a mode. Well the fbdef emulation layer in the kernel was doing that but because fbcon was using it and Plymouth was relying on all the connectors already having a CRTC assigned so that's the scan out engine basically and having a mode set and it just took over whatever it found. But since we were now deferring fbcon takeover so we were not triggering the fbdef emulation layer in if pot only the LCD screen on a lot of laptops was working external monitors weren't because they didn't have a CRTC assigned they didn't have a mode set and fbcon wasn't doing this anymore so I have to pretty much rewritten the Plymouth DRM plugin the kernel mode setting plugin for Plymouth to pick modes pick CRTCs and the second round of rewinding was also at hotblocks port because before this Plymouth that was actually an existing problem where fbcon didn't help before I started working on this Plymouth used to scan all outputs once and then just take what it would find the problem is nowadays most docs for laptops they use something called DisplayPort MST DisplayPort multi-stream transport which means that you get a single DisplayPort link to your docking station and then it gets split up like sort of a hub or switch in network does into multiple display outputs and enumerating that takes a while so what was happening was that the in-early loaded the R915 driver and then immediately started Plymouth and Plymouth scanned the connectors while the monitors connected to the dock were still being enumerated so I didn't see them because when it was getting them they were not seen as being in connected state yet that stick by adding hot plug support also useful if you may have noticed once if you use this script that if you boot your laptop and you have your disk script screen already standing there and you plugged in an external monitor the external monitor wouldn't show the disk script screen that's also fixed by the hot plug support so the last stop in Plymouth is actually adding support for it to read the firmware background from the ACPI extension so that it can use that as a background so when Plymouth now starts drawing an animation well you saw it already in the videos right you keep the logo and you're there was a fedora logo beneath it and so it looks like it's drawing over the firmware background it is in essence but it's not reading the firmware background out of the frame buffer it's not reading the frame buffer because that what why not oh well because external monitors are not light up so they don't have a frame buffer yet and I still want to put something there there could be total garbage there I didn't know what's there so yeah it's it was better this way even though it was some work to get this right because I had a lot of work to figure out exactly where to put the logo because the logo actually is not screen-filling so I need to put it in the exact same place as where the firmware puts it and it was a bit tricky but I figured it out and I won't go into details because that's we don't have time for that all patches for this are extreme so the last item I would like to discuss user space related is user space frame buffer handover this is a thing when when Plymouth first installs its its frame buffer then when GDM starts if Plymouth were to exit what happens is the i9-15 driver if the process which owns the frame buffer exits it needs something to send right video is company be being being scanned out right company all the pixels from the top of the screen are being drawn it needs to still send something to the video outputs so it has something which I call the full back frame buffer I don't know how it's called internally by Intel guys and that's usually the frame buffer which they inherit from the video bios or from the graphics operations region of EFI so what would happen is when Plymouth accident you would fall back to the logo which isn't actually that bad because we were already drawn on top of the logo but it does look bad if you have a different Plymouth team which doesn't work on top of the logo so Plymouth already had support for doing a handover where you wouldn't go to the full back frame buffer which happens is first Plymouth is told to stop I think by system D when it's told to stop it doesn't exit it doesn't also free the frame buffer but it drops its DRM master right so someone else can start controlling the mode setting then a GDM gets started by system D GDM sets up its own frame buffer and tells the I-915 driver to start scanning out its frame buffer which it can do now because Plymouth dropped the master rights and after that it tells Plymouth it's okay to quit and then Plymouth exits and it works nicely between Plymouth and GDM because Plymouth has a special IPC mechanism where you can tell it things like stop exit and it does this doesn't work so well for starting the user session we have the same issue where we nowadays because with Wayland we want to run the compositor as the user and also nowadays we support running Xorg as the user instead of as root so we can no longer use the same display server for the login manager and for the user session so we start a new display manager which needs to take over the frame buffer we do the same trick but without the user session helping us we drop the DRM master rights we start the user session and then we sleep ten seconds and we hope that within those ten seconds the user session has installed its own frame buffer now why is this interesting this is interesting because I would like to make some kernel changes and I hope there's some kernel devs here in the room I think there are to make this suck less so actually about a year ago a colleague of mine Rob Clark completely independent of me and of this effort posted an I octal a new kernel mode setting I octal which he thought would be useful which basically is I am handing over this frame buffer to you but please don't destroy it just keeps getting out of it even if I exit until someone else installs a different frame buffer right so currently the rule is that frame buffer lifetime is bound to the process which owns the frame buffer but there's no reason for it the frame buffer is an internal object it could keep it alive longer than the process it could keep a reference count on it and just drop it when it no longer needs it because a new frame buffer is installed so new I octal for this a very good idea it would make the whole dance with plummet unnecessary it would make the whole dance between gdm and the user session unnecessary it would also fix some ugliness which we still have on shutdown because on shutdown we get similar problems because then we go from user session to plummet a problem is that this has some privacy concerns if you go from a user session and you log out or you shut down there could be private sensitive stuff on the screen and the next process which comes after you could theoretically if the frame buffer sticks around scrape the frame buffer or some or if the machine hangs during shutdown and plummet never starts and ever takes over a user could come walking up and see whatever was lost on your screen so yeah that's a problem my proposal here is that this means that we shouldn't blindly always use the I octal in a mother running on top of Wayland so going free on top of Wayland or go really this case is the Wayland compositor so whatever it's not on top of we could fix this by simply closing all the windows doing it role so that we're just drawing the desktop background in the panel and then make the I octal to keep the frame buffer around until someone else takes over that means that your worst case scenario your background will leak if you have a not suitable for background for work background or whatever yeah tough luck on the other hand this also means that if you're still running a desktop environment on top of XORC XORC cannot know when it will be a clean exit basically so probably not what we could make XORC just blank the screen to make it turn black or something but then it will again be visual jarring so we're not really winning anything so this I octal is probably not useful for X only for Wayland compositors which can do something smart just draw the background before so as promised I kept it short ish actually longer than I planned but I had five minutes extra so that's fine so time for discussion or questions go ahead so that's definitely an option but I don't see anyone going doing the work for that but if someone volunteers sure that's a good one actually well fast mode should help if you have an i9-16 driver actually because the hardware repact thing is not only done on boot it's also done after resume so on at least on resume it should help with flickers I don't know if you see any flickers on suspend that sounds like your video bios or firmware just is being nasty and it's turning the black which desktop environment are you using okay weird yeah I only have an answer oh sure so first let me repeat the question which I forgot for the last two but yeah so oh the mics are good enough great anyway the question was why an I octal for telling the framework to stick around why not a property if I remember correctly what Clark decided for an octal with a property could work I guess but it doesn't feel I mean RM at me like I'm done with this frame buffer I'm unreferencing it but currently it means unreven the lead in one go basically it's also an I octal and adding an FB to a COTC is also so it sort of feels like it belongs in that row of operations and as a property it feels weird because you could set the stick around property and then decide to unset it again or make it false and then what so that's also it's sort of a one-time thing right and a property you can change again later so I have a question we also test your free boot in the in the virtual environment if you put the system in the virtual box because in my experience also think like climate are not as stable in the virtual environment as they are so that's a good question I do intend to test this in virtual box because virtual box has support for booting you in UEFI mode again this is only for UEFI because Biles boot is just broken in so many ways that I haven't bothered when it comes to flicker-free I've actually been talking to to the virtual box guys about making UEFI boots the default when you create a new Fedora or well VM now you may know I've also been working on mainlining a bunch of the virtual box guest drivers putting them in the mainline kernel so I know the virtual box guys well yeah I do need to test this but actually there there shouldn't be any difference okay last row I don't appear to be that happy about grub why not use system reboot that's a very good question and the answer is mostly a uniformity between all the different platforms we support we use grub almost everywhere and system reboot only works in UEFI another classic we I don't support classic Biles boot with this but we do still support classic Biles boot if you accept that it will flicker a couple of times in Fedora and we will probably do so for a long long time because again a lot of technical users who like to see all the scrolling messages also for some reason like to use classic Biles boot probably because in the Ocha right a while ago UEFI support was pretty bad in Minux or not that good so everyone flipped the Biles default to use classic boot and a lot of technical users are still doing that I think if you look at our I think it actually was what was canonical who published statistics about this because they recently they published a bunch of the statistics they got are and I think 50% of all the machines were Biles boot that's probably skewed because it probably contains VMs and for some reason VMs all still use classic Biles boot even though there is no really good reason for this but it's it's broken in different ways trust me I've done enough with firmware at all levels of firmware pretty much I've touched a whole lot of the stack in general firmware is just broken somewhere some of the time well most of the time maybe even and we just get to work around it so it doesn't really matter yes UEFI is contains more functionality and as such maybe more complex but actually from the bootloader point of view and from the operating system point of view UEFI is easier to deal with that's the whole reason why it is invented then classic Biles boot I think also many cases that's one of the first switches people flick when they have any installation problem yeah yeah well I think a lot of more experienced Linux users they flick it on a new machine even before installing just because they've been doing that for the last five installs or five generations of hardware or whatever but yeah classic Biles boot is something which we need to support so that's the reason why we stick with grub I think we also use grub on ARM now on top of actually on ARM what we're doing now is we're loading UBOOT and then UBOOT on top gets the open source UEFI implementation and then we load grub so that it looks a lot like x86 because we want to unify our early boot as much as possible because of support load and stuff so yeah we're working on making arm look like x86 yes even other display manager is used to mean login manager then you shouldn't do that I'm sorry the only other display manager which is still seeing some level of upstream maintenance is like DM I worked back before Wayland was even a thing so like five years ago on making the X server run as a regular user instead of root because the X servers known as a huge attack service so running that as root is not a good idea this needs cooperation from the display manager I filed a block against like DM five years ago please spawn the X session as regular user this is what you need to do to make this happen you need to set up the TTI beforehand because it cannot do that if it doesn't have root yada yada they still haven't fixed that I'm sorry if they haven't implemented such an important security feature as not running root not running X as a regular as root they're still running it as root if you use like DM the only advice I can give you is just use GDM even if you're using it to start KDE you will still get your KDE running as regular user with an X as regular user instead of as root and it supports things like fingerprints reader infinitely better and storing the session across different computers GDM 2.2 is for that and it's kind of annoying when you're in a huge department that could be so what I just checked the last comment was on the 31st of December of last year so it's basically like a month ago the last comment I like DM you mean SDDM oh it's still seeing some commits okay and when was the last release okay so last question oh it's over time okay sorry we can continue in all way