 Good afternoon everybody. I'll be talking today about trading FBDef for DRM. So my name is Geert Erthuven. I'm a freelance embedded Linux kernel hacker. I'm self-employed. I started with Linux as a hobbyist a very, very long time ago and I worked on various platforms and subsystems. What's most relevant for this presentation is that I used to be a maintainer of the Linux framework for device subsystem or FBDef. Also a long time ago, at that time I also worked on the X3 Xserver for the framework for device support and outside of Linux I also worked on graphic subsystems when I was employed by Sony. Big fat warning, I'm no DRM expert so please don't throw any of these objects to me during or after the presentation. So what will this be about? So the Linux framework for device subsystem was officially deprecated into 2015. It means that no new drivers for the subsystem are accepted and that all new graphics drivers must use the newer DRM KMS subsystem. We have plenty of old FBDef drivers left so what will we do with them? There are also out of three drivers in various vendor VSPs and yeah, of course you want to know why was this decided? What can we do about it? And what's the path to the future? I'll start with a short history about Linux and graphics and FDF. So probably all of you are familiar with this message which was sent by Linus Torvalds in 1991 when he started working on Linux. At that time the only thing Linux supported was initially the VGA text console for the kernel messages and for users for logging in on the console and then later support was added to run X-Service like X386 but there had been various other commercial ones at that time and those managed graphics in user space. They handled mode setting directly and optionally usually they had support for hardware acceleration if your graphics hardware supported that, of course. So when Linux grew beyond X-86 it gained support for many other machines and several of them did not have VGA text mode, VGA graphics cards and no VGA text mode. So how to solve that? What happened was that we got a proliferation of platform-specific console implementations. Basically every port of Linux to a different platform with a graphical console implemented its own console which was of course not a good thing. So finally in 1995 we got the framework for device subsystem which was the first platform independent framework to have graphical consoles on Linux. It was started by Martin Schaller on the Atari. Shortly after that he disappeared and I've never heard from Martin again. I have no idea what happened. I wrote a initial framework for device for the Amiga and then later I moved to PowerPC and I started getting into work there as well on the ATI MagSexy 4 board. But before we could get framework for support more, before we could make it more generic and support it on all platforms we first had refactored the console drivers so we could factor off all the commonalities before all of those implemented everything. So the console driver I made it more modular and then we were ready to have framework for device support on other platforms, first on PowerPC then later we got it on Spark and on other ones of them. And then finally in 1998 Gerhard Hoffmann he created FISA-FB which was an implementation of a framework for driver using the FISA linear framework which means that the BIOS on the PC would program a graphics mode and then Linux was to use that. And then finally people could see the penguin during boot up on x86 as well. So let's first have a look at a simple graphics hardware. So usually you have a CPU which you can write to piece of memory which is the frame buffer and that memory contains representation of a graphical image. It's read out by the CRT controller hardware and then displayed on CRT. So this name CRT really stands for cathode ray tube not many displays still use that but the name CRTC is still running strong. As an example of a very simple graphics hardware I have the Sun350 workstation. I don't know many people in this room still have used it I'm looking at some of them I guess they have. Sun3 was a very interesting platform because it was released about one month before Intel released the Intel 386 processor. So the Sun3s are more or less the oldest hardware that ever existed that you could read on Linux on. And on the Sun350 you had a monochrome display of about one million pixels. That was really monochrome, black and white. So each pixel had one bit. I also have a picture of the motherboard here. Here you see the four megabytes of system RAM. This is lots of glue logic and I think probably also the RAM ships for the 128K frame buffer. And it implements the CRTC controller. I have a very simple, just a bit some TTL countershift registers and gates and a clock head generator. And that's very simple graphics hardware that can show you high resolution monochrome image. Now how do you get more color in graphics? Way that was used a lot in the beginning was the concept of bit planes. So instead of just having one bit per pixel, you add another region of memory that contains a second bit per pixel or even a third bit per pixel. With three bits you can have simple RGB color, eight colors. Sounds nice. Another way is to put all the bits together inside the same byte or in the same word. This is what's mostly being done these days. This advantage of that is that usually you have to fit a power of two number of bits inside the byte. So this was used for two, four or eight and later 16 and 20 or 34 bits per pixels. While in the monochrome case, each bit represents either a black or a white pixel. If you want to go to grayscale or chord with grayscale then the bits represent some grayscale value. With color, you could have a color lookup table and for later, once you have sufficiently number of bits like 16 or 24 per pixel, you can just put the RGB value in the pixel directly. And the third way is to use some interleaved way where you have everything looks contiguous in memory but it's not really is. So in this example, you have the first bit of the eight first pixels is stored in the first byte and the second bit of the first eight pixels is stored in the second byte and so on. So you have memory organizations like that as well. Now, what was the framework for device API? It was actually very simple. You had a way where you could map the device def FBX and then you just get access to the graphics memory directly. Application can write right to it and it's shown on the screen immediately. It's a very flexible way but it does mean that the application needs to know about whether you are using bit planes or interleaved bit planes or whether you're using packed pixels with everything packed together or not. So that means that your application needed to support exactly what the hardware could do. Besides mapping the frame buffer, there were also IO CTLs introduced to query the frame buffer mode or to change the mode and also to manipulate the color. FB def also has support for acceleration which is optional. So inside the kernel you can have simple acceleration for to get a performant text console because you need much more memory bandwidth to have a graphical console than a VGA text console. User space can implement acceleration also by M-upping the def FBX device and then user space can write to the hardware registers of the graphics directly. There was support for some of those in the old X386. For graphics hardware where the frame buffer itself is not directly mapped into the CPU's address space later some way was introduced which is the deferred IO. So basically you have an additional buffer of memory and you set up the page table such that you can monitor when it's being written to the page and then regularly you can update, send the updates to the real graphics card. Now about the text console. So FB def also implemented a text console on top of the graphics named fbcon. So on the left you see there's a sliding window of a rectangle going over a buffer of memory. The buffer of memory in memory is a buffer of text and attributes bytes which is identical to what was used on PGA text mode. So all that code could be reused. So if you scroll the text console then basically this rectangle moves over the memory. The window moves across memory on PGA text. With the frame buffer console you also have to when something is written to the console or any other changes made like scrolling then the representation in the frame buffer has to be updated as well. To make this performance some tricks are used. So as I said before you could have with FB def you can have some optional hardware acceleration inside the kernel. A first one is for actually rendering the characters. So when you render the character like an A you go from a bitmap font which has one bit per pixel and you have to write a graphics memory and expand it to use the background and foreground color you want for your text. Some graphics hardware has acceleration for hardware acceleration for that which is very useful if the memory bent to it from the CPU to the graphics memory is very slow. Other graphics hardware has support for copying or clearing rectangles which is also useful for scrolling. And the third possible optimization is panning the screen which I show here which is basically similar to what was used on the VGA text console but now on the graphics buffer. So you have the full graphics buffer here and this forms a window in the memory and when you scroll one line you just have to draw one additional line of text here and move the window down. A second option is if the hardware support is to use wrapping which means that instead of a window moving over the rectangle you basically have something that continues here. So here's the split between the top of the memory and the bottom of the memory and for scrolling you just have to change some hardware registers and the hardware takes care of that. If these optional hardware acceleration features are available then FBCon can and will use different scrolling strategies depending on what's available. Can use panning or wrapping if available. If there's nor not available it can refrain to using the Blitter to copy areas. And the third thing is a smart read-out which is basically if you scroll the screen and you have two characters, the same characters on top of each other then it means that if you scroll one line up that one of them does not have to be redrawn because the same character is already present. That's also small optimization that can be useful. So in those days the graphics stack looked like this without frame buffer devices which is the traditional one which was mostly used on X86. You had the X server running in user space. It's just a mapping depth mem and perhaps the depth IO port as well to do IO access. And then the X server is responsible for all of mode setting, drawing to the frame buffer and implementing hardware acceleration by writing to the graphics card's registers. When FBDef entered the picture and was used then the X server which opened depth FBX instead of depth mem, mode setting would be handled by the frame buffer driver inside the kernel and the X server would only be responsible for the actual drawing and optionally for hardware acceleration. Then evolution continued and graphics hardware gained many more features and complexity increase. So we got set buffers for simple 3D acceleration. We got real 3D acceleration with the 3D graphics engines overlays for video typically using YCBCR instead of RGB formats. The hardware could have multiple planes, multiple displays. You could configure which planes were shown which display move planes and overlays independently. Basically all of this became a bit impossible to handle with FBDef so something new had to be found. Yeah, I'm showing also the memory is no longer showing containing a simple frame buffer which is a direct mapping of what's shown on the screen but it has memory regions containing the display planes also containing texture buffers for the 3D graphics, commands with 3D acceleration, commands geometry data, lots of stuff that was very difficult to handle with FBDef. So in 1999 we got the first implementation of the direct rendering infrastructure with the direct rendering manager of DRI and TRM. That was demoed by Precision Insight at Linux Expo 1999. I've been there, I've seen the demo. So DRM implemented 3D acceleration and in DRM the actual driver part is split in two parts, a small driver inside the kernel which just provides a way to send commands to the hardware and most of the actual 3D graphics programming is done in user space which prepares commands and sends them to the kernel. In those days desktop graphics already got more colors than before. So there was some support for 256 colors in DRM but most graphics hardware for desktop applications you started using RGB with 24 or 32 big pixels and also video formats. DRM supports multiple planes, overlays, there's some memory management handling for all the handling, all the different planes and the command queues and whatever. It's much more complex than FBDef and it has a lot of helpers to make driver programming easier but it's sometimes very difficult to know which helper you should use. After the initial hardware acceleration then DRM also gained a feature called kernel mode setting. I always thought that was a funny name because that's what FBDef does know but basically it's about, yeah, it's about the mode setting. So with DRM KMS implemented, the graphics stuck moved on to what's shown in the third column. So user space is now opening DRI. DRM takes care of all of mode setting, buffer management and hardware acceleration backment which is sending the commands to the hardware and the user space just draws into the frame buffer and thus hardware acceleration. To provide a console on top of DRM, DRM relies on the FBDef emulation. So DRM still uses FBDef and it exposes a frame buffer device for that which can also be used for user space. So user space applications that use the FBDef API can work, assume they do not want to change the mode, the video mode because that can only be done through DRM directly but all FBDef applications can still open the device and draw and it works. At the bottom I'm showing the new image of how the text console works so you still have the text attributes but like in PGA then that's rendered to a shadow frame buffer in main memory. This one is exposed to user space to def FBX. This deferred IO which I talked about is also still used so to track that if user space writes to this memory here that to make sure that the real frame buffer handled by FBDef is updated. So that means that when something is drawn on the console it's first rendered here and then it's copied and possibly converted to a different format and written to the real frame buffer. Now what's the problem with FBDef? So already in 2012 our dear friend Laurent Penciac suggested that we should get rid of frame buffer devices and then finally in a bit later in 2015 the at that time maintainer of the frame buffer device subsystem Tommy Valkain and said that there would be a moratorium on new FBDef drivers. Why? The main reason for that is that FBDef was not suitable for the contemporary graphics hardware. Also the fuzzers started to, that were running on various Linux systems started finding lots of issues and some of them were traced back to FBDem rightfully or not. That's something to be left out of the picture or maybe not. Another issue was that FBDef was essentially un-maintained. After I stopped maintaining FBDef many other people had up there and I definitely want to thank them. But basically as of 2020 there was no real FBDef maintainer anymore. We only had DRI devil was supposed to be CC'd on all patches there because they were interested in it because DRM uses the FBDef emulation and FBcon to try to console. In the beginning of this year Helge Deller took over as the new FBDef maintainer. He made immediately some decision that turned out to be a bit controversial with some people. More about that later. But I definitely want to thank all the people who were former FBDem maintainers and the current one as well because it's worth thanking them. During the last years many of the FBDef features were slowly removed or made less usable. One issue there was that because there was no FBDef maintainer or sometimes it was not that responsible it turned out that FBDef was de facto maintained by the DRM maintainers which I consider a little bit a conflict of interest because they were not interested in really keeping FBDef alive. They just had to use FBDef because of the console. There have been talks about using other ways to have a console KMScon perhaps even from user space because that could solve the limitations of the current console which is for example that you have eight byte characters so you cannot use full Unicode things like that. The DRM maintainers, they're also usually working mostly on very performant hardware and they were focusing mostly on scrolling by redraw which means that all the panning and wrapping and copying features they were not really used by the DRM because for example on the Intel hardware it's much faster to just let the CPU redraw the whole screen than to use some other ways. In kernel 5.9 we got the scroll back feature which was the control page up thing on the console was removed. Later it was also removed from VGA but mostly because of security issues and lots of the hardware acceleration was removed in version 5.11. The claim for that was that it was not used. Yes, it was indeed not used by the DRM drivers but there were tens of FBDF drives that were still using it despite my complaint that was removed since then the decision has been reverted by the new FBDF driver maintainer which was at that time quite controversial. One other thing is that recently I discovered that several of the security issues were actually caused by missing range checking in the FBDF emulation in DRM so they were not real bugs in the FBDF core. Now we know that they're being fixed which is good of course. So the official policy is that we should migrate from FBDF to DRM so I worked a bit on that and here are my experiences. Since 2015, no new FBDF drivers. Since then there's been lots of discussions about what should we do? Oh, but it's very easy to write very simple DRM driver. Just look at the existing tiny drivers but I just looked at the current code which is after lots of cleanups and the smallest tiny DRM driver is still 50% larger than the simplest FBDF driver so it used to be even worse. The DRM people have been pointing to the simple DRM thing since at least 2015. I always said can you show me a very simple driver so as an example to convert, but not much happened. Then I discovered simple DRM was even mentioned already in 2013 and it was finally merged upstream in 2021 for driving firmware frame buffers like EFI and older open firmware. So those are small examples and then of course the question is what do we do with the existing FBDF drivers that do not have a DRM counterpart? There are still about 100 which surprises many people and what do we do with the out of three drivers? So should we convert all the FBDF drivers to DRM drivers? Why not? The typical reasons why not tackle that task are listed here. There's no time to do that. Mostly for old hardware, I don't have access to it. DRM is complex, I don't know how it works. Don't have time to look into it. Are there missing features? I thought there were, but it was very difficult to show that without code more about that later. 2019, Thomas Zimmerman, he wrote some patch series with conversion helpers which would make it easier to just take FBDF driver, add some glue code called the helpers, then clean up the result and get it upstream. Most of that did not end up upstream except for some helpers. And then did I mention there are zillions of helpers to show us from and it's difficult to know which one? So for years I've been thinking about showing how good or bad it is to convert FBDF driver to DRM. So I picked the Atari DRM driver. I don't have an Atari hardware, but the nice thing about Atari is that you can run it in the Arran emulator. Arran stands for Atari runs anywhere which is much more convenient than on the real hardware. I still have an Amiga which I could use, but I wanted to start with Atari first because that's easy. So on the Atari hardware supports various video modes like a monochrome which is actually two colors because it's not just black and white you can pick any two colors. Supports four, 16, 256 colors using interleave bit planes which means that the bits are spread out a bit. It also supports 16-bit RGB, big Indian mode. Very important. So after thinking about that for many years I finally started in 2020 to get something working but it failed and I gave up. But when we got a new FBDF maintainer and lots of discussions about FBDF and DRM early this year I thought it's the time to start working on getting it ready for real. I have something that works. Still not ready for submission, but it's a start. So what were the issues I had? So DRM supports only color index frame buffers with 256 colors which basically means one byte is per pixel. So first I had to support for the new formats. DRM itself does not render anything into the screen it just needs to know about the formats and some of their properties. Rendering to the screen is either done through user space or through the FBDF emulation for the console. I also ordered support for that to the mode test binary from the test utility from LibDRM. So you could show the picture, the test images there. As the C2 and C4, as the C1 and the C2 formats have only very limited number of cores I implement dithering there, dithering in monochrome and dithering with simple RGB. Then adding of course the formats was not sufficient then it crashed in various places because there were many assumptions in DRM that the pixel is at least one byte. So as soon as the pixel is smaller than one byte you get divisions by zero and other stuff like that. And while that also added definitions for the formats are 1, 2, 4, 8 and d1, 2, 4, 8 which are meant for monochrome displays mostly grayscale or the R ones for grayscale. Typically they use the RGB, you have red, cream blue and the convention apparently is that if you have a single channel that you call it the red channel even if it's not red but grayscale or purple or whatever. And then the d1s they are meant for hardware where there's an inverse relationship between channel value and brightness which means that zero is the lightest and for example 255 is the darkest. I have no immediate use for those but it's still there. DRM the color lockup table is always 256 bytes for now probably that's overkill and we should fix that. The main reason for that is because it's also doubles as a gamma table. To make use of all these new formats you probably have to update your user space if you're interested in using that. And most of these things are now finally queued for V6.1 they showed up in Linux next at the time bit early before but it was not there and then the mode test fixes I submitted for libDRM they're still not applied. A second issue I had was then the endianist. So DRM formats are defined to be little endian unless the identifier of the mode has bit 31 set. That's nice. Warning, old drivers on PowerPC I assume because that was the other big endian system that was using DRM. The drivers may use old native endians. That's considered a bug. New drivers should set this flag and we don't need it. And then the FVDF emulation for the console translates if you request one of the formats it translated to the other one. Good. So now what does it mean for endianists? So if you have a 32-bit pixel with four bytes and then three bytes red, green, blue and if you have big endian then it's different. So the remapping of the modes it would get from one mode with one byte per pixel to another both with one byte per color value. With 16-bit that doesn't work like that because you have this annoying split. Here in the middle so the green ends up in both pieces. So that's interesting when I finally tried that. It turned out that those formats were not recognized by DRM so there was some support for it but not there. DRM had lots of conversion helpers to convert RGB. For example, if you want, you can convert from 32-bit RGB to a lesser format. And those conversions handles most of them like the endian's handling as well. I fixed mode test on big endian too. I was a bit surprised so since I thought DRM was used on PowerPC that simple test utilities like mode test would support big endian as well but apparently not. I'm gonna skip some part because we're running slowly out of time but the conversion helpers are fixed now. They are queued for v6.1. The DRM support is missing. Another thing is we have with old graphics hardware is that they typically were used with analog video modes. Maxim Repa is also working on analog TV which is slightly related but I'll skip on. If you remember our old analog TV work, it was basically an electron beam which is controlled by magnetic fields to move over the screen and paint a light up force force and what's fixed in the video mode is actually the number of lines you have but how many pixels you have per line that depends on the analog bandwidth. If you have a digital computer of course you have a clock and you have a fixed number of pixels but this can be quite variable. Also with old CRTs, you have the overscan region here which some of parts may be visible or may not be visible. You don't have a way to find out that the capabilities are the more of the screens are. Digital displays, it's much easier these days. They have a fixed number of pixels and you usually don't want to use any other video mode because else you get fuzzy data. Interesting in digital displays that you still have the actual interfaces on HDMI for example, it's basically a digitized digital version of the old analog timing. So still some parts there and there are some displays that use different interfaces but for the analog things it's all drivers that used to work with analog screens you could need various video modes. One nice thing about the ARM is that it expects that most everything works with the format called XR24 which is basically a 24 bit per or 34 bit per pixel RGB mode. Even drivers for monochrome displays implement this. They don't implement a monochrome format because it didn't exist until the low color patches. So it works with everything. It's very suitable for desktop graphics because nobody else, nobody wants less than RGB core but it may be overkill for the lesser systems. You have to copy and convert stuff like that. So traditionally the DRM drivers for monochrome displays they take the RGB 32 bit per pixel data convert it to 8 bit grade scale and then convert it to monochrome. And yeah, on the modern Intel graphics hardware they claim to have 10 gigabytes per second bandwidth but not all systems have that especially in the older ones or embedded ones. Also other issues because if the applications are not aware that how many colors there really are then they cannot optimize for the screen. On the left you see the normal 256 color penguin. If you try to display that a bit, if the system thinks you have RGB capabilities then you get the left penguin and then after conversion to black and white you get something like that which is really bad. Also if you have a gray text on a gray background or a blue text on a red background or something like that you may end up with something that's not readable at all. So what's the status of the Atari DRM driver? I got all the pixel formats working, text console is working, I get more text working, more test working conversion from RGB to 256 colors, basically conversion to RGB 32 works, conversion to the big engine RGB 16 bit RGB works. I can do video mode programming so it supports almost all the same video modes as the old frame buffer device driver, FB test works using FB Dev emulation. What can be improved? My major headache now is the video mode programming code because it's very complicated and despite all the helpers I don't know really, I still don't know what helpers I should really use and whether I can make it simpler. I should be a way to allocate the shadow frame buffer from the actual video ROM so to avoid using the native 16 bit big engine RGB 565 format to avoid having to convert that from big to little engine or vice versa. I have tested on Aranim only so I cannot provide you with benchmarks on real hardware. What about performance? I don't have benchmark data from real hardware but I can tell you about the kernel size. So kernel size increased by almost 300K because the RAM it includes things I don't need on my Atari like I2C and HDMI and IRQ domain code. Due to shadow frame buffers it consumes much more RAM as well. So yeah, that's what we lose. On the Aranim emulator the text console became 10 times slower. So now it's drawing into for 16 color 4-bit per pixel packed shadow frame buffer and when that's something changed there it has to be converted to the interleaved bit planes. Oh yeah, probably we should find a way to improve. Is all of this still relevant? All the legacy hardware is obsolete now and your smartphone can do 3D graphics fine but we still have low and embedded platforms, small displays, limited amount of RAM, the typical things there. One example here is a one megapixel eating display which is a monochrome display. It's about similar to what we had in the century 50. So you need just the same amount of video RAM like on the century. If you need the shadow buffer in 32-bit per pixel support you consume all the other RAM in the century just for the shadow frame buffer. If you have to convert to grayscale in between then you need the second century just because you need more memory for that. Fortunately, these days such a one megapixel eating display will end up in a modern e-reader and a modern e-reader has easily 200s of times the amount of RAM as the old son tree. So if you would show the same images like you used to show on a son tree then you need a stack that goes here to the roof. Other things like this interesting display it's a seven color eating display uses four bits per pixel. So how do you model that? Perhaps with C4 which assumes that you have 16 colors but you need a fixed palette. DRM does not have support for fixed color things yet. So FBDF used FZ as well. So conclusion. I think we can convert FBDF drivers to DRM drivers. I think I've identified most of the mission functionality and I send patches to handle that. What do we gain? The most important one is the common user space that we don't have to care about bit planes versus packed pixels or whatever strange format your heart may use. You can just have one simple packed pixel format there. We could finally implement support for the old Amiga, old and modified mode. Perhaps we can get rid of FBDF less subsystems to maintain but what we do give up mostly is then a low memory consumption and performance. So yeah, we're running out of time. But I think I've finished. I'd like to thank a few people here and perhaps we can take one question. No, stop, okay. Thank you very much. Thank you very much. Thank you very much.