 Spring and summer. That's one of the cool things. About two years ago, we started analyzing what we wanted for X driver architecture, what we wanted for graphics and Intel. I started attending the kernel summits, trying to figure out where kernel interfaces were going for graphics, what we wanted to do in terms of integrating the Windows system and the kernel graphics devices in the kernel context. There were a lot of clear directions that we needed to go. Obviously, the architecture that we have today, where user mode manipulates PCI configuration registers, is not sustainable going forward. The architecture we have now, where user mode actually accesses IO ports directly, is not sustainable. So we needed to do some fairly dramatic changes, and we had some fairly interesting opportunities in terms of architectural changes coming from outside that we wanted to explore. So the question we had two years ago is, what do we want? What's going to make us happy? What's going to satisfy the needs of our users, of our developers, and even our OS vendors? So what's going to make a Red Hat happy? What's going to make a Ubuntu developer happy? What's going to make Debian happy? Well, I use Debian, so I don't really care about that. Are they happy? What? Are they not? Oh, wait. Sorry. Answer. Are they happy? Of course. Enjoy, too. Eh. They're just going to turn on new compilation flags and break my driver again. And my dreamer? Yeah. OK. Sorry. Thank you. So the big question is, what's wrong with the current driver architecture? Where are the fundamental problems, what are the limitations of the architecture that we have? What things can't we do that we want to be able to do? And how do we get from where we are today in a relatively clean and incremental process to where we want to be? The important thing here is we don't want to just discard all of our existing code. We don't want to make everybody's operating system break for three years while we go off and rewrite the world. We don't have that opportunity. We want to make small refinements in relatively limited areas so that people can both see continuous progress and so that we can, of course, detect problems going forward. So we don't just dump a giant new architecture on the ground, notice that it's completely broken and take three years to figure out how we can fix it. OK. So what do we want on the desktop? Well, the number one thing we want is we want to make it so that you can have that bling experience that we're all used to now accomplish. You want to be able to have a composited desktop where all the objects in your environment can be manipulated by a compositing manager and put up onto the screen. Now, that compositing manager is purely a 2D abstraction using 2D applications of paint or that has some kind of integrated geometry mechanism where you can have SVT objects embedded in your compositing manager. I don't know. But I know what I want. I want to make it clean. I want to make it tear-free, and I want to make it so that you don't get any partial updates on the screen. When people talk about Mac OS X and how clean and fast it looks, the actual performance of the Mac OS X desktop is dramatically lower than Windows or Linux. It looks better because it doesn't care if there are no partial updates. And we can get there. We've seen it with us, but we've seen it in a lot of demos, but we haven't seen it for the entire Linux desktop. So we want to clean that up. We want to take all of our disparate APIs right now and integrate them together in terms of their ability to manipulate the same objects. Right now, when you paint something with a 3D API, with the GL APIs, you can't touch those pixels with 2D reliably. We have some extensions now that let you kind of grope them through with rubber gloves on. But it isn't integrated. It isn't clean. It isn't high performance. We want to get to the point where when you boot your computer, you see that boot dialogue with pretty flowers on it, and it goes all the way through the OS initialization to the point where you're logging in, and you've never seen another screen flash. You screen flash when you power it on the computer, and after that, you're in graphics mode, and you come all the way up to the UI and log in and it never flashes again. We want to be able to control the user's experience all the way from power on to desktop. We want to be able to have multiple users share the same computer. If any desktop environment today is, by definition, a shared desktop, because desktop computers aren't personal computers anymore. There's something you have in a library. There's something you have in your office, in your home. There's something you have in a kiosk, in a mall, or something. We want to get to the point where you can actually context in that and smoothly switch between them and give everybody the same accelerated experience. Right now, you have one user that logs in, they get accelerated 3D. The next user comes in and uses that computer at the same time, not so much. It's just not a good experience. Obviously, we want to be able to plug things in dynamically. We want to be able to plug in USB video cars. We want to be able to plug in monitors. We want to be able to plug in mice. We want to be able to plug in tablets, especially in a mobile laptop environment. Your experience in your user interface is very dynamic. You go on the airplane, you have no devices. Your Bluetooth mouse is disabled. You come into a conference room, you have a projector, and you have a tablet, and you have a mouse or something. So you want to be able to dynamically change your environment. A lot of the focus that we have these days is on reducing power consumption, not only to increase your, not only to save power to reduce your carbon footprint of your laptop, but also just to make the batteries last longer on the airplane. This battery here lasts about nine and a half hours, and yet that's not enough to get me to Australia yet. So I want to go to Australia again. And of course, anybody today fantasize that their computer is fast enough? No? Oh, somebody is. Is your computer fast enough? It's much faster than. It's much faster than. Is it fast enough? It's a little bit faster. There. That's the correct answer. Thank you. So we're trying to improve performance. And another big part of our effort here is to reduce the amount of code running as a privileged user. Right now, your entire X server runs as root. I don't know about you, but that's a whole lot of code running with a lot of privilege. And especially something that talks so closely with the user with a fairly wide API, the entire X API and the entire GLX API, that's an awful lot of code running as a privileged user, especially in a process that has your entire memory system mapped into it. It's not even hard to break things with the X server. OK, so where are we now? We have a composite desktop, right? Everybody's seen Compass. 2D works great. Textured video, that works pretty good, too. Overlay video? Not so much. So if you have a machine that has, if you want to take advantage of the video of overlay, you can't paint the video overlay on the side of your 3D cube. That's very sad. The other problem today is that 3D is not composite. If you run a direct rendering application, it punches a hole right through the middle of your screen and paints itself wherever it thinks it needs to go. With A.F. Shalak. What? With A.F. Shalak. He chooses a mode. Yes, yes. I said direct rendered. What do you think I think about it? No, direct rendered applications do this. No, you can actually see the window, the gears window. It's over here. I don't know why it's here. Yeah, and then you can see where the direct rendering application actually paints the window. It paints it where it thinks it should be, right? Into the frame buffer. Not so good. Yeah, we don't have to fix this. This is not months away. This is weeks away. But it's not working today. Right now, if you want to synchronize your application with the vertical retrace interval, the only way you can do that is to write a direct rendered or a 3D application. The only API that we have that synchronizes is 3D. So your 2D applications, they just get to chance. We want to get to the point where all of our applications can synchronize to the VBlank interval as necessary. In a composited environment, there's only one application that needs to sync, right? So you're not compositing that. But in an uncomposited environment or a full-screen environment, your 2D applications may want to sync as well, especially your textured video. Right now, I've got probably a dozen bug reports on the Intel driver for 965, which only uses textured video, that people don't like the fact that video tears. You get a line to the middle of your video in fast-moving scenes. Not pretty, not acceptable. Why is that? I can't sync the vertical retrace interval to the painting of that textured video on the screen. API integration, this is where I have these three different APIs. I have a video API, XV, I have a 3D API called Render, I have a 3D API called GL. They all talk about different objects. The video can't draw a PIXMAP. So if you try to have a video application painting into a PIXMAP, so if your application can take that PIXMAP and paint it somewhere else, that doesn't work today. Now the cool thing is that video, textured video, can be redirected to something that looks a lot like a PIXMAP, but not actually to PIXMAPs. Video still sometimes uses overlaps. So we've got a lot of backfilling to do in drivers so that the video can paint to Windows and PIXMAPs as actual pixels instead of all kinds that are using overlaps sometimes. 3D can't talk about 2D PIXMAPs at all. You get a 2D PIXMAP, you want to do some 3D rendering. Well, it's all right, you can't do that today. 2D, the 2D engine, the X render API, doesn't know about geotextures. So if you want to take a 2D application and render it output so that a 3D application can paint from it, right now you paint that 2D application in PIXMAP and then you call the texture from PIXMAP function, which literally sucks the pixels out of the PIXMAP and blows them into a texture with the CPU over the PCI bus at about 4 pixels per second. Like, they're both in the frame buffer. How far is this? OK, the other thing we want to get to is where we have no more flash at boot time, no more blinking of the screen, no more tearing of the vertical retraces. You see the vertical retrace bar in the middle of your monitor. This is fixable. So obviously in an x86 PC environment, you're still going to get the BIOS coming up in text mode for most of those, right? Not like you could do about that. But as soon as I get to the boot screen, I want to get graphics mode and move from there all the way through the boot process without flashing again. Do as good as we can. In an abandoned environment, we can do better than that. Do you still have the option to use a small point for high resolution for the console? The question was, are we going to be able to configure what fonts we have on that screen? I don't know. I'd really like to have my 32 lines and... You want a fancy console? So the man wants a fancy console. I have a really fancy console for you. It's called console with a K. What? Yeah. So, yes, the question is, when we move to an FB based environment, will you still be able to have all kinds of fancy console options? We have some thoughts about, perhaps, creating a user mode console that ran on top of the raw frame buffer that didn't use x. Maybe somebody will do that. Not interesting to me. Obviously, we have a console built into the kernel. They run the console. It's built into the kernel. They run on the FB interface already. So we'll be able to continue to expose that. And, of course, if you want to be in text mode, have a party. You're still going to get flashing. But, yeah, not my problem. OK, so how many flickers do we have today? We get the hardware logo screen. Then we get grub. We get a bunch of kernel messages. Then we get the kernel resetting the font. Anybody enjoy that? It's like, whoa, what happened? Then the backlight goes off. And then it flashes to a solid color, oftentimes the root-weed, which is one of our favorite patterns. And then it flashes to the GDM logo. So all these times you're getting screen flashing, the backlight's going on and off, pixels are flying all over the screen. Users are going, what's happening? My computer's exploding. OK, we want to get fast user switching. Right now you can do VT switching. Ooh, VT switching. Let's see if it even works. Well, that was fun. Oh, look, it's not on there anymore. Then I can try to switch back, and we hope it comes back. Right? You're lucky that it works. Yeah, well, you know. I'm pretty, pretty lucky. The main problem we have with VT switching is it's functional for 2D applications. So if it was only a fact that it was, you know, it was ugly, it would be somewhat tolerable. The problem right now is that DRM is limited to one VT. So you only get 3D acceleration on one VT. If I want to move my 2D driver to relying on DRM for all of its acceleration, that's going to be painful if I can only have one user using any acceleration. So we need to fix that, obviously. Hot plug everywhere. Finally, we have video output switching working in most drivers. All the open source drivers now have Grander 1.296. We have a problem right now. We can't resize the frame buffer, which is to say, when you plug in your external monitor and you want to have, you know, you want to have twin view mode or extended desktop mode or whatever you want to call it today, if you haven't pre-configured your X server before you started it to manage extra wideness, it does work. It's like, that's not acceptable. We want to get rid of the need for images on the screen and change that, dynamically. Right now we have a problem that we can only draw to one frame buffer per GPU. That means each graphics card in your machine can only draw to a single frame buffer. The limitation there is that if that GPU has a limited width that it can draw to, the 915 and 945 chips can only draw to objects that are 2,048 pixels wide to 1280 by 1024 monitors not an unreasonable expectation even for people on a modest budget, you can't because that 2,560 pixels wide if they're separate monitors we should be able to create multiple frame buffers. We want to reduce our power consumption. A lot of the power consumption right now is actually caused by the fact that CPU is never idle when you're drawing. You set a bunch of commands out of the GPU with a 2E engine and you want to set a bunch of commands but the GPU is now busy. It's like, well, GPU, tell me when you're idle and wake me up. Oh, no. Are you busy? Are you busy? Are you busy? So if you actually do some x11 perf measurements with filling solid rectangles which is really boring for the X server it says, still. One moment, please. Still. Right now the CPU is melting. We have a big deal most of the time but if we can get the CPU idle a lot more we can save a huge amount of power. So we're going to try to do that. We also want to use techniques like Gallium which is going to increase the amount of code and the efficiency of the code running in the GPU by using better compiler technology. That will take our shaders to make them run faster which makes them run with lower power. We want to make things obviously faster. Right now for most GPUs the render extension is not accelerated well enough. For 965 and 915 it actually has credible acceleration for all the basic compositing stuff. But all the other stuff, trapezoid filling and text painting all that kind of stuff is pretty bad. So architectural commonality between the 3D and 2D side would be really nice. I don't know where that is. I've been looking at Gallium to see if that's going to make a nice integrated 2D 3D API. But things can change. 2D really is different than 3D in some pretty fundamental ways. X has some pretty serious security issues. The entire X server runs as root. The X server maps every IOPORT in your machine right into its address space. So if you make a mistake you can reformat your engine. Yeah. Yeah. The X server maps the entire graphics card. So if the X server crashes and has mis-programmed the video card you power cycle. That was a good plan. One recent good change is that it no longer reconfigures the PCI devices in your machine when it starts up. That's kind of a nice change. You should actually make sure there was space so if your BIOS misconfigured your video card and didn't know your BIOS might stick to your PCI controller right on top of your video card where there was memories that the video card wanted to use. So when the X server started it might say oh that PCI card running your disk needs to move. We'll just remap that so that the X server has enough space. What would all work out? It never happened. It never happened. It never touched anything. It never touched anything. It never touched anything. But I think it's connected to it. It never happened. Do you have a system where it happens? It can try. It's not supposed to. The code is there. When it moves the credit it's just... Okay, so in any case how are we going to get to where we're going? Obviously for composite 3D applications this is not hard. This is the place everybody wants to be. Right now every 3D application shares a common back buffer that's the size of the screen. And that's why they have to know how to clip to that. So the back buffer is clipped exactly the same as the front buffer. So all your 3D applications actually render the scene multiple times to each clip rectangle on the screen. Draw a curved window over your 3D application and it's going to do multi-pass on every rectangle on the screen and then move that curved window around and it has to update the rects to the 3D application to the direct rendered application so that it can paint them correctly. It's really tricky. Obviously what we want to do is move to per window depth and back buffers. And then we want to be able to take those private back buffers and swap them to the redirected window buffer instead of the frame buffer. So this is going to take some changes there. We already have a demo of this working so this dysfunctionality is actually landing fairly soon. Yay! Finally. Synchronizing to retrace the 2D applications right now the easiest solution to get to a tear-free video is to actually make it so that the 2D applications can tell the compositing manager when they finish to frame. So the compositing manager can then use the 3D API to synchronize with the vertical retrace. Obviously we want to make it so that full screen applications especially video players can sync to vertical retrace without having to be redirected to pretty good performance improvement. So we want to make sure that XV the video API can use DRM for buffer swaps. It can use the DRM blocking mechanism for swapping buffers. AI GLX again needs to be able to block in DRM for buffer swaps. There isn't a lot of action going on in this part of the API. It would be nice to see somebody actually take a stab at doing this. We're integrating our drawing APIs that say we have 3 APIs, video, 2D, and 3D. It's fairly simple to integrate video. The code is already there. We're doing textured video like lots of hardware. And in fact, the only thing breaking XV from working, being able to paint fixed maps is if the drawables of fixed map return fail in the XV code right now. It's like, why are those two lines of code there? So it's impossible to write a driver in the X sort of server that can do XV to fixed maps today because the X server has failure for fixed maps. So I think we can fix that pretty easily. I've tried 3D GLX. Now as soon as we get to common objects in the graphics card where we're using TTM for all of our graphics memory management and we have objects that we can name all of a sudden those object names can be moved between the 3D and 2D world. So I can take an object from the 2D world, use that object name and hand it to the 3D system and say hey, here's a new texture for you. And one hopes that the 3D engine will be able to take that object and use it as a texture. There are some graphics cards that are going to make this pretty difficult. I know in Intel it's going to be easy but I live in kind of a privileged world. My graphics card is so simple. Yeah. It's like, you know, UMA. It's a slow world but it's simple. We already have demonstrated the use of this new memory management TTM for 2D drawing and sitting out there. So we're creating objects for fixed maps. We're rendering to them. There are some performance issues. But we think we can fix them. Let's see. Flicker-free boot. This is being handled by putting all of the mode setting into the kernel. Jerome talked about doing this in the radio driver. Fairly simple. You take the mode setting code that's sitting in the X server and you put it in kernel mode. And then you create an API between the kernel and user space that exports that functionality. One of the interesting things about the RANDOR 1.2 development was that that API became really clear when we figured out what the commonalities were between existing graphics cards and how they did mode setting and how they did, you know, the layers of output manipulation we needed to do. So we finally had a common language to talk about video card configuration and that language has now become this proposed kernel interface. It's very similar to the RANDOR 1.2 protocol. We're talking about GPUs, CRTCs and outputs. And you plug them together, you know, and there's limitations on what plug-in can happen. And then you have some additional additional bag of properties if you have additional additional mode characteristics. For video, you want to know what video format you're using. Is it PAL? Is it NTFE? Is it CCAM? What video mode do you have? If it's an LBDS, you want to know whether you're stretching the field of screen or whether it's centered in the middle of the screen for a non-native mode. Simple stuff like that. And the RANDOR 1.2 stuff really showed us how to get there. The VRM stuff is currently on a branch. Do you have a question? Have you figured out the interface for EDID purpose? So the question is, have we figured out the interface for EDID purpose? And I don't think there's any kernel interface that we know of yet to do that. Our plan is currently, I believe, the way the code works today, the EDID quirks are still handled in user space. Note the third bullet here on the slide. We don't really know how to get the kernel to come up in the right mode at startup time. Our plan is to have all that data stuck in the InitRD when it comes up and configure the framebuffer. So your screen is blank until the user space starts and it loads a framebuffer configuration utility that reads data out of the InitRD and configures an appropriate mode. So you're going to configure it and then save it to the InitRD and the next boot time it'll come up in that mode. I don't think it's too complicated. Okay, fast user switching. Multiple DRM masters DRI 2 already provide this functionality for us when it gets integrated and we're ready to use it. Obviously, the framebuffer object manipulation from PPM gives us the ability to create multiple framebuffers. Now we just need to be able to have some protocol for saying, oh, use this framebuffer now or use this framebuffer now. Or let me create a third framebuffer and merge the content of these two framebuffers together. Kind of a super compositing manager and user switcher thingy. We'll have some gals in that pretty soon, I hope. The code for this part is actually in Fedora right now. And we're doing some experiments with that and we're expecting to ship it upstream fairly soon. So moving to DRI 2 is happening fairly rapidly and that should give us a whole bunch of this functionality. Refining the framebuffer this is going to allow us to fix our hot plug monitor problem right now. We've been only hot plug monitors as long as you preconfigure your server to let you hot plug monitors. Not an ideal situation. You have to throw XAA out the door. XAA adds a fixed size framebuffer that you can't change. It doesn't have any provisions for changing the size of the framebuffer. XAA really doesn't care the new acceleration architecture. There are some 3D drivers, 3D driver issues. Moving to TTM will help resolve those. Right now there isn't any way for the 3D drivers to know that the front buffer or back buffer has moved down to the old back buffer. Not so good. There's notification. The 3D drivers don't actually do it though. There is some code in there to make this work and I think it worked for a while in 1915 actually when they're doing the rotation stuff. Yeah. The mechanism is there. It's not very complicated. It's mostly a matter of moving the 2D drivers clearly the problem right now. And then we need to fix the 3D driver to check to see where the framebuffer is. That's right. It's turned off because the render fix the people that are there or if you're reducing this then... So actually a co-worker of mine said that there are some problems in the DRI interface that make this difficult to do. The rotation stuff didn't change the size of the framebuffer just below the position. It comes to work. Yeah. I don't know. This is some work that Adam Jackson is working on. The Shatter stuff to use multiple framebuffers. This is similar to but it's almost entirely different from the old Xenorama code. The old Xenorama code but let's use multiple GPUs to have multiple screens and integrate them into one big screen. That worked at a very high level in the X server and actually duplicated all rendering across all screens and all of your rendering objects all screen. So if you had a PIX map it would actually instantiate the PIX map on both GPUs. If you had a window it would instantiate it on both GPUs. So every time you drew anything to any object in the system it would draw on every GPU. We clearly don't want to do that when we have a single GPU. So what Shatter does is moves down the level of the server and it only muxes out the drawing to the individual PIX maps themselves. It expects the underlying acceleration functions of multiple PIX maps. So we should be able to get some pretty we should get no performance penalty and no memory usage penalty with this particular design. That's going to get us the ability to use to have extended desktops that are wider than the maximum rendering time supported by the drawing engine. We have some questions about how this is going to work with DRIs. There's a lot of hand waving going on. Because right now one of the nice things about private backbuckers is that if the windows are not wider than the max if we need it to windows are not too wide then it will continue to work just fine. But once you want to draw to a single object that's wider than the GPU can handle you have to do multiple rendering. I assume it seems like the existing clipless code should make this relatively easy because the 3D code already handles the ability to re-execute the operations This is kind of like different clips but there's a different object that there is. I think it will be I think it will be an adventure but a manageable one. Obviously we want to reduce power consumption to get rid of the spinning and the 2D driver because the DRM can wait. When you ask to when you use TTN to queue a bunch of requests under the card and the card is busy it blocks waiting for an interrupt It's a novel concept. We need to extend XVMC right now the motion comp extensions the X-video protocol they only work with MPEG and as we know MPEG is not the be-all and end-all video formats we now have 3 or 4 new formats for Blu-ray China is coming out with a new format Microsoft has their own format