 Welcome everyone. So this is Chromewise Graphics one-on-one, right? The goal is not to go into details, but the goal is to introduce the basics of the Chromewise Graphics stack. My name is Stefan. I work on Chromewise Graphics incidentally, so it's a good fit I guess. And I've been here for a while. And so we're going to, in this presentation, try to focus on some use cases and some like patterns that can be useful as the basis to understand how Chromewise Graphics stack functions. So first of all, here's a quick overview of Chromewise desktop. And here we have multiple windows, as you can see. On top, we have a web-based HTML-based window. But underneath it, we have other components. We have Arc++ window with the Play Store, right? And underneath it, we have a native Linux application, G-Edit running, all of this running on Chromewise. And as you can imagine, all those things are pretty different. And I think the goal here is to actually try to figure out how do they function together, how do we make sure they work, et cetera. So let's start with the first one sentence, super high-level description of what's happening. Chromewise Graphics stack really has two halves. One is user space, Chrome running in user space has the big application. And it sits on a current API called DRM, which basically provides services in terms of display and graphics. So the Linux kernel provides the basic services at the graphics level, and Chrome exploits these to build the interface we saw earlier. So if we look at it in terms of a little bit of a schematic here, those three main windows we were looking at earlier, one of them is a renderer, right? So one of them is a web-based renderer, so it sits here. One of them was this Arc++ Store, right? So that's this stack. And one of them is the Linux application, which is what we call Crostini, which is basically our Linux virtual machines on top of Chromewise, okay? To be able to display the applications together, we have a few tricks up our sleeve. For native applications, it's pretty simple, right? It works basically the same way as, you know, normal Chrome tabs, you would. We have renderer processes that talk to Chrome. We'll go into more detail into exactly how this is done, right? For external applications like Arc++ or Crostini applications, right? We will actually use a mechanism called Exosphere, which takes foreign buffers from native applications and puts them onto the screen as part of all the other windows. So everything that is produced this way can go, you see in the single place. So it's kind of a hint in this picture, right? We see things coming together into the middle here. This is what's happening. We're gathering all the image together and we try to produce content, okay? So before we go into too much detail into how exactly all of this works, I'm going to give quick run, a quick introduction on all of these layers here. And we're going to start from the bottom, from the hardware and I will say, look, this can do this. This does this, this does this. And then once we have all the pieces, we will look at some use cases and scenario how they're lined up, okay? So let's start with the first piece, the hardware. What's in the box there? Graphics hardware, actually, there are many things in this box, right? It's not just a GPU 3D engine. It's not just a display engine. It's many things. So the first piece that's important to think about is what we call a 3D engine. This is what you probably all know as like direct text, direct text 12 type of implementation, open GL Vulkan, all that stuff, that good stuff, the 3D. This is done there. Then we have another piece called the display engine. So the first piece takes the commands, right? Those high-level rendering commands and produces buffers of pixels out of them. The display engine takes buffers of pixels, sends them on the wire to get displayed on the screen, so you have two engines that work together to one, produce images, and two, show them on the screen. The last engine, which we're not going to talk about today because it's only half hour, is the video engine which is similar to the 3D engine, is that it produces buffers, but instead of producing buffers based on a stream of rendering commands, it produces buffers based on an encoded video. Think about stuff like VP8, VP9 videos, you play YouTube, there's a little piece of hardware that knows how to decode this YouTube stream into a picture, and then that picture can go to the display engine and get shown on the screen. So that's the basics. So what's above the hardware layer? Above our hardware layer, we have, like I said, the kernel side, the d-arm, not to be confused with digital write management. It's direct rendering manager. It's the name of the kernel module in the mix. So that sits on top of the hardware. It talks to it, it programs it, right? And basically through it, we can control the GPU and the display and say, do this, now do that. Two interesting bits of functionality that exist in there. One is through the atomic updates, we enable what we call overlays, which is a mechanism by which multiple layers in the display engine can get displayed at the same time. This is useful if, for example, you want to have multiple buffers that come in and not have to copy all of them into a single buffer, but instead we display all of these buffers transparently using the display hardware functionality of, like, this one goes here, that one goes there, right? Think about one window, one other window, these are two buffers, that's essentially what this can do. And so we have a functionality called atomic display which says, I want to set in an atomic way this set of overlays, this set of buffers onto the screen. The other one, a very powerful piece of functionality we have is buffer sharing. So this is about taking a buffer from, let's say, the 3D engine that produces 3D content and sharing it to another piece of the driver, let's say the display part of the driver, or sharing it between the process and another process, right? So we have this capability of taking a buffer and without making a copy along the way telling to another process, here's my buffer, you can use it or you can produce it and I will reuse it later, right? And so this is done, this is called DMABuff and this is basically a file descriptor. So this little file descriptor, we can pass it around over like a socket, right? And we can tell the other process, hey, you can have this buffer, I'm sending it to you, it's yours. And that way we can have zero copy buffer sharing between basically all the pieces of the stack that we saw earlier, okay? So these are the two big mechanisms that are fundamental in the whole of the Chrome OS graphic stack. One piece that's also interesting is our GPU memory allocator, mini-GBM. It's basically a way to solve the buffer format problem. This is a library that takes a request, for example, I want to allocate a buffer that has these properties, I want this buffer to be displayable, I want this buffer to be renderable by my GPU, and I want this buffer to have dimensions of 1024 by 768. And so we give all those constraints to mini-GBM and say, give me back a buffer, and it does. Okay, here's your allocation. Why is it subtracted? Because different GPUs have different requirements, for example, in terms of pitch, in terms of internal formats, even in terms of what they can or cannot do, sometimes this will fail, right? So we have all these things to take into account. So we abstract that, and one of the things that can be done, you remember I said in the previous slide, we can exchange those buffers, right, using DMA buffs. One of the things we can do once we've allocated something is say, give me a file descriptor from a DMA buff out of this, and I'm going to share it with another process, okay? And so this is what we say here, support buffer sharing, right? And so it carries a bit of metadata, right? Modifiers, foretelling, et cetera. But for most intents and purposes, this can be hidden, right? We can ignore that behind the scenes all this stuff happen. The library is hiding it for us. Then on top of it, well, not on the side of that, we have all the 3D drivers. So here we're talking about GLES, Vulkan, right? This is the stuff that takes those high-level rendering commands, produces, draws triangles out of them and produces those buffers of rendering that we know. One interesting thing in the Chrome OS is that we have a special process called the GPU process, or now it's been renamed to the Viz process, which basically can be a marshaller and can be like a proxy for all of the GPU accessing Chrome OS. And by this, I mean any other process like the renderer, et cetera, right, that want to use the GPU, do not do it directly. They will do it through that GPU or Viz process, right? So I'm going to use both GPU and Viz name because it's in the middle of transition about it, but that's the same thing. What's interesting there is that we can actually recover. So if there's a driver crash, for example, a driver-side crash, this process is made to be restarted. We are resilient in the whole stack, and we can say, oh, it crashed. You will see probably the screen like freeze or blink for a second, and then it will restart. Usually that means it's a GPU process crash. If you want to play with it, you can actually go in your process manager and see the GPU process kill it. You will see it comes back, but you will have a blink. So that's one thing that's kind of funny. In terms of tests, we have to pass the GPU integration test for these drivers on Chrome OS, for the Chrome side. Now, things function, you remember I said we have three stacks, the web stack, the Android stack, and the Linux virtual machine stack. For the Android stack, things function differently. The Android applications do have direct access just like the Chrome GPU process or Chrome Vise process. To satisfy the Android requirements, we have to pass some tests basically. DQP and CTS, the Android conformance tests, to be qualified as an Android implementation. That's also ensuring the quality of our code here. Finally, the third and the last type of application that we run are our Linux virtual machine apps, right? This one is full on virtualization, and so we have an extra layer that actually virtualizes the graphics and runs on top of the host implementation of the GPU driver. Again, we will go into more details into how exactly these functions in a minute. So, let's go one layer higher in the stack, where you start to be inside of Chrome at this point with a layer called Ozone. Ozone is an abstraction layer in Chrome and its purpose is just to abstract the platform. As in all the specifics of how you make windows, how you allocate the graphics buffer, how you display things on the screen, all of that, it's abstracted by this layer called Ozone. Of course, because it's an abstraction, it supports many possible back-hands, right? So, it's not just Chrome OS, other back-hands include like just working on Linux, there's a Chromecast back-end, right? If you run Chromecast, there's a special Ozone Chromecast back-end that runs only on Chromecast, and this is how Chrome runs on this platform. And so, Ozone is hiding all of these details from the higher levels, implement those primitives that we talked about. And like I said earlier, you know, we have those hardware overlays, right? Those layers we can use to have multiple windows together at the display level. It also exposes that to the higher layer. So, what's on top of that? We have a ChromeCompositor or ChromeDisplayCompositor. Initially, this was written to support one tab. So, you would have multiple elements instead of a web page, and we would have to layer them on top of each other. And so, the initial vision for this compositor was to say, well, all these elements needs to be squashed together to produce that picture. So, that's what this does. The compositor basically takes a set of pictures and squashes them. Over time, it has grown. So, that was the initial designer to support that use case. Over time, it grew in Chrome OS in particular. It supports the UI compositing, which means our user interface, our window bar, window titles, our menus, our shelves, etc. All of this sits also on top of that same display compositor. And when we added Play Store Android support to Chrome OS, what happened is that we added a third component on top of that same compositor, the sphere, which lets us import buffers out of those applications, native applications, and display them as part of the normal flow of compositing. Quickly, the Chrome compositor does have support for a bunch of effects. So, when you see a blur effect implemented at the UI level, or used at the UI level, it is actually implemented in the compositor. Same thing for rounded corners, color transformations, etc. It supports all those basic effects. But it doesn't know, for example, what the window bar is or what the shelf is at all. It only knows okay, I can give you rounded corners around that rectangle. That's what I can do. The thing that does understand those semantics, like it's a window bar, etc., is the Chrome OS UI. So, the Chrome OS UI uses these functionalities we talked about. The blur, the rounded corners, all those layer support. And lets us build the UI on top of it. This UI is the Aura Shell, we call it Ash, which is basically in charge of giving your window a frame of giving your window a shadow, displaying a launcher at the bottom. And it does so on top of all those functionalities we talked about. Now, if we go one to the side after the UI, we talked about Exosphere, which is the capability to import those foreign applications windows into our compositor. All these extra clients can send their buffers over for us, and we can integrate it into the display. How does that work? We used a protocol called Wayland, which is basically a Linux standard for display. Exosphere is essentially a Wayland server, and the clients are just Wayland clients, and they send the buffers over. And so this is pretty standard. We had to modify the protocol a little bit to support some of the functionality we cared about, but it wasn't there. For example, we're talking about adding window frames, doing some scaling, et cetera, but nothing too big. And yeah, Exosphere, like we saw in four slides ago, the client of the display compositor, just like all the other components. And again, we'll see in more detail later how Exosphere is used to display an Android app or to display a Linux application. So Arc++, a.k.a. Play Store Support, a.k.a. Android Support for ChromeWise, it lets us run those Android applications that we saw at the very beginning on top of ChromeWise using native performance. Now one of the things is that this actually runs a full Android system, and we had to re-implement all the platform-specific components in a ChromeWise-specific way to make Arc++ function. So that includes graphics. For graphics, what does that mean? Well, there's a buffer allocator called Graloc, which is similar to because it's a different platform. So we implemented this on top of MiniGBM. In terms of drivers, we are able to use the same almost drivers, minus some little integration. So that's actually quite straightforward. In terms of how do you display Windows, like we said earlier, we have to hook all our machinery up to Exo, right? And so in HWC, means hardware compositor, it's the same thing. We have a hardware compositor, which is a platform-specific component implementation that instead of doing on any normal Android platform, you would do that. Hardware compositor would take those Windows from Android, put them together, it would be the main compositor basically, and show them on the screen. In ChromeWise, we cannot do that because of course we have to pass them to Chrome. So our hardware compositor implementation just passes things down to Chrome for further display. One last one that we have is Krasti, like I said, our Linux system. This one uses a full-on virtual machine, right? So there's actually a guest kernel running with the whole Linux system running inside of the image. And because of that, we do not have direct access to the GPU or to any graphics for that matter. So we have indirect access. What does that mean? That means that we have to have some kind of broker in the middle that will marshal commands in a guest and demarshal them on the host to actually execute them. This thing is called Demons3D, right? And then we have a demon called sommelier which basically delegates the presentation to Chrome in a very similar way to what our hardware compositor does on the Android side of things. So these were all the components. Now let's put them together and see when I have a web page and something happens on the page. How does that get to my screen? We have seen all the components but how do they flow, right? So here's a hopefully simple diagram of how this lines up, right? So this is an HTML page. So each tab as you might know in Chrome runs in its own process, right? The rendering process is they call. So we have HTML and JavaScript running inside this tab. We have the blink which is our web engine. Parses the DOM and produces all those GPU rasterization commands, right? So we've talked about how the GPU can turn high level drawing commands into buffers of pixels. This is why we're trying to do that. Produces those commands but doesn't execute them. Just produces them and passes them here to the GPU or V's process which will actually run them on top of the GPU so on top of real graphics hardware for us. When that's done this is going to go through the host open GLS stack to the kernel to the hardware and back. When that is done we can use a new frame and we can send that frame to the compositor right? Which will then take that frame and merge it with all the other windows that we have, squash them together call the ozone layer and that's the play on word right here. Call the ozone layer to ask for a display to show it on the screen which will go to the kernel and affect it, okay? So in linear terms let's put this in a little more linear way. We start with our web document triggers a paint right in blink. That triggers the rasterization right? Which will produce those list of commands. Then the rasterization happens and that triggers a new frame and the display compositor level okay? So at that time we're in the GPU or V's process then the display compositor squashes all those images and puts them together right? To produce a scan of that buffer and then we decide oh now we can display that buffer right? Because it's ready so we talk to ozone, please start displaying and then the kernel driver takes that command and programs the hardware to start displaying it okay? And then the pixels show up on the screen everybody's happy, we see our image. How does that look for an Android application? For an Android application it's a little bit different. We have Android stack here running. We have the Android application same thing produces its own private buffer here. But when it's done it passes it to the composer of Android which is surface flinger. When that receives the frames it will give this to the hardware composer. The hardware composer that we have will just pass things over across outside of the Android system to Chrome itself and then basically we use the same compositing stack in the previous slide okay? So what does that look like? We scroll in the Play Store application we have a new frame that gets produced because we scroll using OpenGL for example or using CPU both can work. Surface Flinger tells hardware composer hey look this is the new frame can you do something with it? The hardware composer will pass the frame over to Chrome and then Exosphere takes this and can put it as part of the well can generate a new display composer frame. Then the display compositing in this case hits an overlay because we see that maybe the other buffers have not changed so we can reuse that overlay and not have to do full compositing so this is very cheap so we're happy. So we go to us in the airman and say please change that overlay and show the new buffer of this application and then the kernel programs the hardware that's magical. The Linux application to screen flow is super similar to the Android flow except that we have a bigger box here that also has its own private kernel but you see the same mechanism here happens the Linux application goes through our virtualized OpenGL AS stack which executes rendering command by plumbing this all the way down outside the VM. We come back with the buffer. The buffer is passed to Wayland through Somerlie and goes to Exo and basically we can from that part on is exactly the same thing as an R++ buffer. So in linear terms again what does that look like? Well we type some characters in G edit GTK3 will trigger some rasterization produce a new frame if we're a CPU based rendering we will copy that contents into a DMA buff that then gets passed to Exo right again just like other at that point it's exactly similar to our Android application it's a buffer that comes into Exo and okay you already know the drill we produce in a display frame we find that we can use an overlay we go to as an DRM and of course the kernel driver starts scanning at the buffer and it shows up on the screen and again we're happy. So that's for the display case I have one more case that's interesting before we wrap up is what happens which is graphics display you know I do both so I can talk about both so this one is kind of funny we have a monitor that gets plugged in the hardware will generate an event that goes up to the kernel Ozone will pick up that event and we go through a piece of code called display configurator that has all the logic to decide based on this monitor at that point nothing has happened to the monitor the monitor is still black I was like oh what is this monitor type what video mode should I pick what's the refresh rate that I should pick etc should I be in mirror mode in extended mode so here these are the brains that make that decision when this makes that decision it affects it they say okay Ozone DRM please program the monitor with 1024 by 768 I don't know 60 hertz basic resolution goes to the kernel that actually affects the hardware there's a little extra arrow because we tell the Chrome OS UI look we have a whole new display that just came in please take care of it put the UI on it etc right so it's a little arrow but there's a lot of work that will happen down the line but I think the gist of it is we go through display configurator first so again in linear term what does that mean we plug the monitor right there's a hardware event the kernel picks up that event we consume that event in Ozone layer the configurator decides that oh I want this mode for my display Ozone you know ask the kernel hey can you please set this mode that the configurator decided and then the kernel configures it and we see pixel on the screen everybody's happy