 All right. I think we're ready to go. First, thanks for everyone to come here to see my presentation today. My name is Lucas Roussak. I work on Kodi in my spare time. I have a few other colleagues here with me as well that do it as well. This is my actually first time being at FOSDEM. First time presenting at FOSDEM. So it's a really exciting opportunity for me. There's a few parts of this presentation. And we'll see what we got here. So next slide. There we go. All right. So just a quick overview of what I'm going to be going over here. Problems that we're facing and what the actual problems are. How we solve these problems. And then where we're going to go from here. So everything I'm presenting in here is already in Kodi Master. It'll be in the version 18 release. If you saw Martijn's talk earlier, he mentioned that there is the new DRM backend. That's a big part of this. So I was actually planning on doing a demo, but I don't think I'm going to be doing it because the board I have only has HDMI out and I don't have an adapter. And I don't even know if it would honestly work at this resolution anyway. But we can maybe find something later to play with it. Just a bit of terminology that I'd like to go over before going into it. There's so many abbreviations in the Linux kernel stuff. So stuff you'll see throughout this DRM KMS. This is a direct rendering manager in a kernel mode setting. I assume most of you know what that is being in this graphics dev room, but we'll touch on that a bit later. SBC being single board computer. Devices like Raspberry Pi's. And I brought a Dragon board here, but there's plenty of other stuff out there. System on a chip is a similar idea. V4L2 is the video for Linux 2 subsystem in the Linux kernel. And then BSP being board support package. It's something that vendors supply with their single board computers. It's usually a static kernel release, and that's what you have to use. And then RKMPP is the rock chip media process platform. So what is this actual problem that we see today? So many new single board computers come out all the time. There's now eight or nine top SOC vendors that we see, all with varying levels of support in the Linux kernel. People buy these new boards on a whim and hope for them to work in their applications. It's quite a big problem. It creates quite a lot of code maintenance, especially when there's a specific platform-specific code in your application. So oftentimes you're stuck with the vendor's board support package, which in a lot of cases is usually a really old kernel. Amlogix kernel is based on 3.14, which is severely outdated by today's standards. And they really haven't done much to update that, but that's their thing. A lot of these solutions use proprietary methods as well to do all the windowing stuff. So they provide you a thing and provide some documentation on how to use this blob, and you write your code around it and it does some magic inside it and hope it works. But they're usually all platform-specific, so they all require platform-specific wrappers around certain things. This becomes a huge maintenance burden. I'll show you in a sec here what I actually mean by that, but it's actually quite large. So currently in Kodi, as of the version 17 release, so the version that's already out, Raspberry Pi, Amlogix, and IMX6 support existed. Varying degrees the success with kind of all of them, but Raspberry Pi being pretty well, and it is a vendor-supplied kernel with blobs to use it. Amlogix is, in my opinion, a bit of a hack, but we're working on it. And then IMX6 kind of just lost interest by people, and so as of now, IMX6 has actually been ripped out of Kodi, thanks to me. We've actually rejected some PRs for specific code in Kodi. Just because we didn't want to take on this maintenance burden anymore. There's already not enough maintainers in Kodi for a lot of the work we do, and in most areas there's only one maintainer of some things, and if that maintainer leaves with IMX6, the code dies and no one's there to save it. This is actually a bit of an old photo now. All the boards that I kind of have accumulated quite a few since this, but supporting each single board is very difficult. To cross-compile for each board takes a lot of time to burn it to an SD card and run it, and on each board takes a lot of time. So what we want to do is try and implement a way where we have a single code path, basically, for use on all the different boards. So this was platform specific code for IMX6. This was ripped out. Bear in mind these are just simple metrics. They include the header files and stuff too, but 3,700 lines of platform specific code on a board that maybe 100 people ran Kodi on wasn't great in our opinion. So we really needed to look into other methods. I mean, Raspberry Pi is even worse than this, but there's nothing we can really do for that at the moment, and I'll kind of talk about that later. But yeah, this is just a rough example. Keep that in mind. I have some other stuff coming up as well for where we're going. So just an overview of kind of some of the stuff that I'll be covering, how we solve this using DRM. I'll briefly talk about FFMPEG and how some of the decoders support there helps us a lot, and then we have this new thing called DRM Prime is the new rendering method we use across some of these new boards. So it's all pretty exciting stuff. It's all really new stuff, and I actually hope this maybe helps some other applications figure out how to implement some of this stuff, because when it's brand new technology, sometimes it's really difficult to figure it out on your own. And I have quite a crew of guys that are all kind of working on this with me in various different areas, but some people might not. So DRM and KMS. DRM's the display subsystem in the Linux kernel. It's still sort of relatively new, but most main graphics systems have an upstream DRM driver, which is fantastic. So now in Kodi, basically any board that has a DRM driver can run Kodi, which is awesome. And I mean, most people think of the DRM subsystem on all the big graphics platforms like Intel and NVIDIA and AMD and stuff, but there's so many more as well. This dragon board runs on the MSM kernel driver. It has an Adreno chip set in it. It's really fantastic. Raspberry Pi recently gained a DRM driver. We currently can't use it because of reasons I'll get into. IMX-6 has a DRM driver, and hopefully the upcoming IMX-8 will soon too. All-winner is varying levels of support. Once the H3 DRM driver emerges, that would be really fantastic. And Rockchip, I believe it's upstream, but there's still a lot of stuff lacking in the upstream Rockchip stuff, but Rockchip's board support package is based on 4.4, so it's not really that old yet. So I'm mainly focusing on all the embedded platforms. I've tested it on all these platforms, and including AMD and Intel, and it can run. I mean, I'm really relatively new to this scene anyway. I've only been working on this stuff for less than a year now. I attended the Embedded Linux conference last year in Portland. I met Mark James there, who gave a presentation earlier today, and he basically asked me why Kodi doesn't run on the frame buffer on DRM, and I really didn't have an answer for him at the time, so I went home and looked into it and wrote the DRM framework for Kodi, and that's where we are. So if you have more interest in learning about it in general, there's tons of stuff out there. This one linked presentation by Boris Breselon is really fantastic. That's where this photo is from as well, so yeah. So the implementation for DRM KMS is about 2,500 lines. This is just the windowing in general. It doesn't include decoding or rendering it, but 2,500 lines to run on all these different platforms is pretty great. There's no platform-specific code in there. It's designed to run on all the platforms. There are some quirks between platforms, which I'll talk about a bit later, but really it's general enough that it can run, which is really exciting. So then we get into video decoding and rendering, which is a huge issue. Again, in the past, all the video decoding on these chipsets are somehow using a binary blob that does some magic and then gives us a decoded frame after, and we have to do something with it. It's very board-specific. It's not very nice to work with, so what can we do about that? And there's a couple things, actually. The main thing I'll be talking about is the Video for Linux 2 subsystem in the Linux kernel, which is designed to do this. It allows access to the hardware video decoders on the board if there's a driver, a Video for Linux 2 driver, available for that board, which there is actually quite a lot. The IMX6 driver is there. The Qualcomm Venus driver is already upstream. MediaTek has one. I think HighSilicon has one. And then All Winners is in the works, but there's some kind of flux in the V4L2 subsystem right now. So the main issue is, on all these embedded low-power platforms, how can we take a decoded frame and send it to the scan-out without copying to get maximum performance? The main thing we're looking for on these low-power systems. On your i7 at home, it really doesn't matter, but on a little Raspberry Pi, it's a single core Raspberry Pi, it's a huge thing. So the DRM system actually allows us to do this in a really interesting way. This is the decoder and renderer that we wrote in Kodi. With help by Jonas in the crowd there. It's 750 lines, total code. I don't really expect it to get much bigger than that until a little bit in the future when HDR stuff is kind of calmed down. But that's all we need. And the nice thing about this as well is it is platform agnostic. All the work is off-put into FFMPEG to do some of the heavy listing for us and to deal with some of the little bits of platform stuff that we need to do. The code really doesn't... There's a few areas here where I have code on the screen. It's not really a big deal. It's really just filler. It's hard to take code out of context and understand what it means, but this is an important part of FFMPEG. This has been in FFMPEG Master for, I think, only about four or five months now. And this is a big part of what we use. So what happens is we basically will be getting something back from FFMPEG as I show here. We take a frame, send it to FFMPEG, the specific decoder in FFMPEG decodes that frame either using V4L2 or, in the case of Rockchip, it's RKMPP. And then we're requesting a certain pixel format back from FFMPEG. And using this AV-DRM frame descriptor, we get back. FFMPEG puts the decoded frame in a memory buffer and it stays in that memory buffer for good. So what we actually get back from FFMPEG is this AV-DRM frame descriptor, which contains file descriptor for the memory location of the decoded frame and just some other information about that frame, like the size and the pixel format and whatnot. So we get these two things back. What we do with this is really interesting. So we get the frame back. We can unpack this sort of container that is AV-DRM frame descriptor and use it. And what we do is we typically, in FFMPEG, we get back a frame in the NV-12 pixel format. V4L2 does support some other ones, but NV-12 is kind of what we're using now. And what's unique about all these embedded platforms is that they're designed to use with these certain pixel formats so you can send it directly to a plane itself and be shown on the plane. So all these embedded platforms support NV-12 in their output planes. And what we do is we get a handle, basically, from the file descriptor for the memory buffer, and then we add the frame buffer to it. This is kind of the same thing you would do with regular OpenGL stuff as well, but we're taking the frame from the direct region in the memory to access it. So once we add the frame buffer, we get a frame buffer ID, and then we can let the DRM subsystem do all the work for us. We pass the frame buffer ID into an atomic mode set. It just includes some properties, the width and height and location of the frame where it should be displayed. And then we just do the atomic commit. And this is really nice. It's actually a very simple process. As you saw, it was 750 lines of code to do this. And it makes our life really easy. This is what I'm talking about with planes. This is a sample output from the Dragonboard using the mode test utility that's in the libDRM test suite. And there's a lot of different pixel formats there that the planes support. So the next thing we come across is how do we support multiple planes and do the compositing for it. But in the DRM subsystem that it does that for us, we just have to set the planes ourselves. So in the example here, the top plane is type one, which means it's a primary plane. And then the bottom one is the type zero, which is an overlay plane. And there's actually a type two, which is a hardware cursor, but we don't use that at all. So both these planes support NB12, as well as a ton of other modes. And so we use these two planes to our advantage. I made these graphics. It took me a bit of time. I don't really do much in GIMP anymore, but I was really happy with that. So what we actually do in Kodi now is we get a frame and we pass it to the plane for video playback. And the video playback really wants to be in the primary plane. And we want to do this because the overlay plane is made to be in front of the primary plane. And so we put the Kodi GUI into the overlay plane. And then the DRM system and the hardware composites it and outputs it together. So in both of these, there's one with the black frame. That's just the hardware cursor frame, which we don't use. It's just there to represent. But in the first example with video playback, we show that the Kodi GUI is actually shifted into the overlay plane while the video is playing back in the primary plane. This is what it kind of allows us to have such great performance. When the Kodi GUI isn't shown at all, we can completely disable the plane. So it's rendering nothing, so we might as well just disable it completely. And then once the video stops, we can actually shift the Kodi GUI from the overlay back to the primary plane. And I actually just need to do that because of there's certain transparency issues and things that happen when you leave it in the overlay plane. So this all happens in a single atomic commit. That's what's really convenient about doing it. There is actually an atomic attribute called Zed Position, but it's not widely supported in the mainline kernel. It actually tells the hardware to do the composition to move the planes around manually, but we don't want to really support something that's not there for every platform yet. So we do it this way and it works really fantastic. There's some other issues with doing this, of course, that I've ran into using all the various hardware that I have. The IMX6 device in particular, the DRM driver, or the hardware has a hardware scaler, but it's not plumbed into the DRM pipeline. So unfortunately there is no plane scaling, so you cannot play a 720p video on a 1080p screen on the IMX6. It'll just say there's a size-mits match and it'll fail. But on Rockchip and the Dragon board, it works just fine. There are things we can kind of do about that, and there's a lot of stuff I've looked into and how to solve that, and all of them are sort of complicated. It would be nice if there was just a scaler involved, but higher ups tell me that even if we can access something, but it'll probably have to be done through FFMPEG instead, so not really that great. And then the other thing is when putting a frame into the plane directly, we can't use GL shaders, which are a big part of COTI, so doing things like scaling and color correction and deinterlacing and stuff. I think V4L2 and RKMPPE both support deinterlacing, but just because so. So some sort of alternative solutions that we've kind of had to come up with. And this is really only spurred out of that IMX6 board that doesn't support plane scaling. We had to look into some other solutions and the other solutions all involve using OpenGL, and in this case, OpenGLES, which also involves writing another renderer for another code path. But it's not such a bad thing. A lot of that code can kind of be reused, but it's not ideal, but we kind of have to work with what we get. So what happens in this case is that we actually get the file descriptor from FFMPEG and we import it into GL and use it from there, and then we can do image manipulation on it in that case. It works. I have it working on the Dragon board. The performance is still great. The main problem with it is that it requires GLS3 because of some of the texture support that is needed for it. And most of these small ARM boards only support GLS2, which is really unfortunate, but that's, again, life. So there are some other things that we can do with GLS2. There's this GL Texture External OES call that we can use. And I've sort of worked on it a little bit. I never actually got it fully working. It's working in MPV, I believe, but again, it depends on the hardware and if the GLS driver actually supports importing of NV12 format or not, which IMX6 doesn't, that would be great. I know IMX6 actually supports using YUIV, outputting from V4L2 and the plain support and in GL, but I just haven't got around to really testing that yet. Great. But again, it's one of those things where it depends platform to platform if that's a problem. In case people don't really know, regular open GL can't import NV12 or certain planar formats. So the reason it's required with GLES for the one method is because some of the texture support that we have to use, we basically break the NV12 image into their own planes in GL and then use shaders to sample it from their sole. I have some code that kind of covers it, but it's not really important because again, it's just out of context. So here we're actually just importing in RA texture in the Y plane and then in the UV plane, we import a GR88 texture. And yeah, those two are the problem because converting from the DRM format to a GL format requires a certain level of open GL, which is just unfortunate. This method was actually kind of copied from some other code that we have in Kodi right now as well because this is the way that VAPI works in Kodi because with VAPI we get an NV12 frame and we import it into GL using this. So a very similar code. It does work, but again, there are some qubits so we don't really know exactly how we want to move forward here. If this is something we do want to implement or not and if we do, this would be a fallback method for the direct subplane method that we are using currently. So something to talk about in the future, but that's just how it is. And then we just take the imported images and convert it to textures. So the other method is the OES method and that's specific to GLES. That's where we can import NV12 directly, but again, there's some issues with that so it's a little difficult. So there was a couple of platforms that I wanted to go over. Raspberry Pi is unfortunately not really going down this path as of yet. Upstream there is the VC4 driver, which is really great. It works fantastic, but when using VC4 you get no video decoding support because you have to go through their magic blob, which hopefully one day will happen, but as of now it's not so we still have to have that platform-specific code for Raspberry Pi. There's no real signs of that changing, but that's how it is. Amlogic is kind of the same way and it's really too bad because there is so much amlogic uptake in people buying devices. They're so cheap you can buy them for $20 or $30 and have Kodi working on them, which is fantastic, but you're so stuck in using all the problems that come along with their code and the nightmare that is AM codec. There has been an effort to create a DRM driver for Amlogic, which is fantastic. It was done by B Libra. We talk with them quite a lot. The problem becomes, after this, the video decoding again. We've been talking with Amlogic and apparently there's some work that they're doing to do either a V4L2 implementation or FFMPEG implementation, but I have yet to actually see anything, so I don't really know. Even then with Amlogic, they don't have a GBM-compatible library for LibMally, so there's so many little things that are problems with these platforms, but all winners are almost there, which is great. All winners kind of had an interesting past, but I have Kodi running on an orange pie. There's just no hardware decoding support yet for it, but as soon as that driver is in, which, I mean, there was the Kickstarter the other day to do the all-winner V4L2 driver, and apparently it's there. It needs reworking for the new V4L2 rework that's kind of happening right now. Rockchip is working. It's all there. There's a lot of work going into the Rockchip kernel to support various stuff, but the nice thing about some of these boards is that we'll be able to support 4K at 60fps, which is really awesome, and these do H.265 as well, so they're really up to date in their stuff. Unfortunately, it's not quite mainline kernel, but there is a lot of stuff being shifted to there. It's just some of the stuff specific with the GPU and whatnot that is stuck in the Rockchip specific kernel. And this Qualcomm board is absolutely fantastic, apart from some stuff, which the software support is amazing. This board runs on all open source software now. It's completely upstream. It's running on mainline kernel. There's still a couple things missing, but it's not really needed for our cases. Yeah, if they would just make it a proper board layout and power plug and whatever else, and the Android partition scheme is really bad, but that's life. So this works. I would really like to show it, but yeah. And then IMX6 as well is all up mainline as well, which is really great. There's kind of mixed success with this, because IMX6 platform is so old, it only really supports H.264 and stuff anyway. I really look forward to IMX8, which is kind of, it's already kind of been announced, but hopefully the first board starts showing up in the next few months, which is really great. And IMX8 will have H.265 and VP9 and all the good stuff. So future stuff. Unfortunately, the V4L2 DRM prime work is not quite mainline. I've already submitted patches for it, but they do need some work because they're quite invasive. It's actually a minimal patch. I think it's like 30 lines or something like that, but it's still something that needs to go in. The Rock Trip supports in FFMpeg already, so that's totally fine. And then so the only other stuff kind of missing from Cody is going to be HDR stuff, but that needs to kind of be solved in the mainline kernel and how they want to do that stuff anyway. What's nice about this is it's already done all in Cody. So now whenever the boards come out and then the drivers are implemented for them, it'll just work. So it's not like we have to backport stuff into Cody version 18 when it is released to allow support for these other platforms. So not going to do a demo. I don't have the adapter here, but that's fine. So any questions about anything, really? Go ahead. Yeah. So it's kind of just a comment about using, I think it's done by an MPV or MPV? Yeah, MPV. They have a new library called lib placebo. It's meant to do a bunch of like color correction and all sorts of other stuff kind of as a single framework and whatnot. It's possibly something we might look into, but we have certain maintainers that really like to avoid external libraries and things. But we'll see. There is kind of early support in Cody for doing HDR stuff. At least some of the pipelining is there. It's just getting it actually to all the way to the screen is kind of where we're at right now. So yeah. Anyone else? Go ahead. I don't actually have any benchmarks other than the fact that without the kind of V4L2 stuff on Cody, it just won't run on platforms like IMX6. Using the proprietary method works, but it's basically doing the same thing. So you could play kind of standard deaf playback using software decoding. But here you actually can play full 1080p, 60 frames. So yeah. Anyone else? Any other questions like just in general about Cody or anything? Yeah. Unfortunately, there's quite a bit of work still to do for IMX8 as far as I know. The hardware video decoder is a completely different chip set. So the V4L2 driver needs to be completely implemented from scratch. Apparently, there's great documentation for the IMX8 chip though. So hopefully it'll be done in a kind of relatively quick time period. There's been patches already for Mesa for IMX8 that are on their way to being merged soon. And then they kind of have to hook it up in the IMX DRM driver. So yeah. So close, but we'll see. It depends when the boards come out too. The only one I really aware of for consumer purchase is there's a board called the WAN board, which is going to be based on IMX8, but it's available for pre-order only right now and I think comes out in a few months. So yeah. That's a really tough question. So I actually, I'm actually not only part of Cody, I am part of a project called Libre-Elec. It's built from scratch. Linux distribution just meant to run Cody. And currently our biggest use base is Raspberry Pi still. Raspberry Pi is still an awesome platform for the money you get. It's been continually tweaked since it came out. We have guys working with us from the Raspberry Pi foundation to work and improve it all the time. So I think Raspberry Pi is great. Unfortunately it's a little behind now because there's so many other boards that are coming out, but I have really strong opinions about some of the boards coming out and I don't really need to cloud it with some of this stuff. So yeah. All right. Anyone else? Is it HDMI to VGA? I mean, we can give it a go. I've never ran it on that low resolution and never used an adapter before. So it could work. It could not. I think there was before, but I don't see it before. So you can... So aim for 16 to 9 ratio if you can. I can't change anything until it's running, but yeah. Do your best. And that's if it'll run anyway, so no idea. It's running, but there's... I don't know if it's the adapter or what, so... Yeah. So anyway, thank you guys for coming.