 So, I've got good news and bad news for you. The bad news is that this presentation is not that technical at all. In fact, all the things described here will work on 12 release or even 11. The good news is it's not technical at all, so everyone can try it at home. My name is Michael. I'm actually from Hong Kong, but right now I'm studying pure maps in San Jose. I'm a previous user since maybe four or five years ago, and accidentally I start programming in a lot of different stuff, web, iOS, and sometimes OS. So there are a lot of reasons why you want to have a GPU-accelerated guest in Beehive. First of all, you might have some application that you just want fancy graphics, and that breaks into a few categories. For example, you might only want to accelerate one application, or you want to actually render something, but you don't really care what it shows on screens. But the last kind of acceleration is like you actually want to always have access to some graphic interfaces. So for application-level stuff, that's quite easy. You don't need to do anything special. You can just use VirtualGL or Alkuda. In that case, you don't need to worry about anything about hardware-level things like GPU pass-through or even VGPU, which is kind of an interesting thing. So a question about Beehive is, is GPU pass-through even possible? Because nowadays, if you look at the Beehive wiki, they will tell you that GPU pass-through is not supported. I mean, VGA pass-through is not supported in Beehive, but in reality, a GPU is just like any other PCI device with some extra features. So it is actually possible to pass-through GPU to Beehive. For example, this take from my Twitter when I accidentally discovered this fact. As you can see here, it's actually running RTX 2070, and it's actually running in a Beehive guest. The story behind this is that I have a previous machine at home, and it's running current. And apparently, the latest NVIDIA GPUs and the drivers does not support current, they only support up to 12, and if you want to compile against current, it will actually crash or panic your kernel. So I was just trying to make a VirtualGL server out of it, but I never thought actually I'd get graphics out of it, and then actually work on a monitor. So GPU pass-through is possible, but it's not perfect. For example, the GPU you try to pass through to the VM must not be used before. For example, if the BIOS thing, if it ever runs on the GPU, or if your console has ever run on the GPU, it cannot pass through that GPU. So what you need to do is actually use a separate GPU for basic console stuff, and then you need to have another GPU that use only to pass through to the virtual machine. And after you pass it to the virtual machine, you can actually only use it once. That means if you want to use it again, the only thing you can do is to reboot the host, because now the GPU is ... we'll talk about it later, but actually I tried to tweet the PPD driver as well, but it does not really help. But it's actually an interesting discussion we will talk about it later. So it also does not work on all different kinds of OS. For example, I was never able to boot Windows from a crash instantly. Yeah, it's Windows, come on. So some prerequisites that you need to pass through a GPU to a guest. First of all, of course, it's a PCI pass through. So all the things you normally would do to pass through a device to a Beehive guest, you need to do it here. And again, the GPU must not be initialized yet. And also that the GPU must be a real trivial GPU. For example, if you have a laptop GPU, like a ThinkPad, and then when in the bios it's a hypergraphic, you cannot pass through that, because sometimes when you actually use it, when you initialize the integrated graphics, somehow it touched the NVIDIA graphics card as well. So that won't work. Yeah, the third one is a good one, but you don't really have to do that, which is to donate to BSD projects. So first of all, I've tried a lot of things to work around the initialization problem of the GPU. So the GPU I use is actually an RTX 2070, and I spelled it wrong. Which it actually lied about itself. It actually lied that it support a function level reset, but even the PPT tried to reset it with FLR, it actually won't work. So when you try to use it again for other VM, it just won't work. So the way I do it is I modify the drive a little bit, and I force it to use PCI power reset, which dropped the power state to D3, and bring it back again, it won't work. So that's pretty much hopeless until 13 minutes ago, I figured out something. For real experiments, of course, I tried it, but I don't try it on every kind of machine. I only have one desktop and it's AMD desktop, so maybe VTD offers something different, but that's what AMD offers. I use a Gigabyte disk motherboard. It's not commercial. So why use it? Because it has five PCI slots, so it makes doing these kind of things really interesting and easy. The first slot for the holes, I actually assign a 1050 Ti, because it's compatible with the old NVIDIA drivers in the ports. And I actually tried two GPUs to pass through to guesses. The first one is RTX 2070. That's the whole reason why I even tried this project. The other one is the AMD one, 550, because I was curious if K-Mog, DRM, would ever work. And all the guests are installed on SSD, so the B had just read from the diff. We talked about it before, all the Windows 10 guests actually does not work at all. So I check online, try to figure out why. And it turns out, from a Debian VGA pass-through wiki, they say that sometimes you cannot really assign the GPU to the root bus, because it will confuse the driver. So naturally, I try to assign it to a different bus. In order to do it in Beehive, I need to pass the dash Y flat, but somehow it actually crashed Beehive with an assertion failed. So I can never really try with Windows. But maybe, if I have an Intel motherboard, things can be a little bit different, who knows. So we can have free BSD guesses, because that's always the safest choice. The first attempt, I was using RTX 2070 and pass-through to free BSD guess. So I actually downloaded the latest and really official driver, so it would just work. The trick is that if you want to use a free BSD guess, you must use a UEFI loader. The reason why is you want to enable the VT console, so you can do stylus, otherwise X will actually complain about it and you cannot do anything. And of course, it's X, so the console actually will not show on screen, so you don't get a lot in POMPT and anything like that. Therefore, you actually need to explicitly add the bus ID to X.com and do a stylus. But if you want to use free BSD guess and you use SFS on the guess, there's a trick, which is you cannot have NVIDIA low and NVIDIA mode set low in loader.conf. The reason why is NVIDIA.co and NVIDIA mode set.co, they are so big that if you load it in loader.conf, it actually does not have enough memory to load SFS.co. So you end up with something that you cannot put. And you can tune in the loader.conf and then disable a blacklist NVIDIA and then you get set of S back again and then you go back and you remove it from loader.conf and then you add it to outc.conf. That's the only way you can get it work. But there's a bonus for the NVIDIA graphics card, which is that the USB-C port on the graphics card actually works as a USB-C port. Why is it the bonus is because I try to pass through two different USB controllers to my B-Hive guess. It does not work. For example, when I plug in a USB thumb drive to the USB controller, somehow the free BSD guess keep, say, like the USB detection, detection, detection, which I'm not entirely sure if it's a B-Hive problem or a free BSD driver's problem. So the next one obviously is AMD, RTX driver, which used the DRM K-Mod. I was curious about this because originally I thought there's some matrix in NVIDIA driver that somehow it just know how to initialize the graphics card without VGA BIOS. It turns out the DRM K-Mod actually just worked. So you actually get a load upon from your console, but until a point that it load the AMD GPU.co, it actually, the console actually showed up on the screen. So everything just worked. You don't even need a bus ID. Yep. I haven't tried that yet because I actually did just do it for experiments. If you have a 2070 and AMD 580, you want to use the NVIDIA one because it's faster. So that is actually amazing because that means if there's anything go wrong with GPU pass through, it's less likely to be a driver issue because DRM is actually open source driver. So you can actually look around and see what's missing. You probably will not find some secret stuff from NVIDIA that, I mean, you know what I mean. It's like if NVIDIA has some secret source, then DRM K-Mod should not work. But the fact that DRM K-Mod worked, that means the factor from the driver itself has been proprietary with less likely. And then we can look at the performance figure. So I'm too lazy. So I only run the GL mark to branch mark. And as you can see here is the performance of the RTX 2070 when it passed to a VM. And the blue one, but they're both blue, is the RTX 2070 when it runs on bare metal. But this result may not be totally accurate because I cannot pass through all my coils and RAMs. And sometimes OpenGL has the problem that the CPU might actually throttle the GPU because the GPU tries to enqueue commands and the GPU actually waits for the CPU to enqueue more commands, things like that. I mean, it's probably the opposite. The other thing that's very interesting is the Intel GVT, which is like a technology that allows you to create a virtual GPU from an Intel integrated GPU. Of course, that means you can pass it to virtual machines. And the cool thing about it is that actually most of the code are already available in i9-15 DLM driver. So that means it might only need like very little tweak for us to get it working. And once it works, it probably just appears as some PCI device that you can just pass through to the virtual machine. And the fact that it's designed to run as like a VGPU might help us avoid a lot of issues. So I can get a little bit too quick here, but the reason why I want to get it quicker to the future work and work in progress is, again, back to the thing we talked about, the initialization thing, which brings us to the website. So when I investigate what happened that causes the GPU not being able to initialize again, automatically you can think about a few factors. The first factor is the PPT driver. The second factor is the GPU itself, because if the GPU actually has some limitation imposed by the vendors, then of course the vendors will try their best to stop you to do a pass-through. And the third one is actually the guest OS. Originally I thought it's the GPU's problem because of the NVIDIA thing, but later when I see when DLM K-Mod works, it kind of makes me rethink about it because DLM K-Mod is an open source driver. So if the vendor really tried to do something special to prevent the GPU being initialized again and again, then it should not really happen to DLM K-Mod because open source developers, they are less likely want to impose this limitation. And then after investigating more, I move on to the guest OS, and I realize maybe it's possible that when the free BSD guest shut down, it does not really tear down the graphics card correctly. If that is the case, then the graphics card is still in the state of already being initialized. And if the PPT driver does not do anything, then the GPU will still stuck in the state of being initialized, and hence it cannot be initialized again. And the first possibility is actually the PPT driver itself because I just forgot to leave it like 30 minutes ago. I find this very interesting thing on GitHub when I investigate the GPU password on Linux. Because I know on Linux on KVM, it is possible to reuse the GPU again and again. So it is a sensible thing to do to actually see what they do to enable it. And instantly you can see they have like this start and stop script here. Let me try to make it a bit bigger. But anyway, you can see when it tried to start the... to pass with the GPU actually stopped the X, of course, but then it actually unbind the VT consoles. So after unbinding the VT consoles, it unbind the EVI frame buffer, and also it detached the PCI device using the... the equivalent of PPT thing. After that, they basically load the VFIO kernel module, and then they actually then pass through it to the KVM guess and it works. And when they stopped, they actually do the same thing again, but this time they reattached the driver. So it made me think about if it's possible that our PPT driver is missing something, that maybe it's not detaching or retaching those devices correctly to cause the GPU not being initialized... I mean, de-initialized correctly. So I get... I think I go a little bit too fast. And actually that's pretty much... I want to say basically just like a report on what's going on. So are there any questions? So I'm wondering how you could use the same GPU on the host and in multiple guests at the same time. With Intel, you said there's a virtual GPU on other operating systems. Is it possible and what mechanism are they using for NVIDIA and AMD? I think NVIDIA has their own... I mean, they have like a series of GPUs that's for virtual machine, but those are really expensive. And I don't have one, so I don't have the details. But for the Intel one, what it does is basically allow you to create like VGPU out from the integrated GPU, and this will allow you to... You can actually create multiple of them. The only limitation is like each guest can only use one of them. And from my understanding, the guest can actually use those GPU with the Intel i9-15 driver. So the driver issue probably won't be a problem because as long as they can use i9-15. From NVIDIA, I'm not sure, but they might have their own proprietary technology to do that. And yeah, actually I have a friend who's working at NVIDIA. He tried to get me one of those graphics cards, but unfortunately the last few months I was... I'm not able to do a lot of research because as you know what happened in Hong Kong, so I was never able to get my hand on those GPUs. Before running your OpenGL benchmarks inside the VM, did you recreate the CPU topology inside Behive so that the CPUs were pinned and matched? Not really, but what I do is like I basically just run it from scratch, which is the same Behive script and then... But I didn't use that HFLAG, but I realized something strange. If I pass through too much CPU or too much RAM to Behive, I actually crashed Behive with unable to set a memory, but I'm not sure if it's like an AMD problem. So the best thing I can do actually is to pass through 4 CPU to Behive and 16 gigabytes of RAM. But of course on bare metal, it's like a huge difference, right? Because I actually get 32 threads and 64 gigabytes of RAM. But I don't think the CPU should fraud her the GPU performance that much though. You mentioned that when you're loading from loaded.conf having NVIDIA mode setting, kernel driver loaded and the ZFS kernel driver loaded that there wasn't enough memory for the two. On my laptop, that's exactly what I'm doing and it works fine. So is this something that was only in guests that you found to be a problem or also in the host? I have mostly in guests on host, I once load tons of drivers, I mean tons of modules with loaded.conf and it worked fine. But on guests, it never worked. On my ThinkPad, it actually sometimes when I load like AESNI and ZFS and NVIDIA modules, I get the same issue as the guest. So the limitation is in FreeBSD's UFEI loader. It has to do with the boot loader. The way EFI booting works, we have to make a temporary buffer that we copy the kernel modules into and then we exit EFI, we copy that into the right place and the BIOS booter doesn't have that. So because the guests are booting EFI, I suspect that we will flow the size of that buffer. And there is like a, you can, there's a tunable by default, I think it's 64 megabytes on 8064. You can make it bigger if you need to. Oh, that's awesome. You have to read the file loader to do it. But that's, I think, the limitation you're hitting. It's not specific to guest, it's specific to booting EFI instead of BIOS. So it's actually quite interesting to discover this stuff. And some of you might wonder where are Linux. The reason why Linux are not here because I have such a hard time configuring Linux to work correctly. For example, when I have Ubuntu guest and when I tried to make it default the GPU to the slot that I assigned, somehow it just crashed without even booting. So I was never able to test well. And in addition to my slides, I should remember one more thing, which is that on Linux and on, I once tried to pass through Linux and it worked without outputting the graphics directly to the monitor. But I suspect the GPU itself is still functional, but it won't run CUDA. As far as I know, even sometimes in KVM, CUDA just won't work. So running CUDA in a VM is probably not quite possible. But other things like running a virtual GL server probably should work. And that's actually exactly what I do with my 2070 graphics card. I run a virtual GL server in the VM and then I just ask my host to use it. Because my host is running current and it cannot run the latest NVIDIA's driver. Any other questions? Do you remember correctly NVIDIA broke CUDA inside virtualized GPUs on purpose starting with Pascal? That maybe you are. To force you to buy quadro cards. Yeah, and it's a lot more expensive. As a side note, regarding the CUDA, I have tested the pass through also with VMware. And I was able to pass the GPU and also run CUDA code in there. Oh, wow. Do you want to review with the K40? Come on, it's a K. Yes. Yeah, and that is like a consumer office card. That's cheating. Any other questions? Thank you very much. Thank you.