 Welcome to the Home Lab show. You know, it's funny because I have a special guest. I didn't even look at one important thing. The episode number 67. But I have a special guest, Jeff, from Craft Computing. I'm positive a lot of you know him. How you doing, Jeff? I'm doing well. How about yourself, Tom? Wonderful. It is morning. And normally, Jeff would have a beer review on Craft Computing. Do you have a coffee beer recommendation? Oh, hold on. Hold on. I got audio feeding back. Hold on. OK. We're good. All right. No problem. Do you have a coffee beer recommendation, Jeff? We're going to start it out a little weird here. You know, actually, I did grab a canned coffee. I've been really digging these lately. So this is a La Colombe, something like that. You get them at Costco. They're about a dollar a can. You buy them in a 12 pack. They're a nitro infused coffee. And so they're slightly carbonated, slightly creamy, but actually super good. So that is an unexpected answer. I was hoping to caught you off guard. I did not prepare you for that question. Let's see, what did Jeff recommend on the fly? I've always got something in front of me. Yeah. I actually want to try that now. So I'm genuinely interested. Yeah. No, you pick them up at coffee or at Costco. Like I said, they're about $12 to $14 for a 12 pack of them. And they're really delicious. All right. It's usually how I start my morning if I'm not making coffee. Yeah. That is great. Well, they are not the sponsor of the show. So I will quickly think who is the sponsor. And they're actually a friend of Jeff's channel as well. And that is Linode. Many of the projects we talk about here on the HomeLab show are easily run in Linode. Now, not the one we're talking about today, though, because today Jeff's going to join us for some GPU pass through. But for all the other fun things that you don't want to run in your cloud or you want to just run it in their cloud and do all your testing, check out Linode. We have the offer of the HomeLab show to get you started on there. They've been a great sponsor of the show. We thank them for their support. And by the way, I've mentioned this many times, if you've downloaded this podcast, you've downloaded it from a Linode server because that's where we host the HomeLab show and everything else too. Because we're still going it and hosting all of it ourselves. And once we found a couple of bugs that we completely created ourselves, it's actually been quite reliable for the hosting. It makes a great server. So check them out, user offer code. But now we're going to talk about building your game server. Yeah. Now, this is fun. Now, I've already included a playlist on here for Jeff. But we're going to talk a lot about the theory on it, what works, what doesn't work. Oh, it's been a whole journey. Jeff, yes, I've been following this journey because people ask me about pass through and I'm always referring them to just videos. I've watched them. It's just not as easy as people think it is. It's probably the first thing. The second thing is not just any video card works. Matter of fact, what do we want to start with this topic? Which video cards you tried before? Oh boy. Yeah. I have stacks upon stacks of enterprise and consumer GPUs that I have gone through testing on and done either performance testing or just do they work in general? Now, I think we'll clarify one thing right out of the gate. There are 100 different ways to skin this cat. If you want to just put a GPU into a server and pass it through, that is actually a fairly easy thing to do nowadays. NVIDIA released the driver update and removed the limitation on being inside of a virtual machine. So now you can run an NVIDIA GPU inside of a VM. You can also do it with AMD. The one thing you'll need to do for that if you want to log into it remotely is get one of the little display to fake out diagonals. They're easy to find. I think if you look for HDMI dummy on Amazon, you'll find them. They're pretty straightforward to find and they set the screen to be that. Oddly, we've used them not for GPU pass through, but if servers are headless, some of our remote control software commercially doesn't work. If it doesn't have a monitor plugged in, we've plugged those into some headless servers to kind of solve a stupid problem where either A, resolution is locked too low or B, some of the remote tools just go, now with no monitor, I won't give you a display to remote into the server, which is a weird problem. Exactly, yeah. And so yeah, it's the same exact thing. You have to have the video card actually rendering an image in order to remote into that video card. And so PCI Express Passer is super simple. It's supported by basically every hypervisor that's out there. Now, what I kind of, I started down this journey with a set of NVIDIA Tesla K2 cards. These are Kepler based GPUs and NVIDIA actually allowed virtualization of those cards for free. If you own the cards, you can download the drivers, you could make them work. They only worked in VMware 6.5. That's it. End of discussion. Yeah. If you have something newer, if you have something older, if you have Zen, if you have Proxmox, no. But rather than just passing through a single GPU, you're able to partition or para-virtualize your GPU. Similar to SRIOV, but it's a closed source protocol. And so you have to have drivers and support for both your host and your guest direct from NVIDIA and direct from your hypervisor. And I eventually got those working. Performance was kind of lackluster and I kind of figured that would be the end of the road. As it turns out, I don't know if I sparked interest in a bunch of people, but NVIDIA has an enterprise solution that they call grid or VGPU. And it's been available. Kepler was the first generation, but it's kind of its own animal. When Maxwell came out, NVIDIA released their version two of grid. And it's a complete enterprise solution requires a Tesla-based GPU and very specific Tesla-based GPUs. And it is a licensed system where you can again split up a GPU. And the new system is supported on Maxwell, Pascal, Volta, or Turing-based cards. What some industrious modders managed to do was unlock that feature in consumer or non-supported Tesla GPUs. So essentially you can take GTX-980 Ti or a GTX-1070 or a Tesla M40 or M60 and para-virtualize them in your server. This allows you to have one video card and multiple virtual machines sharing the resources of that graphics card. I think you're kind of become the source of documentation on that because you didn't find this in a place. This was a search of the internet to consolidate all these dispersed places of information to come up with how this actually comes together and works. Right, exactly. And so where I came up with, like I said, stealing from one is plagiarism, stealing from many is research. Like there's always that whole passage. And so I did pull from multiple sources, but I've cited every single one of my sources along the way. And those are all available in the documentation that I put together. You can download it off my Google Drive and follow the tutorials that I put up on YouTube. So VGPU is the main method that I look at for virtualizing graphics cards. And if you have, like I said, probably the easiest one if you're looking at getting into this is the Tesla M40. The reason that's a fantastic GPU is it's a Maxwell-based card. It has the same exact GPU die as the NVIDIA Titan X Maxwell. So it's a super fast, very reliable card, very powerful, 3,000 CUDA cores. Oh, and either 12 or 24 gigabytes of video memory. You can pick them up on eBay for $100 right now. That's a good price for that. They're pretty neat, what I was looking at, because one of the other things about them is the way they, the cooling at them because they're designed to go into a server. Yeah, the cards themselves, they are passively cooled, but they are pretty much the best bang for the buck that you can get in an enterprise GPU. So the Tesla M40 is one that I often go back to and demo a lot of these things on. But again, if you have a supported GPU, they will work with this pass-through method. Getting VGPU working is only half of the equation though, because once you split up the GPU, you have to be able to connect to it remotely. Yes. And that's where things have gotten a little bit complicated as well. Now I did figure out that using this VGPU method, you can connect either via Parsec, which is a fantastic free remote desktop solution that offers low latency and it's designed for gaming. But you get dumped essentially into a remote desktop session with full video acceleration. And so if you wanted to, I don't know, edit video remotely or use your desktop remotely, you get full access to your desktop in about the lowest latency possible from wherever you are. If you want to do it locally on your own network, and that's where most of my VGPU stuff happens, there's also an open source solution. There is a server called Sunshine. It's an executable that you run on a Windows or Linux client with a VGPU instance. And it is basically an open source re-implementation of NVIDIA's game streaming. So it's the same thing that's built into NVIDIA GeForce Now. And it's just a small service, runs in the background, and it allows you to connect with an NVIDIA game stream client. There's also an open source re-implementation of the game stream client known as Moonlight. And that also runs on just about any device. So whatever your host operating system is and whatever your guest operating system is, you can connect to them with Sunshine and Moonlight together. And it's a fantastic solution. Now, it looks pretty inexpensive. I mean, not free. The Parsec one at $10 a month. How does it compare to the other ones out there? Is Parsec got an edge on them? Parsec definitely has an edge as far as overall connectivity, latency, speed, resolution, features. Parsec is either free and they do have a paid tier. The free tier allows you to have a single remote monitor. It allows you to have up to 60 frames per second. Although I believe they recently up that to 120 frames per second. Oh, wow. So for free, you get 120 hertz connections. Like it's pretty wild. Upgrading to the paid version, they're pushing that as more of like an enterprise or VDI, virtualized desktop infrastructure solution, where if you have a remote video editor or CAD editors or things like that, the unlocked version gives you multiple monitor support. They also give you things like 10-bit 422 color broadcast instead of just 8-bit 420. So again, for things like professional use cases, video editing, that is an absolutely fantastic solution, and there's really no one that touches it. Yeah. Once you go to the high color space problem for issues that YouTubers might have of having a remote editor, it sounds like Parsec might be the good solution. Yeah. Anyone else record in ProRes 10-bit 422? Yeah. Not the usual homelab conversation, but it's good to know for people looking for a high-resolution solution, Parsec's out there. I mean, I'm always for the open source one, but sometimes it just doesn't fit the bill. It's not the right thing. That's why I wanted to clarify in Parsec's. I knew you used it for some of these more advanced features that it had on there. Yes. Yeah. Yeah, we use the Parsec Paid Tier for remote video editing here in my studio. So for those who don't know, I have a video editor. His name's Rhett. He works in studio probably about three to four days per week, but he also does a lot of video editing from home. And my projects, again, at ProRes 10-bit 422, the average video shoots about 500 gigabytes. And so rather than like handing him an SSD and going here, go out at this and then bring me about the project when you're done, or slinging 500 gigs of data over to a server that he doesn't have at home, we just use Parsec. And so he logs into a remote instance on one of my servers. He gets direct access to my NAS. It's all, from his perspective, he's in my local LAN as if he's VPNing in to a private desktop interface. Yeah, and this can be helpful for really anyone that wanted to use this. Yeah. You know, whatever applications are, it's sometimes it's tricky getting applications to port over VPN, but doing it this way. We're gonna keep it back to the gaming topic here, but just to expand out on the idea, whatever you may need, whether it's a remote video editor or just access to ad applications or any design software that you may use, and you go, I'd like to work remotely, but remotely is house, and remotely for me is just not my house. And traveling, being able to get back in there is quite handy. Yeah. Dunkel says, I'm not a huge fan of opening ports. The nice thing about Parsec is also, it's a client server infrastructure. And so you download a Parsec client both for your host and your client application. They are the same exact application. And Parsec offers essentially like a Google remote desktop or any number of other services, PC anywhere, remember them? Yeah. They offer essentially linking from their client to their server and back. And so you don't have to open any ports. It all runs over 443 HTTPS. So every connection is encrypted and secure. And there's nothing that you have to do to actually make the connection through your firewall. OK. I'm assuming I've seen them suggest that you do the whole punch. It's pretty likely that they're using that. Yeah. Yeah. So awesome. So good on security. And of course, if you're promoting in, VPN is an option for people that want to do things like that. Oh, totally. Yeah. It's an option on there. Generally, you'd want to eliminate anything that causes extra latency with this because to the gaming topic, we can't have latency. Exactly. Yeah. And there is still latency when you're talking about gaming. And if I could touch just one more second on the Sunshine and Moonlight, the open source implementation, those do not have a server client infrastructure. There's no cloud hosting for redirection or anything like that. You don't need to open ports in your firewall or VPN directly in. Yeah. Or be on your local network because you have one fancy server with some Teslas in it. That's exactly right. Yeah. I think that's most of the implementations you've done. So you and some friends and the kids and the family can all play the gaming server together. We've actually had six players, five of them remote and then myself playing Crysis. In fact, one of them was as far away as Norway. And I'm in Oregon. So literally like opposite sides of the planet. He was playing on my cloud gaming server. It was absolutely incredible. That's great. So it's all becoming more viable as the internet gets better and faster. Yeah. Anyway, so where do we want to go from here? We're going to go from here. So back to some of the modern video cards. You said this is now supported with some of the modern video cards as well. Yes. So like I said, the M40 is the most affordable that you can get. It's about $100 and gets the most bang for the buck that you'll find on eBay. You can also use more modern video cards. My cloud gaming server right now runs three Titan X Pascal cards. And I've wired them in. So the fans just run at 100% because they're in a server and they're close together. And who knew that heat would be a problem with three cards that generate like 350 watts of power when they're fully loaded. But yeah. Maxwell, Pascal, Volta and Turing. So the brand new NVIDIA 3000 series cards and pure based cards NVIDIA changed the method again in which they handle para virtualization. And so it uses a new system. It's based on SRIOV, but it's also got some closed source secret sauce that modders have not figured out how to break yet. But they are working on it from what I hear. Yeah. And I think the important thing to think about here, I mean, GPU pass through just in general, I want to pass through a GPU as a whole. You're describing the PCI device and whatever your hypervisor is figuring out the methodology that they have to pass it through. Take this PCI device, make it available not to the host VM, but to one of the guests inside. What Jeff and we've been discussing so far is the para virtualization to further slice it up. You know, and I really love this in concept because it's so easy to do. Virtualization is well understood, been around for a long time. We take a CPU, we take an 8 core CPU, but I still am able to over provision CPUs to all my guests because they're not all asking for the same thing at once. You do something a little bit similar with para virtualization with the GPUs. How is the, because I remember your documentation on the one for, I think it was a Hyper-V. How do you allocate and balance that when you're allocating? Can you over provision GPUs? Yes and also no. So VGPU, which I've installed on KVM and you can do it on a couple other hypervisors, but the VGPU unlock has pretty limited compatibility with what hypervisors it can use. Proxmox 7.2 has been a fantastic one. I've actually been using Proxmox for VGPU since Proxmox 5.0. That's how long I've been on this journey. But with VGPU unlock, what you do is you allocate profiles of the GPU based on the amount of memory that you're going to split and you have to evenly split the GPU. So if you have a 12 gigabyte card, you can do three four gigabyte instances or four three gigabyte instances. Now, if only one of those GPUs is running, it will be allocated three gigabytes of video memory, but will have access to up to 100% of the GPU CUDA resources. And so for GPU performance, as long as no one else is asking for it, it can have 100% of the GPU, but only its partitionable amount of memory. If another, if you fire up a second instance with another three gigabyte instance, if one of them is still not doing anything, the other can go all the way up to 100%. If they both start rendering games, they'll level out about 50-50. You can't over partition the memory amount in VGPU, and you can't use, obviously, more than 100% of your GPU resources. Now, moving on to Hyper-V, because that is a completely different animal entirely. The reason that NVIDIA unlocked GPU pass-through and eliminated the code 43 if it detected VM is Microsoft wanted to implement para-virtualization into Hyper-V, which is a replacement of RemoteFX, I believe it was called, which is the old GPU para-virtualization system in Hyper-V 2012, 2016, something like that. Essentially, Microsoft now fully supports inside of Hyper-V firing up a Windows guest and passing through your video card resources. Para-virtualization is a little bit different because you're not allocating X amount of memory and you can over-allocate, you can over-partition in para-virtualization, but essentially it runs with a single PowerShell line. It's a very long PowerShell line, but it's basically a copy and paste operation where you say, this GPU, I want to give it, or I want to use this GPU for para-virtualization, and I want to attach one instance onto X virtual machine. And when you do that, Hyper-V will split up the GPU, and then there's a little bit that you have to do on the back end as far as setting up a folder structure to share the drivers between your host and your guest. Now this comes with some complications because if you're running Hyper-V on your Windows host, you have to be running the same exact kernel version of Windows on your Windows guest. You also have to be running the same exact driver version, and the driver is actually shared via a folder structure between the two. And so if Windows decides to upgrade from 21H1 to 21H2 on your host, your guest GPU will stop working. If NVIDIA updates your drivers on one of those systems, your GPU will stop working. And so you have to keep them on the same version at all times, both kernel and driver version. That's the complication with this one. But it is completely free. It is completely usable, and you can actually do it with Windows 10 or Windows 11 as a host. You don't have to have Windows Server as a host. So this is probably the method that I would push most people to if you were looking at, I have one desktop PC. I want to add a Tesla GPU or even split the GPU that I already have in there. The other major advantage is this works with both NVIDIA and AMD GPUs, and there's no limit on what type of GPU or what architecture GPU you can use. It's any video acceleration. This could be the first time I'll say this. I think I found a use case for Hyper-V. I'm not a big Hyper-V. Let's get cold in here. Yeah, I know. It feels weird. I just promoted Microsoft. But it's clever they have a support for that. That is other than the intricacies of it. But I mean, generally speaking, you want to symmetrically update your systems. Correct. If there's an update, just update all of them. Just go through it. Deal with whatever drama comes with the latest version of Windows and work your way through it like the rest of us do. There's an update. That's life. Exactly. All right. So it sounds like though, generally speaking, Proxmox is your go-to. That's where most of your gaming stuff runs on? Yes. Yeah. Yeah. Proxmox is my go-to because it gives me the most flexibility and the most reliability when it comes to just firing up an instance and playing. Now did you ever get this working with Zen server at all? I have not gotten it working with Zen. And as far as I know, there's no method to do this with VMware either. So it has to be essentially KVM as a base. So obviously Proxmox, but any Linux-based KVM, you can run it on Red Hat. You can run it on Scent. You can run it on whatever you want. Yeah. And I think that's, I wanted to bring it up because everyone knows I'm a bigger XCPNG fan, but in the commercial space that I use it in, we help companies deploy this in data centers and things like that. And so when people ask me about Pastry, I'm like, this is not a request we often get for the person that's running 2100 Linux VMs to host a phone application. It turns out not their use case. So it comes down to use case. That's why I said like Hyper-V is an interesting one because the lack of limitations on air and Proxmox because of the underlying hypervisor Proxmox uses. And I believe Wendell's covered this before on specifically KVM and parsing out things that he's got a couple of write-ups in the level one tech forums all related to this, which I'm sure you probably referenced at least a couple of the things in there. Yes, definitely. Yeah, there's a couple of threads in particular that I've referenced back to. Yeah, Wendell's just like, when he dives deep into something, there's a lot in there on getting that set up. So as long as you can get that hypervisor running, it's not that Proxmox requirement, it's more that that KVM is the requirement. Well, there's documentation around it and there's support and there's a lot of people that contribute a lot of code to making that work. Yeah, so I've done multiple different tutorials on setting all of this up. Like I said, Proxmox is usually my go-to, but I've also got a tutorial up for Hyper-V as well. I've also got tutorials up for just passing through a GPU or sharing resources between your Windows client on a client OS. Now, the KVM I've done multiple different tutorials on. So you want to make sure you're watching the most recent one. And the playlist that Tom has linked in the description here is the most up-to-date Proxmox tutorial. It uses the VGPU Unlock script as well as VGPU Unlock Rust. And that has been, number one, the most reliable, the easiest to install. It also allows you to customize the GPU profiles. Now, VGPU works on basically a preset number of profiles that are predefined from NVIDIA based on the architecture that you're running. And so Pascal will have certain profiles for certain cards. Maxwell will have certain profiles for certain cards. And again, it kind of comes down to the memory split, but there's also a couple other features that are turned on or off. Using the Rust Unlock script, you can actually customize the profiles and turn on or off CUDA acceleration or disable your frame limiter or all these other nifty tools that are quite handy for gaming. And you could also change your memory allocation based on the profile that you're running. And so the problem with Maxwell cards is the largest Maxwell profile was only eight gigabytes of video memory. And, or sorry, the largest Maxwell card to support VGPU only had eight gigabytes of memory, and that was the Tesla M60. Well, what if I have a card with 24 gigs of memory? Well, the profile says I can only run one eight gig profile. Oh. You can't use the full allocation of your GPU because it's based on a hard set limit by the profile itself. Using the profile unlock, I can modify the two gigabyte profile and say, you know what, go ahead and use eight gigabytes instead. Got it. And again, I've got full documentation for how to configure all of that with a Tesla M40 or a Pascal-based GPU or whatever you have. So yeah, I've been using the Tesla M40s with eight gigabytes of GPU memory each and three profiles split out from it that gives you roughly 1,000 CUDA cores. It gets you on par with about a GTX 1050 Ti for times three for essentially 100 bucks. So about $30 per 1050 Ti instance. Which is really cool. And I've already seen a couple people comment that the prices went up a little bit. I'm going to believe that it has something to do with Jeff making videos about it. It always does. It's supply and command. There's a finite supply of these on eBay and the price is determined based on, you know, the popularity of Jeff's trending video. Yeah, I think the 24 gigs have gone up from 100 to 130 lately, although the Tesla M60 has recently dropped down with $200, which is two GPUs. They're two GM 104 GPUs. So the same GPU that's in the GTX 980. It's 2,000 CUDA cores times two and eight gigabytes of video memory times two. And that's actually the card that I run in my main server and that's where my video editor sits. And so one of the GPU dies, I gave essentially directly to him. He's got the full eight gigabytes of memory. And then the second GPU in that I split out four ways with two gigabytes each for my daughter and her neighborhood friends to all play Minecraft because two gigabytes of video memory and 512 CUDA cores, perfect Minecraft client. Yeah. And so they all get on their school Chromebooks and play Minecraft on my gaming server. Now that that's a really cool thing because now you're kind of you're concentrating all the work into that one system and I love that the client is a Chromebook. My question though is audio. How is the audio? Is it all well synced? Does it work well? It is very well synced no matter what client I have used. But both Moonlight and Parsec keep the audio synced extremely well. I've had more issues with video decode than I have with audio sync. Nice. That's kind of surprising. I thought there would be some audio issues other like some latency but it sounds like they got that. Well, I guess if they got the video figure out it's a video certainly to more complicated of the two. Right. Yeah. I guess it's the concern of did they show any love for the audio engineering of it or like or just trying to get the video from over here to over there. I will say it's not the best audio you've ever heard. It's definitely slightly compressed from the main source but it's totally usable. Yeah. Especially for, you know, like I said playing Minecraft or playing six player crisis on a single video card. Well, and I like this from an allocation standpoint because for example, you know, people, Plex is a popular one. You want to set up your Plex server and being able to parse that out and say, hey Plex, when it needs to do this, which is not all the time, wants to do some rendering and GPU pass through to render things you can do this while simultaneously still using it on your other one to play your games. The really cool thing about Plex and NV ENC in particular is NV ENC is dedicated hardware on the GPU. It's not using the CUDA cores or the gaming parts of the GPU. And so as the meme says, it's essentially free real estate. If you partition out your GPU and you give one instance over to Plex, Plex will use the NV ENC encoder that doesn't act as a detriment to your gaming performance because it's different hardware. It's different hardware physically on the GPU. Now the downside with that is you have to allocate your GPU evenly. And so whatever video memory you give your gaming PC, you also have to give your Plex encoder. And so if you have a Tesla M40, you might end up with like a 12 gigabyte Plex instance or something like that. But you know, you're still able to split it out and use all the resources. Yeah, so that's interesting. That is one detriment to it, I guess. Well, but just go by because Plex just gets some other video cards. Pop a few video cards in, get a board with a lot of PCIe slots and a big power supply. Yeah, honestly, what a lot of the Plex community has done before also is the Tesla P400 or even better, the new T400, the Turing-based 400 with the new Nvidia encoder in it. It's a single low profile card, runs at like 60 watts or something like that and has, there's also an unlock script so you can have it encode more than two streams at the same time as well. So that's very inexpensive solution. You can pick them up for about 100, 125 bucks. I think one of the other fun things, because I did not click on this, so I don't know how good it works yet, but it piqued my interest. So many of you have all seen the Dahle-e things, the Mini Dahle-e. Oh yeah. So they have an open source one, so you can run it on your own GPUs. Oh, I hadn't seen that. I've seen it in the news on Reddit and I'm like, oh, this is going to be neat. I have things I have to finish before I click on this because I want to know. I want to start rendering and doing prompts and everything else. And this seems like, I wanted to play it this morning, but I knew I would not get the thing that I'm supposed to get done if I would have started. I was not aware of that. Like I said, most of the time, my cloud gaming server is just kind of sitting there. So I've got three Titan X Pascal's with a combined 36 gigs of video memory that that could be fun. Oh yeah. Jeff is going to start re-rendering all the thumbnails with it. That's going to be a lot of fun. I think there's a big future in a lot of that type of analytics coming out. Yeah. I mean, it all starts as we all play with it because it's fun. You know, doing things like the Dahle-e stuff, playing games, but the other side of it is you can leverage and, you know, I'm working on a review of the new, and it's essentially a GPU built into it, the new synologies that have this where you can do more analytics with GPUs because they're good at things like figuring out object recognition and base recognition. So I think there's all kinds of fun lab projects when you parse it out. You're just better utilizing that hardware to have it doing something all the time. I run, you know, I run my games at night, but during the day when I have to do work, I let it render stupid pictures of Elon Musk. Or do face recognition or do some type of object recognition, run some data sets. Yeah. Yeah. It's a really interesting field. And again, being able to even split out your GPU and play games on half of it and be able to do AI stuff on the other. My brother-in-law graduated college a couple of years ago and basically has a master's degree in AI programming. Whatever. He's one of those people that's way smarter than you could ever fathom being. Yeah. But yeah, he's been doing a lot of crazy stuff and recently got a new employer who I can't mention, but I'm really jealous. Yeah. Like I know them. That's cool. That's awesome. All right. So we've covered all the how to GPU, how to game and the tools to use, which is pretty simple. And we got just playlist in there to get you into the nitty-gritty details of it. Let's pivot a little bit, talk about the gaming servers and some of them brought up a good question. How do we store this? Because I know you've actually covered this and I've done this before where I've used iSCSI to store it on my Steam games across on like a TrueNAS server. What are some of the ways you deal with storage when you're building the gaming servers? Because games are huge. Oh boy. Oh boy. So let's look at my, if you're running one or two or three instances, a standard solid-state drive, it's all you need. Yeah. For my cloud gaming server, when you start scaling up, if you want an entire server dedicated to gaming instances and you've got 12 or 16 or even more virtual desktop, instances running, storage and in particular, storage latency becomes one of your biggest bottlenecks in the server. So my main cloud gaming server is running an Epic 774264 core. It's got 256 gigs of DDR4 and three Titan X Pascal's. Even running with a pair of NVMe drives, the problem is the latency and the... Imagine you're trying to do queries to a SQL database. You know when you run like crystal disk mark that bottom set of numbers, the random 4K? That's the tough one. That's the number that matters. Yeah. For that kind of fast burst IO. Right. So even your NVMe drives, you're looking at some of the best ones are like 60 megabytes per second. Imagine loading Spider-Man or Cyberpunk or Crysis at 60 megabytes per second times 12. It doesn't work. And so storage latency and random IO speed becomes your biggest bottleneck. What I found out was throw even more hard drives at it. My basic solution was I bought a 12 port SATA interface on a single PCI Express 16 card. And I have those routed to 12 Intel 1.2 terabyte SSDs, enterprise SSDs. Completely overkill solution, but because they're running in ZFS 2, it bumps up that random IO to something like 350 megabytes per second and elites makes it usable. Honestly, the best solution would be something like Intel Optane memory that rest in peace. Yes. I recently did a video on Optane memory and I was super excited to start implementing it. And then it's like, oh, and it's dead. Yeah. Yeah, which is sad about it. By the way, check out Jeff's video on that. It's really clever if you don't, I don't know why it's said because it's such a clever way it works in a way you partition memory in memory slots to be functional logical drives at really high speed. I love this. This is beautiful. Oh, and instead. Right. Yeah. I was so, so disappointed when I saw that news because like I said, I was super excited. I finally got mine working and for those who haven't seen it, I have a terabyte of memory. 1.3 terabytes of memory in my Stornator AB 15. Yeah. And I have the same motherboard as you. So I watched that video going, hey, I have an idea. That'll work. I have the motherboard, but I don't have to upgrade the chip. I have, I already have that chip that support. Oh, nice. So I'm like, I only think I need to use by memory. And by the way, I have slots. Well, in one of the other potential ways, and maybe you did this was or looked into this as you could set up each one of the VMs to talk I SCSI over to TrueNAS. But yes, you would then have to probably at least go 25 gig to TrueNAS in order to be not running into the next bottleneck. Here's, here's the secret sauce. Virtualize your TrueNAS inside of your cloud gaming server and I SCSI with vert IO interfaces because they run up the speed of your CPU. Yes. And so for those, for those who I just lost, if you're running your storage on a separate server. Yes. You're going to, you're probably going to need at least 10 gig if not 25 gig interfaces to handle the amount of throughput of all that game data being loaded by 12 clients simultaneously. However, using if you virtualize your TrueNAS server on the same metal that you're running your game server as. So in my case, I just passed through that that SATA interface card and all 12 drives show up inside of TrueNAS. I configure them in ZFS2. I then set up I SCSI instances for each of those VMs. So each VM has its own I SCSI interface and Tom has a fantastic tutorial on this and I've also done a tutorial on setting up your Steam library on one. But set up I SCSI for each of your VMs. Using Windows as a client, you'll need to install the Red Hat Verd.io driver package to get the Verd.io network interface running inside of your Windows client. But rather than running at one gig or 10 gig Verd.io runs at however fast your CPU can run. And so it actually virtualizes essentially a 100 gigabit link. But it doesn't and it runs standard networking protocol but it kind of ignores the limits. TrueNAS supports this natively so you don't have to do any configuration. Windows you will need to download the Verd.io driver package and install the driver for it. But that is the way to get the lowest latency, fastest access to an I SCSI drive on your cloud gaming server. I was hoping you were going to go for this whole it's a complex process but boy is it a fun learning experience and you don't hear Tom often talk about this and I say this is because a production system that not for gaming I wouldn't recommend virtualizing TrueNAS because it's a potential problem. But for a use case of trying to get the most efficient use and lowest latency running it all on there all on one physical server so let's say Proxmox which I'm assuming where Jeff has it, virtualizing your TrueNAS doing not a Verd.io but just a full true of the PCI card into TrueNAS so it has direct access to the drives which is a requirement does open up a really fun opportunity to be able to build an incredibly fast server with the lowest amount of storage latency possible. Now this is where a lot of people get confused because this even applies with XCPNG you'll take two Windows servers and sometimes you'll see displayed a speed or a link speed that's kind of like a legacy left over because they aren't really they're ignoring any of those limits because I have people say but it only says I'm connected I'm like don't worry about that if both machines are running to Windows guests running on an XCPNG Proxmox or many other hypervisors if they are on the same wire they are not leaving the physical same metal they talk at whatever speed the back playing can essentially facilitate. Right as long as you're using that Verd.io network driver if you're virtualizing like the Intel E1000 is a common virtualized network card that will still be limited to your 10 gig speeds to the 10 gig link of or one gig link of inside of Proxmox KVM but yeah the Verd.io driver essentially connects them via bare metal and so it depends on your CPU cycles not on your network card you're out you're out facing network you're your switch bandwidth whatever else yeah I'm kind of surprised Microsoft didn't just include that already because so much other stuff is that still true for Hyper-V do they still need that extra driver or you don't have a test it? I think as of 2012 they kind of did away with that inside of Hyper-V at least as far as Windows clients goes if you're running a Linux client it's kind of the inverse is true where if you're running KVM and you're running a Linux client natively supported right exactly but if you're running a Windows client on KVM you need to go through some secret drivers and then make it all work it's the inverse with Hyper-V and so Hyper-V I believe ignores the network limit when you're running a Windows client but the Linux clients are still running at 1 gig or 10 gig and I bring that up too because by the way do this in Proxmox because you're going to have more of a hard time playing with TrueNAS to get it to load the proper drivers to get this idea to work with Hyper-V. It's easier than drivers. It's easier and undoubtedly more stable to do this so even though I did sing a little bit of praise for Hyper-V for certain GPU sharing that's where it ended. It was downhill after that when you wanted to say virtualized TrueNAS and Hyper-V that's a big no. Yeah just to bring it back this is Proxmox is where Jeff has all the setup and functioning and working. Yeah This is a lot to think about putting it all together. You asked if I had a playlist and my playlist is literally I have a much longer playlist on this entire journey that I've taken from the original Kepler cards through all the trials that I've gone through from different remote clients and different streaming options and testing with different thin clients like testing a Raspberry Pi as a thin client or a Chromebook or a Windows desktop or whatever else and there's a lot of failure in those videos so people ask me how many videos you've done because I had them as part of a numbered series I stopped counting I think at 11 Yeah So if you're wondering and run across some older stuff and ask the question can it be done Jeff probably did a video on the problem with doing it That's why the most recent playlist is the most effective way we'll say to get this done and get this accomplished because the other ways are undoubtedly more buggy. The other really frustrating thing about documenting my journey on social media with this Tom anytime you start googling specific error messages for the esoterra that you're trying to get running on your server you come up as a reference Yes because I have my forums and it has me going up as a reference It is such a first world problem it is so aggravating I'll be going down some random forum and someone will say oh I came up with this error message and I'm trying to figure this out and you're reading through this thread and like okay I've tried that ooh that's a good idea I don't think it'll work but I'll try that and then it gets down and you get to the bottom of the page and it goes oh Jeff has a great tutorial on how to do NOOOOO Yeah back back back keep scrolling keep scrolling but anytime I search for anything that's grid, vgpu, proxmox I come up in the documentation yes ah well I think this is fun this was a hopefully a fun learning lesson where people already get started on there Jeff's got the videos link down below if you can google the error messages you'll also come to Jeff apparently so this is we'll leave links to all this and of course check out and subscribe to craft computing there's more than just virtualization videos Jeff's been a friend of mine for a little while and oh he's got some little ones wandering in apparently it's breakfast time yes it is breakfast time anything else you like to add Jeff can you say hi you can say hi he can he can hear you yep oh you can't hear him okay there we go hi that's where we get the the mini craft computing people yep this is this is a little bit a little bit alright don't call me a little bit okay okay I head on out I'll be out in just a minute okay yep all right all right anything else further the ad beside your comment from a little bit yes AMD GPUs I have tried so many times I have tried I am AMD GPUs where do I start where do I start where do I finish yes I will be up in just a second okay head on out I will be up in just a second and we'll talk then okay thank you I did I'm sorry AMD GPUs AMD GPUs hey when are you going to do AMD GPUs you can do AMD GPU in pair virtualization with Hyper-V as far as any other method with KVM sure you can pass them directly through the same way that you would with with Nvidia based cards the same driver installation process all that as far as virtualizing an AMD GPU I know some of their enterprise GPUs support SRIOV and I put that in the most sarcastic parentheses that I possibly can yes they support SRIOV AMD themselves do not make any driver packages publicly available to you and so even though their system is free and license free and they advertise that to the moon and back there are no downloads that you can get for MX GPU installs on a client VM it's only supported by VMWare as far as SRIOV goes and I couldn't even I bought two MI-25s trying to get SRIOV to work because people always ask me about it whenever I enabled it in VMWare VMWare would actually crash on me I tried enabling SRIOV in Proxmox it didn't recognize it as an SRIOV capable device so there's some secret sauce that AMD is not releasing even though they advertise as being license free and open and you can use whatever you want so far it hasn't existed so if you do figure out how to make this work let Chef know I will document it and share it with the world make a video on it do some documentation on this so pretty much this is Nvidia stick with the Tesla cards M40s, M60s those are your go to cards to make this happen yep and I'm waiting for either the Volta GPUs to drop so the Tesla V100 or even the Titan V because apparently Volta is insanely fast when it comes to virtualization so I've been really excited to get my hands on some of those they're still trending 800 and up I'm waiting for them to drop once they go below 5 I'll probably snag at least 1 or knowing me 3 at least I mean can't just have 1 we're YouTube people we gotta do something excessive to get the views right exactly make a stupid face put too many GPUs in a server a couple of my VGPU tutorials where I literally if you bought the hardware that I was playing with at retail it would have been like $25,000 for one tower and so a couple of my thumbnails are $25,000 for we are self-aware YouTubers by the way so we can talk about this that's right well this is definitely fun I'm hoping this is gonna be my as I already referenced Jeff and now I have a video in my collection as well or podcast in my collection I'll be referencing people say how we get started with this it's a lot it's a big project it's a good undertaking but you know I think it's really good for a homelab people especially because gaming is often what leads you into that homelab world of well I got a gaming server I built myself that was fun can I build something more and yes you can or now I need a service to support it or hey how about starting up a Plex server or or all these other things that I believe basically any of us in the industry who have worked professionally in IT we all start as enthusiasts we all start as loving technology and playing games or whatever else and that's often the gateway drug that gets you in and it's been a lot of fun to integrate gaming into another industry that I absolutely love and have a passion for and that's enterprise IT hardware it's a really weird thing to have a passion for but yeah no and you know and I don't know if you're a big listener to Dark Knight Diaries but I love how many people started with playing games then into hacking like the two are intertwined like how do you start well I was 14 and someone someone kicked me and banned me and I was like how they find out my IP oh they fished me they did this like it's it's like a pivotal place to get you get your fingers on the keyboard and everything else so I love all this all right links are down below for all the fun stuff we talked about and Lisa check out just a whole playlist on this and if you want a history journey check out just well at least 11 of them are labeled journey so you know they're all they're all in one much longer playlist and I got I think there's 17 or 18 videos in there now like it's yeah and I started on this almost exactly three years ago I started on this in 2019 I yeah boy how naive I was that I was just gonna buy a couple of cards slap them in and make them work yeah I remember you having soldering things and everything else like I know I I was following along with the journey going wow just really into this I hope it was having to solder on 0204 resistors to change the idea of an Nvidia card to make it recognizes as grid capable yep oh yeah we we went into the weeds yep yep so if you're wondering if that'll work watch the video and find out actually yep all right thanks everyone take care thanks for having me on Tom all right appreciate it