 So, I thought we could do this in a sort of format that like where we look, so over here I pulled up the bare metal cluster installation instructions. This is a full document that goes over basically everything you could ever possibly think of to get an OKD cluster going from scratch on bare metal, bare metal UPI I should specify. And over here is my sort of explanation of my setup and supporting infrastructure that I have. This is in that repo that Mike Bacoon is maintaining. So that will be available on his repo and then eventually I think in the main OpenShift OKD repo once everything sort of straightened out. And I thought we could just go through and sort of go through everything here and I'll show you what I did in my home lab to fulfill the requirements like network connectivity is a really big one. A bunch of us were talking about that during the stage session earlier, stuff about CSRs and creating the infrastructure and I'll show you the Terraform scripts and the bash scripts and the stuff that I have set up to get all of that going. And as we go along, if you have any questions, pop them in the chat and Craig feel free to interrupt me literally at any time and just ask me the questions and I'll answer them. And then whenever I get through that, we can talk about other home lab setups or if the Dean wants to come on, I think he's in here, maybe he'll pop away later. If he wants to come back and sort of talk about his home lab setup, I don't know if we're going to talk about that in the single node stuff as well, the Dean. But yeah. So without further ado, let's I think I'm I don't know, I can go first or you can go first. I think I mean, so long as I'm talking, I'll go and then you can talk about your setup afterwards with Dean. I think that yep, mine's not much better. It's just more completely documented, I think. So without any further ado, let's get going. So over here, I you know, these are the resources I'm using basically to get this setup going. The bulk of it is these big three relatively big hyperconverter type advisors, they're all identical boxes that I sort of built up over the course of a couple of maybe a year or so. As you know, I decided to get more and more overkill with my setup. They have a rise in 53600. So that's 12 virtual cores right there. 64 gigs of RAM. Each one of them comes with three four terabyte hard drives, two 500 gig SSDs that I raid one together for redundancy. And then the boot disk reach hypervisor, just some random little budget NVMEs M2 drive that I that I just sort of stuck in there because that's not the important part. And all of my supporting infrastructure is mostly run off of this one NUC that I had laying around gathering dust, small little Intel Core i3 with 16 gigs and a slapped a little 500 gig SSD in there. So the bulk of the of the sort of the expense of this is all in here because this is way overkill, like for any workload any individual person ever could have. But you know what? That's what makes it fun. My hypervisors are each hosting an identical workload. So this is the way I planned it out. Originally, I was going to have the size of cluster be three control plane nodes, one on each hypervisor and then nine worker nodes split 333 on each hypervisor. So my control plan nodes, I gave them for vCPUs 10 gigs of RAM and very small 50 gig root disk. They don't really need much more than that for what I use them for. And the worker nodes get eight CPUs and 16 gigs, 50 gigs of root disk. And then I also pass in one of the four terabyte hard drives to each one. This will get used later to set up the Rook plus Ceph cluster for the distributed storage for all of the container workloads. And then the bootstrap node, which is very temporary, is just another VM that gets spun up for vCPUs 8 gigs and 120 gigs of root disk. That only stands up for maybe about half an hour when you need to first set it up. Hello. So like that kind of takes care of the required machines requirement over here. I read this and I was like, okay, let's see how far I can push it. The control plane you'll know that I'm, wait, let me zoom in on this so that people can see it. So the control plane here, you'll note that I'm really sort of going not as hard on storage as the recommendations say. That's okay because log rotation is a thing. And so far I haven't run out of disk space yet. The compute nodes, I'm over provisioning and the control planes, I'm kind of under provisioning. That's also okay. Mainly because the, as you'll see, I have all my nodes up over here. Or are they right here? Is that showing up at all? Hopefully. But like, they don't run out of memory very much as it is. So it all works out in the end for something like this where it's totally overkill anyway. The main important thing that, you know, took me ages to get going was the networking stuff. Oh, dear Lord, the networking stuff. Where's network connectivity requirements? Now is it somewhere in here? There they are. Networking requirements for user provision infrastructure. This section of the docs took me maybe like a week of just reading and experimenting to get it all sort of straightened out what is necessary and what isn't necessary. It goes into a lot of detail on like what ports need to be reachable from what subnets. And I suspect that is so that, you know, people who have actual real network topologies can set up their routing rules correctly. Whereas I'm just on a flat home network, everything can talk to everything else. So a lot of this actually you can just straight up ignore, which is really great. If you're in a homelab setup that isn't too complicated or doesn't have too many weird VLAN stuff, things going on. And then the really important thing that these docs don't actually mention for whatever reason. And hopefully after this, somebody maybe it'll be me will remember to make it a PR or something up to the docs repo is that the nodes during their initial bootstrap need PTR records set up for them to figure out their hostname from DHCP and DNS. If you don't have that, then they all come up with the same hosting. And then the cluster doesn't come up at all. So yeah, and as Vadim says, the docs do take like a whole bunch of proxies and meters and all this sort of stuff into into account. So they look really complicated, even though I think for most deployments, at least in a home web scale, probably more than that, you don't really need to worry about most of it. And then the important thing is just to have a load balancer. So the docs talk about all of this and talk about the ports and the stuff you need. And they also say, you know, you have a separate ingress load balancer. Most people, I think run the API load balancer and the ingress load balancer, or at least I do, just as the same VM. It's a I just have a very tiny little VM that only runs a shape proxy to vcp use 256 megs of RAM, as little disk as I could get away with giving my little blue to install, and a very, very straightforward it shape proxy config that I will show you later. I adapted mine from the config file that's generated by the OCP for helper node, and simple playbook, which is a this OCP for helper node, I think during the main stage, a few people made sort of references to it. But this is what it is. It is a big old ansible playbook that sets up an all on one know that has all of the sort of supporting infrastructure that you need to run a full open ship forecluster. So the DNS, the load balancer web server serves as a bastion DHCP pixie for, you know, bootstrapping the Fedora CoreOS or, you know, Red Hat CoreOS machines, NFS TFTP. So like this thing, you just point this at a VM, it'll set up to run all of it, really helpful. But for a lot of people's home labs, I don't know that it is. If you can figure out a way to get it all into your environment, it will work really, really well. But if you already run your own DHCP or you run your own DNS, then parts have to become less helpful. In my case, as it turned out, I really only needed the HAT, like the API and ingress load balancer parts. So I just spun that out as its own tiny little VM. And after once you have and then the other half of it is is the port allocations and all the various sort of DNS things that you need. I don't know where in the docks it there it is user provision DNS requirements. So all of these DNS records, you can actually set up once before you even you have to set them up before you even try to start deploying a cluster. The good thing is after you set them up once you don't have to touch them ever again. So that's what I did. I just set them up once and as I experimented with like getting the actual cluster up and running these are all just sort of one time things you have to do and then you can set up in a corner. So like as I said, because I am incredibly overkill, my DHCP and DNS is both managed by Active Directory. So I have a full AD domain running in my home lab environment and I use that for DHCP DNS and like authentication and LDAP and stuff. So I have all of the I had my DHCP reservations set up. I know it's really small. You'll just have to take my word for it because I can't I don't know if I can make it bigger without sort of making it weird. So like you can see all my DHCP reservations here and scrolling further down all of my sort of like the API and API into all the various sort of things like these two are pointing at my load balancer, the everything gets it. All the DHCP static reservations get put here. So I just set this up once, maybe like ages and ages and ages ago. And it all just works. And then especially the XED records, the A records and also crucially the SRV records. This is too big. There. Crucially, the SRV records are the actually some of the more important parts for this. Which I think they're in here. Yes, the SRV records for the XED server SSL stuff. I don't know why they're necessary, but the documentation assures me that they are, or at least they were when I first set all this up back in the OKD for three, four, four days. Also very helpful in this in this actually didn't used to be there when I set up these records, but they had an example as they have an example zone database now. So that helps to serve as an instructive example and see here the PTR records that did not used to be called out by name, but now they are. It's very helpful. Oh, yeah. And DNS PTR records. Look at that. So once you have all of the records and your load balancer and all that stuff in place, you can actually get around to deploying them. So I believe that the actual open shift install program bundles in the Terraform Terraform and the Terraform Libvert provider to actually do this for the IPI based deploys. So I just broke that out and I use that. I have a whole bunch of Terraform. I don't know what they call modules, I think playbooks, somethings for each of my hypervisors and the bootstrap, and they all take care. And so I have a module in here that sets up the bootstrap, the master and the worker node. And I have a module just for making sure that I can download and push the F cost base image to the VMs, to boot off of. And then each one of these will take care of setting up the appropriate number of masters and the appropriate number of workers based on variables that I pass in. So it's all very sort of pluggable. And I have like this file, TF bars that I can use to just SSH and everything and set it up. I'm not running anything very fancy on the hypervisors themselves, just bare Libvert. I did seriously consider running an open stack. But that would have been too much even for me. This was already overkill enough as it was, I thought the bootstrap, of course, gets its own separate one so I can put it up and Terra down separately from the rest of the infrastructure. I'm sort of running through it. After I go and Vadim goes, you know, there will be time for people watching to ask more details on all of this. But that takes care of basically getting everything into place, especially the section about creating F cost machines. The documentation itself talks about, you know, you want to do a pixie install or an ISO install. If you're deploying to VMs, a pixie install is probably a little bit too much. If you don't already have a pixie environment set up that you can just use for this, I think just, you know, doing a bare Q-Cal 2 or having like an LVM thing with that cost sort of deed onto it and then just booting it with ignition because QM you can pass that in directly. And ignition works with that just fine. And that's honestly my preferred approach. What else needs to happen? And then so the bulk of my orchestration is actually done with a with a script here called do the thing.sh. It's a fantastic script. And so it takes, it's very specialized for just my environment, but it gives you an example of just like, here's everything that needs to happen. So I download the latest OKD release. I download the latest CoroS release. And then I create the manifest right here. So I have an install config.yaml. Let me pull that up very quickly. Here's my install config.yaml really, really simple. You know, my base domain for UPI always set the worker replicas to zero. Master, I set to three because that's something I have. I give it a name, set cluster network, service network, network type, my pub key so I can I say some of the nodes if I need to and a fake pull secret, which I don't think it's necessary anymore, but it used to be, and I just have been too lazy to get rid of it. And so from here, I create my initial configs. I use Terraform. Terraform has been configured to point at the ignition configs that are generated by the install. I Terraform apply, and then I wait for the boot strap to complete. And so that's, this is what, you know, we were talking about earlier, like in stark contrast of three dot 11 sort of, you know, giant pile of ansible playbooks that I don't know if any of you guys are familiar with that. But it was, it took a long time and could fail at any part of it. And you always had no idea why it failed or what you could do about it. This is way easier, because it's kind of a binary, it either worked or it didn't. Like if this doesn't work, you there's really don't worry about it. Take the VMs down, try again. And if it doesn't work three times in a row, ask for help. It's great. You don't have to, as somebody just trying to use it and get it going, there's so much less that is environment specific that could go wrong with this setup. And that's very, I think that's part of the whole reason they did open shift four from three. I forgot one of the just to chime in one of the meetings I had attended, they said that they had something like 11,000 support tickets. It generated just off of the different setups from folks using a different open shift three 11 setup. So then now that this is a little bit more mutable, that's the whole reason it makes it a whole lot simpler. All right, that's all I chime in. Yeah. No, for sure, for sure. And then after that, I take down the the bootstrap after the bootstraps up. I sleep for 20 seconds, which is actually too much for. So I just give HAProxy time to realize that bootstrap is out of rotation, because I am incredibly lazy. And nope, 9000, please. So here's my HAProxy and all of its glorious detail. So like I just leave the bootstrap in the HAProxy and I use the TCP check and it just doesn't route anything to it. It's great. I don't have to think about it. This is again more like static configuration that I get to just set up once and leave forever. It also takes care of figuring out where my ingress replicas are, which is a great machine config server stuff. And I'll just it's all wonderful. HAProxy is truly a beautiful piece of software. So I sleep for 20 seconds to give a load balancer out of the rotation and then I sleep for another 10 minutes because something is happening here and I don't know what it is. And so this is kind of the downside of having a sort of very opaque it either works or doesn't set up. I really have no clue why I have to wait 10 minutes here, but I know if I don't, the API server will sometimes refuse to work like I will make an OC call and it'll come back and the API server will just say no, I'm sorry, I don't know who you are. Go away. I don't know why, but if I wait 10 minutes, it doesn't happen. So in the interest of just having a run it walk away come back an hour later in your clusters up kind of script, I just sleep 10 minutes, whatever. And then I do this specifically to annoy Vadim. I sit in a loop and just approve all the initial worker certs because I trust the VMs that I spun up 10 minutes ago. And while the Dean has repeatedly told me and it is good advice, do not trust infrastructure that you just spin up in the cloud to be from yourself. He's right. I don't know how that could happen really, but he's right. That is a possibility. It's not a possibility in my setup behind my TV here. So I just spin in a loop until I get all of my workers approved. And then once that's done, I label some stuff for the Rooksept deployment. That's the other big thing that I do. The four terabyte disks that I use are all they I just pass in the raundice to each worker VM and then each worker is actually running a Cep OSD on it. So I have basically one worker per Cep OSD. So I have nine disks in there. So I know it's Cep cluster. So this is all labeling some chassis and so for Cep's topology stuff so that it does spread this placement properly. But and so actually right about here, I didn't want to point out after after this step after the workers are approved, CSRs are approved and they all report ready into the cluster. As of right here technically, the OKD setup part is done. That's the fun part everything after this point. So about halfway through my script here, everything after this is just like post deployment configuration or day two set up, I think is the as the docs call it somewhere near post installation configuration. So like right after that, everything like the cluster is up and it is technically usable. It's not very helpful to use it at this point because there's no like container storage, the registry is not deployed, nothing like that. But technically it is up and it could run workloads at this point. And that's very cool to think about. So after that, I do some housekeeping kind of things. I patch the ingress controller for my wildcard cert that I use for internal stuff. And so then I have to wait for the ingress to restart itself. And then the MCD reboots all the nodes for some reason. I don't know why I think it's probably to get the CA certificate stuff in there. So then I just wait for that to finish, hacked up a little for loop here to wait for that to finish. And then once that's done, I set up my LDAP authentication. As I said, I'm using Active Directory. So I just have a little LDAP YAML that just sets that up. And then after that, I set up Rook. And Rook because of some fantastic work done by literally everyone on the Rook, on the Rook side of things is almost easier to set up than OKD itself, which is incredible because I've also set up set clusters by hand or with the self ansible play book. And that was what a battle. So having this is like such an amazing thing that I can just literally OC apply a couple of YAMLs just from GitHub. And it'll come up. A cluster will come up. It's incredible. It's amazing. I cannot recommend it enough. Like if you have gotten to the point where you can stand up an OKD cluster or even really a Kubernetes cluster in general, and you just have some spare disks, give them to Kubernetes, put Rook in it. Life is so much better. It all just works. It's incredible. It's amazing. So I just I set up my storage classes and my and my pools and stuff. And that's kind of a tangential thing. I can go into more detail about it if anybody's interested. And then I just wait for the cluster to come up, the set cluster to come up. And then after the set clusters up, I tell the I tell the registry to go use it. And so it goes and just makes it a PVC for itself. I wait for that to bind. I patch the registry so it does the external route. I send I configure metal LB, which is the other half of the magic here that allows home labs to just be super, super, super overkill and cool. Because load balancers are basically the only way to. As far as I understand, with my I will admit incomplete knowledge of the Kubernetes ecosystem in general. If you have something that can't be routed through your Ingress controller, non HTTP traffic or something like that, then basically the only way you want you can get it out is by a load balancer or a node port. I didn't really want to do node ports, but they were I was using them as my only option for a while until I discovered metal LB, which is basically it makes it feel like you're running in a real data center because it will just use like ARP to broadcast, like advertise for a random IP. And it'll just redirect traffic to it. And it works really, really well. I would recommend everybody to just deploy metal LB anyway, just so you have access to something to like load balancer type services. And then I configure my monitoring. I have I had it like OpenShift comes and like OKD comes with all those monitoring. So I just I have a little helper program that I will pull up. It's just a tiny little thing written in Rust. I actually have it linked from my from my OKD deployment configuration guide. Here it is. Just a small program I wrote that it's just a little web server that waits for alerts from the alert manager and we'll just post them to Discord. So I have basically my own ad hoc single person monitoring and alerting setup. All thanks to OKD. I shuttered to think how much work it would be to set up the Kubernetes mixins and do all of the Prometheus stuff manually from a vanilla Kubernetes cluster. So this is honestly a huge value ad for OKD in my book that I get such comprehensive alerting for free. I've gotten alerts for everything from hey the NTP service isn't running in your and your clocks are out of sync to your XEDs are slow to you know. Hey you've got a PDB up. You know during updates it like Rook sets up PDBs for the set clusters to make sure that the plot that the rebalancing is settled like it's really really comprehensive and it all just works and it's amazing. It's amazing. And that's the final thing I disable the samples operator because I'll never use it. And then after that I'm done. And so this process all of these steps end to end takes maybe hour and a half two hours on my infrastructure and it's totally repeatable. I could take the cluster down right now and spin it up again and two hours later we be right back where we started. And so then at the end of it I have a 12 node and completely overkill home lab cluster in which I run basically everything a whole bunch of stuff that I used to just run and bespoke random VMs. I now just run here. And I have crime jobs for backups and you know running all kinds of weird things. I as an experiment set up an authenticated SMB share so I'm running a domain joined Samba as a pod staple set inside of this cluster. That's really fun. Like it it's been it's been all it took a while to get here. You know because I had a single I had a single machine sort of I think it was three node OK OpenShift Origin 311 cluster that I started with. And then when OKD4 came along I sort of was very eager to hop on the train and sort of make the home lab as big as I wanted it to be. And but now after a lot of overkill I'm in a really cool place. And so that's kind of a quick overview of my totally totally unnecessarily overkill home lab setup. Here's some just software I run in it completely not worth the amount of resources I've thrown at this but who cares. That's not the point. And with that I think I think Daniel is asking for a link to the repo that you've been showing. It's it's also a profit. Unfortunately it's yeah it is private because there are secrets all over it. Right. Like the this is half of it. The other half is services which is where I have all of my like I deploy all my workloads with Ansible Playbooks. So like I have a role for each sort of namespace that I run stuff in. So these are all the services I'm running. And like there are secrets all over here. So I unfortunately can't make it public. I can pull this. I can pull the scripts out that I'm using. There's nothing too weird in there. I will definitely think about pulling the scripts out and adding them to the deployment configuration guides as an example. That's a good thought. But I can't show you the scripts as they are now because secrets secrets all over here. It's a private repo. I don't know just so I didn't have to worry about it. Yeah. Link to the deployment configuration guides. Awesome. Yeah. Now that's my fork of it and it will all be merged into the main one. El Mico or Mike McHugh. And then hopefully afterwards I'll just all end up in the main OK D repo. Yep. So that's that's my stuff. How do I I'm sure there we go. So that that's my completely and utterly overkill homelab setup. And I think Vadim's also got a much more tame normal type set up. Aim. Yes. Normal. No. Do we wait for questions or should I just jump in? Um, I don't know. Honestly, whichever way you want to do it. I don't know if there'll be maybe we can go through both of our setups and then take questions. Yeah. Okay. So I'll share my screen and we'll look into my setup. So first of all, do not. Am I muted? No, no. First of all, do not repeat this at all. It runs a single master meaning you probably won't be able to update. Second of all, it's very, very slim unlike three stuff on the resources. I have machine with 20 gigs of RAM and the other one is default laptop with eight gigs of RAM. You won't be able to run there much. But as an example of how low you can go, that kind of works. So the pinnacle, the part of my core part of my stuff is my router, which is a standard edge router from ubiquity. Here's that picture. You can enjoy my insane cabling skills. It provides me with PHP. And I have all the hosts then to their particular MAC address. And somehow it also manages the PHP in a way that the hosts get their hostname automatically. I don't know networking that much. I didn't use any PTR records and such, but somehow it just works. Another part is that the router has DNS mask embedded, but the UI is terrible. This is why I've set up an AdGuard home here because I can set up a TLS, DNS over TLS and that would stop pushing my ISP to the limits because for some reason it hates UDP. And I can also define my own custom hosts here. All of them are pointing to my load balancer machine effectively. And so that's that. That's the router. Next comes my storage box. She is here. It has NFS, it has single node self cluster. Also don't do this ever. But I need it because NFS is so bad that for SQL basically it doesn't work with it entirely. So I had to host a simple block storage so that I could use CSI and run some software which uses SQLite. This host also runs a typical HAProxy, also copied from the helper node which has a very, very standard HAProxy template. Next come the actual hosts. I have just two of them. And I initially provisioned a laptop which is now my compute node. I started it with as a bootstrap node. It has 8 gigs of RAM. It's barely close to what the bootstrap needs. If you have a chance to get 16 gigs of RAM, that would save you a lot of time. And I also upgraded the default SSD disk to something M2. Also pretty small because the LCD was happy but was showing quite a huge latency during upgrades because this is the time where you pull a lot of images, start new containers and I said it was very unhappy about that. Yeah, pictures. It's just a box. That's my surf dashboard. Also don't do that. But since I have like, I'm using 20 gigs of it for the storage. I don't experiment on this cluster, just actual production. So that's more than enough for me. And that's the view of my OCD stuff. I run quite a few projects there. Most notably, the most helpful operator is probably the pipelines because what I can do is that I can change things on the console. Unlike the GitOps approach, I change things in the console and periodically think every couple of hours, the script runs and saves all the manifests using OC item inspect. It saves all the cluster versions, operators, node status, all the all M operators I'm using inspects all the projects I have access to because I'm not much interested in default open shift projects. Remove some nonsense from it. I don't care about particular parts. I don't care about events. I stripped the boring stuff from the almost generations of things and so on. And finally, it gets committed, saved, and pushed into my internal GTA instance. So every couple of hours, a new commit is created. I can see what has changed in that cluster. And I should trip that out exploring. I should put out to and so on and so forth. So it helps me track back of what has broken and kind of restore the state of my cluster back to whatever I had, especially a particular application breaks down. Another useful operator is snap scheduler that helps me create snapshots of my pvcs here. Do I have access to it? Yeah. So I use Ceph as a CSI volume. Then no, no snapshots. Maybe I, this user doesn't have, no. And every couple of hours or days it creates a new snapshot. So if something breaks in the application itself, we can easily roll it back as a pvc and replace. Next comes a wonderful piece of software called Loki and Grafana, which stores all the container logs almost effortlessly. I think it uses 150 Mac and each PromTail agent uses 70 Mac, which is nothing, but I can do searching by logs, for instance, as one's force. Okay. And the biggest downside of this setup is that since it's a single node, it's incredibly hard to update it. All the operators would work fine until it stumbles on machine config, because machine config has a setting that no more than one master node can be down at all times. And I only have one. So what I have to do is to make it reprovision it back to original, I'm fetching the master's machine config annotated with the desired config notion that node as if it has already upgraded and tell machine config statement to upgrade it, to reprovision it, the whole stuff. It doesn't work out of the box because it also tries to install necessary OS extensions like KMA agent and most importantly, network manager or VS. So I have to cancel it in the middle. And if the node doesn't come back, I have a small hard attack because I would have to fix it. This has bit me quite a few times, so use it at your own risk. Installation, it's covered, upgrading. This is very dangerous, but the whole issue is supposed to be fixed in 4.8, so I'm really waiting for this to land stable. A useful software. Yep, this Grafana is provided by the Grafana operator, which takes care of all the data sources, the dashboards, and can upgrade Grafana from one version to the other by just changing one setting in the operator. Snap scheduler covered, Tecton also covered. Useful software, Euro and Git server, home assistant, great stuff to control, smart home appliances or just collect all the information in one single piece. A bit more than to keep passwords. Nino is an S3 like storage. I don't think I mounted it anywhere in my apps, but certainly possible with different CSI stuff. Nextcloud, a terrible PHP application, but it doesn't job since the files cross multiple devices. Navidrom is a great music server, which follows the aerosonic protocol. Miniflux, heresys reader, a lot of favorite stuff like metric synapses and Pluroma, and the Wallabox is a great application to keep the pages and read them later. And I think that's probably all I've got. Two very different extremes there. I like it. Yeah, that's awesome. Hey, Vadim, did you have a link to that page you were just looking at? Yeah. Yeah, there's also a plural question to send three basically. That's what it was, okay. That would be a good time, I guess, to ask if there were any questions for either either HomeLab. Or we can just say here and nerd out about how cool all this is, because this is very cool. Looking at your script tree that you're catching stuff in the middle of the deploy. Yes. I think it can be worked around rather the proper way to implement this is to pass manifests to installer. It has a special folder where you would put stuff and tell it to keep applying. Like, yeah, this one. So when you change the wildcard, I think it makes your, oh yeah, since you're patching a proxy, that certainly makes the MCD lay a new file. So instead of getting one consistent config in the beginning of the boot, it gets it in the middle so it makes the move. That should save some time. That definitely will save me some time. Yeah. I didn't, I didn't realize that everything, I, to be fair, I have not really looked at all of the emails in there and tried to figure out what they're doing. All of the cluster configs, all of that stuff gets laid out in that folder first. Cool. Rather it gets mixed in into what the openshift installer generates. Right. And either you, there is a create, create manifest command to lay them out. I think, yeah, I think I run them both because I actually do have to set the master schedule, schedule true to false. Not that it doesn't, it doesn't break anything anymore, but I'm unreasonably paranoid about running random workloads on my masters. So I think I, yeah, I think I played with it and it almost broke everything. So I put it back. They all get merged into one in the end. So there are two ways to approach it. Either generate and to set or create your own, wait, no, maybe it won't work with a schedule of config. There's only one. Or maybe it would. So the other option is you lay out just the master schedule change into its own YAML file. OpenShift installer merges them as if it was changed in the beginning and generates the ignition file. So from that point, it looks as if it was, has been there forever. So I could, I can make a whole file that just like sort of is a patched subset of this YAML, put it in that folder and so on. I guess as long as I named it afterwards, it would, it would merge it. Yeah. Again, it's only possible if you have it statically, like there is no templating there. Yeah, yeah. But everything I'm doing is totally, totally static. But proxy stuff certainly should save you like a lot of time. Yeah. Most, most of the time here is just, and this also answer the question Dan just asked what provider I'm using the Libvert Terraform provider. That's what it's called this guy. This I also believe is what gets built in, is one of the providers that gets built into the OpenShift install itself for use for the Libvert IPI deploys. So this, this provider is extremely handy. It handles doing the ignition. It handles dis-configuration, network configuration, the whole smash. It basically does almost everything and they also provide an access LTE escape patch to configure bits of the Libvert XML that it doesn't quite know how to do yet. So extremely flexible tool, that's what I run against. So I just have it set up like each one of my, or inside Terraform here. So library, library three and library four, those are the names of the three hypervisors. Library two is the knock, which gives you the idea of the ordering of how all this stuff got set up. So library, library three and library four are each of the hypervisors and in here I have a, I have a main and so like host equals var.host. So they all just set up and it will deploy a master and three workers with all of the config file and then each of the modules in here. I'll just look at the master one as an example, right? So this is, this is all using the, the resources provided by that provider, Libvert ignition, Libvert volume, Libvert domain. And so like this all just gets, these files get passed in by up top. It's kind of confusing to sort of see it all in one place. Let me go up here and I'll show you. So like, for example, this, this con, this content thing. So this is where the ignition file comes in. And that's var dot IGN file. I actually specified that up here in the per hypervisor config. And so that I set up dot dot dot config master dot IGN, which in turn is where the open shift install binary spits out its ignition files. So once I run all of that, I just pass it here. The provider ships it to the hypervisor, sets it sets QMU up to use it. And then from there it all just proceeds as if normal. Super, super convenient saved me having to set up a whole pixie server for the temporary purpose of, of deploying these VMs like the docs recommend. And I think that's probably something we could, we could tell people if they have an entirely virtualized environment that to use a provider like this. I don't know how helpful it would be in a generic sense because UPI is literally everything and this is kind of provider specific. But I'm very nervous about recommending people what to do because they should know what they want. And all we can give them is a bunch of options like this is why we're collecting all the guys and I think focusing on some particular way like use Terraform that's not the right way to do that. Yeah, I get that. But just an example of what you can do with Terraform and patients. Like most most I think the longest part of this unfortunately is waiting for the Fedora-Coro is download and then I actually have an LVM library set up to like chop it out. So I have to like turn the Q-Cal 2 that comes down from the F cost thing into a raw and then take that raw and DD it four times for one master three workers to three separate LVM things. That actually is what takes the longest, unfortunately. And I don't think there's a way around that. Oh, well, but actually setting everything up is fairly quick. Just gut measurement. I wonder if you have access to cluster API. Like when you set it up you don't get any machine sets, right? No, I no machine sets or anything because I would need the the liver thing. So that would be another approach but I don't know how flexible the liver machine operator is going to be. That's something to look at. But I definitely have it's interesting. I've seen the project I'm sort of keeping an eye on it. But for now deploying it statically is the is the way to go especially because I need to it's a very static configuration. It's only ever going to be these 12 BMS and they're going to be set up exactly just so. So I don't like the health checks would be the thing that would be super helpful to have from the machine set. I don't really need the auto scaler. I have no use for it. But the machine health checks would be cool. The health checks they basically implemented for spot instances. It doesn't make much sense in the weird world. But yeah, if you don't have an auto scaler then having the whole machine set cluster API thing is probably even a bigger vocal which is a challenge. Yes, I'd be. I'd love to figure it out. And so like, yeah, these are all just auto generated but they're all here. I don't know what to do with them but it shows me them. Yep. The main part is the nodes. I some it's funny that like, you know, all of this and like all of my all of my RAM legitimately all of it goes straight to how do I do it? Yeah, nodes or I guess I can go to projects right and I can sort again. No legitimately all of my RAM goes to goes to Seth. I have no idea what it does with it to be honest. I should. I don't. But yeah, 45 gigs basically 44 gigs straight to Seth. It's half the reason I had to go this wide in the first place because when I tried to set up smaller Seth clusters, they just ended up eating all the RAM on the host itself and I had no room left to run actual workloads. Does it go to actual OSD or something or the goaling part? Yeah, it's it's all to the OSDs. I please. Oh, all right, it's happening. This this view always takes a while in the per project view. Wish I knew why. This is also probably because my NCD cluster is slower than it should be, but. I think I've seen a bug related to the console bug. Shouldn't it? Oh, there it is. This is certainly console bug. I'll go straight to the pod view then that will work. So all of them, all of the OSDs are at the top and they all they're all very uniform, but they all sort of go up to around four or six gigs and then kind of sit there. I have no idea why I suppose Seth is doing something with the memory. Everything else is kind of tolerable. The moms and the and the MDS is for the for the CFFS. But in in general, because because Rook is upstream of I think OCS these days, they have special open shift support in the upstream Rook project. So I get to set up the I get the PDB support, which is very helpful, especially during rolling cluster upgrades, which is possible and works pretty much flawlessly with a multi with this multi master setup I have. So kudos to everybody on the OKD team for that, the open shift team because that can have been easy. But the PDBs are really, really cool because the Rook operator will set a PDB as it reboots worker nodes. So or in a firm interview, nodes with OSDs on them, so it'll reboot one of them, wait for it to come back up and then wait for the the CFF cluster itself to settle and stop rebalancing before it reboots the next one. So my cluster upgrades take longer than they should because it's not just a quick VM update. But on the other hand, I don't lose the I don't like get my data availability interrupted at all, which is very, very, very cool. Yeah, the consistency and disruption are the principle of the upgrades. The time is not rather we care about control plane upgrades. This is super important and workload disruptions. If it takes you a year to upgrade. Sad, but you can move on. You can still keep on upgrading and the worker nodes will catch up. So that's the tradeoff we have to make. No, I totally get it. And I'm just I just want to point out that it works like extraordinarily well. It all like in the PDB sort of fit in with it nicely. It's like little things like that that I'm thinking how would I set this up with like a vanilla Kubernetes, especially like these these views here with the monitoring. Where is where is the world's best dashboard? Here it is world's best dashboard. Neil, are my scripts public? Sadly, no, there are secrets all over them or all over this repo in general, like actual secrets like my private key and stuff that I use everywhere. So I can't make this repo public. I can try to pull out the scripts but they're not really, I don't think they'd be helpful for anybody who isn't me because they are very, very, very specific to my hardware topology as it were. But I can definitely show them as an example or annotated or something in the in the configuration guide that I have here and sort of do that. Have I experienced entity slowness? Dan's asking, do you believe that's a config problem or inherent to the hardware I'm using? I'm pretty sure it's my hardware. I honestly, this is like the weirdest issue I think I've ever, ever seen because it's weird because two of my nodes, every now and again, they'll be like, hey, EtsyD is running slow and then it just resolves itself a few seconds later. And then the third one is just rock solid. And so that's what the world's worst dashboard likes to show me a lot because the distinct durations here, like you'd see four of them are up in like the fifth, like the, too high. And then one of them is in like the six millisecond, four millisecond and the other ones like a hundred milliseconds and 90 milliseconds. It's identical hardware. I have no idea. They're all like almost the same spec. I think they're looking different as the motherboard. Very, very weird to see. And like now that I'm using it, these both spike up and the third one is just fine. I have no clue. Yeah, that's all. I mean, the leader, the EtsyD leader would have a worse sync duration, of course, but two of them, it's a bit odd. It is weird. I haven't figured out any rhyme or reason for it. I've seen that some things like there's some, the back before I sort of leaned off of, there was a time when like, OC get cluster versions was consistently like, hello, it takes 60 seconds for OC get cluster versions. And I was like, that's weird. So there are some oddities about EtsyD and what it's running on. And it doesn't seem inherently related to disk usage because they're nice SSDs. If I go look at them with IOSTAT or something, they're fine, but that's sort of troubleshooting for another time. Neil, could I split out the secrets into a bash file to source or something? Perhaps. I am kind of thinking about figuring out somebody to sort of like to pull in the secrets from a separate private repo and then have all this stuff out public so that people can reference it. I might put some more effort into doing that. It's just not been mostly on my radar, but if people are interested out, I can definitely make the effort over a couple of weekends to do so. Yep, Diane's also chiming in like, please do. I certainly will make the effort. Yep, anybody else? Anybody have a prospective idea to set up their own home lab doesn't have to be overkill? What if, since you're using three high builders, what if you set up an OCD as a bare metal there and you can have a schedulable, you can have schedulable masters. And then on top of them, you would set up QBird and run OCD clusters within the, I think 4.7 or 4.8 has a native QBird integration. No clear OCDs. Yeah, I think you just have to sort of enable it. There was a section in the docs about it, if memory serves somewhere down here. Maybe, I don't know. It wasn't in installation somewhere. Yeah. But the problem is. It would be very interesting. It would tie all your machines into effectively one hypervisor, so it might be not that reliable. But one of the stuff to play with. Yes, definitely something very interesting to play with. But my primary goal here was to just see how highly available could I get something that wasn't running in somebody's cloud? Like the reason I'm using Active Directory, which is a complete tangent is because Active Directory is based to the only thing I could find that does transparent DHCP failover back and forth. So after the first time that my DHCP server decided to crap out for some reason, and I woke up one morning and nothing could get a DHCP address, I was like, this sucks. And AD was the only thing I found that can transparently failover DHCP management to another computer and then fail it back to the first one when it comes back. And so that's actually a lot of the reason why I'm using AD, even though it is in itself incredibly overkill. Interesting, I haven't heard of a lot of DHCP failure stories, but. Yeah. The VM itself crapped out, unfortunately. And so it took DHCP with it. But yeah. Okay, folks, I have to drop. I think I'll rejoin in a couple of hours to see if some questions need answering, but if something left unanswered, let's keep it in Slack. We'll get back to that. Okay. Have a great evening. Bye-bye. Yep, later. We can continue chatting here, or if anybody else has any questions, Neil, Dan, I know you guys, and I know you've heard my stories about this before as we were setting it up. Greg, I don't know, got anything else? I got nothing. Good stuff. Pretty awesome homelab there. I'm impressed for sure. We spent a lot of time. It's too much time. I think Neil and I actually first started with an OKD single machine three-node cluster, like not OK, OpenShift Origin 3.11. So like I first started this path in, when was it, Neil, 2018, 2019, somewhere around there? And it's just, yeah, 2018. And so this is a three-year in the make-and-home lab. And I think I finally got it to somewhere where I can sort of use this as a base to play with, because everything before is just like getting it going. And then as soon as you got it going, it would break, and then you'd have to do it again. It was, you do. That's for time and money, Neil. Yeah. Coming up, brother. I've been telling them that for ages, but of course, now is not such a good time to be buying consumer hardware. Sadly, it is. It's incredibly dumb now. Not by all that stuff used. I think that might be the only way to do it, but even the used market now is starting to get, like people are starting to understand that prices are inflated. So they're like even on eBay, stuff is going up now. And the other part of it is I actually built my boxes from scratch, because if they're sitting just a few feet away from me, I can't have like big rack bounce or like the Dell PowerEdge type things be too loud. So unfortunately, no one's really selling their old gaming PC chassis. And if so, they make me pay too much for the graphics card. Yep. Yeah. If you come across a 730 or a 3070, you let me know. I'm still rocking my 1080. I'm running a R9 390 from five years ago. This was the year I was supposed to upgrade, man. Yeah, you picked a bad year. Thought, hey, five years, that's enough use out of one graphics card? Nope, but yeah. These scripts, I mean, Neil has been badgering me for ages to get these, to polish them up and bring them out publicly, but or to turn this repo public, but like especially in here, like, and I think it's worth looking into just the flexibility of the Ansible playbooks. I know the guys who are maintaining the Ansible collection for like the K and S module and then some community.okd operators on top, but they dropped by the working group a few meetings ago to talk about their work there. And so I've actually been using the Ansible stuff really heavily since way before they showed up simply because like I'm more familiar with Ansible than I am with like Helm templating cause like all of the config maps and stuff you'll note, it's just, it looks like a config map or like a normal Kubernetes YAML file, but I think the Gingya templating is more powerful and flexible than Helm templates, which are, it's just text interpolation. It's not, it's a little bit too iffy from my purposes. I just prefer Ansible because I know how to use them because I think the templating is more powerful and it also makes it really easy to, because Ansible sort of focus on just being able to reapply the same thing and it'll incrementally make progress, it's super easy to, if you mess something up it'll stop right there and you just fix it and it'll keep going, whereas with Helm, like I know Kubernetes itself has the capability to do that but I can also mix in other types of configuration if I need to into one of these playbooks. So that that's also super powerful enough to script it with, I can just use Ansible to do everything. I don't have to worry about a bash script that has to also Helm and then maybe a little playbook for something else that I need to set up. It's really nice to be able to standardize on one thing and Ansible has the most flexibility. So like I deploy everything through Ansible and one of the neat tricks I actually found which I don't recommend anybody ever do for a real production workload unless they really know what they're doing but it's super helpful in a home lab environment where you don't have to care is that, hold on, let me go find a good example. I think yeah, the media stuff is full of it right here. So like what I do is I have, OKB has the image streams have the capability to monitor an upstream tag for changes and when the tag changes, it'll pull it in and that's super helpful. So what I do is I just set up a build config that triggers on type image change from, so I set up a tag here that's just upstream and that looks at wherever the upstream Docker image is and whenever that changes, it triggers the build config to do a new build and then the deployment config and then that build config pushes it to a tag that's only used internally. Oh shoot, I'm not, you're right Dan, I'm sorry. You're on such a roll, I didn't want to interrupt, I was following you. No, no, please interrupt me. So as I was saying, this is a very neat trick for home lab deploys to just keep your, the stuff running inside up to date without any manual intervention whatsoever and the key is that OpenShift image streams and image stream tags can just keep, it will automatically pull an upstream Docker image repository for changes to a tag on an upstream image. So like here, I have this image stream tag called internally jacket upstream. So this is a tag in the internal registry jacket upstream is monitoring the GitHub container registry tag jacket latest up in the GitHub's infrastructure and it'll pull that for changes and anytime it changes, it'll pull that down locally and trigger an image change update which gets picked up by the build config which will then, which can then rebuild the image stream tag and then I say, okay, do a build based on that upstream image and push it to an image stream tag called jacket latest. That's just an internal one. And then down here, my deployment config is watching the jacket latest tag and it will trigger a redeploy. Every time that, every time a build succeeds. So once you set all that up, so long as there is an upstream tag that sees changes whether it's a latest or a version tag or something like that, every time it changes without any thinking or manual management on my part it will literally just push and redeploy and it's wonderful. So like if I go to builds here for media you can see like a whole bunch of stuff. So like, here's the, you'd see this one's up to 33 and it just does it randomly whenever it sees an update, whatever time of day I believe by default they pull every 15 seconds which is aggressive, but I can't change it. So we're stuck with that. And the deployment configs are rather the pods down here you can see like all of the various deploy pods and all the various versions, it just happens. I don't have to think about it and like you'll note like this one here these two didn't work for whatever reason. So it just stayed on five and I have to go figure out why that didn't work but whatever, but honestly that's really, really powerful and to get that in a vanilla Kubernetes distribution would be, I can't even really think of a good way to do it out of the box. Like you'd have to cobble together two or three different things like maybe an Argo CD or there's a thing I've heard that people use in smaller like K3S setups called Watchtower that can do something similar. Maybe that one only works by monitoring the local Docker demon. It would be a pain to get that working but with OpenShift and OKD it's just basically done for you really, really handy. And it's just one of those things like a lot like to me the biggest things that just make my life easier like a lot of the toil around maintaining a big distributed system like this whether it's for product or for personal use it's just setting up the sort of automating sort of little toil tasks, getting monitoring, getting alerting, figuring out what to monitor, figuring out what to alert on, setting up like automated image deploys for the things that you know you can set it up for things like that, all that sort of just day to day sort of blah stuff that's not really interesting or fun but has to get done anyway. OKD just does for you. I have really comprehensive monitoring like and I get that for free. They're using as far as I can tell the upstream Kubernetes mix in monitoring stuff and that all just gets deployed and it works and then it monitors the worker nodes themselves for disk space or NTP, anything I could think of. I've seen alerts for all of it just for playing around. Super helpful. It makes me feel like I'm a badass and it frees up my time to worry about like actually like doing more complicated things with my workloads. It's a good springboard to jump off of rather than I think most of my time would have been spent sort of setting up all of that supporting infrastructure is just more stuff that I have to set up to get to a level of confidence in what I'm running. And a lot of the times people don't just just don't set any of that up and like they get surprised when their stuff fails and while that's OK in a home lab environment it's not OK in production. So that's how spot on that's just you know my two neat tricks, you know, run a Rook, run a Seth. It works. It is by far. I think the easiest way to get a self-cluster deployed of any really thing. And yes, it assumes you have a Kubernetes cluster somewhere. But once you do, Seth is like laughably easy. I don't know how they did it. I suspect Blackmagic Daniel has an interesting question. When people ask you why in the world would you run Kubernetes in your home lab? What's your favorite example of a workload that Kubernetes makes easier for you? I think honestly the monitoring, just being able because it's really difficult to set up proper, proper like sort of monitoring and metrics management for internal stuff to the point where a lot of people don't do it. So that is kind of an open shift thing. I don't know if I can. If someone asked me if they want, why would they run an open shift in their home lab? I'm not so sure I could give an answer that would say it's worth it running an open shift in your home lab for an application or a set of applications. I would run it in the home lab and see other folks running in their home lab to help understand open shift. And maybe I answer this question wrong but I use it more as a learning tool in the home lab as opposed to running like actual work. Yeah, I think if you already want to run, the people who would want to run Kubernetes in their home lab or the sort of people who are already sort of the sort of people who are sort of messing around with Docker swarm or manually setting up Docker composes for everything that they're running like all their sort of mishmash of internal little home labby type programs. Those people I think would be well served to move to Kubernetes simply because it gives you a lot of stuff for free that you either didn't know you were missing or knew you were missing but didn't really feel like setting up. Things like log rotation, things like ready checks, things like failover if one of your random little knock boxes fails. Will your workloads like you lost everything running on that box? It's a platform as much as it is a tool, more than it is a tool. So when I, Kubernetes doesn't, it makes my workloads easier to run but the nice thing about it is that the nice thing about Kubernetes is that any workload that you were already sort of managing yourself with Docker, you now don't have to really manage. You just put it in there and it'll run and it'll be there and you can get to it. And that's really powerful because it makes it super easy to just throw new things in there because all of these home lab type things are giving people options to run it as a Docker container these days. And so it just makes it really easy to just take what they give you, chuck it into the cluster and it's just there. Like I, at this point, I'm able to deploy new stuff in maybe minutes to hours, which is no better or worse than it would have been with Compose except I get so much more for free. Stuff that ordinarily would have taken me more hours or maybe even days to set up for each individual application. And the meals following up, yeah. If you have OKD on top of Kubernetes, the best because like if you, if I were to make a crude comparison to Linux distributions, right? Vanilla Kubernetes is like Arch or Gen2. It's a base. It moves quickly and you are meant to bring your own sort of opinions and workflows and put them on top of it, right? Arch boots you up to a console login screen. You're supposed to install your programs, your workloads, whether you want GNOME or KDE or something. OKD is trying to be more like the Ubuntu. They already bring their opinions about stuff, but they make it fit together very well. So the Ubuntu or Fedora of the sort of Kubernetes world. And for people who just want everything out of the box to get stuff done, I think OKD is a fantastic option. If you're already committed to looking into Kubernetes, especially once the single node cluster stuff comes out so people don't have to set up multiple machines to make it all go nicely, but they'll still get the advantage of updates. And that's the other thing. I don't have to update any part of this setup manually. It'll just say, hey, your cluster is going to be updated. And it'll take care of updating the containers it's running and the underlying base OS. That's so big. I don't have to remember an SSH and everything once a month and run app get update or something. It just happens for me. And it's all tested. I wonder how many people are using Braille as their base OS on their worker nodes and OpenShift4. And I also wonder if they realize how much they're missing out. I know, right? Because I figure our costs, right? Red Hat Core OS probably has something so much for OCP like we've got with Fedora Core OS here. I don't have enough money to pay Red Hat for. Yeah, no, I mean, so yeah, you could use the R cost. It's going to be the immutable deal, right? But you can use like straight up Braille and Braille A. And then you won't have that over the air type of thing, right? Oh, right. I wonder how many people are just going to use our worker nodes as Braille and not use Core OS. Man, I wonder if they know what they're missing out on. I really hope. I suppose ignorance is bliss in that case. Maybe they're happy or not knowing because they might be kicking themselves if they realized how much they would get out of the box if they just sort of let the community or upstream handle that forum because it is honestly amazing, right? Like just I'm thinking about like as soon as F cost puts out ARM builds, I'm switching all my Raspberry Pis to use F cost and containers because it just the management is so hands off with it, right? You only need to go into it if you notice something is broken, otherwise it'll just keep chugging away. You don't have to do any toil like so much. I think that that's the word I use. I don't know if anyone else like toilets, just like the sort of maintenance stuff. It's sort of a chore. It's sort of like, I don't want to do it. It's boring, but I got it so much easier. That's the thing like, yeah, it took me a while to get all this set up. But I mean, I got this basically locked in January, February of last year, somewhere around there. And that was back in the OKD for four or five-ish days. And ever since then, I've just been able to upgrade and redeploy clusters, yeah, for three. So I've been able to upgrade and redeploy this cluster over and over and over and over. And I've been able to bring it back exactly as it is every time. And that's amazing. Just being able to work with just infrastructure I don't have to worry about tearing, like what happens if I tear it down? What am I going to lose? Do I have all my configuration? This kind of forces you to do everything sort of properly, which is that's helpful in and of itself. Like, this is all I need. This will set up the exact same workload that I had. It'll bring everything in. It'll restore from backups. It all just works. And once you have your pattern figured out one way, you can just do it forever, yeah. So, well, I wouldn't, I can't recommend, you know, running, I honestly wouldn't recommend running your own Kubernetes in a home lab, even on RPIs or something like that. There's just too much extra stuff you need to worry about to make it actually usable. And there's a lot of hidden things that you don't really think about because Kubernetes is big and complex. And there's a lot of little things that can break that you'll never know about because Kubernetes is really good at keeping going even when half of it is actually broken. That's the one thing I had noticed. Very good at that. Gotta give them props, but you'll never know that anything's broken until everything falls over. And that's happened to me before when I've just been testing out a K3S because that's where I did a sort of side for you into seeing if vanilla K3S was a good option for me. And in some ways it was quicker to stand up, it used less resources, but I never had the confidence in it that I could put stuff on it and it would stay running two weeks, three weeks, four weeks later. So a lot of times it never did. This OKD cluster has been up for, I think a couple of months now, through various version upgrades, works fine. I'm pretty sure I'd be able to keep it up forever at this point. Yeah, Neil mentions that his vanilla K8S cluster keeps falling over. And I think that's just because it doesn't come with anything other than the bare minimum. You need to put your own monitoring stack into it. You need to put the Kubernetes monitoring mixins in and collect them. And then that'll definitely tell you what you're doing wrong, Neil, but you have to know to do that. You have to put them in there. You have to set up something to collect them. You have to set up something to look at what the Prometheus has collected. It's a whole deal. I just get all of that. And it just works and the cluster is up and it's stable and it's fine. Very, very helpful. And my programs just update themselves without even, I don't even notice because of the whole deployment stuff. Wait until the single node clusters come out, Neil. Then you'll have somewhere to run it because it'll just be any old VM that you want. I don't know. That's the one thing. I kind of want to know how they're going to do rolling updates on the single nodes because this thing, as Vadim was mentioning earlier, it was like it'll update a master, but the SCD cluster needs Quorum so it knocks one out, updates it, knocks the next one out, updates it, knocks the third one out, updates it. Oh, with the single node cluster, I don't know how they're going to do it. Maybe they'll just, I'm sure it'll be some measure of hackery. I do not. Do you know? I saw Vadim in his demo. He showed some code and I didn't quite understand it, but it looked like it might allow him to do it all day. Yeah, his thing, because he knows how everything sort of fits together, he's definitely, he's hacking something together in the YAML to trick the machine config daemon into thinking it already did the upgrade by just overwriting a YAML file and then he tells it to pivot and reboot because normally the machine config daemon, my understanding of it anyways, the machine config daemon is the thing gets, it's orders from the machine config operator and the operator is keeping track of who's rebooted and when and like what state everybody's in. And then as that works, it'll instruct individual machine config daemons to do the OS, treat, pivot and reboot process. So the machine config operator is like, I've only got one master, so I can't reboot it no matter how much it wants to. So it'll just sit there. And so he's tricking the machine config operator into thinking that it can, that the machine config daemon has already rebooted and then he just manually kicks the reboot. And as he points out, that's it almost probably will never work out correctly. So yeah, Neil mentioned, maybe it'll require a user notified reboots. I think that probably makes sense. It'll probably be like, hey, I did as much to the upgrade as I could, click this button to reboot. Like that would be the best case scenario. I don't know if we'll have that for 48, but I'm sure that's probably where they want to get to. I wonder if there's a way to just... Hey there, how are you guys doing? What's happening here? Just all your demos up. Awesome, Saw. Did you get any feedback on what we should be updating in the documentation? And yeah, so Shree, I've been jumping in and out of the different sessions and I heard that you had lots of secrets in your scripts and it was so custom to what you were doing. So I knew that. And all the time you heard that part. I did, you know that. That would be the part I jumped in on. That's awesome. So yeah, so... Yeah, no, I definitely, I think that it's possible for me to split all of that to sort of pull out the stuff that's specific to my setup or at least have that script in. I think I want to have it eventually in the OKB repo. Just as like, here's an example of what you can do. And I'll flesh out the stuff I'm writing up for Mike McEwan's into that probably. And it's probably killing Neil right now that he's not on stage. Cause I'm not sure if we can add him in to do that. I'm not quite sure if we can invite him. But yeah, so I would love to see maybe today he's in Neil's in three different sessions. That's what I like about you. See, that's, that's, that's, that's... So what I would love to see you do today is at least make a pull request for a stub for your, for your documentation into Elmico's repo. And, you know, just say this is a holding pattern and a link to any of the, and maybe just put the additional resources that you had that you've shared with people. I think I saw a few things get shared in there in that stub and get that in there as a holding pattern for this approach to that. And I didn't get, so yeah. And Craig, if that background is real on your, behind your chair, I love that. It is real. These are, these are panels from Amazon. Seriously. I'm in a basement and it's, it looks really bad. I ended up framing it up and putting these up to make it look a little bit more professional. You look just like rock and hipster, awesome. And I'm in my partner's basement in the art supplies. Cause as we all know, I have no internet at my house. That's why I pay for fiber optic. So it goes out on days like this and yeah. Do you have soundproofing with those, those panels? No, I think it makes acoustics worse. Oh, helpful. So I know you did a blog post Craig and I haven't looked at El Mico's repo lately, but I'm wondering if there's a, if you've entered it could make a stub for the stuff that you've done to link out there so that we have access to that. The most recent one I've got is for four or five. And I'm in the process of writing a four or seven. So, but also a lot of, well, I've got you here. I mean, I've been using medium to post my, my guys just cause it's a little bit easier, but I was curious if you had any other suggestions of a better way to do it. That's maybe as easy. Well, I am. Today I saw your OCD blog option. Yeah, we do. And that's pretty good. And if you, if you're fine with markup, which I think you are, that's an easy way to put it there. I'm not sure you're going to get the same traffic as medium. And I kind of like medium too. What I'm not a proponent of is documentation by blogging. Though I think for some of this stuff, you know, it is one off kind of thing. So if you can, if you did it in medium, we could put a stub blog in the OKD.io blog that link back out to your medium. That's what I would suggest so that you drive some traffic both back and forth. But also to think about in the way that you're documenting it, you're writing this blog, what about it can we put into the deployment guide for home labs and, you know, how we, and I don't have a real opinion about it. It's just that I would like those deployment guides to be updatable and maintainable. So like when we go to 4-8 and that sort of stuff that someone could pull it an issue and do it. So I think a lot of it is that, you know, the other guides that you're talking about are complete. Like they show the steps to say, you need to do this, this and this and this, but they don't actually take the time to show you how to do that because you're really kind of supposed to have already known how to do that. And the only difference is in my guides is I go through and show that step, which is kind of a pain, honestly. But it helps folks who are learning who I think a lot of folks kind of come to OKD and then go to OCP eventually, which is why I did it that way. But I'll definitely do that all out of stuff. That would be great. Even today what I'm trying to do is just get people to take that first step and make a pull request to put the stub in for it, sort of commitment, halfway commitment to getting that there. So if you can take the time and do that today. I'm looking to see who's, there's six people and three of them are us. So I'm wondering about the other folks who are here, Karim and Daniel and Neil who's floating back and forth. So maybe just Daniel and Karim, what is the major difference between your home lab or your Fantasy Island home lab and Craig and Shrees? How different are you guys from that? And does that deserve yet another home lab log post with instructions? One thing that came out today watching Vadim showed his home lab and it was kind of interesting to see, because Vadim is on like this whole other level and it was interesting to see his home lab and it was interesting to see Shrees' home lab. It's just to get to go through and look at how everybody has something different set up. It's like, okay, well, I like that part. Oh, maybe I can use that. It would be kind of a cool demo just to have everyone kind of show off their home lab at one time or another. I don't know. Maybe it's, what do you think, Shree? Just is that too much for me now? Yeah, no, I think that would, well, I think it's like the perfect amount of geeking out because like there's so little documentation just out there in general for Kubernetes. Just like, you see a bunch of people like here, so I got Kubernetes running on my Raspberry Pis and like there's no one who sort of is like, here's wacky stuff, right? And like both Vadim and I are different sort of spectrums of wacky. He's got, he's going wacky in one direction. I'm going wacky in another direction. I think it'd be very, very interesting to see how wacky everyone can sort of push their OKD deployments. I think a lot of us probably have something that's weird in one way or another. So it wouldn't be super geeky, but it would be very, very cool. Yeah, and I think one, and I think you hit the nail on the head that I think the home lab approach is where people learn and where they're experimenting. Daniel's got his, he's not even running OKD yet, so he's got a single machine going. So when Daniel, when you get yours going, I expect to blog post at least out of you and coming to the OKD working group and we'll do that. And then if yours is significantly different than Craig's or Shree's and Vadim's, I'm really interested in having all four or multiple home lab configurations linked into the deployment guide, because I think that's one of the many value propositions for OKD is helping people get their home lab set up, learning and getting some experience with this. And I think the issue for me has always been is that DIY Kubernetes was great, but then I could never do anything after that. So I could get it installed, it was running Yippee, but I really couldn't get an app running on it. And I wasn't even trying it on a Raspberry Pi, so that was, yeah. And then I ran out of time because that's kind of the whole thing. We were actually just talking about that before you came in, it was just like running Kubernetes by yourself, you get it up, it's almost nothing, you have to put everything into it still, nobody will tell you what it is, and then your cluster will fall over because something went wrong or you didn't configure something quite right or some piece of the hardware fell over and you have to throw it away and do it again, it's pain, it's a huge pain. It's pain, and I don't want Kubernetes to be painful for anybody, even if they're not a Red Hat customer. I want everybody to have a good experience because it just, when people have bad experiences with stuff, that just says, okay, well, I'm gonna go over and do something totally different. And then we get all this, we don't get their feedback, we lose them from the communities that we're supporting. So that's kind of it. And I'm old school, I've been on OpenShift team for a very long time. I just hit my eight years working on OpenShift, which is, and I came on board because I love the promise of platform as a service. I was a Heroku addict that went in, I even did, I worked on Cloud Foundry and at ActiveState for a little while and then came over to OpenShift land. And for me, the Nirvana is that wonderful dev experience that we were promised. And then going to Kubernetes was kind of, yeah, poor Cloud Foundry, I know. And I appreciate what Cloud Foundry has done and where they're going and all of that. So this is not really me slamming them. That more what I'm interested in is getting back to that early days promise of platform as a service. So, and I think what we are doing with OKD is getting us close to that. It's still what you guys had to do today, much more complicated than I wanted. But the nice thing about it is I was able to point out that when I was going over my script original, I was able to point at a point maybe about halfway through and be like, at this point, the cluster is technically set up. Everything on top of here is customizations for my particular environment. So I think it's a lot better than my old Kubernetes scripts which were based on, I think K3S and this was like maybe two or three years ago. And those were just, it was not pretty. It was not pretty what I had to do to get K3S going. I think K3S is better now of course, but still. OKD is by far a nicer experience. So, I know we're all here chatting but I don't wanna keep you guys away from your weekend longer than you need to be. So, thinking we have come to an imminent closure of this conversation. So if you leave speakers and if no one, Daniel or anyone else and you wanna go over to another session, maybe go hang out with single node cluster people who are still probably bootstrapping something. And just slowly merge into whatever the final session is that people are still yabbering on. Well, we'll do that now. And I will say thank you and you should find an invite to join us at KubeCon EU in your inbox sometime next week if you haven't already done that. Hey, thanks for putting this all together, Diane. Yeah, thank you Diane. All right, take care. Thanks guys, jump to another session everybody. All right, see you.