 Hello and thank you for joining us. My name is Eddie Zaneski and I serve as a developer advocate at Amazon Web Services. I am also the co-chair of Kubernetes 6 CLI where I help maintain cube control. I'm coming to you from Denver, Colorado in the United States. Hi, my name is Jeff Billameck. I'm a principal software engineer for the Home Depot. I'm in Atlanta, Georgia and I'm super happy to be here with you Eddie. We have one goal today and that's to inspire you to start a home lab. When a lot of folks want to learn Kubernetes or a lot of the cloud native technologies that are out there, they don't really have a personal project to get started with. It's much easier to throw yourself in and really learn all the things when you're actually managing something. And so we don't obviously want you to run and play in your production environment if you can help it. And so that's where the software we're going to show today really shines. Why don't we start off with talking about what a home lab is. So home labs can be very addicting just so you know. Before long, you might find yourself looking for used servers on eBay late at night. But all seriousness, home labs are a really safe way to experiment with new technologies, either for your own personal use or maybe something related to your job. As for the why, I think it really jumps into, besides having fun outside of work, it's a great place for you to improve your craft. You can also get a whole bunch of practical applications and experience, you know, as you'll see with home automation. The real question though is why would you want to run this inside of Kubernetes? For starters, it's a really great way to learn Kubernetes. Kubernetes provides really great features for like resiliency and redundancy. Like for example, just through the day this week, my power went out for about three hours. The UPS has all failed and so everything turned off. And when the power finally came back on and everything powered back up, Kubernetes basically ensured that everything was running without incident. Like everything just came back automatically thanks to the Kubernetes scheduler. And Kubernetes just kind of gives you a place to put things. Maybe that one-off Quran job that you have running that alerts you when there's a 90s dance party at the venue around your apartment like I used to. Once your cluster gets full, you can always just, you know, grab another Raspberry Pi for example, toss it in and grow it out. One of the things we both run is Home Assistant. And Home Assistant is a really popular open source home automation platform. It's very active. It's very well supported. They have releases every month. And it does one thing and it does one thing really well, which is tying together various different products and services that you normally don't integrate very well together. Yeah, like for example, having your ring door bow played nicely with your Nest camera. You can glue together different ecosystems that normally would never cooperate and play together. But it gives you a common and unified way to really define what that looks like. And when it comes to home automation, there's different protocols for some of like the devices that you may run. Like for example, there's Wi-Fi, but those typically tend to be proprietary and cloud-based. However, there's some more like self-hosted kind of protocols like Z-Wave or Zigbee, which are low-powered mesh radio type protocols. And typically when you run those, you're going to run some sort of radio controller within your home. And that will basically be the interface between the software you run and the devices you have around your house. Yeah, let's take a quick tour of what a Home Assistant looks like. So here you can see my Home Assistant dashboard. You can break things out by rooms in different areas. So you can see my front door right here with my front door porch light and my door lock that I can lock and unlock remotely. Also my garage door. You know, here's my thermostats that I can control. A couple security sensors like if the windows opened or closed. There's that cam feed from my ring doorbell. I can trigger off nice things like it knows my wife and I are home because our phones are connected to the Wi-Fi. Here you can take a look at some of the integrations that I have installed. Here's my Chromecasts, my robot vacuum, thermostats, doorbells, all different bits that plug together. Along with a huge list of other ones that I can install. What ties all these things together are automations. And so here's a quick, simple one. This one turns the lights on in my kitchen after the sun goes down. So you can see it's got a trigger for sun, 20 minutes before sunset, checks if the lights are currently off, and then it turns them on. And this is of course represented via YAML, which we all love. In this automation, I'm going to show how the motion sensor in my alarm system will trigger the lights in the hallway to automatically turn on when it detects motion. Every one of my family loves this automation because the lights just magically turn on and off when they're walking through the hallway. It's really great. They don't have to lift a finger. So one of my favorite home automations that I have is these smart Z-Wave light switches. And so they work like a regular light switch. They're dimmable. I can hold it up or down. But one of my favorite bits about it is that depending on how many times I pushed a button, it fires off a different event. And so for example, we have this nightlight here. And so if you tap twice, it actually controls the nightlight instead of the regular lights. I've set this up to kind of demonstrate how the different components running in Kubernetes kind of work together, or in particular, the very specific person object detection feature of one of the workloads that I'm running called Frigate. It's running as a home chart. That's the thing that's using the Quaral TPU USB device, the TensorFlow processing unit. And basically, its whole job is to offload small video stream to that device to apply a machine learning model. And in this particular case, it's object detection. And so what we're about to see is my assistant, my daughter. She's going to walk by and we'll see that Frigate detects her as a person. Here we go. And it's going to draw a box around her. And when the box turns blue, that means it detected basically person. And if you look below, we see an event fired. It showed the detected a person. And so now anything subscribing to that can act appropriately. Like for example, I have automation set up to do certain things when someone's detected and say, for instance, we're not home, or it's between a certain time of day, that kind of thing. So let's start by talking about hardware, right? The best hardware that you can possibly run in your home lab is what you already have. You know, maybe that old PC, maybe you bought a Raspberry Pi three years ago and it's still sitting in the box because you don't know what to do with it. You know, I'm running a four Raspberry Pi cluster. I'm running a collection of PC and Raspberry Pi devices. They're all different types of compute devices. I have an old laptop, I have an old desktop, I have a couple of things I got off of eBay, just a bunch of different little things plus three Raspberry Pies. One of them is connected via USB to my alarm panel device so I can arm and disarm the alarm and get alerts when windows or doors are open or this motion or things like that. And if you currently don't have any hardware laying around, we'd recommend just, you know, grabbing a Raspberry Pi or two. You really don't need much more than that to get started. You can obviously start with one, but when it comes to Kubernetes, having multiple nodes really gives you a much better experience. And if you're running Raspberry Pi's, the processors they run aren't typical Intel processors. They're based on ARM, which is gaining a lot more popularity now. And so when you're going to run like ARM compute modules or compute devices in your Kubernetes cluster, it's going to be important to look for container images that support ARM, typically things that have like multi-arch support, we're going to fall in that category. The great thing is that every day there's more and more solutions that are providing ARM support in addition to Intel support. Yeah. Arms really come a long way, hasn't it? You know, from back in the day when we were running Docker swarm on some pies and I just remember trying to install Redis for the first time and there was no ARM image for Redis yet. And I was like, well, I guess I'm returning all this stuff. Next, let's talk about software. We'll start with K3S, which is going to be the backbone of your home lab Kubernetes cluster. K3S is a CNCF project originally started by the folks over at Rancher. It is a lightweight Kubernetes distribution, which basically means they've stripped out a bunch of cloud provider specific things that you're really not going to need. It's packaged as a single binary that holds all of the things, including your Docker container runtime, Ingress controller, basically everything you need to get started is right in the single binary. It runs great on a single node or many. It has a really awesome and easy install and upgrade experience. Basically, you're just replacing that single binary. The control plane uses either SQLite by default or etcd if you want to switch to a high availability. As far as installing K3S, we recommend checking out catch up by Alex Ellis. Quick shout out to Alex. He's got tons of content out there about Kubernetes K3S and Raspberry Pi, so be sure to check him out. But catch up is a very simple installer for K3S. It also can add a bunch of other add-ons and helm charts that you might need. As far as configuring your K3S cluster, just a couple quick recommendations. Swap out the built-in load balancer to Metal LB. Metal LB will let you use your home network IP space for load balancers, and it allows you to map multiple services to the same IP address as long as they use different ports. So you can really use one single IP if you'd like. Traffic comes installed on the cluster as well by default, and it's a great ingress controller out of the box. Definitely learn some of the dials that you can turn. And as far as persistent storage goes for your volumes, I would check out either Longhorn, also by the Rancher folks, or Rook. Figuring out what you want to run and how to run it in Kubernetes can sometimes be overwhelming. Fortunately, there's a package manager for Kubernetes called Helm, and it's a package manager that provides you these different Kubernetes workloads packaged up in things called Helm charts. And Helm is a utility for installing and maintaining these Helm charts or these packages. Helm is quite useful because it provides an easy way to install workload and set some values that are pertinent to that workload without dealing with a ton of YAML. It allows for easy upgrades and rollbacks, and there's a huge community and ecosystem of Helm charts that have already been made that you can reference. Pretty much anything that you think you want to run in Kubernetes probably have already been packaged up as a Helm chart. When using Helm, there's lots of easy ways to find and discover Helm charts that you want to install. You can do a Google search for the thing you're interested in plus Helm. You can search on GitHub. There's also a really great site called Artifact Hub, which has thousands of Helm charts referenced there. And specifically for home use, there's a really great community called Cates at Home that have over 100 Helm charts in their catalog that range from media type things to home automation to many different things. It's a really active community. I highly recommend you take a look. That's Cates at Home and Artifact Hub. Installing packages with Helm is pretty easy. Here I'm going to show an example of installing the Home Assistant Helm chart. And I've got a terminal window here with a top like tab basically doing a watch on any pods that have Home Assistant Helm on the name, which there's none running right now. And then bottom tab here is the command to install that Helm chart. And so like using the Helm CLI, I'm saying Helm install and I'm giving it a name. I'm calling it Home Assistant Helm. And then I'm setting some basic values to basically bootstrap it. So I'm basically interning on Ingress. I'm setting a hostname for it. And I'm setting the Ingress path default. And then I'm turning on persistence. So the settings that I make will live beyond like restarts of the pod. And then I'm referencing the chart name. In this case, it's the Home Assistant chart at the Cates at Home Helm repository. And so when I run this, it installs the chart. It installed really quickly. You can see now the pod is there and is booting up in just a few seconds. It should be ready to go. It's doing whatever like startup probes or readiness probes that are necessary for it to be there. And now it's running and ready. So I'm going to switch to a browser. And I'm going to navigate to that Ingress URL that I just set. And I'm at the basically the Home Assistant like setup page and I'm ready to go. Returning to those smart home protocols we mentioned earlier. We have Z-Wave and Zigbee. Both of these depend on different USB devices to have a mesh and radio network throughout your home that connects all of these smart devices. Each of these are going to have their own software that drives the USB driver outside of Home Assistant. So there's tons of different in cluster communication going on between all of your different services where everything gets wired up and talks back to Home Assistant eventually. So if you take a look here, this is my Z-Wave to MQTT application. And it's running in my Home Assistant namespace communicating with the Home Assistant server over either a web socket or you could use MQTT. MQTT is just the message broker that we use for most of the IoT things. And so Z-Wave to MQTT really just kind of keeps track of all of my Z-Wave devices. This is where I would come to add a new device to my network. It's a process called inclusion. And so once I start this up, it would pair and depend on that USB device to connect it. There's also the same thing for ZigBee. You can see here, these are my office lights. I have four different overhead light bulbs that are all connected through ZigBee instead of Z-Wave. And I can just start flipping some of these switches on and you should see them light up behind me. There are some differences between the two protocols. But at the end of the day, it's really going to depend on which devices you want to use, what they're compatible with. Most folks run a mix of both protocols. And here you can actually see an MQTT debug tool that I can actually connect to the MQTT server that's running inside of my cluster. It's an Eclipse project by the name of Mosquito. And you can see here a bunch of different topics that are being subscribed and published to. And this is really the heart of how that smart home communication goes between the different applications. So again, your cluster is continuously passing messages amongst itself. It's a great real-world experience to have different components depend and rely on each other. So even though you may have redundancy with your compute devices in your storage, in some cases, there's still times, especially for home automation, where you are tied to a single physical device. But basically if the device that you have your USB device connected to goes down, that's a problem. And so one thing that I've done to solve that is I'm making use of a Kubernetes SIG project called Node Feature Discovery as well as DeScheduler. And so what that means is I have Node Feature Discovery configured to dynamically label the node that a USB device is connected to with a certain label saying this device is on this node. And then I schedule the pods or the workloads that require those devices. So for example, Z-Wave to MQTT needs to be able to talk to the Z-Wave radio. So it'll only schedule the nodes that I have this specific label on so it can actually talk to the radio so it'll work. But if I unplug it and I plug it somewhere else, Node Feature Discovery will remove the label from the node that it's no longer connected to and it will apply the label to the new node that it's connected to. And then DeScheduler will eject the pods if they're currently running on a node that does not have the required label. And then Kubernetes Scheduler will of course schedule them back on the proper node. So this gives the Kubernetes-like native scheduling way to handle movement of devices. So if I have a problem, say like with this laptop or any device that I have, or any node that I have a physical device connected to that I need to still be able to use, I simply just plug it into a new device or new node and Kubernetes will handle the rest for me automatically, which is really handy. This is what happens behind the scenes in Kubernetes when a device is moved from one node to the other. In this particular terminal window, there's three different Tmux panes with the top showing a watch on any Kubernetes node that has the custom Node Feature Discovery label for the custom Z-Wave controller. Right now that's on K3SD. The middle window is a tail of the DeScheduler logs for any events that are related to the Z-Wave 2M QTT pod. And then the bottom pane shows it's a watch on the Z-Wave 2M QTT pods, or pod in this case, showing what node it's currently running on. In this case, it's K3SD. So when I unplug the device and move it to another node, the label will get removed from K3SD, which just happened. And then in just a minute when it's plugged in to the new node, it should show up. In this case, I plugged it into K3SE, the node called that. And so I fully expect that node to get that label. And there we go. It just showed up. Now it's on K3SE. And when the DeScheduler loop runs, it detects that it doesn't fit on that node anymore. And so it ejects it. And then it's going to basically be rescheduled by Kubernetes onto the appropriate node that has a label, which in this case is K3SE. And the bottom, we can see that it's terminating the pod on K3SD. And then it's in the process of creating the container on K3SE. And so that basically shows behind the scenes what happens when the device is moved from one node to the other. As we said earlier, if you're looking for a personal project to learn cloud-native technologies, Home Assistant really has you covered on all fronts. Right here, you can actually see a Grafana dashboard of some of the Prometheus metrics that Home Assistant can export for you automatically. You pretty much just add one line to your config. So here I could see my TV usage that's broken out by different TVs in my house. So here we can keep track of how much time we're watching TV. Pretty straightforward Prometheus query, if you're not familiar with what they look like. All right, same thing with our lights, we can map how long we leave the lights on. I can punch this out to look at the past 24 hours. We can see the temperature was warming up today. And so if you want to learn Prometheus metrics and maybe build some Grafana dashboards, and this is again a great place for you to explore and experiment within your home lab. Here's an overview of the GitOps approach with running Kubernetes. In this particular case, this is my GitOps repo for running my cluster. I'm just going to kind of talk through what that looks like in practice. So like for example, Home Assistant in the folder here, I have a Home Assistant YAML file. And what that's doing is that's referencing a flux helm release kind of object. And it's set up to use the Home Assistant Helm chart in the version that automatically gets updated through Renovate for when there's new updates to the chart. And then all the normal helm values. So like I referenced the image, which also gets updated when there's a new version. I'm setting things like environment variables, Ingress settings, service settings, storage for like persisting the configuration. I'm annotating the pods for doing pack ups to Valero, for example. And then I also have this add on called code server, which is VS code, but server based. It's actually a sidecar running along with Home Assistant. So I can edit the Home Assistant configuration files directly through VS code and persisting to GitHub. When I make changes to the repo, if I were for example, to change the version of the Home Assistant image that's running, GitOps will pick it up, will flex will pick it up, it'll detect a change in the repo, it'll then notice a change to the helm chart like values, and it will apply that. And then the workload will basically get restarted with the new value. So I don't do anything to the cluster directly. I do everything through the Git repo. So as part of my day job, I do a lot of automation of systems. And so I'm very much a fan of automating as much as possible. So in my particular case, in my home lab, in my Kubernetes home lab, I'm running Ubuntu server on all the nodes and out of the box, Ubuntu server is configured to automatically download and install security updates. And if a particular update or CVE that was installed requires a reboot, Ubuntu will signal that by creating a file on the file system. And so I run a Kubernetes workload called cured, KURED. And what it does is it looks for each node to see, does that node need to be rebooted? And then in the serial fashion, it will drain and coordinate the node, it'll reboot the node, and then uncoordinate after the reboot occurs. So that gives me automatic security patching without having to lift a finger, which I really like. Another is for Kubernetes upgrades themselves, for themselves. I run K3S and Rancher has a tool called system upgrade controller. And what that does is it monitors for a new version of Kubernetes, or more specifically a new version of K3S. And when it detects there's a version and it meets the criteria to be installed, it will drain the nodes, coordinate them, upgrade the version of Kubernetes on that node, and then uncoordinate them. And it does that one at a time as needed. For both the cured solution and the system upgrade controller solution, I think I have them configured to only run like at night, like between 1 a.m. and 5 a.m. So when I wake up the next morning, I might see like in my like monitor or whatever that some of the nodes rebooted, I'm none the wiser because everything just magically works. And I never have to worry about keeping my stuff up to date. Lastly, for Helm charts and Docker image versions, if they're assemble, I make use of a GitHub tool called renovate. It's super, super awesome. Basically, all of my Helm charts, if there's a new version of the chart available, it will open a pull request on my GitOps repo, letting me know there's a new version of the Helm chart and then I can merge the pull request to automatically install the new Helm chart, similar for new versions of Docker containers that compose the workloads that I'm running, if maybe the Helm chart didn't have that update or whatever. And there's lots of configurations you can do and rules to guard how renovate works. Like for example, I think I've run Plex and I haven't set up where during the evening, if there's a new version of Plex, it's automatically going to install it because they barely have breaking changes. Okay, so for lessons learned, some things that we've discovered as we've been going through this journey is a few different things here. One is that if you're going to run Helm charts or applications that have custom resource definitions or CRDs, it's going to be important to note that while Helm can install CRDs initially, there's currently an issue with how Helm deals with upgrades to Helm charts and changes the CRDs. It basically does not upgrade the CRDs after they're initially installed. And so you'll most likely want to handle your CRD installation out of band of your Helm chart. That way you'll be able to maintain the life cycle CRDs much easier and avoid all those problems. Yeah. And another big one is plan for your persistent storage early on. In my case, I have four Raspberry Pis that are just running with some big beefy like 128 gig SD cards. K3S has its own local provisioner that's built in so it can kind of like provision local, you know, disks. The problem with that is it's not distributed or highly available. So the pod actually gets locked to only run on that node because that's where the data is if you want to persist that data. So we recommend using something like Rook or Longhorn. The process to migrate after the fact is basically you have to spin up a, you know, a blank copy pod, attach the volume, do like a cube control, you know, copy to your local directory. It's a giant mess. So plan for your storage early on. Just check out Longhorn. It basically can take advantage of the rest of that SD card I mentioned and distribute it against X number of nodes replicated. And kind of to that note a little bit is you will most likely experience some sort of failure or setback, which is perfectly fine. It's your opportunity to learn, but it's good to have, I guess, a plan in place to anticipate that failure and have a good recovery idea. You may need to tear down your cluster and recreate it. And so keep that in mind as you go through this. So I had a power loss and my database got corrupted somehow. And I basically was like, Oh, I guess I'm spending my Saturday rebuilding my cluster from scratch. So it will happen. Even though ARM support is getting much better, there still are a number of workloads that don't have ARM support. And so if you're, if you're choosing to run a cluster that's mixed architecture, so for example, a mixture of Intel and ARM based hosts, you, one way to handle that, that mix is to taint your ARM nodes with some sort of label to say this, this can't schedule things and then add tolerations as needed for the workloads that you know can support mixed arch. That'll save you a lot of pain in the long run with things that normally would not run on ARM, but can only run on Intel. So in summation, go out there, grab some hardware. If you don't have any already, you know, install K3S on it, have fun controlling your light switches and tying things together. And really just don't be afraid to break things, right? Your home lab is your safe space to experiment and learn. And it's a lot of fun. It is a lot of fun. It's highly addictive. Cool. So with that, we have some time for questions. Outro