 You are live we are live homeland show episode number eight Whoo, exciting Almost there to ten almost there to ten. This is a Tom Lawrence and we have with us Shayla Croy Alex crunch mom. Hello All right, we're gonna talk about Something really complicated to make your life easier. Let's go. I look at it I'm not a bad way to put it actually Especially considering one of the first points that we're considering bringing up is not to overarchitect something But I think we all kind of do that don't we isn't home lab kind of overarchitecting in and of itself in a way We wouldn't do that Just just a little just a little I mean the best job security isn't it? Yeah, I think it is I don't you know, I don't are we live on YouTube? Yes, okay interesting There it is It just took a second. It's catching up It is catching up So, yeah, the whole idea of the automation is a really big part of what we wanted to talk about today It's almost like disaster recovery, but I'm kind of hesitant to call it that because that's more of like an enterprise thing Although you can argue that if your significant other is upset that something is down Then you might want to get that up and running pretty quickly or maybe you're practicing for you know Stuff you want to roll out at an actual company in which case you could use whatever term fits, but automation is mostly On our topic list today. There's different Obviously different technologies that you can use there's different use cases for why you might want to automate things Why it just might be a bother and you might not even want to do that so that's basically what we're going to talk about today and then that'll help help us get to The future where we could talk about in-depth things or any one thing More in-depth that they were getting to the point where we're running out of the High-level stuff the top high level stuff and I think an important high-level concept of the automation is The concept should be your data is really important, but the servers are essentially ephemeral You can destroy them. You can rebuild them. They don't matter if something goes wrong Let's say you're a pipeline company and there's a major attack and you need to rebuild it You should have an automation script that just kicks it all back off and rebuilds it off There in the north I had to queue for 20 minutes to get gas the other day Yeah, I heard it spanned down there I see people throwing that in the comments because it's a topical here in May of 2021 But it's still the same concept. These are ways you when something goes wrong if a server is infected if a server has a problem Because of an attack is an example not just a server failure You know, you should always have backups of your data But you should be able to just run a script kick off a deploy script and J's dove deep into this before with the way he Uses ansible pole to do deployments of you pull it and just with a few parameters passed to it You can rebuild that particular system and then copy your data or remount the shares where the data is located to get the server up and running again Absolutely one question to ask yourself if you you know you run a home lab You have some cool apps running on there and something goes down conceivably I don't wish that on anyone obviously, but how hard are you are you willing to work to get it up and running? I mean if you don't have any automation at all and or backups I mean you should definitely have backups, but if it's spent if you spent like four hours setting up that server Well, you're gonna have to spend four hours to set it up again if it completely breaks But not everyone has four hours, right? I mean a lot of us were really busy We could probably spare 15 20 minutes a day if it's super busy and then eventually that server goes down And we don't even have the time to bring it back up So automation like Tom mentioned is you know Hopefully automatic it just self heals. That's the ultimate goal. Yeah getting bears the challenge though I would love it if we just press a button and it's auto healing But you got to architect that kind of thing and architecting things is what we do in home lab We're either doing this because we enjoy it. Maybe we want that certification So we want to practice with whatever a company is using for this kind of thing So, you know, just ask yourself how much time are you willing to spend? Rebuilding it in automation might be the key because you could basically automate well essentially everything about your servers to where Ultimately, you could delete them on purpose. I'm sure you guys have heard of the chaos monkey But for those that haven't I believe it was Netflix. They had a chaos monkey I believe I don't know if it's a script or a server. I'm sketchy on that part That would just randomly delete stuff and kill services just to see how well it's self healing now That's gonna be way beyond the scope of anything we could talk about today But ultimately if you could get to that point where you could just delete something Impulsively and nothing goes wrong. Something spins right back up again or worse case scenario You have to just run a script and it comes back up. I'm still fine That's one of the goals that as when it comes to homelab that we might want to consider trying to reach All right, so here's how I think about automation and it's not disaster recovery is as you seem to be Suggesting that's one use case right, but honestly, it's probably a rarity The way I see automation is that it is capturing Knowledge it's capturing capturing opera operate operationalizational knowledge That's even a word it's capturing the architects knowledge of what they think the system looks like on any given point in time but also it's It's a way of declaring You know on this day in history On this day in history. This is the way things were, you know, this is how we designed things and When we deployed version X of our application It looked like this, you know and Without putting things into automate into automation You just don't really have that kind of Ability to be able to wind things back and figure out what change might have introduced extra latency Right at some step in your application and stuff like that. So for me, it's it's very much Non-optional automating things now, I mean You could say if I do this task Once every nine months and it takes me five minutes manually. There isn't a case for that But if I leave the organization or I get hit by a bus or I just plain forget what I was doing nine months ago Which is highly likely. Let's be fair If it's in ansible or some other Automation system, I don't have to really think about it I just have to run the playbook and I'm good to go and I know that because it's been working for the last nine months It's probably going to just work again Yeah, I completely agree. I think that's probably that's a really good synopsis of a very good use case They're probably the majority reason why we want to do things like this now disaster recovery I agree. It's not that but it's kind of an element in some ways It's sometimes when you talk about automation you're by association You talk about some of the things that are part of a disaster disaster recovery plan And we're not going to go over disaster recovery in general because it's a you know Obviously an enterprise topic and some of these things bleed over to the enterprise I think it's a personal thing too because it depends on the individual if it's a single person living by themselves They have one server and they don't mind rebuilding it once in a while and if it goes down They really don't care because yeah, they like it, but it's not like a critical piece of infrastructure They might make an argument why you know if they're not interested in configuration management why they might not want to do that I would argue they should just like you said Alex that it itself documenting it It's a point in time Yeah It makes it easier the next time because especially when you you know have something running for six months And then it breaks you have no documentation You have no automation you have to re-google everything again that you did the very first day But if you don't mind that depending on what you're into or you know how it goes But then perhaps you do want things to be available I think that's the goal that everyone should strive to is I completely agree you should do that if you can Then there's a challenge of over architecting something. I think that's what you alluded to if You know you have a server that takes four hours to stand up And you automate it takes five minutes But it took you you know weeks and months to build it You could argue that you're spending much more time with the automation solution than it would be to manually build it On that one occasion. Yes, but I I think in the long run it will pay for itself many times over because It gives you the freedom to make changes a bit more gung-ho without worrying about breaking stuff Or you know say you want to move from VMware to Proxmox or you want to go on to bare metal or whatever You don't have to worry about well Hmm. What did I install for this particular version of you know the the hypervisor that I was using? What was the drivers that were needed? It's just all there. It's in the it's in the git repo in the history from that commit a few months ago And another benefit of this too So even if you do feel like you're spending a lot of time on this That's not a problem in and of itself because think of it this way at my day job And I think this is kind of like the dream for a lot of people in home lab where You if you work in it already your boss comes to you and says hey Can you work on Ansible or whatever it is and tell us about it and maybe help implement this and that's something You've already been doing in your home lab for a long time That's like yes, I can help you with that absolutely what happened to me Yeah, I was messing about with a docker and ansible at home and I'm fine for a devops job And that was my career that feeling is amazing It's like it validates everything that you've been studying That they just happen to go that direction And I've had that happen twice to different companies I've been using Ansible for a long time both just so happened to go toward Ansible and since that and we'll talk About what Ansible is later on but I think that's kind of what we hope happens if we work in it Or even if we don't work in it maybe we want to and these are things we can add to our resume We've legitimately been working on Ansible if we have so there's all kinds of other benefits beyond the amount of time like you That could pay off especially if you're looking for a certification or depending if it maps to your career goals or not So that being said it's I think my main point on my side at least is it's it's a personal thing I agree with everything Alex has mentioned I think that there's a you strive to do the right thing to document to automate make it easier on yourself Not have to remember how you did the thing You switch to digital ocean or from digital ocean to a note or vice versa or whatever you're doing Or you're going from Proxmox to VMware. You don't have to worry about how did I do the thing? Hmm. I don't remember that was six months ago. That could be really frustrating So I actually open source all of my automation code I'm a huge fan of Ansible in particular. I use Terraform a little bit as well So for those that don't know I mean It's hard to know quite what the audience doing don't know in advance, but Ansible is my personal favorite Config management tool. I use that because it runs over a stage so I connect to every server remotely in a push one So that what that means is I have a central system could be my laptop could be a VPS it could be, you know any system running Linux Or a MacBook so any anything that runs a Unix type terminal Right and that runs SSH. I can connect to a remote system And from there I can basically push my configuration out over SSH Now there are some other tools in this space puppet is another notable one salt as well is another one Puppet works in a slightly different way that works in a pool model. So you have a bunch of remote servers that run an agent and Every I think 15 minutes by default. They connect into a central puppet master. Got to love the name of that Yeah, yep and say hey has anything changed in the last 15 minutes And if it has they apply the Delta and they bring themselves back up to what the desired state should be That's the primary difference between ansible and puppet now. I mentioned terraform as well Now this one is an infrastructure tool. So I use terraform primarily to actually create VMs or VPS instances It's on-prem so I use it to automate stuff against Proxmox and VMware in my house But I also use it to automate stuff against linode and digital ocean And many other cloud providers like AWS and all the big guys as well And and these these two tools terraform and ansible work incredibly well together because yes, they do you end up having a logical delineation between The creation of the infrastructure so that can include stuff like firewalls the VMs themselves any networking that goes around those systems any disks you want to provision for them and then once you get to The point where the VMs and and all other infrastructure has actually been created I can then switch to ansible and say hey right go and bring these into the desired state that I'm looking for That's exactly right. That's that's how I use terraform and how I recommend other Ansible in combination as well. They do work very well together like I I've had someone ask me You know when when does terraform and in ansible begin and I said pretty much the same thing you just said I basically said well Let's use terraform to make things exist and then ansible to take things that do exist and bring them up to spec And also if we want to make changes globally rather than you know If we have 500 servers hit them all one at a time that takes a long time But to be able to bring them all to a desired state and change the state bring them to the state at any one time and personally I've used puppet chef and ansible and Then when I settled on ansible for me personally they're like at that point nothing else existed like I just felt like This resonates with me. This is how I think it should be and this is great It's easy to learn and I also felt I don't know if you've had this experience Alex I felt like ansible keeps up better with what's in the industry like if There's a change in apt for example that their modules are pretty quick to Accommodate that when a new distribution is forked from another distro. I've literally seen ansible You know pretty much include that right from the get-go and the one time it didn't Can't remember what distro it is I put the pull pull request in myself when they didn't support a Distribution that I just hit the scene and I was expecting to have to go through this rigid set of requirements They're like, okay, it's merged. I'm like, oh wow cool So yeah, I could sing praises for ansible. I think it's an amazing thing and just like Alex said that's exactly what it is It connects via SSH. That means you don't have to run an agent Just as an example my day job before I started with ansible They used chef and I had a client call me up very upset. You're like my servers hitting a hundred percent CPU every hour And I'm like it means like why is this and I'm like, oh, that's just chef running Now you could argue the chef implementation wasn't done very well, but it's like the CPU. There's an agent running It's just just using CPU cycles to check through everything and ansible is just more lightweight by comparison Some environments you need that for compliance though. Yeah Well, you could do the same with ansible the way I use ansible is Different maybe you could talk about that later on either in a dedicated episode or in this one I kind of use a hybrid approach with my ansible That's one thing about ansible. There's a generally agreed upon way to use it But there's nothing stopping you from bending a little things here to make it work more flexible for you And I really love that flexibility about it Obviously, there's some best practices you need to follow and going against those best practices You could basically correct it in the forums politely so they've generally been pretty good So generally speaking whether you you're using puppet chef or ansible like Alex said It's just getting everything up to a certain point. You define the end result and then the tool gets your server to that end result You manage a lot of service Tom. What do you what do you do for this kind of stuff? I'm partial to ansible the It's always just the simplicity of it agent list and sometimes I'm not even needing anything much like I just need some stats from a server So, you know queue it up I just need to know the information from these servers give it all back to me So it's something parsable. I've also used it when I do some of the load testing if we got a load test a Server for I ops I can spin up several different VMs and use ansible to China Kick off simultaneous loads from a series of VMs focused on one particular storage target So I can get that kind of overall, you know, can this handle when it hits production the type of I ops I need out of it ansible has always been like that easy tool No agents know nothing just throw a couple of VMs throws some SSH keys at it and you know where you go It's it's just the simplicity of it. I would say is what I like Python on the remote system Yeah, yeah, that's easy enough and another element of Ansible is ansible tower aka awx awx being the open source version You could pay for ansible tower. I don't use Awx a lot, but I do have it. I do use it I have various playbooks as they're called an ansible that are on my awx server sometimes it's cool just to have a template as they're called and Run that template you get the results right in your browser, which is pretty cool completely optional not required It's just one of those things that's pretty cool to set up if someone wants a project to work on I personally like it. I don't use it every day. It's not like a big part of my workflow But it's just something to look at if you are the type of person that likes to have the visual component of ansible Which again, is it required? I'm laughing I Prefer to just do things on the command line However, when you're in a team at work and you've got a bunch of other people doing stuff You know, it's worth Something like tower is well worth it because you can do roll-based access control So, you know, this person can run this specific task But also it bridges the gap between ansible and puppets so you can have scheduled tasks For example as part of tower and say run this task every 15 minutes to make sure that these packages are at this specific version For example or some config files match what they should And the other thing is is that AWX is the it has a relationship to Tower a bit like okd has a relationship to open shift. So it's the upstream open source version of Tower oh, I see Morgan's just said that in the chat as well Yeah, yep Yeah, so I definitely agree and I've even had my day job. It's pretty much the same We have some individuals work, you know just starting out They want to help out maybe take a ticket off the queue and it's something simple Maybe there's an approval to just update packages on the server. Just go to go to AWX login run the script But then you know more advanced people. I love the command line That's what I prefer I could do the same thing from the command line. I could run the same playbook I don't need to even open a web browser. I can just fire it up and it works So definitely something to at least know exists depending on I mean will it fit your use case really depends on what you are looking for but Definitely a lot of interest in it is just blast excuse me interesting things about Ansible one of my favorites And I really love how well Just like you mentioned it fits with Terraform how it can fit in with other utilities. I think it was last week. I Created a cloud init config That as soon as I restored a raspberry pi SD card image that I had with the cloud init config on there It automatically provisioned via Ansible I didn't even do anything but restore the SD card and powered on done walk away And I've got a notice on my phone because you know, you can even hook Ansible into notification services So you can get a notification on your phone if you really want to go that far I do because why not then I get a notification Oh, the raspberry pi provision is all set that I can log into the server and sure enough everything Was set up and ready to go. So it's easy to learn but nearly endless with the possibilities that you can Build into it. If you could think of it, it's probably possible. I haven't yet run into anything I wasn't able to do an Ansible so highly recommended and you can start really simple of just having it return Something to you or just kicking off something simple, you know Some install or an apt-get update Maybe you know one of the easiest ones to do you have a bunch of service to update that you want to update manually Instead of just using you know unattended updates You could actually put a playboat together put your service in there and so like alright This is the update command and what does it return after it kicks off? You know apt-get update or yum upgrade and start looking through it It's like some of the most basic commands you can start sending to it see how they return and you keep building it from there and a nice thing is You know even like gaelic said so many people put their playbooks put all of the things they do out there on github you can often find those and You know start reverse engineering how they work and take and cut the snip and piece them together So you can build a tool or a script or a playbook that builds what you want to build It's a pretty well documented system out there. That's probably another plus for Ansible I found lots of answers on stack exchanging github It does help the other portion of Ansible that absolutely rocks is a thing called Ansible Galaxy Now this is a way to import Work the other people have done essentially so if you want to do something like set up a samba server or Set up a bunch of users or something fairly standard like that. You can pretty much guarantee There's going to be a role available on Ansible Galaxy What you would do then is you would put in your playbook execute this role against this group of servers and Then you would provide a bunch of variables that would override the default variables in that galaxy role To customize say the username that you're using or the name of your samba share or the mount point for it or something like that And it just means you can take these reusable building blocks of code You know Jeff Gehrling is probably the ninja master of Ansible on the internet and he's written Hundreds of these damn things upon Ansible Galaxy We actually had him on the self-hosted podcast a couple of episodes ago and he's turning into the raspberry cut raspberry pie ninja now as well so He literally submits pull requests as I understand it I'm pretty sure I saw his name in the source code when you know talking about the source code when you go into You know the the get repository for Ansible you'll see his name talking to the developers there. He's extremely active in this space mm-hmm Definitely a lot of respect for him and he does really awesome YouTube videos highly recommend that everyone checks them out if they haven't already done so So any other thoughts about Ansible? Because I think we we totally we covered most of it Feel like I could be forgetting to learn a lifetime to master. Yes for sure now terraform We could have even started with this one actually making things exist. Yes, you can use it to make changes later Some you know, I won't get into should you or shouldn't you but it is great for you know making things exist And it works with different providers, which is how terraform it how that works So there's a provider for it lino digital ocean AWS google cloud proxmox, you know your local hypervisor technologies are A capability of that too you tell it which provider you want to use if you want to build it on AWS Or you want to build it on vmware and that provider knows how to work with that kind of thing and you build The config files basically the scripts and then there's a plan and an apply the plan tells you Um what it would do If it was going to run or if you did actually run it and then there's a an apply you run That tells it to actually do the thing so you can actually run a plan And it'll tell you what it wants to do But you watch because you know if it's going to delete your instance Maybe you don't want to hit that apply and allow it to do that Maybe you you wrote something wrong or something like that, but you get that flexibility Maybe you should be architecting your systems in such a way that you don't care if you accidentally delete something I agree. That's the ultimate goal for sure. Um, absolutely But it just gives you that visibility. It tells you what it what it's going to do And then you as the owner of the infrastructure you just make that decision. Yes, I agree But if you did type something wrong or there's a typo it'll tell you it doesn't catch all errors or syntax errors most every now and then it could fail for other reasons But you define the beginning state it creates the instances for you and you can even tell it To launch ansible the first time so that way everything Off of what running one terraform script will just yeah, that's nice that gets messy when you're trying to do that So I wrote a blog post for openshift.com a few months ago Um on pretty much entirely deploying an open shift kubernetes cluster with nothing but terraform It's a very very powerful tool and you can do all sorts of similar constructs to that you can enhanceable You know, you can template things you can iterate over things. It's basically a programming language Um, so if you have a bunch of ip addresses, for example, you want to inject into some static ip configurations You can provide that as a dictionary item And then have the code iterate over that dictionary and insert those ip addresses one by one into various vms and stuff like that Very very powerful tool. Um and so When I think about automation, I I think about the sort of the more mundane tasks like bootstrapping a server and stuff like that for my Home lab But I also think about how that's going to translate onto massively distributed systems like kubernetes clusters and things like that There's no way i'm going to go around and update the packages for example on 20 vms all at once That's just madness and a waste of time these days Right, but I just want to run one automation command and have that do everything Yeah, I think that's a great works Yeah, yeah, we're getting in here like it worked last week. What did I do wrong? But but like you mentioned earlier in the in the podcast, you know because you wrote it in itself documenting and all this uh state defining that you do just just makes it that much easier if you have to redo something later And um anyone looking at it if you did it right We'll be able to see not only from the documentation that you write, which I hope everyone writes documentation And they'll at least see from the code. Um, however, okay, so documentation, right? Yeah, we gotta talk about that The code is the documentation, isn't it? Yes I think that's why do I need to write more right? If you if you are an infrastructure engineer You should be able to read my code and be like, okay. That's what that does. Cool Move on right? Yeah, well, I think there's there's truth to that. I think there could be use cases for documentation. Um, there could be The basics or here's how you here's how you consume the repo. Here's how you set it up. Sure. Okay, fine But the more complicated stuff that's in the code. I don't need to I don't need to explain it twice Do I mean if I do you definitely don't want to yeah, you write that down. I've then got to update it in two places Yeah, you don't definitely don't want to write books and you don't want to be redundant You don't want to write like install these packages create this database and these permissions if the terraform or the ansible or whatever you're using If it does all that for you, then yeah, you definitely don't want to write like, you know, so install my sequel And create database app name or whatever But I think there there's definitely a use case for both and it also depends on the individual as well But um, I'm a big fan of documentation. There's just some things I document and maybe you could argue this isn't even homelab related. Uh, sometimes when I learn certain commands There's like certain syntax, um, or certain arguments that I like that I just want to retain If I'll make it a bash alias when I can Sometimes, um, I'll write myself a little note. I know on, um, self-hosted. I think you guys talked about Hedge talk, um, I are unless you guys were talking about a different solution It seems like there's like a lot of competition in that space for documentation I like just using them with text files personally, but I understand that's not like what the cool kids are probably doing I wouldn't know what the cool kids are doing because I'm not one of those Well, I think we're all cool in our own way, but um But yeah, I mean ultimately if you're doc, I agree if you're documenting things that you're doing in the code That doesn't really make a whole lot of sense for sure So definitely draw a line from what you are going to document and what's redundant Yeah, you know, I do get excited when I see documentation in the code. It's like well commented and things like that I got a mix I've worked with and I think you may have too as well Phil who works over for the linux foundation. Um, he's always Been doing it right. I've always seen nice comments on the code. He's always put together It's so appreciated poster person for doing it right It's like twice I've contacted him and You know with a personal problem on my own servers and it's resulted in him writing a pull request That same session that same meeting. It's great. Um, it's like a superpower. It's not but um, it's definitely great if you can um So basically there's some other concepts that are around automation and configuration management that we could probably argue We want to try to minimize if we can one example of that Is vm templates which can be a good thing But then probably unnecessary if you're using terraform the correct way and configuration management but they exist in multiple different, um You know platforms allow you to create them now Sometimes it might be a good use case to have like like one thing That allows it to bootstrap or maybe um, something that allows it to hook into your system better But I would probably argue vm templates I mean if you're if you're relying on those too much, then I think that kind of means you're not doing configuration management, right? What would you say alex? I like a vm template to have a few basic things in it. For example, maybe a convenient ssh key pre baked in Some basic packages. For example, python so that ansible will just work without me having to the fucks every time Right with with red hat. Maybe it could be subscribed if it's a homeland, but if it's in an enterprise situation, I'm gonna I'm gonna want to do Subscription based on the role of the server anyway Have you come across a tool called packer this thing? Uh, I love it. But yeah, okay. Cool. It's a hashy corp thing So maybe how did I thank you so much? I'm so glad we didn't forget that I would have hated myself if we didn't talk about that today Yeah, the thing the most important thing really is that you build templates regularly and that you know something like packer for those that don't know Essentially you provide a recipe to this thing and say go build me a template and it it goes away and it builds it and with a bunch of different rules and and things like that and Before uh, docker was really quite prevalent on the scene which we talked about last episode Packer seemed to be the answer, you know coupled with vagrant and stuff like that You were able to build development environments that were fully encapsulated And deploy these things from a template in in no time at all and that just seemed like the obvious thing to do But with with containers that kind of reduced the use case In my opinion for that kind of thing a little bit It has yeah, I think you you hit the nail on the head when you mentioned I think you mentioned ssh keys and since ansible uses that it's so much easier For ansible to hit the um, you know new vm If that ssh key it's using is already there then you don't have to You know if you want to create a different user for ansible to run under you can Or the key or whatever it is you do to make your configuration management solution work That's just one less thing to do. I I mean, I think the issue comes when You see people that are building an empty or a template that has everything like The database with the schema and the application and the applications config files and the and the packages and then They like update the template every week To do that effectively doing the same thing you would use terraform and configuration management for so I feel like I agree If it's a good starting point, that's fine But I've seen people run into the trap of making the template everything And that could be a waste of your time Especially when you can have like version control with your configuration management your terraform and even packer In the chain it and packer for those that haven't heard of it Allow you to create those templates and images and I use I've used it with um proxmox and aws personally So with aws, you would actually see the ami The amazon machine image as they call it show up there after it runs So now you could create vm's with that image same with proxmox It'll you'll see a template for your vm's there You write the script to define, you know How it's going how you want that image to look like or that template to look like And then from that point forward you could use that template as the starting point for future vm's that you create and it works with It personally speaking it works. It's worked with everything. I I've tried or thought I might be interested in like google cloud aws As I mentioned aws already proxmox many more I I pretty sure xcp ng don't quote me on that But I was looking up the various platforms that it supports and it knows how to build templates with those platforms or on those platforms So it's like you learn one tool packer create your templates with terraform the different providers Like we mentioned earlier allow it to work with different platforms So you have one tool for each purpose and ansible works with pretty much everything And then you could just use them for whatever platforms you're using I wanted to address and I'm going to butcher the name. I'm sorry Parithosh I hope I got that right Who asks couldn't you just use cloud in it to set up the ssh keys? Yes, you definitely could so long as the os that your provisioning supports it, which is not all of them Which ones don't um, because I know just so it so happens that the one or two I tried Worked just fine, but obviously that's that's not really good example Do you have any in mind or it depends if you're installing from an iso? so, you know Some, you know a bunch of desktop for example, probably would support it, but you'd need to pass it in The correct way and if you're building from an iso directly that's It's not easily done It really depends on how you're provisioning these these systems if we're doing it through automation It's very easy, but if you're doing it through click click click any UI you've got to Paste things in the correct field and all that kind of stuff It's just not It's not a standard thing across different distro So I could put into does it differently to fedora to arch to red hat, you know, so I felt Um, because I actually worked a lot with it a week ago I kind of felt like the documentation just was nowhere near as good as it is with terraformansable and the others I mean there is documentation It's just not as much So I feel like at least when I was working with it You're more likely to run into a situation where you're googling how to do a particular thing You may or may not find a good lead on how to do that thing Um, in my case, I was just playing around with it. I'm like, okay. Well, raspberry pi The ubuntu server raspberry pi image has that built in so I figured I'm going to customize the cloud in it and then I looked up in the documentation how to Change the default user from ubuntu to my name I did as the proof of concept I wanted to do that and I found it and I was able to change that default user name I found the syntax for it and totally locks the image. You can't log in I I fought with it for like probably over an hour and I know I did it right The user was created. I could pop the sd card in See that etsy password was updated with the hash I provided I never really figured out why I couldn't use it But I did eventually find a workaround. It was a little challenging I think depending on your distro like alex mentioned, you might have an easy time A bad time With me though, I've noticed that yes, you could do it with the iso but From what I've seen so far Even if it doesn't include it by default you can install the package for it And then usually the distro maintainers if they do provide that package will have a default Cloud in it config for their distro and then you could grab that customize it then you can essentially You convert that installation into a template and then the next time it runs It or boots up. It's going to um, you know make that the case But again, don't over architect cloud in it. There's a you know There's some things you could do with that But then you run into a challenge if you try to make that everything to everything and might just Start working against you. So SSH keys absolutely a good use case for it But like alex mentioned, you know, check your distro The capabilities make sure the documentation is there for the things you want to use it for and then choose accordingly and all right, so um Any thoughts on the whole template things or even cloud in it for that matter? And actually tom, how do you use? Templates like for your use case. I'm not to ask you that. Yeah, so I did look up Packer builder zen server the code's a little old I imagine maybe it still works because zen server The same api stuff works from forever ago. They carried everything forward and added features A lot of my stuff is still vm templates because it's just an easier way to go the I usually just keep a couple maintained running ones and then because it's so easy to fork them You just create clones of them and it instantly deploys many as I want because you can do this inside of xcpng through zen orchestra It's nice. It's just so easy from web management standpoint if I need five of a buntu's I can just go grab the buntu. It's already up to date and duplicate five of them And kind of like alex said it saves me the trouble of loading python and all you know The things if I wanted to use ansible against them it has everything in them already So i'm using it a little bit different for my lab. It's probably not something I'd want to use for customers the automation's a better way to do that for my lab for simplicity and The fact that I don't use automation enough From my day-to-day use cases. This is why I usually still do it that way That's also why you know, I'm still learning in from you guys as well about better ways to do this just like everybody else here so Something you said kind of reminded me of a reason why a lot of people use cloud in it And there's other ways of doing this. You don't need cloud in it for anything let alone what i'm about to mention but one issue that comes is Especially with beginners is the ssh host key issue where Let's say you you create 10 vms and you create it off of one template That template contains the same ssh host key. Yes, then every single time you ssh into a machine It's going to basically say Or give you errors because it's it's tracking like the known hosts and the hash associated with that They're all going to have the same and then it's going to really confuse your ssh client now Cloud in it will wipe your ssh host keys and regen them on new instances But that's not the only challenge specific to a boon to and I think other distros might do this I don't know what which one's doing which ones don't and this really drove me nuts Well, basically what happened was I had the ssh host key thing figured out completely every machine got a different one But each machine was fighting over the same ip address even though the mac address For the nick was different on each So generally how it's going to work is a vm if it's exposed to a dhcp server It's going to present its mac address I need an ip it gets an ip then another server with a different mac address and I made sure this As for an ip it gets a different ip but despite them having different mac addresses They all got the same ip all the time And what I realized was that with the boon to there's a etsy machine id file machine hyphen id that has a I don't know if it's a hash It has content in there and that's what it presents to the dhcp server instead of the mac address So if your image All the servers have the same machine id and I think even debion might be going this direction if they haven't already Then if you don't you know fix that you're going to have an issue where now The servers are fighting over the same ip address and what it'll be, you know Yes, it's h into one you start working on it. It drops what what happened It says a connection reset because now it's trying to hit a different vm And cloud init will actually fix those things and you could fix it yourself by just simply Truncating the machine id file by the zeroing it out just empty it out And it matters if it's absent Versus empty. I found my testing believe me a lot of Non politically correct words were spoken out of my mouth when I was fighting this issue and I eventually found it out I wrote a blog post about it So there's going to be things like that that in an image or template You want to make sure are accounted for and cloud init even if you only use it for the ssh host keys and the machine id And that's all you do. That's that's totally fine What's interesting is that with raspberry pi os Excuse me. They don't actually use cloud init. So But how they get around the the host key problem, which is kind of funny when you look at the System d unit for this it literally looks like someone hated cloud init and wanted to avoid it Because the system d unit literally rms all the host keys out of that cssh It runs the dpkg reconfigure open ssh command to regen the host keys and then it disables itself So you enable the service next time it boots It does those few things and it disables itself So they went out of their way to avoid cloud init But you could do things like that if you want to get around the problem Yeah, it's interesting that they do it that way we we were talking about that when you're putting the show notes together We're like, oh really that's that's Okay, that sounds like something someone very basic would do and we don't think of the raspberry pi as Built by amateurs, but they have the reasons they have the reasons and part of the reason could have been documentation maybe they they got maybe when they wrote it initially there wasn't documentation for what they were trying to do They should just simply be able to use the debbie and upstream Cloud init config to do that. They really shouldn't have to re architect anything But yeah, you're right. It is what it is I chuckled when I saw it for anyone running raspberry pi os if you go into Etsy system d system You should see a service file there a unit file called something like reset host keys dot service or something similar to that And then just you know cat out the contents of it and you'll see exactly what i'm talking about But those are some things you might want to use cloud init for If that's something that you experience with your vm So I just want to get that out there if you run into an issue where All of your vm's are fighting over the same ip you have no idea why they have different mac addresses That's etsy machine id that that you should take a look at it's probably the same on each one So you'll run into issues like that when you create templates every now and then Which is why I should switch to automation instead of just cloning all my vm's In in one thing about automation depending on the platform I know there's not going to be very many people using aws. I don't think maybe i'm wrong in our in our listening audience But we actually were able to get it to a point where there there didn't need to be Any template or image because it has something called user data where you can just With a launch config right there in aws. You could just put something there and add the ssh key via The built-in ssh key service there with aws. You could pretty much just have a vanilla ubuntu devian or whatever instance you're using Nothing no template no am i nothing and then it just launches the configuration management to to do pretty much everything it is possible probably At a too high of a level for the average average home lab person because they might not be using a platform that Exposes capability like that. Probably we'll have to do a template Now another thing about this that I think we've kind of wanted to talk about or proposed to talk about our notes here The idea where the server is basically Completely unimportant the data obviously is what matters the app is what matters But if you're in a situation where you're thinking man, if this server goes down that'd be horrible I have to do all these different things Imagine a world where you could delete the server right now and just spin up a new one and nothing's different for example You could have a database that's outside of the vm. Maybe it's mounting via auto fs or some other service Maybe some central storage the vm the container. Well containers are pretty much stateless anyway You don't you don't really care about that because everything of importance is in a central place And if you're like tom and I using True nas you have snapshots And you could easily roll those back If especially if you have the configuration in snapshots or something like that Just basically being mounted. That's an option. There's a lot of interest especially my forums I've been asked how to make how to create this basically have a central storage and have what's important there Obviously, it becomes a single point of failure if that one storage goes so you really have to think about The single point of failure that that can create But it's just definitely something to talk about. I know tom you use true nas probably way more than I do Because you make videos about it, but I think you and I probably have the same use style Yeah, it's it's another spot where you can create some levels of automation and that's expanding rapidly One of the things that true nas has done is allow api access to control the system And this is going to expand out how you can handle your storage how you can tie this into Some of your other tools like kubernetes. I believe you can tie that into it so you can roll back the snapshots from that I don't know if you've covered that in some of your kubernetes things you're working on or not Yeah, I'm thinking about it, but I haven't I don't I've tested but it's some of the expandability They're getting in there It's it's really nice to see and in even a terraform working with xcp ng All these different tools are building so you can You know build out your vm's build out your storage servers and have it all tied together very tightly So you don't just replicate a vm like tom's been doing Yeah, you know that so you mentioned kubernetes and containers one of the things I many millions of things I love about linux server.io Is that it's pretty logical slash config that's according to the documentation and and it's it's been that way every time I've used it the when you have a container when you get an image from linux server.io and you create a container with that image Then your data is in slash config So if you mount slash config to a central resource and I tried this with true nas I was using the lounge and irc Bouncer that you can basically run in a browser and I had the config The slash config folder mounted to a true nas data store and they had snapshots and You know it was empty. I spun up the container and sure enough All the config files just ended up there on true nas and then I deleted the container I respawned it again and it was exactly like nothing happened because all the config files are not in the container They're in that a central place And and alex, I know you have a lot of docker compose kung fu going on So you probably have uh, I'm assuming a better solution or maybe some kind of um Interesting solution around docker compose if I remember correctly with your services Yes, so in the in my infrastructure repo. So github.com slash ironic badger slash infra All of my stuff is open source if you want to go and dig around a bit more um In there I I have I I writ It's not a word. I wrote a role for ansible which takes in a python dictionary of Parameters and spits out a docker compose file at the other end. So I basically plug in my traffic labels that I want the Host paths that I want to bind out my volumes to ports all that kind of stuff Um, and out the other end comes a fully functional docker compose file That's really useful for a few reasons Uh, the most important of which is secrets So I have a bunch of Keys that I don't want the internet to see and yet my infrastructure repos open to the world. So how do I handle that? Well, I put a bunch of stuff in something called ansible vault ansible vault will it will encrypt a yaml file using aes 256 encryption Um, and at runtime the ansible Controller, which is usually my laptop Will decrypt that file in real time Extract the you know cloud flare api key for example or database password or whatever it might be And then inject that into the clear text file that lives on disk of the host That is a downside the docker compose file. It's clear text on the server I don't really know a good way around that except running some something like hash you call vault, which I'm not going to do at home. So right, um Yeah, that's that's how I manage my docker compose files But the data itself Just lives on uh, zlfs Uh, what do you call it volume? Yep data set volume. Thank you. That's the word my brain Yeah, I said data store, which is incorrect terminology data set. So I have a data set per container um, and I have I use jim solter's sanoid uh script which runs every Hour, I think by default. So I have a snapshot of the app data for each container. So this is so app data is separate from Data data if that makes sense. So the configuration of the application is separate from Say my movies and tv shows right the movies and tv shows lives off zlfs. It lives on a merger fs uh array For want of a better word. It's just a bunch of disks mused together fused together in software Um, perfect media server.com if you want to know more about that um But yeah zlfs is how I do snapshots for application configuration That's a brilliant way to use it. I also use ansible vault as well Um, and I I kind of run into the same thought process because because you mentioned you don't know a better way to solve the clear text thing um It's similar to me like I use ansible vault and I have the file that you could use like dash dash vault password file I believe is the option and ansible you could point it to where that uh text file is that contains your key But you actually do not want that key to be in your version control where everyone can view it because that just totally destroys the whole purpose of it because now everyone can decrypt that And you know the ansible knows in the ansible config file you tell where to find this actually I just use a command line argument and um, you have to be either root or a very specific user to Actually view the file on the system and then I could make the argument if you got that far Um into my server, okay It's already blown wide open if you're able to to get into that because it's not in reach of any web server It's not exposed to the internet Or anything like that. It's just uh, you know readable only by the users that need to be able to read it And um, if you're able to to get that far and you tell me what my own password is Well, you know what pet on the back you got you won the prize you got in there The problem was before that you shouldn't have been well Let's be fair people are just going to go after pipelines and not jay's home server. So, right? Well, I mean now that I'm sure there's probably a bunch of people trying so, you know Um, and if I if I get owned I guess I I only have myself to blame for that But um ansible vault is pretty pretty good for those kinds of things. I haven't used the Hashi corp vault I remember looking at it and just looking at a cross I'm like what how does this work? Um could have been my edd, but I plan on actually looking at it again and just kind of you know I deployed it in a previous job and it's uh, it's a beast It's very very complicated to deploy or certainly was four or five years ago Um, but does work very very well. It's incredibly expensive at the enterprise level though Yeah, I I think it was like two years ago when I looked at it I had the same opinion so unless something has like drastically changed recently It's probably the same thing again still today, so um Yeah, so I know we didn't really talk about I mean we did talk about the differences between ansible and Puppet and chef is pretty much the the same Problem there. I think that you know the fact that all of us are using ansible I mean I could talk a little bit more about the differences But I think we pretty much covered that too because um, it's agentless like was mentioned earlier Chef is pretty much the the same pros and cons as puppet for the most part I have to just personally make that recommendation for ansible for anyone looking for that I don't think anyone will disagree with me on that But um, then again, I mean if your company is sold on puppet or chef and that's already the direction that they're going I mean it is what it is and I haven't used salt personally so I can't really speak about that Jeff I would say that uh Probably I don't know This ac unit controller has better security than the pipeline people Yeah, we haven't heard anything good come out of that yet. So when they running windows xp. I had I probably I've I'm trying to sort fact from fiction out because there's a lot going on around if they were running xp They deserve it. Yeah That's uh, yeah, that would not be a good day to To expose a well, I mean it wouldn't be very hard to expose a windows xp machine if we knew of one, but yeah, that's not good at all So are there any other thoughts about configuration management automation and the like because um, unless I'm missing something obvious I think we may have covered everything that we set out to talk about Yeah, I think we we got most of it out there and it's one of those things like we can't possibly Cover all of it. It's it's a lot of it comes down to you know, what do you use? Um The salt stack one and I seen a couple people mention that we didn't dive into it I know they've had a lot of security problems and it had a lot of security updates to it I'm not saying that's bad as a product. It may because of the popularity of it people found problems with it. Uh, the bigger problem Most of the time was the fact that it was all publicly exposed and people never thought to tunnel any of that type of traffic That does your configuration management Um, because even ssh I mean ssh is secure enough for the internet But if you can secure it if you can secure any of the transport layers of your Uh infrastructure do that. It's just better that way hide it behind a vpn Make it make them work for every inch they try to get into your network Don't make it easy don't roam in those xp So that's when it really comes down to all these automation tools because the one thing that really keep in mind How we were talking about you know decrypting that yaml file on the fly and things like that is With all this power comes the power to massively scale out Uh destruction and it's not unheard of for someone's server to get popped And then they find out it's the base of the automation server where they can massively disrupt things This uh the problems are scalable very very scalable something. It always needs to be uh really taking into consideration with these tools I try to separate things when I can too like I have um a completely different ansible volt key for My retro pies that yes, I do use ansible my retro pies. I literally go all in on ansible But I didn't want the same volt key on like everything because then if they get that volt key They could go to other repositories other ansible repositories of mine and unencrypt things there too Um, so you do kind of want to maintain at least some separation there Because if yeah, you have multiple layers But if you have one key to all the things then the person wouldn't really have to walk through very many layers if they get that key They they expose pretty much everything from that level up. So and especially Like I mentioned earlier, don't put your vault keys passwords as if they had private keys or Anything that people shouldn't see in a public get repository like it's there's a funny There's a funny anecdote to my infrastructure repo 99.9% less leaked credentials. Yes Yeah, that's what happened. I didn't have the right thing in my git ignore file and oops And I I think I I have done the same thing like in a in a panic. I literally Deleted the repository and you know because I have all the code on my systems And then I just created a brand new one without that file Um, yeah, there's ways you can actually purge something from the history, but I didn't even want to wait I'm like I'm like nuke it all start over not start over I just did a initial git commit to the new repository with all the same stuff and went from there Obviously, I lost all my history. It's the only way to do it because git is a database and it contains Even if you delete the file, it's still got the history there. That's the kind of the point, right? So Yep. Um, well the reason why I say it that way because a friend of ours, uh, you actually gave me a solution There was there's a tool that allegedly will go through and totally purge the history of a git repository of that thing But I wasn't of the opinion where like I wanted to trust it 100 percent because if there's like an edge case where Exactly, it did not delete it in one of the commits. I didn't want to risk it So I didn't let's face it every github repose being scanned by bots every second. Anyway, so once he's in there It's up there. You've got to change it as fast as you can because it's public knowledge at that point Yeah, yeah, there's there's a lot of automation tools that are built Um, the manufacturers a couple of public ones I can't remember the name of it off top of my head, but there's a couple public ones I think they've been shut down now because git did not appreciate them They scan the repositories for people's private keys and then they list them on a website on like a scrolling page You just can go there and get the latest keys as they as it finds them An entire companies have been hacked and I hate to use the term hack in this regard When um an aws key to a company's entire infrastructure was up there in clear text for anyone They just you know, they have all the rights that that I am Roll has or the key that that is attached to and then they just own everything because uh someone thought it'd be Okay, or maybe they just didn't think about it and put it right there in the repository as a public repository So just just keep that in mind. I know most people listening to this probably I know I know it's pretty obvious Believe me. It's it's not obvious for quite a few people out there. So yeah, keep that in mind So hydra keys We'll leave you with that because I think we've reached the end of it. We've reminded everyone to secure everything So hydra hydra password hydra keys are hacking everybody around here. Yes. Something like that All right, so I think that was our show for today Yes All the previous episodes you can find at the home lab dot show. Uh, this one will be published within 24 hours So, uh, we'll have it on there. So if you want to listen to us again And we'll have some show notes for some of the things we talked about in case you missed it in chat That'll be posted back over on the homelad show and uh, thanks Tom Lawrence here and J. LaCroix And Alex Kretschmar. All right. Take care everyone