 Here we go! Thank you, Skale, for giving me slides. Hi. Thanks for coming to my group therapy session. I saw the schedule and there's no talks after me. I think I have the room for four hours. I'll try not to go over. I'm Justin. I work at Disney Animation. I'm not an artist. What I am is a tool hoarder. Kind of just collect these things and keep them around and try them out. So, let's kind of go over some of that stuff. The talk is inspired by Brian Lendook, has a Linux talk. If you haven't seen that, I recommend going to see it. He actually gave an update to it last night. Yeah, let's go ahead and just figure this out as we go. That's completely fine. So, one disclaimer before we get started. I'm going to read this because the video, both people that are going to watch this on YouTube might not be able to see the slides, so let's just go ahead and go through this together. The thoughts and opinions expressed in this presentation are not my own. They are the collective outcry of every user who has ever used one of these tools and ran into one of the following limitations. While there are many ways to manage configs, this presentation will only focus on a subset of tools that are advertised as a way to make your management of configs easier in some way. You can see them in the back table. This presentation cannot address every issue in every... Somewhere I lost my place in every environment, but I'll attempt to make it more... I'll make limitations more obvious. This presentation should not deter you from using a config management tool. In general, they are awesome. I mean it. You would be a fool to manage the infrastructure of any size without one. If you care for your life, your job, your loved ones, or preventing Skynet, please do not write your own. There is one caveat to writing your own. If you are going to write your own tool, please make it have a better theme than the current tool. If I'm called a puppet master one more time, I might cry. One more disclaimer. I have not used every version of every tool in every size environment, nor have I spoken to every user of every tool. I've spoken to a lot. If a claim limitation does not affect you or cause you to cry in your pillow at night, please let me know on Google+, so your comments can be ignored. The claims in this presentation are strictly for educational purposes and should be used as such. In this delimitation that you find particularly egregious, please let me know on Twitter, and I won't ignore your comments. At Rothgar, the first step to fix the problems outlined in this presentation is to acknowledge they exist. The second step is to storm the respective GitHub issues, let the maintainers know what limitations to prioritize. The third step is to pay those companies, so they will listen to your outcry and assign someone to ignore your request. The final step is to not, I repeat, do not ignore the issues and move everything to Docker in a config management free Utopia, where you will incorrectly assume no issues will ever come up again. Now that that's out of the way, just look broadly at, everything kind of sucks a little bit, and they all kind of have some of the similarities together. One of those similarities I'm very sorry about, but I cannot help you with. That's more of a, it's you, it's not me, it's you, but a lot of these tools just, I'm sorry if you're in Windows environment, because they all kind of suck. Second place is documentation. They always, there's a, you always want some example of something and you're not going to find that documentation because there's either full sections of tools that are documented or, you know, you have a weird environment or, you know, something you need is not going to be there and that's okay, documentation sucks everywhere. Some a little more than others, but it all kind of sucks. Testing is hard. Carlos gave a talk earlier about test driven infrastructure and just explains like testing this stuff is hard. Like if you need to know what's going to be, you know, you make a change here and like a server in Oregon or something is going to do something, like what is that actually going to happen? You know, what's going to happen? Is the network the same as your local environment? Probably not. So you're either maintaining duplicate infrastructure or you have crazy, vagrant boxes that try to replicate everything and it's just, it's hard. It's not going to happen, you know, there's a lot to do there. Templating. Everyone loves some ERB and JINJA too, right? They're great, you know? We're all here together. And there's, and why isn't there like a way, I mean, I know there are some tools that kind of do this, but like I have a template and like I want to do an Ansible run and like spit out the template and like just show me what that template was here and like not, like there's a no op, but then it's going to like try to put it there. Like I just want to like, you know, server I want it to run on. Someone figure out the environment variables and then make that template for me, right? It should be easy, but sometimes it's not. And then in those templates, like you have secrets to put, right? Like that's what templates are for, like you need to put things in those files and you only tell your friends your secrets. Like why would you tell some strangers and like EML is not your friend. Like this stuff is hard, like every Config Management tool, like the answer is like, oh well, let's rub a little crypto on them, like we're good, let's stick that in text file and no one can read it, right? And it's true because like I can't read that. What does that say? Like that's a stuff, I gave it my password and it lied to me, it told me this was my password. I'm like, no, that's not what I said, right? I don't know, like you read that, but then like someone made another commit and like that's the password, like that's the new password. Oh cool, like, well I don't know which one's which, like okay well let's, let me go to the box and like find out what it is and like decrypt it somehow. Like one of these is password and the other one is, you know, DevRandom. Like can you read it? Like I'll give you the PEM file, can you tell me what it is? Like boiler, like that one's password. Let's, let's, so you decide you're gonna, you're gonna do this. You know, they all suck a little bit, but like let's just start, let's go with it. Like let's pick a tool and you know, it spends a lot of marketing dollars. So they must be good, right? They're really mature and man my Twitter stream is sweet, so let's, let's go with it because it must be good because I see it all the time. So like, you know, let's go to documentation like what do I need to get this set up? Like there's gotta be some simple like diagram of like okay it's like a box or two, right? Oh well, kinda, like a little more than I expected. I mean this is large scale, like this is, you know, 7000 nodes, like yeah, like I know people in this room that have that, like I do. And that's kind of a lot to manage for this and I don't, like the lines sometimes, I'm color blind, so like thanks, like but no thanks. And like who's gonna set all this up and like what's managing this? Like how did I get that there? Like I need config management to put that in place. Like but, ah, it's a true story, very true story. My very first Ansible playbook I wrote was deploying Puppet to all my 7000 nodes. Like that was what I did. And it's all downhill for Puppet since then. Like it was just kinda like hey, like that worked and that was cool. But you know what, like let's go on ahead, because Puppet is still mature. So spell this out for me, like what do I need? Like those are the 13 boxes I need to manage this. Like that's okay, like 13. Like I can do that, Amazon is cheap, right? Like I can convince management, they'll do this. So it's like, you know, 30 cores. Like that's not terrible. I think Raspberry Pi 3 is gonna have that. So we should be okay, like a little bit down the road, like we'll be fine. And like, oh man, 100 times a ram. Like that's a little bit of bull's bite. Like okay, well let's keep going. Like you know, it's just like a fried hard drive. Like who cares? So yeah, like management, yeah, they're on board for that, right? And like we're good. But then, you know, who has, you know, this is a new environment. This is like hipster Puppet. This is C and Haskell. Like this is all the hotness of, you know, Puppet server. Puppet 3 anyone? Like Puppet 2? Like you're on Puppet Master, right? And that's, let's see, what's those requirements there? Like, oh that, that's a little bit worse. You know, it kind of upped it for everything. And then there's that, that other little thing that like, oh yeah, Puppet Master, like this is only, this is only 2,000 nodes. I got significantly dropped what you could just do. And so if you have, you know, let's say you have 4,000, like just double it all, right? Management's on board. Got our infrastructure set up, you know? So Puppet has Puppet code. You know, like here's a lot of good things about Puppet code. Like it's a DSL. You know, like, DSL just for config management. So it's got to be good, right? It's like JSON, Ruby, like how to baby in Puppet code. So it's got to be easy to write, right? So I mean, let's, let's go for it because this is, you know, oh, sorry. It's got to be mature because it's Puppet code and it's been around forever. Like if anyone knows this bug, if this has been around forever, it's a feature bug, whatever. Ten-year-old from Puppet 0.24, like I need to make a directory tree. Like let me do that in Puppet code. Well, kind of. Like you can kind of do it. Really, like, I mean, I have 13 boxes up set up on my server farm and like I want to run this on all of them and I can't. Like I can't easily. I just, I mean this bug's 10 years old. I just imagine Luke, like he's still, he's still in port. He's been away on his computer. He's like, I got to fix this bug. Hey, Luke, you want to come to lunch? No, no, I got to fix this, you know? Okay, we'll bring you back something. But does he know we don't use that bug tracker anymore? Like we moved. Did you see the banner? Like hey, we moved. Like go somewhere else. Like we wiped this lake clean. All right, so let's just dig into the Puppet code. Like we can, we can write this because it's easy, right? Let's, let's make our directory tree. Because this, this is, this is it, right? This is what you want. It's not because it's bad. And why are there commas everywhere? Take those off there. Like nothing does that. Like why are there commas on the end? Like, I don't know. But so like let's, let's make this as a class, but let's make like parameterize that class. You know, this is what you need, because you're doing it wrong. So okay, like the parenthesis is around. All right, so the class has parameters. So now we're good, right? Like you need a params class that has your variables. Then you read that in and you inherit it. So like, okay, now we're good, because this, this is good Puppet code, right? It's like, well, store your variables in Hyra. Like set that up and you gotta put it on Hyra first and then it, that inherits. Okay, because that's, you're still doing it wrong. But then it's like, well, we're good, right? This is Puppet code. This is easy. Like, well, it's not a roll or a profile. Like rebuild that. Put it as a roll and profile. I mean, come on. Like this isn't hard, right? And rub a little R10K on there and in your set. Really, this is what ends up in my Puppet code. That's, like, that's it. Like, all of that. I'm like, all right, screw it. Just go, like, I'm in. And then you gotta figure out like, I need that directory there first, right? Puppet figures it out for you. You know, they compile it. It's all, it doesn't go in order that you wrote it. Right? So you have to, like, use only keywords. You know, always look them up. Like, subscribe, go back. No. You notify something and then, well, okay, let's take a step back. Like, let's put classes together. So the classes will go in order, right? And you got these little things. Well, the squiggly is subscribed, because that starts with an S, right? And then, like, reverse is unsubscribe. Then you don't get email. I don't think that's right. I don't even, like, let's, you know, I need all my directories there before I put files down. And, like, this is the example I get for, like, I'm like, what is that? I don't know. I don't know what pipes, why did you put those in there? Like, I don't know. And so, for public people, like, I have a, like, Puppet 5.0, like, I know syntax is going to change because it changes every time. So here's my feature request, all right? Like, new syntax for before, like, and after. Are we still good? Follow me? All right? And there's, like, I don't care, Puppet, you decide. Like, just, you can do whatever you want. And then, like, you know what you did. That, those are going all over my public code now. Like, that's what I want next. Those are going to be good parts, right? I mean, Puppets, like, they have, like, good parts, right? NCO. You guys are laughing. That's the punchline. I'll narrow it down a little bit. So, like, we have NCO, like, kind of broadcast stuff to everything, and you can talk to everything at once, right? So, that's cool. What's the best plug-in for it, right? I kind of had that. Kind of had a tool that did ping. Like, ping lots of things at once. And, like, you know, I could run Puppet. Like, oh, well, they already told me to start the Puppet service. So, like, Puppet's running, right? So, you kind of have a service plug-in to disable the Puppet service, right? And then you can go back and you can run Puppet with that. So, you're okay. But then, like, anything else you need to do, like, I need to check something else in my infrastructure. Like, RPC, like, that's your answer. Like, just give me a shell on all the boxes at once, and we're good, because this is going to broadcast everywhere. But, you know what? You know what I really like? I think that's kind of like this. Speaking of SSH in a loop, I mean. I mean, Ansible, right? Like, that's what it does, right? I mean, Luke has said in the past, like, it's not a solution. But you know who's a really smart guy? Michael Dahon's a really smart guy. Like, you know what? Like, I'm not going to do that. I'm going to do it a little bit different, because I'm going to just use Python, and Pyramid is not a state, right? Let's kind of abstract that a little bit. And I mean, Michael's smart, so he must have figured this out. Like, he used to work at Red Hat, and then he worked at Puppet, and he started Ansible. And then he saw the writing on the wall, and he got out of there before Red Hat bought him again. He didn't want to go back. But let's, you know, so let's take a look. What do we got here? Tools exist that did this. I mean, we could get SSH on a bunch of boxes. There's a bunch of tools that do it. I could call up any of these, give it a list of servers, and I have SSH on all of them. And there's more. I mean, it just keeps going. Like, every other week, it's like, oh, another parallel state thing. There's a couple different things, right? It gives us abstraction, and it gives us, you know, it gives us Cal State. I mean, none of those other tools gave me Cal State, and that's good. And there's two things that everyone does when they run Ansible for the first time. One is they turn on infinite scroll back in their shell, because Cal State has just taken up like 40 lines of shell, and you're like, oh, where'd that go? Fuck. And the second thing is they're uninsulting Cal State. It's like, oh, I didn't actually want that. It was cute, but man. But then, I mean, Ansible is really close to just a straight shell. I mean, there's not a lot of abstraction there. And so, I mean, let's go through an example here. We're going to see. Let's install a package. Let's install some software. So, like, Chef and Puppet, like, hey, that's cool. Package, that makes sense to me, right? And Salt's like, I see what they did there, you know, because just Edmunds are late. It's like, hey, you know what? It's like, okay, Ansible, like, you can do this too, right? Oh, shoot. What's the web box again? Do I use the app? I don't remember. Let's look at it. Well, okay, for anyone who gives me comments, like, yes, Ansible 2.0 has a package module. Cool. I have no idea which ones it includes or how it works. Just saying. So, like, this is my new, this is my playbook for Ansible, because it's like, I need to do stuff. So, I have Commons, I'm going to put some stuff on there. It can, for Christmas, we make a registry. We're going to install Apache, right? Oh, well, is it HDBD? Uh-huh. Apache 2.0. I can't just, like, put a variable there, because the modules are different, right? So then, what do you got to do? You got to, like, do this attraction. So you have all these, like, if branching. I'm like, well, Fedora uses YUM, right? And Edmunds use apps. And then it's like, well, like, which version of Fedora is that? That 23? That's DNS. There's another module. So you have another branch down there. It just branches all the way down. I just want a little more abstraction in there. So it's a little less, it's just a straight shell. A little bit more. And then you're troubleshooting this. I don't know why, for some reason, for me, Ansible seems hard with it. I don't get it, because you have to register, you know, like, run something, register a variable out of it, and I need to, like, do something with that. But I never know what I actually need out of that variable. So I'm like, I just have, you know, register a variable and then, like, debug a bunch of times. I just go through my shell. I'm like, uh, I think this is the one I need. I'm going to do a little JSON output in VAR and then, like, repping that. Because I have no clue which one I actually needed to do something on it. Then you want to use that variable for something. So you have to, like, Ansible uses GINJA and it's, like, directly in the playbook. But it's, like, kind of GINJA sometimes. Sometimes it's more of a, like, well, that's not really, like, enclosed in GINJA brackets. So it's like, is that GINJA? It just, like, piped it to skip. Is that the variable we skipped? No. The task was skipped. And so you have to make sure you kind of get that in your head. But then it's like, well, let me cast, like, that variable. My output was one. But that's actually a string. So I always got to cast it to an int and then test if it was, you know, the right thing. And sometimes that's just, you know, is it GINJA? Is it not? Is it YAML? And if the YAML starts breaking down, you probably should get something a little more specific in the language. Like, public, I don't know. It's like this, where it's like, oh, I need to do this thing and put this file in place. And this is a lovely example from the documentation on getting some escape right there. I don't think that's better. Ansible is just sometimes. I mean, when you're really, if you have a few boxes, cool, get going, and you're up and running, and, right, there's no agents, right? That's the stage that's kind of an agent. But what's up with five forks? Like, five servers at a time. I mean, if I have, let's say I have a thousand. That's going to take forever. And yes, let's change it a little bit, but no one starts at this low of a bar and then just jumps all the way. And then they added things, like accelerated mode. It's like, oh, accelerated mode, that's faster. It's like, well, sort of. But it's not. Unless you've got EL6, everyone has EL6. But then I'm really sad when they came out with accelerated mode and then fireball mode was gone. Fireball mode was awesome just because they named fireball. I'm going to run all this in fireball mode, because that's just better. Okay, so Ansible 3, I have a name suggestion. Maybe we can't use fireball because I'm going to confuse people. But Hadouken is a really good name. We can just go like, Hadouken mode and it's faster, and I think that would work pretty good. And then there's other things you've got to turn on to get things to work better, and it's like, as soon as I get to 100 servers, managing SSH config for 100 boxes is just not designed to do that. I can't import an SSHD folder of every server, and it gets really confusing of which boxes which, so how do I connect each one, and then you might have different keys for different people, and it gets to be kind of a mess managing all that SSH. It's a good one-on-one. And so, you know, yeah, well, it's easier to write, right? There's no SSH, well, sort of. But, you know, they must have solved these problems, right? So, like, how do they solve those, like, complex YAML stuff? Right? They got YAML. They do similar things to Ansible, but they, like, inherently say, no, no, no, it's Gingya 2 and then YAML, so we're going to filter that in to render one thing and then render another thing and then make it a thing. But then it's like, well, you can solve that because you can just write Python. You can just write Python right in there, and that's great because it's like, but then there's, like, multiple Python. It's like, well, which one did I want? And it's like, well, write everything in the first one. When you realize it's wrong, write everything in the next one. And it's just like, you know, keep going from there. Like, maybe I'll do one of the other ones. It's like, what? Maybe not. And, like, they have other YAML. Like, oh, YAMLX, that sounds cool. Like, that's a salt YAML. Like, it's like one step below Puppet Code, but, like, one step above, I don't know, like, HD. And then there's really obscure stuff. Like, if you have config and cheetah, like, I'm very sorry, you must be managing Windows boxes or something. But just in case, like, for salt, hey, if config management doesn't work out for you guys, I have it, like, a pivot you could do. Like, you could change from doing config management stuff and focus on these renders. I think you've got some going on there, the way you can pipe them together. How many people wouldn't love markdown to Confluent? I mean, just take markdown files right into Confluent's documentation. That's great, you know? And then for anyone else, like, there's one other step. Like, email, write to Excel, and then give me a state out of it that I can just deploy somewhere. Like, that is business right there. You can just render that and just, like, search my email anytime server A was mentioned, like, stick it on this Excel document and give me a state file. That'd be cool. And then they go, like, one step further in their documentation where it's like, you could write HTML or puppet files. I want to write my puppet manifest and then have Salt put that out somewhere. But, you know, they have that as, like, an example and then it doesn't exist. So it's actually a little sad. It's like, hey, you could do this and they don't. But, you know, let's look at Salt a little more. And so, you know, Chef's pretty mature, right? GitHub's like, you know, less than 400 bucks, you know, issues on GitHub, so that's pretty good. You know, Ansible's a little less mature. Yeah, a little bit more. Like, Salt's a little less mature than both of those. Let's just say, like, I mean, a little less mature. Three, maybe. All right, like, that's that's a lot of bucks. But, you know, I might run into one of those buttons when I'm deploying stuff, let's just say. And then I feel like Salt's documentation, they took all this cool documentation and then they gave it to Puppet. You order it. Like, you just do something and like, and they just deployed it to the, okay, that works. Like, no, it didn't. And then, like, this is everywhere. Like, hey, examples are on the code. Like, you mean that code with 3200 bucks? Let me just go read all those bugs and like, let me triage those as I'm going and I'll set this up and we're good. No, like, I don't want to just read the code every time because so then you're like, okay, well, let's find some common states a little further because, like, examples for states, Salt, I'm sorry. A Git repo is not a good place for just a bunch of states that, like, you wrote and Chef actually does, you know, have this bit and Puppet has a ton in the forge. Great. So let's go right to Chef. I'm sorry, Chris, I might. It's delightful. So, like, supermarket, right? It's a good place. But, like, why does nothing name-face? Like, you have, like, I'm on supermarket, I'm looking for a recipe, and I have one Apache model, like, but you can't name it something else because, like, there's only one that can be Apache. Like, nothing else can be called Apache. People name them different things and, like, oh, well, like, there's other Apache things. So you end up just searching GitHub again and you hope someone wrote recipe or Chef somewhere in their GitHub repo because, you know, that's the only way you're going to find it to find these examples. And Chef does a profile thing, you know, they figure out a little bit of, I mean, orderings in top to bottom, right? On the page you write it. That's good. Except for stuff like this doesn't work. And it's like, well, why not? It's like, well, you compile and when you compile, that file didn't exist. So, like, the execute never ran because it never, you know, the file they put it there but then it didn't run it. So, like, you have to trick it. And you're like, oh, always execute this. It's like psych, only if that's there. Like, don't worry. Like, I always want you to put that there but then don't always run it. You know, there's still like these weird oddities because, like, I mean, Chef is straight Ruby. So, I hope you like Ruby because it's good. And then Chef wants you to do things with Chef server, obviously. I can't, I got to get some other stuff because it's time. Chef solo really, like, everyone just does it. It's better. Like, these disadvantages, I'm sarcastic. These aren't necessarily disadvantages in a lot of cases but they just, it works better with Chef solo. Writing your own resources and providers. A little bit here. So, here's a surprise, the theme for DevOps Day right here. I'll wrap it all back up together. Like, you should still use one of these because seriously, you'd be stupid not to. Like, that is the surprise. Like, yes, sometimes they suck but look at this is what you had before, isn't it? You had an Excel document with all your servers in it and then like hardware and like, oh hey, I changed this server, this, you know, ran it on this box, so like, email it out to the team and hopefully I'll save it and then they edit that one. It's like, no, that sucks. That was so bad. Like, well, maybe I can just, you know, do something like indirect shell scripts because like, I don't need all this abstraction stuff. It's like, well, people sort of did. This exists. Like, it's a waffle, like, config management in bash. No, like that. And I mean, I usually, usually Scales long were kits so I censored some of this stuff, but seriously, like, this is another one. Like, it's still written in Ruby but it just shell scripts all the way down and I already have a folder of shell scripts. I don't need to just deploy all those everywhere. That was stuff. And so, here's the real, like, why you need to use these tools. Like, why is this better than like, waffles and doing your own thing and how you were doing it before? Seriously, like, meet space people do not scale. Like, I cannot do this on my own and, like, I can grow my team and then, like, we run out of space and, like, no, that still didn't work because no one knows what some other people is doing and it just doesn't work. So you need this stuff to scale. Like, if you're going to go anywhere with even 50 boxes, like, scale, like, use one of these tools. And seriously, they're always consistent. Like, the broken bits are always broken and it's great because, like, you can rely on it and you can just say, like, oh, I know what you're doing now and you can work around some of those things because it works and it always works the same way until you've made your version upgrade. Um, yeah. It's very different. And then it is seriously just reliable. Like, when was the last time you told, you know, the DB admin, like, hey, I need you to migrate this thing? It's like, oh, I'll get to it, you know, sometime. It's like, well, like, I need it before lunch. It's like, well, like, some things, you know, give a 30-minute window. Like, you don't know when it's going to run. Like, you don't know when the DBA is going to do it. Like, at least with something else, you can narrow down some of that scope of variability and it's really reliable. And it just works, you know, works better if it was consistent that way. Man, there's more things. And then the number one thing that's still, like, you should use one of these tools because you are going to be an angry sysadmin when you're old if you don't because this whole room of people, like, this community is the reason for these types of tools. And it is great to have people who are currently at this bash script in your basement because everyone's in the basement. But, like, you have people that you can reach out to and you can work together with people on solving these issues for not just yourself, but then the next person that comes along next year they can benefit from what you did. And that's great because that is what we all really need when we're just kind of supporting the system. So, that's my talk. Thank you.