 So it's pretty exciting to be here. This is kind of a fun topic for me to talk about. It's one that I feel pretty passionate about because I've seen it done badly so often. And I've done it badly as well. So some of you may know me, some of you may not. But I do have a blog. I podcast as part of the Ship Show podcast, which is release engineering and all kinds of other stuff, as well. And you can find me on Twitter at slasher underscore D. I also recently started working for Ops Code, which I like to at least disclose at these conferences because I was not accepted to speak when I was an Ops Code employee. So good to know. I mean, I've always been pretty suspicious of vendors. And this microphone is a little weird. And so I like to always let people know that I am now actually a vendor, which I find hilarious a lot. Can everybody hear me? OK, this keeps kind of sliding funny. OK, so credentials. What qualifies me to tell you anything at all? I have been doing configuration management since 2010 when I met it. I had no idea that tools like this existed at all when I met Chef. And I was like, where have you been all my life? And it's pretty much been configuration management ever since. I also have a long background in systems, mostly retail web ops and things. And I have seen a lot of inconsistent environments and things like that over the years. And so I have a lot of good stories and anecdotes to tell you about. And I think that that will be fun today. I want to kind of talk about why we want to have consistent environments and some of the things that can go wrong when you don't have them. And then talk about a few ways to focus on making them homogeneous across all of them. So we have a problem. Our environments are run in a muck. A lot of times what happens is we build an app. And as it promotes through environments, we refine those environments. But a lot of times the refinements don't make it backwards. So production, everybody cares about production. But people care less about other environments. And as we make things better and better in production, a lot of times things don't get migrated back. Or different teams are responsible for different environments. And they don't communicate. And you don't really know what's happening in production or how things happen up there. And so you just kind of figure it out back in depth on Steph. And this doesn't really fly if you want to do continuous integration, or continuous delivery, or continuous anything. Because nothing is going to be continuous if your stuff is different everywhere. Because you are constantly going to be stopping and trying to figure out why things are different. Because things are just not going to work reliably ever. So environments run a muck. One of the ways that we can cope with this is actually working with configuration management tools. And what that can help us do is homogenize across environments, and help us make our process more consistent. Because what we end up with is a thing where the larger your organization, a lot of times, the less communication you do, the less you know about other environments, and then things become more and more out of sync over time. So the next thing really just shortly, you may actually think that I'm up here to talk to you about Chef, and I'm not really. Everybody knew, even before I started working for Apps Code, that Chef was my preferred tool, and that I like it a lot. But I have a real passion for just configuration management in general. And I think you should be using it. And I think that it doesn't matter to me as much what tool you're using, as the fact that I really think you need to be using one. So I'm not here to talk about Chef. A couple of the examples I have are going to be in Chef, but as far as what's going on here, you should be able to do anything I'm talking about with any of the major configuration management tools. So don't worry too much about that. So it's really just what you do with the tool. Don't worry about which one you have, or which one I'm talking about in here, or what the examples look like. So environments, who cares? So environments are where your app lives. So if you're a dev, you really should care about more than just what you just wrote your code in. The environment is really what you put your app in. And if it's not the same everywhere, you're going to have problems. So you have to care about other things too. So not only do you have to care about operating system consistency, you have to care about integrations, connections, databases, all sorts of things. And so it's important to realize that when you have a dev environment that is different from tests, that is different from production, you can run into problems. So what kind of problems can you have? Does anybody here not believe me, by the way, that you need the same environment? Because I have had these discussions with devs, who are like, well, we don't care what the deployment looks like in production. Or we're just going to build our Jboss JVMs like this. We don't care that they have nothing to do with what they're doing. I have had these discussions for real. OK, so good. I'm glad that everybody believes me, at least, and that we all agree. But in the meantime, let's talk about a few fun things that can happen when you have out of sync environments and process. Eventually you get something right to, say, stage your production and somebody else is doing your deployment. And all of a sudden they're like, well, I tried to do this thing, but it didn't work. And it turns out that there were actually three other steps that never got into the enormous spreadsheet that is being passed around. So there's something like that. Or you might have deployed something by hand and forgotten about it, and then done it in the next environment and forgotten about it. And now it's prod, and it's 3 AM, and we've forgotten to do it again. So that happens, or this actually I have seen several times, where the wrong database connection information is deployed to production or a bad password, like the dev password for something ends up in production instead. 3 AM outage is no fun for anybody. Somebody fixed an SMTP server setting on the server by hand and then forgot about it. And so then when we build another server in a couple of weeks, that server also has the wrong SMTP server setting until somebody comes in and fixes it by hand again. And we keep forgetting to do this over and over, and it just becomes one of those things that never gets done. This is one of my favorites that we had. It was really obscure. We were migrating a giant website, and one of the apps in the website actually had really special, weird SSL configs where the Apache server actually had to talk to the Java app with special configs, and nobody knew how it was set up. Nobody knew how to make it work. It just kind of worked in tests for the app, and it had for years. And it took them three weeks with five people getting this app working when we re-built it. It was just amazing. I mean, something like this, you could just, once you have these configs in your configuration management, it's so much easier. Nobody has to guess at stuff. So and then the last one that you see a lot is people pushing SSH keys by hand, whether that's for batch jobs or just general administration or who even knows what, but this happens all the time. And then you end up with a giant list of SSH keys somewhere that you have to maintain that are never maintained anywhere, but in some directory on an admin server somewhere, right? Not even someplace safe. So I mean, what happened? Applications are complex and organic, and things just grow, and they can get out of control. And people don't give as much love to non-production environments. And this is really a problem, and you really do need to care about this. So I really feel like we have two failures here. One, I'm not going to talk about too much, and the other one I'm going to talk about a lot, right? The first one, we have a failure of communication, because people don't talk. The devs don't talk to people running production, especially, I mean, sometimes it's possible that you are also managing your own production. But it's pretty rare. The bigger your company, the less likely that's going to be. And the more layers you have between you and the people managing your production infrastructure. People don't talk. They don't know what's happening. They don't want to know. They don't want to talk, right? Half the time, they don't like each other. That's a problem. I actually think that was going to be funny, because it's just life, right, in a big company. So ownership is the other thing. Nobody wants to own devs, because it's gross. It's a mess, right? So these are the things, right? Communication ownership. So first of all, people got to talk, or none of this is going to work. You got to talk to the people you don't like, and you may actually find that they're not that bad. So, and I'm not even talking about dev ops, right? This is not dev ops. This is basic human communication. You got to talk to people, or nothing is ever going to work. And that's really all I have to say about that, because I think that there's probably already been a lot of really great sessions today on organizational stuff and communication. So I'm going to move on to what I feel is more important, which is technical ownership of non-production environments. So Dev and Test are really kind of like the Cinderella's, right? Nobody wants to own them. They get out of control really fast. They become a mess, nobody really understands them, and they kind of work, and they kind of don't, and they might work, and people don't want to touch them, and they get more brittle over time, because it's just a mess, right? So, because they're complex. As you get more apps, and you get more people working on stuff, you just never know. Manual deployments happen in Dev all the time, and people just kind of figure stuff out. When they don't work, they just kind of copy files around, and kind of figure stuff out. They're unloved, nobody loves Dev. Oftentimes, Dev is like somebody's responsibility among 20 other things that they're working on, right? Or it's, the Dev team is responsible for Dev, and they just want to write code. So if it kind of works, that's cool, right? And over time, they become unlovable, because that brittle thing, right? They just, they get worse and worse over time. The ownership becomes less and less, we have nothing at the end of the day, right? Cause nobody wants to own this stuff, and so hopefully it just kind of works. But unfortunately, they're really necessary for us. Lower environments, who in here is ops? I'm just curious now. Dev's business? All right, so this is the thing, especially for ops. We have to remember this. Dev is production for Dev. These people, they're using it all day long. Sometimes they're using it all night long. So I mean, lower environments are production for QA, for developers, for user acceptance, and a lot of other people. And so letting them languish in a state of barely usable existence is really not gonna work out long term, because what this does is it causes development teams to lose hours and days of time, trying to figure out why things don't work. And so it's a memo to Dev too, right? I mean, somebody has to own this stuff and you have to care about it. You can't just keep making the release engineer try to figure out why stuff isn't working, right? Cause eventually, the release engineer's gonna quit. So and what happens is that everybody is miserable using these environments. They have to use them all day long. And eventually what you have are miserable people because all the people who wanna be happy are gonna leave. So there are so many things in your environments that are not code. A lot of times what we find is that people are putting integration information and other stuff right into the code base, right? And we wanna avoid that. None of these things belong in your build. Packages and versions, obviously not. Once in a while though, some things get packaged up, right? Also, mail server information, multiple data center information, database users and passwords, connection strings, integration URLs, climates, batch jobs, all of these things, right? I mean, these are things that should never be in your code base. They should never be in your app. You shouldn't be having to rebuild your app because somebody changed the database password or somebody is changing the connection string for whatever reason. Or you added in a third data center with HA stuff, right? You should never have to do this. You should get your stuff outside of your app. So this information is always changing. So and it's really hard to know what the definitive information is, especially if you're in Devs because Devs are always the last to know what they are supposed to be connected to. So what are you gonna do? What are you gonna do with all this information that you don't know what to do with the people that have been configuring by hand and just kind of keeping around in tribal knowledge? Like there's probably a config set on the admin server, right, with like all the integration information or there's a wiki page, right? That hasn't been updated in six months. Or there's a spreadsheet from the last project last year that has integration lists and ports. So configuration management offers you a way that is reliable, consistent and visible to people. So one of the really great things about configuration management tools is that we now all agree that this information should be in version control. Everybody can see it. I can see how the servers are built. I can see how things are configured. I can see how people are constructing database strings. They're not in the code base. And if I need to make something new, I can actually go over and look and see how things are configured. And that's really empowering for you to regardless of what group you're in. So some of the things that you can do, some of the things that we can talk about with configuration management, right? We can eliminate mistakes with it. A lot of the things that I was just talking about were mistakes, things that are done. People do them by hand, they make typos, they make a mistake doing the build and all of a sudden you have the dev password in the production war file, right? Something like that. We can ensure consistency across environments, across servers, across data centers. We can automate the complexity. The complexity is not ever gonna go away entirely, but we can get our fingers out of it, right? I mean, the fewer things that we do by hand, the more likely it is that life is gonna be good for all of us and things are gonna make it to production in one piece and work. And we can talk about separation of duties. I don't know if, is the compliance signed? We can talk about separation of duties and why configuration management makes this easier, right? So we can eliminate mistakes with configuration management because the more manual steps you have in your environment is an invitation to human error. It's not malicious. People don't make mistakes on purpose. People don't make trouble because they're angry. They, you just make mistakes. My life is typos. I spend more time fixing typos in my code than I do in writing code. I mean, really, that's just the way it is on a lot of days. And so these things, you know, tools exist for a reason. Get things out of your pipeline that's manual. Don't have people running things by hand. Jenkins exists for a reason. Configuration management tools. Puppet, Chef, Convict, CF Engine, whatever. These all things exist for a reason. And the fewer things we touch with our hands, the less likely it is that we're gonna break something. So it's really important to think about stuff like this. What about Apache config that I'm talking about? If somebody had written that ahead of time into Chef or into Puppet, people would know about it. And at least, even if it wasn't still accurate, it would at least give people a clue as to how things were configured, as opposed to the 30 backup files that were in the config directory that nobody knew which one was the actual write one. So, get your people out of the pipeline. It's the fewer fingers you have in there, the better things are gonna be. And honestly, it's more fun to write pipeline code than it is to run pipeline scripts, really. So the other thing that we can do is ensure consistency, because a lot of times what happens is the apps team is building production with one way and then the dev team is who knows how they're getting their kick starts or their app builds, right, their server builds. They're getting it however they get it. They might get a server delivered one day with one version of the OS on it and three weeks later, they might get another server with something else. So you just never know. So one of the things that we can do with configuration management is stop building our servers by hand. I don't actually think most of us are doing this anymore. I think we actually have mostly evolved beyond that, but I mean, stop making people guess how the things should be configured. If you are making things in production one way, migrate that back, make that information freely available, make it easy to use. So don't just provide people with a server where there's a kick start server. Actually that won't actually happen a lot of times because it's production and people can't actually touch production kick start servers, right? So there should be a way for people to bootstrap servers and dev that allows for a mimic of production without actually giving them access to that. So I mean, there are ways to do that. A lot of times what I find is that there is a dev kick start server that has no love, right? Nothing, no love. It was configured two years ago and somebody might be reconfiguring it or they might be just re-kick starting that things with the two-year-old kick start and then fixing stuff by hand. Cause that never happens. So we can ensure consistency. We can make things easy, fast, reliable, right? One OS to rule them all. Make sure that your OS is the same everywhere. It's such a simple thing to do, but it's so powerful if you know that everything is the same from one environment to the next with your operating system. Otherwise packages run amok all the time. Versions of things run amok. Even kernel versions sometimes are different. And trying to troubleshoot that stuff in production is the worst because it's so low level trying to figure things out that a lot of times you're just not gonna notice until you get to production and you have a performance problem and you don't know why. And then all of a sudden someone's writing scripts to compare package versions across 30 servers. Has anybody ever done that? I've worked with people who've done it. I haven't actually had to do it myself, but somebody on my team has had to do it for stuff. Right, so ensure consistency. Packages, I have a real passion for package management because I hate it, right? I hate package management. I hate that I'm still stuck at this level of the operating system and management because I would much rather be doing stuff at the application level and making beautiful deployments and things. So yum and sell Tomcat with your keyboard, don't do that. Don't download some random Tomcat and then put it in a download server somewhere locally, right? Don't, don't do this. Just don't do it anymore. Find your package that you want. Use your version control and use your configuration management to ensure that you have the one that you want and that it's the same everywhere, right? This is a chef snippet. I think Puppet looks almost identical to this for the most part. They use insure, but it's pretty similar. Or you can even do this, which is basically define your version outside of your code and then you can have production versions and then you can also have the version that you're bumping in dev while you're testing things and you can ensure that you are testing things separate from what's happening in production. But you still know what environment is on what version at all times because that information is defined in your tool and it's in version control, right? This is a big thing for me. Locate your environment-specific configurations outside the code base. I've said this a few times now, but I keep wanting to say it because people keep wanting to put this stuff in the code base and then we have to rebuild the war file every time something changes and I'm like, why do you wanna do that? Why? Why do you wanna do that? Why do you wanna manage any of those things? Get all of it out of your code base. This is one of the biggest things that you can do to make everybody's life happier. Abstract all of your data out of the code base, right? Put it in Chef, put it in Puppet, put it in a jar file in Artifactory, right? I mean, I've seen all of these done. You don't even have to have a config management tool for this, but I prefer it myself. But I have seen people do all of these things but get it out of the code. Do not have to rebuild your war file when connection data changes, when an integration URL is changed, when you add a new one. I mean, just, you don't wanna do that. Anybody wanna do that? Just checking. All right, we can automate our complexity. We can use configuration management to automate our deployments to a certain extent. We are never gonna get rid of complexity. Deployments are complex creatures. We hear there are a lot of moving parts. There are app servers, there are web servers, there are load balancers, there are cache servers, there's database schemas, and all of them are versioned and often they require some orchestration to go together. But you can do quite a bit to make your app and your infrastructure self-aware with deployments, especially with the simpler side of things. And instead of running around to servers like you do, right? You've got this big long list of checklist things that you have to be, you gotta turn off the app servers. You gotta make sure that nobody's taking traffic, right? You gotta make sure that everything is ready to go, that the database is the right schema number, right? And that everything is pointing to the right thing before you actually go in and run the deployment scripts that may or may not work, depending on the moon phase or whatever it is this week. I mean, really, what you can do is, what this will do is make it so that your app can be just a little bit more stable with a little less hands-on kinds of working with it. So app versions, one of the ways that you can do this in tools is to have an app version attribute outside of your tool or inside in an attribute in your tool, right? So you have app whatever, 1.2.2, and everybody knows that that's the version that's running in production. And then if you bump that version, then that should trigger several things to actually start looking for a new instance of your app version, as well as understand how to stop taking traffic so that it can actually like do the deploy of the app and restart itself and things like that, depending on the complexity and the amount that you can communicate between teams, right? Because this does require a fair amount of trust, but for continuous delivery and things like that, this kind of stuff is key. The ability to make your app infrastructure self-aware and fault tolerant is key to a lot of the stuff that you wanna do with continuous delivery. You can't do it without trust and without some automation. You really cannot. So finally, anybody here in compliance or security? No, well, this isn't nearly as funny then. But what are the big things today, right? PCI, devs can't have access to production database passwords. It's a big deal, right? It makes so that there's all kinds of like voodoo dancing around the fact that we have to be really careful with database passwords for production stuff that touches credit card data and things. Also, whatever, Sarbanes, Oxley, there's all kinds of healthcare stuff. Everybody has stuff that nobody can touch, right? Except for like the chosen few. And so a lot of this is what this allows you to do, again, when you abstract all of your configuration data out of the app, it allows devs to manage the dev password and it allows whoever's managing production, whether that's ops or somebody else to manage the production password in a way that keeps people from knowing what it is unless you're the person who has to know what it is. Right, but it doesn't stop the rest of you from managing your passwords in other environments. So I mean, that's kind of a really cool thing, especially in this day and age of having to have so many different compliance initiatives that we have to adhere to. The more stuff that we can automate with things like this, the less it's gonna slow us down, right? I mean, the less you have to call somebody at 3 a.m. to get a password out of the password vault, a literal safe, right? So I've done this with like Oracle databases and PCI stuff where you have to get somebody at 3 a.m. to go to the actual safe with somebody else and get passwords out of the safe to reset them when they don't work and stuff. So I mean, it's a big deal. So being able to actually separate these out is a big win for a lot of people as far as speeding things up and not slowing people down in dev, right? And it's something else. So anyway, what you really get with a lot of this is confidence and the ability to move a lot faster knowing that everything that you're working with is solid beneath you instead of having to wonder all the time if you're gonna have to stop and figure things out and try to figure out if maybe you have a package version mismatch or if IP tables is all of a sudden somehow on on some of these servers because somebody didn't turn them off or somebody didn't disable it during kickstart or whatever, right? I mean, so the big thing you get out of a lot of this is stability, confidence, and the ability to focus on what's fun because really none of this is fun. Confidence? For a few questions. I would love to take questions. Any questions? Hi, I got a question. So agree with taking configuration out of the code, right? You shouldn't have to rebuild your code in order when you change your configuration. But I guess how far are you taking it out, right? So you mentioned, you know, yum install Tomcat's no good because hey, you've no idea where it's coming from or what version you're getting. Be very explicit, but that explicit where you had, you were defining the package version and release number, where is that? Do you support the central storage of that or does that live as a kind of manifest or, you know, recipe with the code base in the branch or the stream that you're working in? Well, I believe that Devs shouldn't have to care about versions of Tomcat. I think that they should be able to just like provision themselves a VM with a sanctioned version, the current sanctioned version of Tomcat in the organization, really. And so I think that's cheap. But you might have to think about it if you happen to make changes in your code as a result of a newer version. Then the branch that you're working in, you're gonna need to make sure that when you deploy, you're deploying with the newer version of Tomcat versus the old one, which is still in production. So to that extent, you do need to be aware. So that's, I guess, my question is, where do you support storing that level of infrastructure configuration that's required for the app that may be specific to application code changes? In your tool, absolutely. So what happens is that when you set these attributes in all of the major tools that we use, you set a basic attribute, there are ways to override those at application-specific levels. So you can actually have your, whatever, your organizationally sanctioned version and you may have a later version or an earlier version even, for something that is specific to a specific app. And you can override that at different levels in the tools. And you can make that at an application level, you can do it at an environment level. So really, what we're talking about here now is that we have OS level configs and package level configs that you want to deploy to your servers. And then you have, say, like an application config that you would also deploy as well. So that would maybe have the information about that application's integrations and that application's database information. And the same thing, like you can override a Tomcat version in those areas as well. So you do really wanna separate operating system level, infrastructure level configs from applications because most of the time your OS stuff and your infrastructure is going to span dev teams and organizations. But you will still want to be able to have app teams have specific information, configuration data for their servers and their information, for sure. Thanks. I think we have time for one more. Hi, so our problem of our dev tier is not that, it's not in Puppet. It's in Puppet, it's all configured, it's all nice. But it's a dev tier, so dev have access to it and they're doing experiments on it. So it's completely unstable, not because it's not in configuration, but because engineers are constantly thinking of it for their project and breaking other people's projects. How do you deal with that? I have done this, right? So the first year that I worked with Chef, I was constantly getting a, Sasha, I tried to create a VM, but it doesn't work. And I would be like, oh yeah, because I was updating the network cookbook and obviously it worked for me, but I missed a config something somewhere, right? I'm constantly breaking that. So it's important to remember that when we're doing the stuff for sure, that again, dev is production for somebody. And so what you need to do, and I don't actually know how Puppet is dealing with this specifically, but Chef is versioning aggressively. So we version our cookbooks and everything. So what we do is we pin step to versions and we make sure that we pin environments to versions. So environments and Chef have cookbooks listed and basically, and then we pin versions of cookbooks to those environments. And so what that allows us to do is mindfully promote, especially in production, which is good, but it also allows us to play around in dev, but it also allows you to pin your own little bit of dev to a known working configuration set and allow whoever's doing the infrastructure experimenting to not break you on a daily basis. And then when they're ready, they can say, look, I have a stable version of this. You can use it or you can not, it's up to you. And honestly, I don't know how Puppet is doing that, but I would be surprised if it were not possible.