 Oh, fuck yeah. But that kind of figured out, really, what was the next thing to train. And it sounds like you guys are going to say, why don't you chef this really? How do we enable you to do that? Yes, no, absolutely. That was very good. Talking of the real issues are, you know, what does it take to go to the next level? How do we get trained? Fantastic. No, I'm really sorry I couldn't make it. We had one of our guys who was actually leaving, so I had to keep for his going away. Yeah, no, fuck yeah. Thank you for taking me out. No problem. So thank you for catching up, and we will definitely honor that right now. I will promise you this. If the only thing that happens is that my happy ass gets in my car and drives down, down and gets together in the car, I will do that. As long as everybody knows, I use chef, I don't administer the chef. And it's always a funny conversation. There's administering the chef server, which someone has to do. Yeah. That's not me. Yeah. But I write lots and lots of chef shows all day long. I think part of it is just, honestly, taking advantage of y'all because we need to advance a lot. Yeah. Yeah. Yeah. Yeah. Yeah. Yeah. We need to find them. I kind of love these conferences, but a hat like that is completely and totally acceptable. Chef, let me put a feed of pile in there. But, you know, we've got some fire. It's like, you know what's going to happen with the meat. And you want to make that pile, or have it, and you can put it all together and generate a lot more. All right. All right. All right. All right. All right. Yeah. Yeah. Yeah. Yeah. Yeah. Yeah. Yeah. Yeah. I mean, you got a little micro. I mean, you think about that. I know, there's a man. I'm not sure how people... I think I have to go inside as well. Can you document it? Do you guys work together? Notification? Yeah. I don't know how we were. We've got release now. But I don't know if it's... they're emailing you. Where did he go? I'm sure they're going to... Yeah, I can't think of anything. That's fine. I've got more on the line. Oh, OK. That's a good one. Yeah, yeah. All right. All right. That really means everything. Oh, my god. Is that the end of something else? Good. So, I'm perfect. I'm here. I'm here. My chef trained in gigabyte. Yeah. Put a picture of you up there. I think that was from Rally of Life, either this year or last year. It's irony of it all. I'm going to see you go away. Yeah. All right. We're going to get started in about 10 minutes. Just a heads up to everybody. Well, it's time to see. So, I've got you guys plotted to do your one minute pitch just after Anderson. OK. Got a minute. Yeah. Go right for it. Yep. Hey. Hi, everybody. Is this a fine spot for us to do it? I mean, you can do that one, too. Like, just wherever you want. OK. I literally rolled it in this morning. We just found out that we're supposed to meet here as well. We're all hanging out in the room and chilling. We're like, oh, where are we? We will come in. Yeah. OK. OK. Good. Awesome. All right. Is that all you spent? 20 minutes. I think you're doing it. Yeah, you're doing it. Hey, Sam. One minute. OK. White papers to go. Test, test, test, test. You can hear me? Thank you very much. Oh, that's already live. OK. I'm going to turn this down until you are ready to talk. OK. Put that in the back pocket. Oh, yeah. Leave it on. When you talk, when you go up there. You'll do that on the board. Oh, yeah. In the back of the court behind. This way you're not next to that. Most of the time. How are you? You all right? OK. It'll either be a debacle or a really fun. Not the same thing. Yeah. Not like always. Happy to do it, John. What are you guys doing? I'm going to give him a tour of wherever. Oh, nice. They're the worst places to work. Yeah, that's what I'm thinking. You're all right. I thought you guys stopped that. Have you seen any videos of the rocket crashing on the thing, completely? I can easily shut down like five minutes earlier. I have a longer session anyway. Is that 45 minutes or something? Yeah. That's not what it says on the schedule. The schedule I have stays in half an hour. If you look on the website right now. And then it's like half an hour after that. And I've got to talk to him. How does this all work? How? Why don't you sync up with the other? Because I'm a bad person. It's your fault? Yeah, it's my fault. Everything is very much until I take responsibility for it. Well, we better get started so we can get... It'll be fun. The talk is being recorded. Whose phone is this? Decent phone. All right, ladies and gentlemen, if you can come in and grab your seats, that would be awesome. Controlling the screen too? Or is it? It seems like you've got to display it. Oh, fancy. Why is it flickering? Let's try to see if we can do this real quick. Yeah, down the display. Well, let me change it. Yeah. All right. How's everybody doing today? Ready to kick off the DevOps day? Yeah. I mean, is there a way to fix it? All right. All right, just by show of hands, who've been here before? How many people have been here the first time? How many people didn't raise their hands? If that helps. All right, so we're going to get started in just a moment or so here. A couple of things to be aware of. We've got coffee out behind us. Thank you to our sponsors. We're going to have a little one-minute sponsor presentation in between presentations from speakers. Also, how many people have heard of the theory of constraints? Four to go. For those of you that have, this is going to be a great study in why you need buffers, because I may have forgotten to add those. So this is going to be an awesome day. We're going to go things after lunch. We've got one presentation, and then we're going to jump into actual open spaces, which will be our first time here in LA, doing open spaces. So we'll explain that when it comes. With that said, once Andrew's good to go video-wise, it is his. Yeah, I'm not just going to cooperate. I don't think it will, I don't think it needs after we'll change it. I think they're refreshing. I don't think it's bad, though. Does anyone know how computers work? It's not going to be after. We can just go, I was actually going to go to the outside anyway, so I just have like a few silly ones. I'll put them up there, but they're not important. I mean, basically, it just shows my Twitter handle. Do we have to do that? We'll try one more time. What do you think? Displays? This says 60 on the sheet. Is it going through a splitter? It's going through their computer, and then it's going out the arm. So maybe there's something in there. All right, and then we'll call again, and we'll just go. It's got nothing. All right, screw it. How do I close this? Let's go back to that. Since we're already a little bit over time and Chris did not plan with proper breaks anyway, we should probably get started. How are you doing? Make DevOps great again. How many of you are wondering what I'm going to talk about? But here it is and amusing to do a bunch of things about, you know, the parody that is our critical process today. But instead, what I want to do is actually, I'm going to say a few things, but then I want to have more of a dialogue, because I feel like, and I'll go to this slide, I feel like every DevOps talk has already been given. There's like not really new stuff. There's some new tools, there's some kind of new things we could play with, but the actual kind of underpinnings of principles, they're not changing. They're not new. But at the same time, just put in perspective, and I know this isn't going to be as amazing as it would be if we weren't having a slicker, but a lot of this starts to sound like pandas vomiting rainbows, which is to say that it's somewhat inactionable or becomes, I don't know, people kind of do this pendulum that you see in the talks where on one side people give like a really nice tool talk and everyone complains and they're like, well, there's too many tool talks, there's not enough talk about culture and people. And then on the other side, you have people who give culture and people talk, and then people on the other side are like, wow, there's not enough about the tech. You got to talk about the tech, right? This pendulum, who's ever experienced this? Who's ever said any of those things, either of those things? Every talk on DevOps was already given basically in 2009, and mostly by me. Who's ever seen any of my other talks before? Who's ever seen my deployment talk from Rossi 2010? Who's seen that there is no talent shortage. So someone told me once that I give talks where it feels like it's going somewhere and then they have no idea where we're going and they're really, really confused and then at the end, it all kind of made sense somehow. Has anyone ever had that ride? So this is, this is a, I'm not, I don't have any slides, I want to have a dialogue, I want to have a town hall, but I want to, I want to put a few slides and quotes to frame it. So this is the most important slide, obviously. This is the easiest way to find me and get my attention. Also just to make sure, next time you see me, I might look different. And then I could look really different. So this is all pictures from the last five years. Just so you know. And then very, very important, obviously, engage with my brand. So this is Charles Mikowski. Anyone is a Mikowski fan? Yeah. So this is a quote, the product of the world is that the intelligent people are full of doubts, while the stupid ones are full of confidence. And there's a lot of tricks. A lot of tricks have to do with how you, how you say things, how you communicate things. I could come up here, I could say some words, I could move my hands, I'll raise the pitch, dramatic pause. Sounds like I know what I'm talking about. I project confidence. You'll believe me. That's our democratic process, by the way, too. Right? This is like, this is old school demagoguery back to the Greeks. So who's familiar with two phenomena? So one is sort of well known and somewhat discussed, and that is the Dunning-Cruber syndrome. So the Dunning-Cruber syndrome, for those of you who don't know, is a tendency for human beings to, an expert to underestimate your relative expertise. And if you're not an expert to overestimate your relative expertise. Right? There's also another, another phenomena, and I think this is hardwired in our brains and kind of goes back to this demagoguery that I was talking about a moment ago. And it is provably true that human beings hardwired prefer confidence to expertise. So you take the Dunning-Cruber syndrome, where the people who have the least experience or the least expertise but have the most confidence, and then the natural human tendency to prefer confidence to expertise. And you can explain a lot of problems that you guys probably see every day. So this is another one, and I changed the words, but I think it, like here's the one of the things I'd like to point out, and that is, like, dead ops actually, fuck it, dead ops sucks, but it's like, it could twist your body twist your mind, like people kill themselves. So this is the guy who basically burned Atlanta. So he's saying, after all the reporters, everyone's glorifying war, he said, no one who's ever fired a shot wants this thing, right? I'll read it, dramatically. I am tired and sick of dead ops. Its glory is all moonshine. Only those who have either fired a shot nor heard the shrieks and groans of the wounded who cry for blood, for vengeance, for desolation, dead ops is hell. So there's tons of work that goes on in organizations, and even organizations that are fairly high performing, that is not so awesome, right? And I've been inside a lot of organizations, I've helped a lot of organizations what I feel like is improved, whether it's any of the day, even people that I've seen, maybe especially people I've seen, give talks about how they're doing such a great job, but I know for a fact it's basically a shit show, right? And that's one thing that I think I've come to realize that everything looks like a shit show from the inside. There's some things we can do better, there's things we can incrementally do better, but let's recognize what's actually happening here, and it's like this is our opportunity to me what DevOps represents is very inside of this larger theme of software, which is optimizing all of human experience, all of human performance, human decision making, that inside of this, we have using software to actually improve the work that we do, right? So that software cycle of improving human performance, of improving human experience is part of this DevOps cycle. It's a small thing and that would bring me to another point I want to make, which is that it's not one thing, right? There's all these buzzwords going on where people are talking about continuous delivery. It actually pains my soul when someone wants to talk to me. We'd like to have a DevOps initiative. You're like, okay, cool. And then after that, we're going to have a continuous delivery initiative and you're like, oh god. Because it's all one thing, right? We can't really talk about a lot of this stuff in isolation. And I think it's sort of bad for my perspective because my original goal was never to bring developers and operations together. My actual goal as a human, as a person involved in, the product that I was involved in, was to optimize systems. So if you think about what it takes to optimize a system to build architecture, there's a lot of people in this room that are probably pretty good about thinking about how this is going to communicate with that we don't put the same kind of impetus into designing our organizations and my original ideas that kind of led to a lot of these conversations were really around how do you build systems that are human systems? It doesn't just start and stop at DevOps, right? There's all this other stuff that has to happen to build a healthy system. You need product management. Probably the hardest job in the building. You need to have some way to make money, hopefully. I mean, as long as we're going to play capitalism, we might as well play the win, right? Like, what's going on here? We don't get to choose all the rules of the game. But you have this system that you're embedded in and it's sort of sad, at least to me, that you have people who are like, oh yeah, DevOps, and it's like, yeah, developers and operations are working together and it's like, fuck the sales guys. Like, fuck that. Man, those guys aren't even humans, right? Marketing, whatever. Like, everything outside of my little tribe, my little spear, they're not humans. They're some other tribe. And if I get them, I'll kill them. So here's the last quote, really, and then I kind of want to have a conversation. But this is the conversation that I want to frame. And that is another Picasso quote. And I actually saw this on the park about driving in. Yeah, Picasso, and I already put it the other way around. Knowledge without follow through is worse than no knowledge at all. And I'm actually a little bit sad that people aren't as excited about the organizational learning stuff as I am. It's really awesome. It's actually I believe this as authentically as I can believe it, that the difference in the organizations that I've seen be successful trying to transform themselves into having, by definition, is a function of their capacity to learn as an organization. And there's a body of research around organizational learning and a body of research around systems thinking that I think is just sort of laying dormant. And I try to introduce some of the conversations, but I feel like no one's having those conversations back with me yet, so maybe someone can help me out here because it's all about me having conversations in their arms. But this idea of learning meaning that you didn't actually learn anything until you've changed behaviors. Right? If you get information and it doesn't change your behavior, you didn't learn. Right? Anyone as an expert in anything in here? Anyone play musical instrument? Does anyone play chess? So the if I read something about chess or I read something about music, but I can't play the game better or I can't play the music better, did I actually learn anything? And so the rest of the time what I want to have is a dialogue with people in this room about what you've actually changed in the last five years. What behaviors have you actually changed? Because it's glorious that I can pay my mortgage and talk to people, but it's actually very, very frustrating too because you end up having these conversations and maybe you guys have these same conversations in your organizations because I also strongly believe whenever anyone comes in from outside a consultant role there's someone who said the exact same words that they're going to say inside of the building the week before. Was that? Well, they did actually get paid for it. That's the funny thing about a lot of these organizations is they're like, we need talent, we need intelligence, we're going to innovate, we're going to get the best people we can and even in the building you're like, don't do anything except for what we tell you. Right? It's awesome. So in the last five years in these organizations over and over and over where they say and it's always a little bit of a variation but we have this intractable problem that either there's like no way to solve because, you know, we made all these bad decisions over here but we're not really willing to revisit them so what we're going to do is give you this like Gordian knot of stuff that you're supposed to like somehow unravel even though there's like only like one or two threads you can pull on. Right? Well, usually what ends up being like the last conversation was what we'd really like to know is how can we get the dev ops without actually changing anything? Right? And so that's why I want to come back to this dialogue about what did you actually change? So I'm going to ask people if anyone wants to, I saw there's a mic so I have a mic here if anyone wants to take the mic they can and if no one takes the mic hold on people because I know some of your names there's a couple hands that went up, yeah there had to be a better algorithm for that So basically just what have we changed in the past five years? What behaviors have you personally changed or organizationally changed? Well, basically you know being a business owner a lot of times I have to reevaluate you know how I do things and over the past year I kind of got exposed to dev ops and actually it was actually at the last scale where I got exposed to dev ops and I recently started using Ansible to automate a lot of the stuff that I was doing manually and I'm trying to kind of improve my workflow by automating things that I understand but at the same time a lot of times all these technologies can be overwhelming so I'm really trying to get a grasp of everything That's good I have this conversation a few times too where people are saying all this stuff like you go through and you automate stuff and you didn't actually get that much the thoughtfulness because the problem actually was this is the problem most of us have here or a lot of us have here I've actually probably helped contribute to it is often a anti-pattern to try to automate what already exists and that you have an opportunity when you're going to go to automation to think about the things that when you have something that's really complicated to automate that's a smell just like if you're a developer and there's things that are hard to test there's always things that whenever the technology is resisting you maybe you want to rethink it I worked for Small Company X they got bought by a big company Y about five years ago and at Small Company X we were devopsy and small and we changed the behaviors we used to deploy code by exposing metrics we changed the behaviors of the definition of done by getting developers to write metrics and their own deployment codes and in company Y when we were purchased there was a huge silo between development and operations and so far as the buildings weren't even allowed people weren't allowed to go between the buildings so I'm here after a couple of years I'm from Canada originally and at company Y we changed the behavior between the development teams and the operation teams everybody can free-flow and now we're exposing metrics just slowly taking a little bit longer to change my life Metrics changed my life right so I do think there are some pillars and this isn't something I just came up with but I think everyone is sort of in the inner circle of the devops crowd recognizes the cams or calms acronyms right so culture, automation I like adding lean because then you can say calm instead of cams and then finally sharing I think sharing is the biggest one for me and I think that watching I think sharing happens on two dimensions that are very very important and in some places they only do one and not the other and the first one is the sharing that we do inside of the organization and that is important I wouldn't say it's more important but it's definitely important if you're trying to optimize the system and this side can't make decisions because it doesn't have information that seems suboptimal that doesn't seem like a big debate but then there's this global community and this is the thing that I think most exciting about devops as a movement is not that there's tools like whatever who cares I can make a computer do something I could actually I swear I could make computers do these things better with bash configuration management stuff we're seeing like okay fine but there's this global community of practice that are all sort of solving these problems and they come to devops days they come to velocity conferences there's a number of other conferences where they kind of gather I think scale in some senses kind of like a proto devops conference with a lot of links conferences that ability to connect to have experience who have solved some problems or maybe they're just farther along on the problems that you're about to have is a force multiplier that I think is totally underutilized and underestimated like I'm relatively intelligent but the power that I have to solve problems is 10X more than it would be if I didn't have a list of people on my IMS my Skype, my phone, my email that will answer my questions that I've already done that stuff so how's the mic? howdy folks howdy my name is from Ticketmaster so one of the things that we've done in particular last year is we've actually sped up initiative that we refer to internally as devops university and when you're coming from as people have pointed out very heavily siloed operational versus development versus versus whatnot each of those become stumbling blocks for the other operations is the stumbling block for things getting in production dev is a stumbling block for operations because shit's falling over and there's not a complete feedback loop so what we've been very aggressively doing is we've been basically taking our operational dev folks and putting together we do a week-long training course spun up by this gentleman who is probably embarrassed but I'm pointing him out and we try and teach them all the basics of how to reach across the aisle and how to the best ways to self-service these the best ways to drill into all the different you know beginning development tricks for the operation folks or beginning operation tricks for the development folks it's been a significant force multiplier for us over the last year as a matter of fact we've had events where if we hadn't had this initiative in the company we simply wouldn't have managed to pull them off and from that perspective it's been fantastic. That's what I'm talking about that's deliver organizational learning they're setting up explicitly to learn to teach, to learn, to share who else does that how many people solve the DevOps problem of having a silo of operations and a silo of development by putting a new silo called DevOps in the middle and where the hell are the like engineers where do they go anyone with this story this guy has a okay go go I have started treating my infrastructure like code as often as humanly possible and I found that it leads to my team using the same toolset and techniques as the developers and I find that that leads to us understanding their pain points a lot more and getting resolution to the problems faster so that the blog post that I wrote what DevOps means to me in 2010 had three main points and the first one is that developers and sysadmins could and should work together which is revolutionary I know in 2010 but in so many organizations especially when you're coming out of places that had not really traditionally been delivering services because there's this transformation that has to happen when you go from IT keeps the mail server going to IT is actually the value chain that really enables a lot of these conversations so having those people work together we're in some like weird ITIL frozen and they're passed like they didn't have a big thing and then the other thing which is to the point like everything is becoming API driven and this is obvious in 2010 and it's even more obvious now that if I can provision machines with APIs, if I can configure machines with APIs, I can take things in and out of Mario with APIs, I can do all these things that used to be done some other way with an API writing to APIs suspiciously like software development I hate to break to you but it looks suspiciously like software development which means that you can leverage lots of tools lots of process, lots of things that software developers have been doing for a while and that you can start to as he pointed out, you have now like some boundary objects and things you can share and have conversations about because you start to have a common tongue you don't see them as the other it's like, oh yeah, there's this deep stack that no one of us can actually understand fully and we need all of us to do it Hi, my name is Joel and the main thing that we've done at Holt looked in the last two to three years, I think we've gone through the whole sort of dev from like full isolation to like, oh now we're working together, we have that middle silo you were talking about, then we sort of expanded the middle silo like, wrong wrong wrong wrong incrementally better incrementally wrong better and so we got to the point today where my only mandate for my team is no craft work, and that's a big big point to make because writing software can be craft work, writing automating things can be craft work and so craft work is loosely defined as things that you really don't feel like you should be doing, there might be a better way and so every time something starts to smell like craft work, like we have to automate this new technology again you know we have to do this again for containers like that's craft work, right and so really thinking and being thoughtful about being mindful of why we are doing something and what the you know, alleged business value is, like that's how we've been able to actually eliminate almost all the craft work, so no one has to deal with Nagios, no one has to deal with a mail server you know, no one has to stand up KGM boxes anymore, we've eliminated almost all the craft work, and we can focus on actually solving bigger, nastier problems, I like it I'm just going to add this one while I run I started caring about family and making sure that I was taking vacations and that when it was time to go to dance practice with my daughter that I went, so that's what I've changed and I saw someone else, so I work for a school and when I say school I'm in kindergarten through ninth grade so generally there is no such thing as death in this universe, there is only ops the problem is that when you only have ops you're at the mercy of the products that you have available to you in the market and they don't always solve your problem so embracing DevOps from us meant creating a community of learning so that we can bring DEV into the house with the staff we already have every operations person is now also DEV because we're learning how to do it and how to make our own tools awesome there's a couple of conversations I've had some of them have to do with large government, free letter whatever and they've got these problems that are very similar to you they actually are trying to run software that they're getting from a third party so lots of tricks that come we're trying to coordinate and align people inside an organization, get that much harder when those people are actually by definition in some other transactional relationship outside of the organization and the things that were just said I actually kind of want to go back in time so this is pre before Puppet was even called Puppet and I was a developer and Luke was a sysad man and we had a bunch of conversations and I was reading all the agile stuff and there's this community of agile developers and you know who some of them are and you're reading the blog posts of Michael Fevers and you know Ragon Wall or whatever and like maybe some of you guys know them and maybe some of you don't but one of the things that we very explicitly wanted to do was two things one is that there needs to be this community that elevates operations right like that it recognizes like who in the dev world at that time this is going back to like 2004 time it's like you kind of knew these people that had done these things that were blogging about that they were talking about and you could like point to them and they were giving conference talks but there wasn't really, there was no velocity conference there was no DevOps conference so like who are these people that are doing these things we know they exist because like we see these services we know they exist because like they have to so why is there this community and I think that that has happened in a way that like it's an existence proof that this room is full basically like we have this community and that we can leverage that and then the other thing that we explicitly had a goal is that you want people to be at the pub by four right now for Chris that could be the dance recital by four right and this kind of goes back to the point I was trying to make earlier and I hope all of you guys understand this it's not just about the job that you do it's about improving human experience improving human performance right and that to me if we can continue that and I know we have a long ways to go and I know that lots of people go to work and they have like mud and blood on their hands and like the pager's going off till 3am whatever but incrementally we're getting better like there's things that are better and that allowing you to go spend time with your family allowing you to go to the pub whatever you want to do with your time like that to me is an improvement in human experience do you have a question or comment here's one in the front or back get them what we do when we started doing devops seriously all of the developers working on the team were quite busy doing let's quote unquote bigger work and there were some of the smaller things that actually makes the developers life easy we didn't have time to do that like incorporating some chat ops or writing some guides on how to deploy certain things on how to write it creating custom made developer environments etc so being an educator at heart the notion that we found is actually hiring interns so we went to local universities and hired interns and here's the thing no seniors or juniors no seniors or juniors hire green interns who did not start even developing early and you can actually shape them to make those tools for you that's how we have chat ops on our teams that's how we have automation driven from code check ins etc so just a thought so I always love when people work in such a way that it helps you get better at what you're doing I feel like if you're really good at whatever the skill acquisition of chats or music or whatever by playing the song by playing the game you should get better so we should all be getting better so bringing green people into me like that that's an example of road musician learning I actually want to bring it back to some beneath it because I was cringing a little bit when he said crap work because a lot of the stuff that you're talking about is super necessary so like there has to be if you're serving the larger purpose of building the greatest possible cathedral the biggest possible organization you can but sometimes you do things that in the moment you know like this is not fun for me this is not what I want to do but you do it because that's what that's what organization needs from you at that time or maybe you get a bunch of interns to do it for you so you mentioned organizational learning a couple times and I heard some of those things about organizational behavior in terms of making DevOps great again can you speak to that in terms of what has been effective for organizational learning so there's a talk I gave in 2013 of Rossi who has a paper with the seven dimensions of organizational learning from Walk into Mercer and I feel like if you take and do that questionnaire and then reverse the questions and make them statements then that gives you very actionable ways to improve your organizational learning that's another talk have you seen that talk? has anyone seen that talk? John Kiss so basically that framework that is it's from the 90s so this is the thing that's like crazy so I stumbled on this paper that someone wrote for the recitation where he measured the organizational learning that correlated with job satisfaction in this R&D department in Taiwan it's like the kind of stuff you do for a PhD or whatever and then I unraveled all the references back to this model that these two women professors had made in the 90s and then I was like I felt like I found the Holy Grail this is like almost because the only thing I got frustrated with trying to help people along the way is you go into an organization two different organizations might look exactly the same for all intents and purposes they're relatively the same size relatively the same kind of business and you give them the same set of tools the same set of instructions and one will just basically face plant and another one will like totally blow you away like how good the results are and the thing that I came to realize or recognize at least what I believe is that the difference is like literally to learn as an organization like what are the impedances inside of your organization right now that actually keep you from learning and there's a long list of them like is everyone able to question things regardless of their rank in the organization is everyone reflecting on their work are you deliberately doing things to learn are you focusing on not just the work you do but actually how well you do the work so there's this questionnaire each one of those I think has a potential to be inverted as a statement and then be made actionable like make those statements in your organization you'll be a learning organization John Lewis give him the mic after this guy I had just a few cultural comments more from a management perspective a few things we tried to do at YP one is if you're in leadership we try to sponsor departmental meetings that wish there is no leadership other than lead engineers we're not there unless we're invited and there's sort of a gap you have to overcome if you do that which is will they go away and come back with things I don't want to hear or will they riot you get a bunch of angry lead engineers and engineers together and they come to your office with torches and pitchforks they don't although at the beginning they'll ask you to come and come and come and you say no I'm not coming figure it out come back to me with what you think is the right thing to do and then second people to if they want to do results oriented management we always talk about that we only care about the delivery of the work not face time not line of sight not kind of all this other bullshit as I say wildflowers not bonsai so provide environments for great things to grow and happen don't pinch and wire and try to control everything the other part is you have to do the hard work of de-assoling your department and everybody has those people that you've worked with and kind of what I look for are the people that are different with different people find the bullies in your department that are really polite to some people but a complete dick to these other people so they're just under the radar because if you want to let go of the reins and let things evolve and run the way they should which is a bottoms up approach those people will get in your way and you have to have the courage to often make decisions without you being there or you're still managing things that's something we've tried to do to some success so while you bring that over here I'll make a couple of comments so there's a quote by Ben Horlitz that says that the level of communication required in an organization is proportional to the level of trust or inversely proportional to the level of trust so if you have high trust you need less and less communication to do work and then all of us are probably the ideas of the cap theorem and this notion of consistency availability and partition tolerance there's nothing about the cap theorem that actually talks about computers so if you think about the human systems involved when you have a meeting what you're actually doing with that human system is a right lock so when you have a company wide meeting that's a global right lock of your organization and so when you set up structures where the edges can independently make decisions and take actions you're going to have much higher throughput and activity inside the organization as long as you get to the level of trust that everything's going to be right John Willis I was going to save this one for last we need an internal 6-parallels that's not real that's not real I can't even say my comment that I wanted to say an organizational learning there's an incredible amount of knowledge on people who have studied Toyota production systems and lean like McRafa Steven Spears Jim Womack and if you look at what Toyota did that was probably the greatest learning organization in modern times in terms of what they accomplished if you look at one other thing if there's any dispute from 1970 to 2010 they decimated the automation market and they did because they were probably the greatest learning organization on the planet that works studying how they did that so I would definitely look at McRafa, Steven Spears, Jim Womack and Deming yeah Deming so that just brought up a funny part so I had these conversations where at the end of it where they're like oh yeah we really want the DevOps but we don't want to change anything but we'll just use Docker or whatever the tool is everyone wants to take the blue pill it's like how do we get the blue pill and we can just go back to whatever we're doing and it doesn't actually matter so I think I'll give you guys a break I'll give you a couple minutes back not too many but just to wrap up I wanted to say that I think DevOps is whatever you make it you can put energy into something and you're going to get energy out of it inside of the organization don't worry about we saw this same kind of adoption with agile where at some point a transition like focus on actually trying to deliver software which is what the agile manifesto says the only measure of progress is code the only measure of progress is this one thing over here and they started focusing on the detail of the methodology and the process it's like oh we didn't do this one thing exactly the way the process works so we're not agile and that's not actually what the agile manifesto said it says we're learning how to do software we're learning how to do software by helping others do it and by doing ourselves that's the spirit that I think DevOps should understand all of this is that we're learning how to do this by doing it I can do what I do, you can do what you do but we do it and we help each other do it and we're going to get better and better and it's not a thing that sticks we're like oh yeah that one methodology you didn't use Jenkins the right way you didn't use this one new container tool you're not on Unicernals it passed all the fashion and tribalism it dominates the things, gets to the principles that really underpin what we're trying to accomplish and help each other as human beings evolve to optimize the experience and the performance that each of us have together that's what I think we'll make DevOps state again and one last slide even though they look like shit thank you thank you alright well we get set up for the next talk I thought you'd have a public lab a quick pitch for everybody that doesn't know our sponsors make this possible and that is super super important so if you could tweet at them and let them know that you're here and that you're thankful for them that would be awesome and for those of you that are walking out the door we're going to get started super super quick so be back fast well hey everyone my name is Spencer Sebald I'm the regional engineer with Puppet Labs yeah cool let's start that again hey everybody my name is Spencer Sebald I'm the regional engineer with Puppet Labs in the back corner we got Amy McQuestion and if you see somebody wearing slacks and a button up shirt he's our regional sales guy he stands out like a sore thumb here at scale we're here representing Puppet happy to answer any questions we're here with a complete swag and a booth in the exhibitor hall so come by introduce yourself say hi you get to answer any questions or anything you might have hey guys thanks for coming I'm Isaac with Sumo Logic we're a rapper going fast for log management in AWS I'd encourage you to come by a booth say hello we scale instantly we deploy instantly scale drastically for a free trial you can also go to SumoLogic.com to see a lot of whitepapers and everything hey everyone for coming come by and check us out check out those URLs two minutes if you could take your seats that would be awesome if you could take your seats please we're going to get started hello welcome so we're running a little behind so I'm just going to get right to it this is getting started with testing of an infrastructure I'm a fit admin that's just trying to do things better I work in a mixed environment and this journey started because I wanted to manage my windows boxers this is just my journey into what I've learned this is still a work in progress I'm still trying to get this workflow adopted by my shop but here are some of the concepts that I'm needing to teach my teammates about so first what is it and why is test driven infrastructure worth the effort so basically test driven infrastructure applies agile development principles and practices to infrastructure bringing along with it similar benefits of minimizing risk building confidence in the code and process and efforts and try to keep efforts on task and outcome focused by with a verifiable behavior where we are checking our results against our requirements this we have an increase in quality making it easier to improve upon giving us confidence it leads us to safe refactoring because we are applying a good processes which allow us to speed things up and enable us to be more adaptable letting us approve upon our design of course these new efficiencies also reduce IT costs current problems with system deployment and maintenance sometimes it's just a lack of process every engineer is implementing solutions however they see fit whatever they think is best this can be chaotic and difficult to repeat can't verify what you need is actually being achieved and it's just not scalable or there's manual processes where things are being documented there is some structure but this is still prone to errors and inconsistency and of course documentation becomes outdated thus introducing risks and deteriorating trust in your process or just poor code you have started automating but the automation is not well implemented you can have a bunch of scripts that are difficult to understand and become unmentainable configuration management can help address these issues configuration management it is translating your infrastructure into code configuration management ensures the systems are in a known good and trusted state this allows visibility into your config and they are trackable as you use version control for your configuration and it also provides reliability as your configuration management will ensure the state of your systems and making your environment more predictable this increases productivity resulting in fewer outages less firefighting and less firefighting now that there's more with configuration management also now a more efficient way to do change management giving more confidence to deploy letting you scale faster an example of configuration management workflow this is Chef but the idea is similar with configuration management systems at the top we have a repo where we're checking in our configs in the blue below that we have our workstation we're implementing and making our changes to our configurations and related attributes and below that is our configuration management server which is responsible for deploying these configurations and enforcing them with policies and to the right is the node that it's responsible for some of the popular options you may refer to Chef, Puppet the newer guys are include Salt and Ansible at the moment I've been starting because of well I work in a mixed environment and they've had a good history working with windows they have a large community a lot of out-of-the-box tools and they've been continuing to innovate on their workflows an example of what a configuration management file may look like here it's very simple just an action to make sure a certain package is removed below that we've added an attribute to be more specific to a version of that package we need to guarantee is removed the configuration management has a major shortcoming it does not prevent that infrastructure code that is poorly created and maintained incomplete code including things like leaving comments with to-dos or sloppy quick fixes can be unclear leading to a code that can have unintended side effects and could be catastrophic now reducing confidence in your process and code again so configuration management on its own is not a complete solution so we have test-driven and behavior-driven development to help us out test-driven development is a framework for managing and involving requirements that facilitate the creation of highly reliable maintainable code it helps us maintain scope because we have tests that are written first against specific requirements these tests and we code against those tests these tests provide us with faster feedback and help us reduce risk and build trust in our process the behavior-driven development is an extension of test-driven development this facilitates the collaboration between stakeholders and developers by describing requirements as system behaviors so business values are prioritized because the requirements are correlated to business outcomes tests are written to describe the system's expected outcome and communication is improved as documentation becomes more readable up-to-date and is readable for experts across different domains that can be outside of the team so an example of such a test is here in service spec this allows us to a more readable test that validates that our server is in its correct state at the top we have 480 should be listening and below that we are ensuring that we have the right release distribution is actually deployed so mnemonic for remembering test-driven development is the red-green refactor just red we're just starting with a test in a failed state this test is also the beginning of our documentation we're writing just enough code to suffice the test from there we can refactor cleaning up the code change how we are implementing the solution but we cannot change the behavior as we need to always pass that test and we rinse and repeat as necessary continuous integration is a practice from extreme programming type of agile development to help speed up the process we're basically going to automate test-driven development so continuous integration is a practice of short development cycles with automated testing and automated code integration we this allows us to detect issues sooner with more immediate feedback because of the automated testing we are able to fail fast and then changes become even faster to implement because they are incremental and verified along the way two notable types of testing as it relates to test-driven infrastructure is unit testing to verify the individual components that you're working on independent of changes being made to the greater system by other members and then you have your integration test that when you test you're testing your changes as they play with the entire with the rest of the infrastructure so a simple example of continuous integration workflow you've made some changes you commit those changes the configuration the continuous integration server fetches those changes and begins a build process to test and verify whether or not the changes suffice for the test from there the continuous integration server can send out feedback to the team to let us know the results popular options of continuous integration software includes Jenkins it's not the most popular software out there that's what I'm using at the moment I started with Travis CI mainly because they do free hosting of this for open source projects and integrate with GitHub and there's Go which is a tool by ThoughtWorks that looks pretty interesting to me and is probably next on my to-do list to investigate but there's CodeReview just because we're moving towards automating all the things doesn't mean CodeReview should be overlooked and as you see here we have one character who's asking for his family to check his code and asking him to highlight anything that looks stupid and he proceeds to highlight his face CodeReview is just a peer review of the source code this helps improve software quality ensuring if other people can read your code it helps ensure that it's readable which means it can be maintainable it also helps with knowledge sharing of the source code in new team members can observe different approaches and solutions to problems tool-assisted options include Garrett, Board of View, Fabricator my examples are just to give you an idea what's out there there's tons of available tools but as far as for CodeReview tools they act as gatekeepers for your commits they can assist with comments and discussions and tracking and help visualize this so to actual test-driven infrastructure so with test-driven infrastructure we have documentation this is the first step usually we start with documenting what our requirements are and including the components versions how to actually install and then we move to writing test which will describe these requirements and ultimately become the documentation and then we script towards these tests and we're not trying to do anything extra we just need to make these tests pass then we and another important component being able to audit and track the code which is where version control comes in to play and the process is continuous where we're automating testing changes and if they pass we automatically integrate them so a workflow that I've been the way I'm currently using it right now there's no exact way to do this some people don't even do CodeReview so there's no 100% way to do it but this is what I'm experimenting with at the moment so I have my configurations in my Git I do a fork and going the wrong direction do a fork code just enough to the coding includes writing the test so I write the test and I write the code to pass that test I commit and my commit will trigger a Git hook that will call out to my configuration and my continuous integration server Jenkins in this case it will push the job out to the appropriate place with the appropriate tools to run the test and and spin them up in a test environment which can be a container, VM Amazon instance you can pretty much talk to anything you want for your test does not have to be your actual production like environment just to do your initial test of your change and then we get notified whether or not it passed you can go to IRC, email carrier paging whatever your team is using at that time and if it passes I have Jenkins issue a pull request and then from there it is ready for code review and if it passes code review can be integrated so continuous delivery continuous delivery it gets your code actually to your staging environment so continuous delivery of changing continuous delivery of the practice of delivering every change to a production like environment this increases the ability to adapt as we are as we are gaining more feedback faster this increases reliability as every incremental change can be observed and lowers risk as a result and is a path towards getting to production faster so what that would look like it passes code review and the pull request is being committed then we call out to our continuous integration server again pushes out our test and but this time it makes sure it spins up our test now integrate it with the rest of the master branch and everything else that we have that's been deployed and spins it up in a production environment from there again we get our feedback and if it passes this time we actually push out to your configuration management server that is responsible for the staging environment continuous deployment is taking this to the next step this is actually automating the process all the way through your staging environments to production this provides us with a quicker return on investment because we are getting to production more immediately and we have faster client feedback so if you break something you should know pretty fast so continuous continuous deployment workflows just like I said going through from our dead environment to the different staging environments QA whatever is necessary for your team and then automates it out to production which is the key difference for clarity continuous delivering is can automate you all the way up just before getting to production continuous continuous deployment is automating that final step as my research I found few people that are actually able to achieve 100% automated workflow like this but options include plugins for Jenkins like workflows and pipelines Chef having native tool called delivery truck Go has this idea built in from the very beginning hence why I kind of want to investigate them pipelines are just a series of stages to provide you visibility and feedback in the workflow so challenges why this all sounds great that I want to implement with my team but why this won't work it's a cultural shift I have to get everybody on board to change their workflow that's also a time investment that does cost money and it requires full adoption so if people are not writing tests or they're writing too many or writing tests without the proper follow through it's not going to work even if they're trying to but they're not writing good tests either they don't know what to test for or don't understand what they're testing for or overlooking what needs to be tested or just writing things that like will pass anyway because they're testing trivial code or or their tests are trivial so in conclusion test-driven infrastructure improves reliability because we have a test first process and those tests help produce current and readable documentation because of automation we have an increase in speed and we have an increase in productivity because our tests are helping us to maintain scope and we have we are more adaptable we're doing less firefighting and we're able to get feedback faster and all this helps reduce the cost so for more sources and information I've provided several links I mentioned the slides will be available I've got a link at the very end for my slide deck and thank you I just do want to do a shout out to Nat Harris of Chef I.O because I picked his brain a lot for getting started I had a lot of trouble getting some of these concepts and terminologies defined so yeah I'll tweet out the link for my slides in a little bit also just the QR code will take you a link to my page where I'll be putting up more information about this talk there's already a link to the slide deck there as well I know this was a little fast I got a little short on time so I'm more than willing to take questions later on out in the hallway or whenever but I think I'm about out of time so the joy of the operations we're going to do what we can to resolve these technical difficulties so here's what's going to happen we're going to take a 15 minute break which means that the two talks that come next are going to be pushed back by 15 minutes each so go get some coffee provided by our awesome sponsors those include Similogic Puppet Labs Verizon, Ticketmaster, close.com Chef and I know there's one more and I'm sorry that I'm missing you but yes, go get coffee we'll see you all in 15 minutes and they definitely let us know in the app feedback and on our member care also you have stakeholders that are extremely unhappy you have someone in marketing who's been working 2 months to get Diane von Furstenberg back on our site after the last time it sucked up and all of us pride in your work and you know most of what you do this site is I have a nice graphic fridge web transaction time is hovering around 200 milliseconds and that's a significant drop of force we're talking about within a matter of about 1 to 2 minutes and remember this is all symphony, this is all PHP so the obvious question here how did we get to these 5-cell numbers 1 minute, 2.3 million Nginx log events per hour orders up to 250 orders per minute so if I do some very complicated arithmetic hanging fruit, lots and lots of low-hanging fruit and as long as we didn't try and all the trials and what I think you guys want to know which is what were the practices what were the learnings what do we do now that we didn't do then to harken back to our first and so I brainstorm with what we do and why not the biggest bottleneck you are wasting your time because what will happen as you debug this application is that you'll be spot-fixing things over here that have basically no impact because you're all bottlenecking in one place so it doesn't matter how long it takes you to get to that bottleneck that bottleneck is still the slowest so if you imagine with me a structure diagram where you have users on the far left you have your backend API and then on the right you have your e-commerce database or any database really and you imagine the flow of data through the system you can actually bottleneck anywhere along the line but in experience my experience, other people experience most of your issues are going to happen over here they're going to happen on the data side they're going to happen on the API side your web stuff, yes you can have slow JavaScript but those things are not as endemic they're easier to fix than having to go back and redo hundreds of millions of records worth of historical data to fit a new schema, stuff like that that's very very challenging to do and so after you're debugging these bottlenecks and you're tackling the biggest one first another one's going to crop up so after a while you're going to start to feel like Hercules slaying the Hydra every time you chop off one head you're going to see three more grow how do you deal with this sort of thing I suggest you do what Hercules did which is burn the stump make sure that nothing else can crop up there again and so what that really means is get down to the bottom of why you have a performance problem get to the root cause and fix the root cause and that might not always be possible and so you will have to make some tradeoffs for the root causes second principle choose and use your tools or make your own tools that you're going to use to inspect your application understand what it's doing and make sure that it's doing what you expect you must use an application performance monitoring tool and I know that here we're very false friendly we don't like saying that sometimes there is a better paid solution but pay for the software please use New Relic, use AppDynamics use anything whatever instrumentation you've made you need something that someone else has spent all of their time working on to make good alongside there you must profiler applications so what the APM will do for you is help you identify where there might be an issue with your code and what the profiler will do is help you drill into that performance problem and understand why it's there and potentially get a fix and so once you have found an issue you've debugged it, you think you fixed it you just cry with low tests absolutely be confident the biggest things I'm going to share with you today next principle understand your trade-offs so every time we make a decision in engineering we must understand that there's a trade-off and sometimes the trade-offs are insignificant sometimes they're very very significant and you regret it later and that you don't want anyone to find out what you did so what I'm endorsing here is to spot fixed performance problems and actually treat them as exceptions not the rule this is very important in other words if you find a particular piece of your application that is extremely performance sensitive fix that spot whatever things you do there to make it work to get it under a number let's say you must get it under 50 milliseconds do disgusting nasty things but keep them there and keep them contained make sure that everyone is aware that the card endpoint is nasty be careful looking there there's lots of tests around there it's extremely fast and we like that but again that's not the rule and here I have an image of who is a Simpsons fan anybody might have heard of this show I have an image of old Simpsons Homer and his self-designed car which is a lovely green pistachio color and has all kinds of bells and whistles on it and that's what happens when you do not make engineering trade-offs you don't understand what the problems are you try to please everybody you end up pleasing nobody and your solution costs $80,000 so next principle limit the total work done in other words when you do work do it as rarely as you can do not do it over again in other words you're going to want to reuse computation whenever possible so cash everything cash SQL queries cash JSON blobs cash documents everything throw things into memcash and set a sensible lifespan for them reuse things everywhere you can and this is not going to be informed by technical constraints it's going to be informed by your application so you must know what is safe to cash and when and when to show someone a sale result versus a fresh one and along those lines use a caching proxy everyone in here should be using a caching proxy who is not using a caching proxy I see some heads we use varnish varnish is the single greatest thing that saved our website thank you Travis Hanson and the reason is that we were spending lots of CPU time redoing expensive calls on holtlook.com for a given events call it might take 6 or 7 seconds and about 300 queries to traverse the taxonomy and generate this giant JSON blob and we only need to regenerate it when events change which as we said it's 8 a.m. 1 p.m. sometimes 4 p.m. and so do you think our database knows when that's going to expire? yes do you think our API can set a caching header or varnish to know when to expire it? double yes so if you include varnish as part of your development life cycle if you include this concept of caching everything you can you're actually going to end up with a significantly reduced need for capacity because varnish will be doing most of the heavy lifting and finally use a content delivery network edgecast, akamai, whatever you choose to use this is significantly important because you will not be recalculating all of the work that we've done for the same reasons that you want to use varnish use a cdn and it will actually make your members experience better due to geolocation of caches and other nice features that they give you for the arm and leg that they're going to charge you and virgin blood and first sports and so on next principle this is pretty big and it ties into the previous concept choose technologies that are designed to scale not everything is designed to scale some things like we said are a trade off they're easy to write software in but they're not designed to scale conversely some things are very challenging to write in and they scale very nicely so in this case I'm endorsing the use of things that might be trickier but are designed for scale and I know that a lot of developers don't like to think in terms of varnish when they're writing their apis varnish is designed for scale it actually prevents a stampeding herd problem which I'm not sure if everybody knows about but if you have a thousand clients or n number of clients connecting to your varnish server and they're requesting one resource which usually is a url then varnish will consolidate those make them wait and fire off one back end call to your api server it will take that result and send it back to those thousand clients and they're none the wiser so you just got a thousand times more servers for free and it's going to cache it for you so you might not actually revisit that server for 5, 10, 15, 20 minutes and this is probably the most controversial part I've been talking about PHP for now but you might want to rewrite expensive pieces of your app with appropriate technologies and so in that case for us that meant that our search implementation which is written in PHP it's using doctrine, it's using symphony it's very very slow and fortunately it's a fairly isolated endpoint doesn't impact a lot of other things so we actually rewrote the entire thing in Scala and it is significantly faster it is significantly more scalable and we use significantly less servers I'm not sure how many more times I can say it significantly but it is important so those are the principles of performance and again these are things that you want to pepper into your designs in stack rank of course and where where it gets a little less negotiable are the three performance and again these are things that you must do regardless of how you choose to tackle your performance problem regardless of what it is the first performance commandment know thy system and so what do I mean when I say know thy system it means that your app is running on something what is it whether it's Windows, whether it's Linux FreeBSD doesn't matter know it well know it inside and out have an expert on staff that can help you debug these problems because if you're building on a shaky foundation you're not going to get anywhere with it second commandment know thy runtime know it exceptionally well whether it's PHP, Java Ruby, Python doesn't matter understand what the limitations are and understand what you can do about those limitations know what your frameworks are good at know what they're actually really bad at and these are things where you cannot have any strong opinions strongly held they must be strong opinions weakly held because you will be challenging your assumptions regularly first commandment know your application this is something that no one else can do for you so when I say know thy application that means understand who is using it understand what their behavior patterns are understand what that means for your application are carts more expensive than looking at events are product pages particularly challenging because there's so many of them you need to know how they're being used before you can possibly debug and appropriately act on a performance problem know your system in our case we're using Linux we're using CentOS Linux 6 and 7 and we're very happy with it this is a little controversial I know that we're talking about performance not talking about manageability again we're talking about trade-offs choose bare metal wherever possible this is really really important I know that rub some people the wrong way don't use containers don't use virtual machines use bare metal this is the absolute fastest way you don't want anything between your runtime and the metal and again this is when you're tuning for performance this is when you have Black Friday coming up you have no other options you don't like this option but you're going to install PHP on all of your virtual machine hosts and you're going to run PHP right on the box second for know thy system is tune the kernel the kernel by default is very conservative and gentle with your hardware it thinks your hardware is brutal we know better get aggressive like kernel tuning and you'll be much much happier with it and I'd like to take a break here to give you a short story when I started a whole look we had a big big problem with connecting to our MySQL database we got lots of PHP errors to the tune of MySQL servers got away connections dropped stuff like that and necessarily all of our investigation goes right to the server box we're tuning we're changing parameters in the kernel we're trying to figure out why this thing is dropping connections right turns out the caller was in the house the entire time the problem was never on the servers it was actually on the clients so when PHP is connecting to MySQL and given a default kernel it will actually use up all of the ephemeral ports it has available and when PHP tries to make a new connection to run another SQL query it's actually going to get a timeout and that's where you get the MySQL gone away error and I'd like to thank Travis Hansen back there for that he spent three months of his time figuring out what the hell was going on there and so what we actually ended up doing was using a kernel flag called TCP TW reuse set that to one that means it will reuse existing connections it will not create new ephemeral ports all of a sudden problem solved and the website is so much faster funny funny how that works so that's just one value if you want to tackle all of these values all at once you're going to want to use a tool called tune D for sensible defaults it will give you a profile that you can apply and it's got a number of them like virtual guest, virtual host you're going to want latency performance and that's very very important so the latency is what you want you don't care about throughput or other issues so I'm going to skip some stuff here let's go to knowing your run time in our case it's PHP I've had people come up and tell me I didn't think PHP could be fast you are right it can't PHP is very very slow because it does a lot of stuff but first thing you want to do to make PHP faster is use recent versions in some cases you'll see some workloads you'll see that it's going to be actually twice as fast so if you want a legacy version of PHP this is the absolute first thing you want to do get that PHP upgraded even 5.5 to 5.6 it's going to be good 7 should blow us away so we'll see we shall see second about knowing your run time is frameworks are inherently slow they speed development but they hog resources and particularly here we're talking about doctrine and so doctrine for the uninitiated is an ORM that is commonly used for PHP install it is actually the single biggest thing that uses CPU anywhere in our environment so further along those lines PHP eats CPU eats it you heard it here first and eats lots of it so what you're going to want to do when you're performance sizing with PHP in terms of concurrency not in terms of throughput think in terms of concurrency not in terms of throughput so throughput is a post-facto measure of what your server did concurrency is what it is doing at any given time so I'm going to give you the ultimate performance tuning formula for PHP you're going to take the number of physical CPUs on your host you're going to divide by the average response time in seconds and what that's going to equal is the total number of PHP workers you can run on that box at any given time and what that means is that for every single request that comes in and wants some CPU time it will get CPU time, it will not wait because if you have a ton of requests coming in it will immediately start to snowball and you will actually end up stealing CPU from each other and this is not even in a virtualized environment this is actually in a physical host you will actually have the PHP thread steal CPU from each other so if you do all this math I know I'm going to have more than that many users on the site buy more servers there's no other option, you need more hardware than you have that is how you perform, that is how you capacity play so finally here knowing your application and I alluded to this earlier but you want to focus on the parts that make you money the parts that make you money are the most important parts so if you have a performance problem with the happy path in your application focus on that first, the critical path must be fast and the closure with a I think the best, finest example of product focus engineering that we've had a hold look as part of knowing your application take cues from user behavior I have for you a tale of too many carts and so I've mentioned carts a couple of times because they are very expensive for us, doctrine is doing a lot and we realize that carts are actually the lion's share of our CPU time about 40% was just spent at looking at carts empty carts and so the engineers like digging into the code to figure out what's going on with BHP and our VP who is actually not here today says all right hold up step back, step back, step back what do we know about the members? what do we know about how they use the cart? and turns out we didn't know anything about how they use the cart as it turns out what the numbers look like is 100% of the members that visit the site only 5% have ever added to cart in their entire whole career and of those only 10% have ever added to cart again so what we decided to do is send back a cookie that says do not look at my cart and that is on the web client and on the mobile client and every time you add to cart we clear the cookie so what does that mean? it means that we've reduced we've essentially gotten 20 fold performance for free because we're not doing as many carts the people that are shopping and never adding to cart the window shoppers they have a faster, more positive experience because they're not waiting on an empty cart call that doesn't do them any good and the people that are adding to cart actually have a faster time adding to cart shopping and getting through because they're not competing with the people that are just window shoppers and so with that we actually were able to drop our average response in about 30 or 40% which is significant after you bundle in all of the other stuff that we've already done so any questions you have I would have a slide up here for you but you can reach me at Joel Esalis that's on Twitter that's on Gmail, that's on GitHub so any questions? so the question is, we're talking about performance what about scalability, how do we want to scale the hardware out? that's actually some of the stuff that I plan to talk about but what we do is again using tune.de to set the appropriate scheduler and elevator settings so tune our file systems very very aggressively so PHP likes to talk to your disk you might actually do that 100 to 200 times per web call, it'll open 1 to 200 files and so if you have access times set to write on your file system you're going to be writing to your file system constantly 100 times per call and all you're doing is updating a timestamp so what we do with our mount options on EXE4 is no ATIME, we disable buffers we set the commit time to 600 seconds so for 600 seconds there's volatile data that might be lost but again, we don't like to write to disk so most of the data that matters is not on that box so that's how you get the same hardware to do more basically and again, don't virtualize yes sir so the question was, when we re-implanted our search and scala did we consider any of the existing functions? so our search implantation is actually leveraging solar cloud and the scala portion is the part where we take our gigantic you know, best in class product index and we give you nice filtered results based on brands based on other colors and other facets that we add and that's what our API is doing but we are using solar cloud heavily so the question was, who in the organization actually initiated the idea of tackling performance and I'd like to say that it was the engineer group it really came from the product team and it came from the leadership and the members and they said, we need this to do better and we decided to tackle that as a problem but like I said, intrinsically we as engineers want to design things that are fast that are efficient that are well designed that we can be proud of and so really, I don't think it was a very difficult conversation to say, we need to tackle this problem there was a bit of pushback against performance not a feature, it's not worth engineering time but we got to the point where there was an inflection and the dollars ran itself down alright alright, so I muted thank you, appreciate it alright, there we go I'm going to bring up cars.com and they're going to give us a quick spiel about what they're doing to sponsor and why they care so much about this event so here you go hello everybody thanks for having me we're here, this is like our third or fourth year of being a DevOps sponsor here so we're really happy to be here and sponsor everything we have a great DevOps department so we're here enjoying all of the great talks and working one of the things we do is work in salt and in spinning up configuration but I wanted to say first I'm James Sacks and I've been here for five years at cars.com and I've also been to scale five times which is a coincidence because cars got me here we're in Santa Monica we have a small group of nine engineers and half of us are in Dev, half of us are in ops but it's a very permeable environment so I've been working on Python, PHP, Java, Groovy, JavaScript TypeScript, Angular Postgres, MySQL we're also touching Nix NixOS, Salt and Docker so there's lots of fun we've got front end sites we've got ETLs by the heap so there's lots of fun stuff to do one of the reasons I'm up here is because we also have a position open in the Dev team so if you're interested you can come to the Python and the Nix booth and we'll be there we're in the Python shirt and it's a fun place our big company is in Chicago but we have a small team out here we have a startup, we have no clients so it's kind of cool thanks alright let's give it a round and next up is Elizabeth talking I don't remember the talk but it's awesome stuff let's go we don't have slides but I'm at T-Leia2 on Twitter like Princess Leia T-Leia2 I'll tweet out the slides after my talk so I'm here to talk to you about first I'll define what that means by distributed I mean the team is distributed I work with systems administrators from multiple companies all over the world so our team is made up of people from Australia and Russia, several in the US several throughout Europe and so we need ways to collaborate in that sort of environment so my role on the team I'm one of the systems administrators who has four in root on all the boxes there are nine of us who have this and then we're supported by a bunch of people who work on the team our job is to make sure the OpenStack project infrastructure works and that's why we're made up from various companies we're all from companies that are contributed to OpenStack so I'm from Hewlett Packard Enterprise we have contributors from IBM, Rackspace, Morantis various other players besides the OpenStack community that's also why we all work from home we don't have an office to go into to work together so we all pretty much telecommute so that's pretty much our team we have requests to us for changes in the systems OpenStack sent to us to be a change request in our system rather than tickets and things you might see in traditional OpenStack projects so we don't have typically bug reports and people don't submit things to our mailing list they'll either come and bother us on IRC and say hey this thing's broken and we'll help them write a patch against our infrastructure which I'll talk about but we don't use a traditional bug system very much for people to report things to us we expect them to help us solve problems in our team this also means that the priority of requests that come into our system assessment team is not really determined by us it's determined by people who want to work to help our infrastructure work because it's all open source and anyone can submit patches so we have all of our system stuff up on Git if you go to git.openstack.org slash OpenStack for us it has a whole listing of I think we have over a hundred projects now inside of our infrastructure that help everything run we have instructions available for anyone to submit patches against that so we have very detailed like this is how you download the Git repository this is how you submit changes to our code review system and then the rest of us will pitch in and help you review that code into our infrastructure so as far as tooling since we're an open source project and we work with a lot of companies inside of OpenStack we have to only use open source tools because the other part of the title of the talk is only using open source tools to do all of our work so we don't have a Slack channel we use IRC we use mailman mailing lists and then on the side of what we actually run in the infrastructure we run the full continuous integration system that OpenStack uses so that's Garrett for code review a series of tooling to connect that up to Jenkins to run all the automated tests to test OpenStack and then everything that its developer will interact with so all the wikis, all the IRC bots that run in OpenStack channels the pace bins, the user plan everything that runs in OpenStack that people interact with is pretty much run by the infrastructure so we built this continuous integration system for OpenStack and we had sort of been running our system traditionally like people would send bug reports and collect them and then we work on them and we realized we had this full continuous integration system and we could start testing our system changes as well as all the OpenStack changes so the first tool that I want to talk about is our continuous integration system so OpenStack is pretty much written in Python so we already had a lot of Python scripted in place so in the infrastructure team we pretty much decided early on that we'd write everything in batch in Python so that it would help OpenStack contributors help us with the infrastructure and we could also run all of our automated tests that we were already using for OpenStack on all of our system stuff we also made sure that all of these tests were completely automated so it really was not much extra work for us as the infrastructure test as well as the actual OpenStack project test so in our workflow a developer or assistant administrator will load the Git repository of whatever they want to change make the change, it'll be uploaded to Garrett which is the code in the system and then it goes to a thing called Zool which is our gatekeeper depending on what goes busters recently Zool will queue everything in place and make sure dependencies are tested against each other this will then go off to a job queue called JobPorter called Tierman which then sends it off to our fleet I think we have eight Jenkins servers now they're all set up as masters they don't know each about each other but the Gearman worker will distribute the things adequately across the Jenkins masters depending on the criteria so again this is used for all of OpenStack and then it's also used for our infrastructure some of the things we test, I mentioned Python but we're also using a lot of Puppet in our infrastructure so we do Puppet linking tests against our code to make sure that the Puppet files look nice and are formatted nicely because again we're a team that's distributed all over the world with a bunch of companies and we have random people submitting Puppet changes to our code all of the time so we wanted to make sure that it always looks nice and syntactically clean and that was kind of the lowest barrier for our Puppet test we then added Puppet Parcer Validate which actually does, I don't know if you call it unit testing it's like little tiny testing to make sure it hopefully looks syntactically correct this will make sure that brackets are closed and your commas are there and other things and then we also use beakerRspecs to actually do full-on like this thing can actually deploy tests and that's a relatively new thing that we're still working through but those tests are going really well so far so then we're pretty sure once the code frontends the test which it does as soon as we upload the code it's not merged by anyone yet it runs all these tests and then people come in and then it goes through another series of tests to make sure it didn't nothing merged and then it finally lands in our infrastructure we also have thrown in a few other tests over the years we know the syntax of some of our XML files so we'll do checks against the XML syntax to make sure that's correct we want to keep a lot of our project files alphabetized and it turns out humans are really bad at the alphabets so we have an automated test to make sure that the files are alphabetized we also did a lot of manual work when we were checking for people adding IRC channels to the OpenSTAC project we wanted certain permissions set up on that so we wouldn't lose control of them or have to bug the free-node staff so we now have an automated bot that will log on to free-node and check that the channel has the right permissions and then come back to us and say yes, that test passed so this is a really cool one because I was the one who was always checking to make sure things were in chance and I'm like, why am I doing this? we were able to just sort whatever test we wanted and it's actually been really cool for us it also means some free-node review we're obviously having our peers review our code so we don't have a change system where people who don't know what they're doing are reviewing our stuff and taking forever it's just peers on the OpenSTAC infrastructure and since we're doing that before we're merging code and making it go live it prevents some funny things from happening like I had a funny slide of this patch that I wrote that was really stupid, it had like a it was like a double negative like if not this thing and then one of my guys that I worked with was like, Liz, you could just put equals I think what I was, you know how it happens, like you were writing something and then you changed your mind halfway through and you rewrote it it's a crazy thing in there, but it would have worked and I could have just merged that in the infrastructure and it would have been fine, but we have peer reviews we can pick up on silly things like that it also means that anyone, as I said can submit changes to our infrastructure so we have people from throughout the OpenSTAC project that I want to add a test or I want to make a change or in one case some of the guys wanted to add an asterisk server and all the core people on the team were like we're not running that because it would be in the cloud and I don't know about asterisk and PBX is hard so some volunteers from the community came along and said actually we know a lot about this so they launched the asterisk server and they have been caretakers of that so it really empowered the youth the community and the company that they were working for the company was able to put an allocation of resources to that and we didn't have to block on they didn't have to block on us we reviewed the code and we tested but we didn't actually have to work so I mentioned it gets merged to repository after it's been tested and all the coders have been done and then we use Puppet and Ansible to actually go and deploy those changes as soon as it gets merged we also use like a BTS module in Puppet so some of the projects we don't actually do releases on so we'll just watch the master branch every time an update is made to that project and go live with it and hope nothing breaks but we did a bunch of tests so this means we don't log into our servers very often everything's pretty much done through BodeReview when we need to log into servers usually when we need to check on logs or restart something that stops or something but this also means that a lot of people in the community don't have very good to do with our systems so we have a public cacti instance cacti.opistack.org has information of all of our servers so one of the really good things about this is if someone wants to replicate part of our infrastructure we really like our CI system we want other people to use it so if they say well how big is your Garrett server we say I don't know go look at cacti it'll tell you how much RAM it has it also allows other people to debug problems for us sometimes just fill up and we're not monitoring that very well so if something happens anyone can just go look at cacti and be like well logs filled up your server so everything stops it allows people to be proactive about helping us debug things even if they don't have a log into the server there's a picture of cacti you know what it's like we also use a tool called Puppet Board which is a dashboard for Puppet it's typically not something you'd run publicly but if you go to puppetboard.opistack.org we are we've had to turn off a few things in Puppet Board to make it safe to run in public but it does allow you to see when some changes are being submitted so if you submitted a patch it gets merged you're able to see when it merged and what exactly ran so if your patch merges and it fails like it didn't do the change it wanted to do whether it made that change or not and then you can write a new patch to follow up and fix what broke without having to ask one of us to log into the machine and look at the log to find out what happened this is probably my favorite part of being able to not have to log in because I spent two years on the team before I got logged into the server so I didn't want to be bothering my coworkers all day being like I broke a thing again why the break I could just go to the dashboard we also have a lot of documentation you go to www.openstack.org we have a link to a bunch of our system documentation that will document everything from how to submit a change to our puppet and test it so you can do like testing on your side before including all the modules and everything that you pull down and it also gives you specific documentation as far as all of our servers go so if you want to add a server to cacti it will tell you what to big file to edit and then what repository system does that do if you want to add a server to our infrastructure we also have instructions for that so the team that worked on asterisks they can just look at our documentation and say okay we're going to read through this we're going to propose the changes then we're going to talk to the instructions about what we're going to do here and so then we had all the data that we needed to actually launch the server at that point so automation is great and not logging into servers is really nice because my SSH key lives at home so we do have to but we do have to do something from server which has got for us we've managed this a bit with tooling and a bit with social construct in our project so we have come to the conclusion that complicated migrations and upgrades can't really be done when we upgrade Garrett we don't just change the version number and then let puppet go to town because that would destroy open stack so we have manual processes in place so in this case if we're doing a manual migration or some sort of upgrade that we can set in detention we'll typically get together in an IRC meeting beforehand and we'll work on a etherpad to come up with a plan of attack the etherpad will typically have every single command that we plan on running during this migration or upgrade all written out even if it's easy, even if it's obvious when you get in the zone and you're doing this upgrade like you forget how to run my SQL command so we make sure it's all spelled out and also we can review it with any etherpads, the collaborative tools everyone can make comments they can make edits and then when we review it we can make sure that we're all on the same page and all the commands are correct plus we're already used to reviewing each other's work because every single patch that comes into our infrastructure is already reviewed so culturally we're in a really good place to do that we also have this issue where when we initially launch a server we're still doing that manually we haven't quite figured out how to do that in a way that it's automatic for us because we have to make some human decisions when we do deployments and honestly sometimes even though we test all of our public code sometimes we bring up a server and it still doesn't work because we forgot sometimes in our Apache config or we didn't add a thing to start today after we installed it we still have a manual process for launching a server but we do have systems inside of our some of our root servers that can actually do the deployments we have secret git repositories that have all of our passwords and keys and credentials and things and so the root admins can actually have access to those and there are instructions on the machine on how to do some of the manual processes and of course like I said passwords are not open source we need to put those somewhere private to get and so they're pretty safe and there's history so we can look back and see when changes happen so as far as day-to-day work we don't really use phone or video because we don't like it so we're all on IRC all day all of our channels are completely public and they're on a free note so we have an open stack for a channel that we'll hang out in because then whenever anything goes wrong with the development workflow there'll be a lot of people in infra talking about it so I think we've got like 400 people in our channel right now just sort of watching and making sure Garrett hasn't broken or Jenkins didn't fall over so that's sort of where we are home days, we've got open stack infra when something goes horribly wrong a bunch of us will pop over to openstack-infra-incidence and so we can focus on that so people will be joining the infra channel being like what's wrong, the world is on fire but we'll be able to be in our incident channel focusing on actually fixing the problem we do have some like people who will do interference in the main channel and they'll say like yes everything's broken they're fixing it right now but we'll be able to be in our own space to be able to work on it we also have an open stack sprint channel this was sort of born out of the fact that a lot of projects have in-person sprint but apparently as much as we don't like phones and videos we also don't like well we decided to do virtual sprints we don't need to get together in person so we have a sprint channel and so we want to work on something like we want to do like our upgrade from public 3 to public 4 for instance we'll probably all work in the sprint channel get all of our patches in line make sure everything is lined up in the view and we'll typically leave the infra channel that day and all of us will join the sprint channel so we can focus on that all of the logs are public we have keysdrop.openstack.org we can see not only infrastructure logs but all the open stack project logs we also have weekly meetings and those are always logs and the minutes are posted in the same place so the weekly meetings are kind of a check in with everyone on the team we review our priorities and we make sure we're all on the same page with what projects we should be working on we use our pace spin a lot but if someone has launched a service on one of our things and they're like hey can you look at this log and find out what it's doing I can just dump the output in this log without giving them the log in and I joke about in-person stuff but we actually do get together in person every six months for an open stack developer summit so we actually get to see each other and talk in real time in person about our issues and this isn't something that's really important to us it's fine to have a team that's distributed around the world but we find that the culture tends to decay over time if you don't see other in person once a year and it's something I was surprised by honestly because I really like my IRC and my cat but it turns out we have to go and see each other and it's really revitalizing the team and it helps us connect and helps us work together because someone can be really curmudgeoning on IRC but as soon as we go out and have drinks one of the things we have struggled with that I know a lot of teams always struggle with is handling time zones so we are again across all over the world and I think our guy in Australia hates all of us a lot of us are in the US and so we talk often he's sort of starting his day and it's really hard the only thing we've been able to do is the situation is add more people in the time zone because they're not alone so he often works with us in the evening and then he'll work with the people in Europe during his evening and their morning and that's when he'll make all of the major changes he'll merge patches he'll restart services usually when someone else is around and then when he's alone he has to work he usually does reviews maybe holding off on some major changes because if no one else is around he doesn't want to break everything that's no fun for anyone we also manage a lot of servers in our deployment so we have one guy who's the expert in elastic search and then I pretty much run the translation server so he's not going to touch any of those things unless one of us is around to learn about those things so it's really just adding more people in the time zone the only thing he's done to help that we worked on also improving our handoff between shifts so if there are incidents we'll sort of let them know this thing has been going on or we just restarted the code review server because it's all around again we'll sort of let them know what's been going on today and I think that's definitely a real work the time zones also make it slower for onboarding since I was in the U.S. where all the other route admins were I was able to pitch in they'd be working on something and I'd be like hey I didn't learn that yet but people in the other time zones they don't really get that opportunity quite as much so they have to do more formalize meetings with us to make sure they understand all the components so onboarding is definitely slower for folks who are distributed further away but honestly, mostly the work even in a distributed fashion it works pretty well for us using all of these tools they're all open source and we all pretty much love our job and it's an open source project so if anyone's bored or wants to get experience with ops you're welcome to come hang out and learn about what we do and get some experience I found that people who tend to do that often end up getting hired because we realize we now can't live without you so I go to my boss and I'm like you need to hire someone so you're looking for opportunities experience and maybe even a job at some point and that's it I have a couple minutes for questions I think alright, it is time for lunch we need to be back oh, we got two t-shirts come up and see Liz if you're interested she'll ask you open stack questions no okay, it is time for lunch we're gonna be back here at 145 on the dot cause we're gonna get started with configuration management sucks and I probably shouldn't have said that cause I'm the chef guy but we'll go with it have a great lunch, everyone started in just under 10 minutes for your friends that are out still getting lunch, you should send them text messages and tweets and let them know hey, we're gonna get started soon so we'll be on right at 215 I'm sorry, 145 I'll repeat that, 145 oh, that's one, two, three okay, cool alright, we're gonna get started in about 5 minutes let's take our seats and let's send those text messages out to people that are still taking too long on lunch alright, what do you say we get started everybody have a good lunch? well, Justin is up with the talk configuration management sucks can you guys alright, here we go thank you scale for giving me slides thanks for coming to my group therapy session I saw the schedule and there's no talks after me I think I have the room for four I'll try not to go over yeah, I'm Justin, I work at Disney Animation, I'm not an artist what I am is a tool hoarder kind of just collect these things and keep them around and try them out and go over some of that stuff the talk is inspired by Brian Lendook has a Linux talk if you haven't seen that, I recommend he actually gave an update to it last night let's go ahead and just figure this out as we go let's complete this line so one disclaimer before we get started I'm going to read this because the video both people that are going to watch this on YouTube might not be able to see the slides so let's just go ahead and go through this together the thoughts and opinions expressed in this presentation are not my own they are the collective outcry of every user have ever used one of these tools and ran into one of the following limitations this presentation will only focus on a subset of tools that are advertised as a way to make your management of something easier in some way you can see them in the back table this presentation cannot address every issue and every somewhere I lost my place in every environment but we'll attempt to make it more I'll make limitations more obvious this presentation should not deter you from using a config management tool in general they are awesome, I mean it you would be a fool to manage infrastructure of any size without one care for your life, your job, your loved ones or preventing Skynet please do not write your own there is one caveat to writing your own, if you are going to write your own tool please make it have a better theme than the current tools if I'm called a puppet master one more time I might cry, one more disclaimer I have not used every version of every tool in every size environment nor have I spoken to every user of every tool spoken to a lot if I claim limitation does not affect you or cause you to cry in your pillow at night please let me know on google plus so your comments can be ignored the claims in this presentation are strictly for educational purposes and should be used as such if I miss the limitation that you find particularly egregious please let me know on twitter and I won't ignore your comments at Rothgar, the first step to fix the problems outlined in this presentation is to acknowledge they exist the second step is to storm the respective github issues and let the maintainers know what limitations to prioritize the third step is to pay those companies so they will listen to your outcry and assign someone to ignore your request the final step is to not I repeat do not ignore the issues and move everything to docker and config management free utopia where you will incorrectly assume no issues will ever come up again now that that's out of the way just look broadly at everything kind of sucks a little bit and they all kind of have some of the similarities together one of those similarities I'm very sorry about but I cannot help you with it's more of a it's you it's not not me it's you but a lot of these tools just if I'm sorry if you're in windows environment they all kind of suck second place is documentation there's a you always want some example of something and you're not going to find it documentation because there's either full sections of tools that are documented or you have a weird environment or something you need is not going to be there and that's okay documentation sucks everywhere a little more than others but it all kind of sucks testing is hard Carlos gave a talk earlier about test driven infrastructure and it just explains like testing this stuff is hard like if you need to know what's going to be you know you make a change here and like at server in Oregon or something's going to do something like what is that actually going to happen you know what's going to happen the network the same as your local environment probably not so you're either maintaining duplicate infrastructure or you have crazy vagrant boxes that try to replicate everything and it's just it's hard it's not going to happen right away and there's there's a lot to do there templating everyone loves some ERB and JINJA too right they're great you know we're all here together and there's and why isn't there like a way I mean I know there are some tools that kind of do this but like I have a template and like I want to do an Ansible run and like spit out the template and like just show me what that template was here and like not like there's a no op but then it's going to like try to put it there like I just want to like here's my template file here's my you know server I want to run on someone figure out the environment variables and then make that template for me right it should be easy but sometimes it's not and then in those templates like you have secrets to put right like that's what templates are for like you need to put things in those files and you only tell your friends secrets like why would you tell some strangers and like you handle not your friend like this stuff is hard like every config management tool like the answer is like oh well we have these passwords they gotta put them somewhere right let's rub a little crypto on them like we're good let's stick that in text file and no one can read it right and it's true because like I can't read that what does that say like that's a stuff I gave it my password and it lied to me it told me this was my password I'm like no that's not what I said right I don't know like you read that but then like someone made another commit and like that's the past like that's the new password oh cool like well I I don't know which one's which like okay well let's let me go to the box and like find out what it is or some decrypto somehow like one of these is is password and the other one is is you know DevRandom like can you read it like I'll give you the PEM file can you tell me what it is like boiler like that one let's let's so you decide you're gonna you're gonna do this you know they all suck a little bit but like let's just start it let's pick a tool and you know a lot of marketing dollars so they must be good right they're really mature and man my twitter stream has their ads like every fifth week so let's let's go with it because it must be good because I see it all the time so like you know let's go documentation like what do I need to get this setup like there's got to be some simple like diagram of like okay it's like a box or two right oh well kinda like a little more than I expected I mean this is a large scale like this is you know 7000 nodes like yeah like I know people in this room that have that like I do and and that's kind of a lot to manage for this and I don't like the lines sometimes I'm colorblind so like thanks like but no thanks and and like who's gonna set all this up and like what's managing this like how did I get that there like I need config management to put that in place like but the true story very true story my very first Ansible playbook I wrote was deploying puppet to all my 7000 nodes like that was what I did and it's all all downhill for puppet since then like it's just kind of like hey like that worked and that was cool and but you know like let's let's go on ahead like a puppet is so mature like to spell this out for me like what do I need like those are the 13 boxes I need to manage this like that's okay like 13 like I can do that Amazon cheap right like I can convince management they'll do this so it's like you know 30 cores like that's not terrible you know like I think Raspberry Pi 3 is gonna have that so we should be okay like a little bit down the road like we'll be fine and like yeah that's that's a little bit of both fight like okay well let's let's keep going like you know this is like a you know fries hard drive like who cares oh yeah like management yeah they're on board for that right like we're good but then you know who has you know this is a new environment this is like hipster puppet this is C and Haskell like this is all the hotness of you know puppet server like puppet 3 anyone like puppet 2 like you're a puppet master right and that's let's see what was the premise there like oh that that's a little bit worse you know it kind of updates for everything and then there's that that other little thing they're like oh yeah puppet master like this is only there's only 2000 okay significantly dropped what you could just do and so if you have you know like just double it all right management on board let's go for it got our structure set up you know so puppet puppet has puppet code you know like here's a lot of good things about puppet code like it's like DSL you know like DSL just for config management so it's got to be good right it's like JSON and Ruby kind of baby and puppet code so it's got to be easy to write right so I mean let's let's go for it because this is you know oh sorry it's got to be mature because it's puppet code and it's been around forever like if anyone knows this bug this has been around forever feature 10 year old from from puppet 0.24 like I need to make a directory treat like let me do that in puppet code well kind of like you can kind of do it really like I mean I have 13 boxes up set up my server farm like I want to run this on all of them and I can't like you can't easily I just I just imagine Luke like he's still still important typing away on his computer I gotta fix this bug hey Luke I'll come to lunch no no I gotta fix this you know like okay we'll bring you back something but does he know we don't use that bug tracker anymore like we moved did you see the banner like hey we move like go somewhere else like we wipe this lake clean and alright so let's just dig into the puppet code like we can we can write this because it's easy right let's let's make our directory tree this is super easy right but it's so easy to write bad puppet code because this this is this is it right this is what you want it's not because it's bad and why are there commas everywhere take those off there like nothing does that like why are there commas on the end like I don't know but so like let's let's make this is a class but let's make like parameterize that class you know this is what you need because you're doing it wrong okay like the parenthesis alright so the class has parameters so now we're good right no you're still doing it wrong like you need a params class that has your variables then you read that in and you inherit it so like okay now we're good because this this is good public code right it's like well store your variables in hyra like set that up and you gotta put it on hyra first and then that inherit okay because that's you're still doing it wrong but then it's like we're good right this is public code this is easy well it's not a roll or a profile like rebuild that so there's a rolling profile I mean come on like this isn't hard right and put rub a little r10k on there and in your set and really this is what ends up in my public code that's like that's it like all of that I'm like alright screw it this is just go like I'm I'm in and then you have to figure out like I need that directory there first right how big figures it out for you they compile it it's all doesn't go in order that you wrote it they do it they do it right right so you have to like use only key words you know what I always look at my like subscribe go back no you notify something and then okay let's take a step back like let's put classes together so the classes will go in order right and you got these little things well the squiggly is subscribe because that starts with an s right we've been like reverse is a unsubscribe then you don't get email I don't think that's right but then like let's you know I need all my directories there before I put files down and like this is the example I get for like I'm like what is that I don't I don't know pipes why did you put those in there like I I don't know and so but for public people like I have a like public 5.0 like I know syntax is going to change because it changes every time so here's my future request alright like new syntax for before like here's before alright we're good is after we're we still good fine alright then there's there's like I don't care puppet you decide like just you can do whatever you want and then like you know what you did that that those are going all over my public code now like that's that's what I want next there's got to be good parts right I mean puppets like they have like good parts NCO you guys are laughing that's the punchline I'll narrow it down a little bit so like we have NCO like kind of broadcast stuff to everything and you can talk to everything at once right so that's cool what's the best plugin for it right paint I kind of had that kind of had a tool that did paint like paint lots of things at once then like you know I could run puppet well they already told me to start the puppet service like puppets running right no one runs the puppet service don't be silly so you kind of have a service plugin to disable the puppet service right and then you can go back and you can run puppet with that so you're okay but then like anything else you need to do like I need to check something else in my infrastructure like RPC that's your answer like just give me a shell on all the boxes at once and we're good we have podcasts everywhere but you know what you know what I really like I think that's kind of like just speaking of SSH in a loop Ansible right like that's what it does right I mean loop has said in the past like it's not a solution but you know who's a really smart guy Michael Dahon's a really smart guy like you know what like I'm not going to do that I'm going to do a little bit different because I'm going to use Python and Perumiko's not s8 right I'm going to abstract that a little bit and I mean Michael's smart so he must have figured this out like he used to work at Red Hat and then he worked at Puppet and then he started Ansible and then he saw the writing on the wall and he got out of there before Red Hat bought him again he didn't want to go back but let's you know so let's take a look what we got here what tools exist that did this I mean we could get SSH on a bunch of boxes there's a bunch of tools that do it I could call up any of these give-it-a-list servers and I have s8 on all of them and there's more it just keeps going like every other week it's like oh there's another parallel s8 thing that's cool but you know Ansible gives us a couple different things right it gives us abstraction and it gives us you know it gives us Cal State I mean none of those other tools gave me Cal State and that's good and there's two things that everyone does when they run Ansible for the first time one is they turn on infinite scroll back in their shell because Cal State is just taking up like 40 lines of shell and you're like oh where'd that go and the second thing is they're uninsulting Cal State like oh I didn't actually want that it was cute but man Ansible is really close to just a straight shell I mean there's not a lot of abstraction there and so I mean let's go through an example here we're gonna see let's install a package let's install some software Chef and Puppet like hey that's cool package that makes sense to me right? I see what they did there you know because just admins are late hey you know what there's like not the Ansible like you can do this too right oh shoot what was I what's the web box again do I use do I use the app I don't remember let's look at it okay well okay or anyone gives me comments like yes Ansible 2.0 has a package module I have no idea which ones it includes or how it works same so like this is my new this is my playbook for Ansible because it's like I need to do stuff so I have comments I'm gonna put some stuff on there it can recursively make a record web we're gonna install a patchy right but oh well is it HDBD the patchy 2.0 I can't just like put a variable there because the modules are different right so then what you gotta do you gotta like see this abstraction see all these like if branching I was like well right and Ubuntu's apps and it's like well but like which version of Fedora is that that 23 that's DNS use another module so you have a branch down there just branches all the way down I just want a little more abstraction in there so it's a little less it's just a straight shell and then you're you're troubleshooting this I don't know why for some reason for me Ansible seems hard with it I don't I don't get it because you have to register you're like run something register a variable out of it and I need to like do something with that but I never know what I actually need out of that so I'm like I just have you know register a variable and then like debug a bunch of times I just go through my shell like I think this is the one I need I am just like using the whole JSON output and var and and then like repping that because I I have no clue which one I actually needed to do something else then you want to use that variable for something so you have to like Ansible uses ginger and it's like directly in the playbook but it's like kind of ginger sometimes it's it's more of a like well that's not really like closed in ginger brackets I like is that ginger just like pipe it to skip like but what does that actually mean that's a variable with skip knowledge the task was skipped and so you have to make sure you kind of get that your head but then it's like well let me cast like that variable my output was one but that's actually a string so I always got to cast it to an ant and then test if it was the right thing and sometimes that's just you know is it ginger is it not is it is the animal and if the animal starts breaking down you probably should get something a little more specific in the language public I don't know and then you get stuff like this or it's like oh I need to do this thing and put this file in place and this is a lovely example from the documentation on getting to skip right there I don't think that's better. Ansible just sometimes I mean when you're really if you have a few boxes cool get going and you're up and running right there's no agent right as it states it's kind of an agent but what's up with five forks like five servers at a time I mean if I have let's say I have a thousand that's going to take forever and yes let's let's change it a little bit but like no one starts at this low of a bar and then like just jumps all the way and then they added things like accelerated mode it's like oh accelerated mode that's faster it's like well sort of but it's not unless you unless you got EL6 everyone has EL6 but then I'm really sad when they came out with accelerated mode and then like fireball mode is gone like fireball mode is awesome just because the name fireball that's cool like I'm going to run all this in fireball mode because that's just better like okay so Ansible 3 like I have a name suggestion bring back you know maybe we can't use fireball because I have confused people but Hadouken is a really good name like we can just go like Hadouken mode and it's faster and I think that would work pretty good and then there's you know there's other things you got to turn on to like get things to work better and it's like it's like get to 100 servers managing SSH config for 100 boxes it's not it's just not designed to do that like I can't import like an SSHD folder of like every server and it gets really confusing of which boxes which so how do I connect each one and then you might have keys for people and it gets to be kind of a mess managing all that SSH because it's one on one but and so you know salt yamble so it's easier to write right there's no SSH we'll sort of solve it but you know they must have solved these problems right so like how do they solve those like complex yamble stuff right it's they got yamble like well like they do similar things to Ansible but they like inherently say like no no no it's ginger 2 then yamble so we're going to filter that into render one thing and then render another thing and then make it a thing but then it's like well you can solve that because you can just write Python and Python you can just write Python right in there and that's great because you don't have to worry about all that yamble yes you just keep going but then there's like multiple Python well which one did I want it's like well write everything in the first one when you realize it's wrong write everything in the next one and just like you know keep going from there like maybe maybe I'll do one of the other ones it's like what maybe not and like they have other yamble that's not cool like that's a salty yamble like it's like one step below puppet code but like one step above HD right I don't like config and then they have like really obscure stuff like if you have config and cheetah like I'm very sorry you must be managing windows boxes but just in case like if result hey if config management doesn't work out for you guys I have it like a pivot you could do like like you could change from doing stuff and focus on these renders I think that's what's going on there the way you can pipe them together how many people wouldn't love markdown to comp I mean just take take markdown files put them right in the conference documentation that's great you know and then for anyone else like there's one other step email write to excel and then give me a state out of it that I can just deploy somewhere like that's that is business right there you can just render that it's just like search my email anytime server a was mentioned like stick it on this excel document and give me a state file that'd be cool and then they go like one step further in their documentation where it's like oh you know you can you can do anything you want like you could write HTML or puppet files I want to write my puppet manifest and then have salt put that out somewhere and but you know they have that as like an example and then it doesn't exist I was actually a little sad it's like they give this as a hey you could do this and they don't like it you know let's let's look at salt a little more and so you know just pretty mature right get up like you know less than 400 but you know issues on get up that's pretty good you know and ansible a little less mature a little more like salt salt a little less mature than in both of those let's just say like I mean a little less mature all right like that's that's a lot of fun but you know I might run into one of those buttons when I'm deploying stuff let's just say and then I feel like salt documentation they took all this cool documentation and then they gave it to puppet and they said you order it like you just do something and like and they just deployed it to the okay that that works like no it didn't and then like this is everywhere like hey examples are on the code like you mean that code with 3200 bucks I'm just let me just go read all those bugs and like let me triage those as I'm going and I'll set this up and we're good and no like I don't want to just read the code every time because there's bugs so then you're like okay well let's let's find some common states further because like examples for states salt I'm sorry a git repo is not a good place for just a bunch of states that like you wrote and and chef actually does you know have this a bit and supermarket puppet a ton in the forge great so let's go right to chef I'm sorry Chris I might it's delightful so like supermarket right it's a good place but like why does nothing name face like you have like I'm on I'm on supermarket I'm looking for a recipe and I've won a patchy month like but you can't name it something else because like there's only one that can be a patchy nothing else can be called a patchy so like people name them different things and like oh well like there's other patchy things so you end up just searching you hope someone wrote recipe or chef somewhere in there get a repo you know that's the only way you're going to find it to find these examples and chef does a file thing you know they figure out a little bit of I mean ordering in top the bottom right on the page you write it that's good except for stuff like this doesn't work and it's like well why not it's like well you compile when you compile that file didn't exist so like the execute never ran because it never you know the file they put it there but then don't always run it so like you have to trick it you're like oh always execute this it's like psych only if that's there don't worry like I always want you to put that there but then don't always run it you know there's still like these weird oddities because like I mean chef is straight ruby so I hope you like ruby good and then chef wants you to do things chef server obviously I can't I got to get some other stuff because time chef solo really like because it's better like these disadvantages I'm sarcastic these aren't necessarily disadvantages in a lot of cases it works better with chef solo and writing your own resources and providers a little bit here we're out so here's a surprise the theme for devops day right here I'll wrap it all back together like you should still use one of these because seriously you'd be stupid not to like that is the surprise like yes sometimes they suck but look at this is what you had before isn't it you had an excel document with all your servers in it and then like hardware and like oh hey I changed this server this you know ran on this box so like email it out to a team and hopefully I'll save it and they edit that one it's like no that sucks that was so bad it's like well maybe I can just you know do something like indirect shell scripts because like I don't need all this abstraction stuff it's like well people sort of did exist like it's a waffle like config management in bash oh like that and I mean usually scales long were kids so I censored some of this stuff but seriously like this is another one like it's it's still written in ruby but it just shell scripts all the way down and I already have a folder of shell scripts I don't need to just deploy all those everywhere that would stop and so here's the real like why you need to use these tools like why is this better than like and doing your own thing and how you were doing it before seriously like like meet meet space people do not scale like I cannot do this on my own and like I can grow my team and then like we run out of space and like no that still didn't work because no one knows what other people is doing it just doesn't work so you need this stuff to scale like if you're going to go anywhere with even 50 boxes like scale like use one of these tools and seriously they're always consistent like the broken bits are always broken and it's great because like that you can you can rely on it and you can just say like oh I know what you're doing now and you can work around some of those things and that's cool because it works and it always works the same way until you made your version upgrade for different times and then it is seriously just reliable like when was the last time you told the DB admin like hey I need you to migrate this thing it's like I'll get to it at some time it's like well like I need it before lunch and it's like well like some things you know you have a 30 minute window like you don't know when it's going to run like you don't know when the DBA is going to do it you know like is it this week like at least with something else you can narrow down some of that that scope variability and it's really reliable and and it just works you know worse better if it was consistent that way manage more things and then the number one thing that still like you should use one of these tools because you are going to be an angry sys admin when you're old don't because this whole this whole room of people like this community is the reason for these types of tools and it is great to have people that you're not just like hacking away at this bash script in your basement but everyone's in the basement but like you have you have people that you can reach out to and it's you know you can work together with people on solving these issues for not just yourself but then the next person that comes along next year they can benefit from what you did and that's great because that is what we all really need when we're just kind of supporting the system so that's my talk thank you so I guess I get to say we've got wonderful sponsors pop it and chef and awesome talk so alright so here's what's going to happen right now just going to be a uber geek girl on twitter is going to come up and start the process for open spaces what we're going to do is actually walk through that process I'm going to start making my way through the aisles passing out sticky notes and sharpies and for those of you that have done open spaces you know why if you haven't you should totally stick around to find out why because it's going to be awesome what's up everybody's leaving they're all just going to convene by the open spaces board right for those of you that are staying I want to introduce you to the concept of open spaces I was first introduced to this at the she's geeky conference up in mountain view and it was super life changing like to watch 300 people chaos self-organized and actually come away with that with with a lot of going forward ideas for themselves was amazing so couple of quick just rules and the laws of open space technology the first is the law of two feet it's the idea of having one foot of passion and one foot of responsibility basically if you're not learning something or contributing to the discussion that you end up in you should get up and go to another space whoever comes are the right people whatever happens is the only thing that could have whenever it starts is the right time and when it's over it's over the number one thing is prepared to be surprised here's how it works Chris is passing out post it note you will write a topic that you care about that you're passionate enough to at least start the conversation about write it on a post it note everyone will get one minute to introduce their topic then you will go to the board and you will put it in a time spot everyone else who either wants to talk about that topic with you will come and you decide which one they're going to go to you may find that you want to combine topics which is completely awesome and I'm really looking forward to seeing what y'all come up with so with that grab a post it note write a topic that you care about and let's do some open space and any questions about open spaces any questions so far who will here has done open spaces at a conference before cool is it awesome nice nice very cool alright is there anyone that wants to start or should I okay I guess oh here we go alright my name is Matt I want to talk about how you use Kubernetes alright hey good afternoon my name is Kevin I want to talk about big data dumb alright we want to make big data easy enough to use it doesn't have to be so complicated we've got some ways that we do it at Canonical and I want to share with you guys how we're doing it and hopefully you don't want to talk about the same okay so I'm just going to say it I want to talk about how configuration management sucks because it sounded like that was a really interesting thing and at very least I'm going to pull Justin in to have some interesting conversations so I'm going to write it down but you've got to have something you want to talk about hopefully other people have ideas on this I want to talk about CDN for hostage mitigation if you've been through that situation and you either use the CDN or not use the CDN for that would be happy to sort of share some stories and see if you guys have ideas but hey I love configuration management but I'm really interested in what happens after configuration management how you manage things like orchestration and what happens when you configuration management on a scale of more than a couple hundred machines I have serious concerns about when too much trending data is actually too much trending data when it comes to metrics and I don't know how to draw that line and go where but I suspect that you know software might eat the world after my data does alright cool I'm interested in model prison configuration management so applying all driven architecture to configuration anyone else got a topic I'm interested in talking about infrastructure's code tools like hash port careform and how teams collaborate using tools like that alright anyone else okay so what's going to happen next is all the sticky notes should go up on the board over there and whoever would like to organize that should totally go over to the board and figure out the schedule I can come help but it's more fun super self organizing but it actually works out really well yeah we've got about until three to actually begin so if you guys want to collect all the stickies and actually kind of lay it out in the schedule that would be fantastic it looks like we're starting to kind of come together what we're going to do is this corner over here is going to be table one sorry for the lack of a table that corner over there is going to be corner number two or table number two tables this over here over by the screen is going to be corner number three or table number three and this is going to be table number four so if you want to go find your group and we can start some unconferencing alright just a reminder that corner over there is number three that corner is number two that corner is number one and hey I'm in corner number four