 All right, we'll go ahead and get started. I'm going to talk about agile infrastructure. It may or may not seem interesting right now. Hopefully by the end of it, you'll have questions, maybe not answers about what agile and infrastructure are and what they have to do with each other. So I'm going to start by telling you a little bit about myself. I'm Andrew. I was a developer once upon a time. I've been a member of several agile teams. I'm tolerated at the Salt Lake Agile Roundtable quite frequently. I mostly work for startups. I'm the founding partner of a new startup called Reductive Labs. And I'm an all-around troublemaker. And the rest is complicated, but that's probably enough to keep going with the story. There's always a duck. So what is agile? What is agile? Who knows what agile is? Who thinks software development is a solved problem? So we have this manifesto. The manifesto has four values and 12 principles. And if you really sit and reflect on the people that were at Snowbird and the influence and the impact that they've had on our industry, both past, present, and future, then you just have to be amazed at the movement, the agile movement. And while we talked about earlier maybe agile hasn't crossed the chasm, but the word has, there's still a lot there. There's a lot of benefit that people aren't getting from this meeting, this meeting of minds and this idea of support. What is agile? Was it really me? The way I see it, I kind of break it in my own understanding into kind of two camps. And you basically have agile practices that help you plan and predict. And then agile practices that are sort of engineering and developer centric. And if you sort of reflect on the manifesto that was signed and those people that were there, then most of them were developers, headed back on as a developer. So sort of developer centric. And you've got to have someone who makes decisions about what features you're going to do. And you might want to make sure that they're working before you deliver the software. But then there's this whole list of things. And you heard in some talks earlier. It's like the executives and the leaders don't even know what their role is anymore. They're trying to figure this thing out. And are we doing agile, or is agile doing us? We don't really know. And there's sort of this all these little tribes. And you can't even start talking about sales guys in marketing. They just break down. So we kind of have this little circle of happiness. And people are talking about, and there's lots of good ideas. I know lots of people here have strong ideas and implementable solutions to executives. And there's some ambler stuff on databases. And these different things, but they're sort of this central thing. And then over here, there's this other stuff, right? That people, we want to be agile. And we talk about being agile everywhere. But we're not always sure how to do it, right? And you can't be too careful with who you led into the circle of happiness. So if you look at the timeline and the time frame, when the manifesto was signed, it was sort of the rise of this client's server architecture, right? So at that point, you had most software being shipped on CD. So you would go to the store, and you'd buy a CD, and you'd install it, and it would work. And at the other end of that, you have all these support costs. Because it has to work on every version of Windows. It has to work on all this stuff. And you have to manage all this sort of complexity on the other side of things. And then we've kind of moved more to a server-centric architecture, where now you're managing those servers, and those applications are developed on those servers. But that creates its own complexity. So I want to kind of know, just in general, whose company works on web applications? Who makes web applications? So I'd say 60%, 70% is that fair? So we're kind of moving. You can still go to Walmart or whatever. You can buy some applications in Shrevecraft. But for the most part, everything is going to be downloaded, right? We're getting to the app store age. Everything is going to be kind of streamed in bits. And then it might still have an installer, but it's no longer going to be shipped. So there's this whole chain. There's this chain of kind of delivering value that's shifting. And in most cases, there's all sorts of transformations that are kind of happening right now. And I don't get into the hype of some of the words too much, but there's definitely a change in the way infrastructure can be deployed. And what I'm doing, what my personal work is is kind of a big part of that. I like to think so. So who's working on a web app? Where's that web app run? Who takes care of the servers? How do you interact with them? Is them people or servers? So I want to kind of step back and talk about agile practices. Because I talked at the beginning, there's sort of engineering and planning practices. So if we step back to what most people consider the developer-centric engineering practices that are agile, in my mind, they all start with these two things, version control and building from source. Until you can do that, you can't really do anything else. Does anyone disagree with that? It's essentially once you can't think about TDD. You can't think about continuous integration. This is the first step to being able to embark on the best practices of agile development from the engineering perspective. So who version controls the configurations for their servers? And who can automatically rebuild systems? So there's this project that my company makes called Puppet. And what Puppet allows you to do is encode this infrastructure to encode the server configurations in a declarative way. And code has semantics. If you're a developer and you work on code, it's come up a couple times. It's a theme. It's the what versus why. Semantics can give you why. It can be recovered. And as code, it can be reproduced. So you can rebuild those servers. It's more maintainable. It's more extensible. It's shareable. And some people are using it. So this kind of gives you an idea about the type of organizations that can take advantage of this. Anyone can take advantage of it because it is free and open source software. So it allows you to, at very low cost, do some things that you probably aren't thinking about doing today. Potentially. I know we only got a couple of sysadmins. So I'll try to explain it a little better because it's a mixed audience. And I want this to kind of make sense to the product manager and the executive and everyone else. If you kind of use the traditional approach to managing servers and systems, then the chance, say you had a web app and you have some web servers and some database servers, the chance that those servers are all configured identically gets pretty small, pretty fast using traditional methods. And why? Well, going back to my hero, Brian Merrick said, this morning, basically a lot of places, the sysadmins, they're taxed to the limit. They have this backlog of requests they can't quite get to. And if something goes down, there's not really documentation. Oh, there's this package, and there's this configuration, and you try to do it. You might have some scrimps, but oh, the version of the operating system change or that package is upgraded, and you're not maintaining private repos. And there's all this kind of technical stuff that just gets in the way. And I think that as everyone's familiar with the term yak shaving, you basically end up shaving yaks. And these inconsistencies cause all sorts of problems. They have real costs. And if you're having any success at all with your application, there's more and more and more servers. There's going to be more servers. So you're running as a sysadmin from the brightest fire to the brightest fire, and the chances are that the server that's working that isn't configured exactly like the server next to it that is also working is going to get any attention. It's going to get zero attention. This is a shout out to BrotherDarms. So I used to work for a company that had kind of a SaaS online e-commerce thing. And if there's an outage, it really impacts the company. It's every second you're down is dollars. And it impacts not just the sysadmin who has the pager who's scrambling to try to figure out what's wrong and get it up, but it's also the support guy. It sucks in all the things. The executives just watching the money kind of go, and it's not fun. And that just makes the problem worse, right? Because now you have another fire, the brightest fire. So what does that really mean? Well, deployments and upgrades tend to be expensive, tedious, and air-prone in most organizations. And that's because there's a lot of manual steps involved. There's a lot of babysitting. There's a lot of tending. I see the sysadmin shaking his head. Can I get a holiday? So who's as a tester has ever run into a problem when they're testing an application that the test servers are not configured like the production servers? Therefore, something didn't work or didn't work, and then you get on the other environment, it works. And it's just great, right? Like, you can spend like cold days dealing with that. Who's ever had that situation? Yeah. We're creating servers. Yeah, OK. I mean, there's staging. There's all these kind of things you can put into the workflow to make this. Another thing that happens, if you can't rebuild your servers from source, then if a server goes down, you're in a world of hurt, especially if that's a critical system. So one of the things that we try to think about when we're building systems that are redundant and scalable is you should be able to take any box and throw it out the window. If you can recover from that in zero time or in some allowable, based on the criticality of your system, amount of time, then that's a good test. And when you start thinking about, OK, we have these unstable servers, and if we touch them, and they go down, then it starts to make sense that you need to sign off. And you want to slow down and kind of control the pace of change on these systems. And so things like a ticket system that has to be signed off with this very, very heavyweight change control process starts to seem like a really, really good idea. I know, yeah, you're having some success, so there's going to be more and more and more servers to manage. And now we're going to talk about virtual machines. So we also have this kind of new stuff. It's been around for a while, but it's just really starting to take hold. Some people talk about the cloud, cloud clouds, and they argue about all this stuff. It's just like arguing about scrumbers, XP. It's kind of stupid. But what it allows you to do is take it even a step farther. So now your infrastructure is even more encoded, potentially, because with an API call, you can launch new machines, where the previous MO is, if you needed more capacity, then you had to get a purchase order, and you had to send it off. And then three weeks later, you get machines, and then there's guys with CDs, or if you're maybe pixie booting or whatever. And so now you have some systems, and then you got to configure them all. And now you have your new servers, but they're not configured just like the servers that they were supposed to be identical to because, well, all the problems we already mentioned. But now you can bring up servers in minutes with API calls. EC2, you can have VMware, ESX, Xen, whatever, in-house. And you can set up all sorts of infrastructure that, in minutes, you can have new machines. And those machines can be used to do things like be your production environment. And those could also be test and dev environments. So you could do a lot of interesting things that you can't really do if you're using traditional methods right now. But it could also be the development environment. How many people, when they hire a new developer or a new test or whatever, can kind of out of the box, give them the environment that they need configured? Or how much time does that kind of waste on the front end of bringing a new guy in? So there's more and more machines. Some people like to think about making kind of images of all the different systems that you have. And so then you start out thinking, OK, well, I'm going to have a database. I'm going to have this. And then pretty soon you have like eight different images. And then you've got to deal with change amount and all this stuff. And it's just image fall over the place. And I guarantee you it's not as good as what we do, because you don't have any semantics. You can't look inside. Once you save that image, it's 500 megabytes of opacity. And you don't know what's in there. You might have made notes that this is the server that does this or that. But at the end of the day, you don't know until you start it up. And then you have basically you've kind of minimized and collapsed some of your hardware problems, because you can run 10 machines on one hardware, but you have 10 times more machines to configure now. So the problems that we were talking about before are getting multiplied, the configuration problems. And what are you going to do? You have no idea how to manage those. But that's supposed to be fun. So infrastructure is code. And we're kind of talking about the engineering side of things still. So my advice, as much as possible, automate everything. And I think there's a lot of strong parallels in this approach to building infrastructure and automated testing in the tester side of things. So it's basically tools or infrastructure and tools that give you open up new possibilities for how you can do stuff. We talked about the manifesto, and we say that we value over tools and process. But who can build agile code without a unit framework, a test unit framework? You need to have these kind of processes and tools to make the individuals and interactions even work. So we want to get more done, spend less time doing it. Hopefully we can put out the fires. And we want to use our humans who are smart and we pay them a lot of money to make decisions. And then we want to let frameworks do all the work. And that's kind of what we're going for. So what it allows you to do, and this is why some of the big companies like this approach, is it kind of collapses the aspect of scale. And if you ask my partner, who did a lot of the work and most of the work to kind of pioneer this approach, if you ask him what machine should you use this on, he'll tell you it's just like flossing your teeth. You should only do it on the ones you want to keep. So what it allows us to do is take advantage of these processes and tools. And because it's code now, you can apply all the knowledge that you guys are basically kind of growing and spreading and sharing all we use for software development to build our infrastructure. So it's code, you can diff it, you can put it in virgin control, you can blame, you can do all the stuff that we do in code with the same process. And that's going to have some implications for planning that we're going to look at in just a second. But there's going to be more and more servers. I can pretty much guarantee that. And this gives you a way to get that handle on that. So now I'm going to kind of transition one of the things that in our work, when we're going in and we see organizations, this has a lot of parallels with Agile in general. But people can only change if they want to, right? You can talk all you want about tools and about Agile and about all this stuff. But if someone doesn't want to change, they're not going to. So when you get to the heart of the matter, it's not necessarily a technology problem in a lot of organizations. It's a human engineering problem. It's a social engineering problem. You have to get people to believe. You have to get people to understand and be motivated. And I think that these kind of core principles of the manifesto are the key to doing that. So building communication, building collaboration. And it's also because now you have code and you've hopefully put out those brightest fires, you can come out of that firefighting mode and you can get much more predictable estimation and prioritization of the infrastructure that you need to build. So who's heard the term non-functional requirements? What does that mean? So I always hear people say non-functional requirements and then I'm like, well, if you don't do that, then it won't work at all, right? So you have to do that because when it comes right down to it, requirements are requirements. And if you're building a web app, it is the infrastructure. You got rid of the problem of shipping CDs and having to support it on every single platform that you're going to, but now you have one platform and it's yours and you gotta take care of it. And if you don't, then you don't have an application and that's the bottom line. So without infrastructure, without keeping this stuff going, there's no app. And not only that, but say you have a successful app and you're having more success and people are using it and what happens now? Well, the stuff that worked when you had 10 servers doesn't work when you have 100 servers and you have to analyze how the stuff's going down into those databases and hitting files and that requires a deep understanding of both the application and the infrastructure. And if you don't have that kind of collaboration in the media of the minds between the dev and the ops, then you're gonna have a hard time solving these problems and really scaling this to whatever your aspirations might be. So we're gonna talk about some people. And we already talked about, there's only us and that's sort of a theme that I keep getting, going back and talking to Alistair and kind of getting his thoughts on it. But when you think about the different roles and personas that are building applications, the developers are kind of outside of the, or the operation guys are kind of outside the circle of happiness of the developers in those cases. So there's sort of this inherent problem, communicating, right? And going back to what we talked about a minute ago, there's potentially this heavyweight change control process that's been put in place to protect us from ourselves, right? So pretty soon people aren't very happy because the developers don't understand why the operations guys make the decisions they do. The operations guys carry pagers and they don't understand why the developers keep writing bad SQL queries. And it's funny, because it's true. And so they have this kind of heavy process and they don't really talk to each other and they enter tickets and then pretty soon people are even more upset about how the things were. And it's a bad scene and you can see, and this is kind of the point I was trying to make in the, there's only us, but you can see this sort of wall of confusion between other roles in other cases and there's sort of these ways that you can think about breaking them down. And one of the favorite, or one of the things that I found particularly insightful is an essay by Brian Beric, who was here talking earlier about Boundary Objects. And it's basically creating these things that both sides can kind of see and he talks about these communities of practice and these communities of interest. So in this case we have developers who are a community of practice and we have operations who are a community of practice, but together they're a community of interest and really you can't deliver the business value of your organization unless you are sort of this unified community of interest. And as we already talked about, the app and the infrastructure are intimately tied together so you need to maintain or establish these, you don't need to, but alternatively it's spent a lot of money and throw a lot of people at the problem and they're always gonna fire. So that makes people happy, hopefully. And that's a great essay, it's very short, but read the Boundary Objects PDF. I think that there's a lot of, you can take those and you can apply those to lots of other places in an organization where you have breakdowns of communication between whatever the developers, testers, developers, problem managers, developers, executives, whatever, if you can figure out some sort of shared metaphor or some boundary object and I think you can solve a lot of problems. And in some of our clients and some of the people that we're working with, what happens is the code that describes the infrastructure is this boundary object that allows people on both sides to see things from not necessarily the same perspective and some conflicts could, right, because it allows you to kind of optimize world views. But what it allows them to do is look at the same thing, have the same information about it. And so because of that, in some of these organizations, what their processes evolve to is that when developers write an application, instead of just throwing it over the wall of confusion and hoping it lands on the servers and works, what they do is the developers are actually responsible for writing the public code that will configure that application, right? So that's a boundary object that gets passed back and forth and if the application is not being configured correctly, then that's a bug and that can kind of go back into the normal flow of software development. And because it's an application, you can put it through the normal cycle of dev, test, and prod and so by the time it gets to your production servers, your test servers are already configured the same way and you just apply a lot of the same things. So this is kind of the takeaway on the planning side. So you can plan for infrastructure requirements but be willing and able to change them. And because you can encode them and you have the flexibility that coupled with the virtualization, you can do a lot of experiments with your infrastructure that you couldn't do if you needed a purchase order and weeks to get it. So you can bring up instances of EC2 and get experiments on the platform you're trying to run with the applications you're trying to run and get a lot of feedback. And I think that's another thing that Agile in my mind sort of built on is these type of feedback rooms and being able to feed that back into itself. So be willing to change them, be able to change them. And this is, so I'm stepping through the principles and values a little bit. So the operations in my mind, their customer is the application, their customer is the developer. So they need to be able to kind of have that same dialogue with the developers in some sense as the product managers would have with the application. And if your infrastructure isn't working, nothing is. So if you're talking about shipping working software, that's kind of the baseline, right? You need to have the lights on the servers or else you got nothing. And of course, create a culture of collaboration. That's something that we've been hearing over and over. There's only us, there's only us, there's only us. And together we're gonna ship some software. And then I added this a little bit so we want to take advantage of the processing tools. We have for software development. And I crossed that out and I went individuals interactions but then after we had our little talk earlier, I decided that it was better to just go to interaction. So. So the most important statement from the manifesto and this is kind of the takeaway. So I have a particular work I'm doing on. It's interesting to me, I'm passionate about it and I want to help people and I feel like I am. But I also, you know, you guys maybe aren't in that role to kind of see this. So I want to give you a takeaway that's sort of generic and you guys can do whatever you need with it, whatever you want with it. So the most important statement, it might not be the values and it might not be the principles. But the way that the manifesto starts, in my mind, is maybe the most important statement. So we're uncovering better ways of developing software by doing it and helping others to do it. I'm happy with it. Does this stuff you're working on in all with the age-old issue of data-centric applications, you know, automatically converting existing data into new data structures and that kind of thing, or is this completely infrastructural? So the question is, will this help you with the age-old problem of essentially mapping data from some older data structure into some new data structure that can be consumed? And the answer is no. It's not really, what's resources, so what PAPA provides is sort of this abstraction layer with the resources that a system administrator would use and those resources are abstractions for users, abstractions for groups, abstractions for packages, abstractions for mailpoints. All the kind of things that assist admin kind of thinks about it in his head and he has to remember on one platform, I type this and on the other platform, I type this, so it's RPM over here and then it's apt over here and then kind of keep all this stuff straight in their head. Those are the things they're abstracting. It's not gonna help you map data, it can help you move files around, but it's not gonna help you transform data. The meat cloud, oh, so he asked, what is the meat cloud? And that's a loaded question. But I wrote a blog post a while ago, over a year ago now, talking about just these different terms and kind of, we went to dinner the other night, we went to dinner last night with some of the speakers and something Jeff Patton said, kind of sticking my head for a while, and that is, nothing exists until you have a word to describe it. And he was talking about kind of lean and calm mind how you do all this stuff, then until you kind of reframe it with these other words, then you don't necessarily understand it doesn't exist for you. And so from there, I was watching all the stuff that's happening with clouds and watching all the stuff that's happening and meat cloud is basically my term that I came up with for the way that kind of throwing bodies at the problem you have like this line of servers you need to scale up and you just keep buying more guys, right? Instead of figuring out how to do it better, you buy more guys. And that's the way I've been using it a lot, in like a generic thing I think of the meat cloud is sort of like all people doing sort of tedious anonymizing work, right? So anytime you have people doing something that could be automated and probably shouldn't be automated, that's what we call it. Carl. Yeah, this is kind of about, it's more of a comment than a question that when you were saying that you think that tools are still necessary even though people are important according to that test and all that, I think a pretty good argument could be made through the fact that the tools you're talking about are not necessarily individual tools. They're more like just techniques, patterns that are applied in a lot of different ways. There's more than one kind of version of public out there that we use. And so I think there's a lot of value when you're saying this is more than just tools that you're talking about, these techniques, these patterns. I totally agree. So the comment is that it's more than just tools, it's about techniques and patterns. And for example, there's other projects that try to do something similar to public and it's all kind of about what your philosophy is and what side of certain decisions you wanna be on. And then there's also, so one of the things Puppet is only supported right now on Unix-based platforms. So if you have Windows, then there's a lot of this stuff, it will eventually, because it's becoming apparent, at least to me, this is a superior way to manage systems. But that's why you owe it. And so I think that those gaps will get filled in in Windows, but right now, if you're managing Windows infrastructure, you have other tools, but you don't have this particular approach to solving the problem. It's interesting, you look at the continuum of Ant, you made it into something in public where you recently do larger and larger scales on it. Well, so that brings up an interesting point. And I think what you, so the question is when you're talking about automation and he brought up Ant, which is a build tool if you're doing job stuff, I'm sure you know, and then Maven, which is kind of a higher level abstraction to try to do some of the same thing. And then Puppet, and in some ways, I agree with you on some level, I think they're sort of orthogonal, they're solving different problems where one is building the application code and one is building the infrastructure code. And maybe at some point, they will kind of weave together. And I think that that might be coming faster than we realize, but for now, I think there's some orthogonality. I know a lot of what you're saying, so I really love the presentation. One other factor that comes to play for us is we introduced Agile and we're delivering much faster. So now we're delivering releases, well, with multiple teams, potentially daily operations. And to watch operations go through this, you can see their heads exploding. So that's a good point. That's something that I didn't really explore as fully as I maybe could. There's a couple approaches to this and there's sort of a spectrum. I think it has to do with what your application is and how critical it is and what your action is delivering. But there's, I think it's IMVU, there's an article that came out about this Lean startup stuff, has anyone been reading that, does anyone know? So I'll explain what they did. They decided that they're just gonna get rid of testers, basically. And so they automated it to the point where when you commit code to the version control and it passes the continuity integration, it's automatically deployed. So they deploy 50 times a day. Now, that works for them. And when I first heard it, and you kind of saw this backlash from the testing community, if you really analyze their business and what they're delivering, they're delivering avatars for chat, okay? And that's great. And they built a business and they made some money. But at the end of the day, the worst case scenario, if you have a bad deploy, if you will, is someone can't chat for 20 minutes or however long it takes you to fix that, get it back or revert it or roll it forward or whatever. So if you try to apply that same mentality to like a life-critical system or something, I think there's a middle ground. There's a spectrum. And I think it goes back to design, too. There's analogy in design when people talk about you're not gonna need it and do it at the last responsible moment. Well, in some cases, for some of these applications, the last responsible moment was six months ago, right? So when you start talking about, okay, well, we're not gonna test. Well, what's the responsible level of testing for the critical nature of your app? And if you're doing chat avatars, you might be done, right? So we're, we could take a break. I can switch hats. Welcome to Agile Roots. No, we're scheduled for a break in four minutes and there was five minutes of slush time. So we're basically right on time. And there's a half an hour break. Then we have some tutorials. It should be good. Everyone's invited to the social. There's gonna be food and there's gonna be conversation. The format is gonna be ask the experts and that means anyone who wants to be an expert can be. So you can kind of take a table and say, I wanna talk about this. It's not quite open spaces or facilitated, but if you wanna have conversation about a certain thing, just put up a little sign and we'll talk about that. And we're gonna mix it around and leave by then tonight. We'll play werewolf. Any questions? All right, have a break and then we'll see you guys back in four minutes.