 All right, my clock tells me it's 2.15. So we'll get this last slot session on the last day of Drupal Const started. Thanks, everybody, for coming. My name's Sam Boyer. I've been a Drupal developer for, I don't even know, five years, something like that. I maintain panels and C tools with Earl Myles. And I'm also your friendly Git administrator on Drupal.org. All your pushes belong to me. And recently, what's that? Oh, speak up. All right, I'll move that around a little bit. More recently, I have been shifting my focus kind of further down and down into the stack of the Drupal world towards the DevOps side of things, which is what we're here to talk about today. So who's heard of continuous delivery? Not continuous integration, but continuous delivery. Good, show of hands. All right, who's actually taken serious steps towards implementing continuous delivery? This is good, because this is about what I was expecting. Yeah, we generally, in our community, are somewhat under aware of a lot of the best practices around software delivery. We tend to kind of muscle things out. So the goal of this talk is to give a general overview of some of the principles in continuous delivery. And I'm going to try to mix the forest and the trees, the high-level concepts, as well as some practical applications to give you a sense of how to apply these ideas. Because the first thing to say is, it's really, really hard to do full-on continuous delivery. It's a major commitment, and you can get a lot of value out of it if you do parts of it. But still, it takes a lot. All right, so DevOps. And I'm going to steal a quote here. At its heart, DevOps focuses on enabling rapid, reliable releases to users through stronger collaboration between everybody involved in the solution delivery lifecycle. One of the important results of this collaboration is the application of agile practices to the work of operations, in particular, aggressive automation of the build, deploy, test, and release processes for hardware and system environments. So let me just highlight those words on there. Because really, that's the key thing, along with aggressive automation. Jess Humble is the author of the book Continuous Delivery. He works for ThoughtWorks. I'll steal more things from him later in the presentation. But they're really the thought leader in this entire space. But this is a really important overall concept. And I'm starting at the concept level here. The idea is that if we can aggressively, seriously automate most of the steps between when your code gets committed to when it's actually deployed, everything works better. And realistically, those processes tend to be build, deploy, test, and release. So as I mentioned, we are generally not that knowledgeable in this community about these kinds of practices. So I kind of have a proposition for how I see it. If you're not familiar with the website, Clients from Hell, become familiar with the website, Clients from Hell, because your life will be better for it. But yeah, client, can you just get rid of this here? The Laura Mipsum client? Yeah, just fill it with dummy text that doesn't mean anything. In this proposition, we all are a little bit like the clients, because we know the end goal that we want. We want that dummy text that doesn't mean anything, but we are kind of vacuously unaware of many of the best practices out there. So apart from just insulting everyone in the room, there's an actual point to that, though. Which is that it's not just about saying that we're not really aware of how many of these things are supposed to work. There are really useful crutches that we rely on. And I want to talk about those. So when we think about what our sites actually are, a Drupal site, we tend to sort of think of it as my site. It's the production version of it. It's got the set of features and modules installed and whatever on it that's there right now. In reality, though, no production site is, well, OK, unless you're making changes on production. In which case, yeah. So unless you're making changes on production, really any production site is much better properly understood as the endpoint of a process that involves a lot of different environments, and that code, and the database, and everything else, in a ton of different states. You've got development instances, and then your staging instances, maybe some test instances. But whatever it is, we are relying on a crutch when we think of a site as actually just being one version. So there's differences across these various instances. I mean, simple things like, for example, you probably don't want to send emails to your users from your dev instances. You don't want to use your production memcache boxes from your test instance. But when we don't think of a site as just that end single production thing, and we don't consider all of the other pieces there, we're doing ourselves a disservice because it's not magic by which code gets from one place to another in this process. It's often a deliberate, and often if we haven't been really careful and planful about how that process works, it's a difficult process to get code from development out actually onto production. So the reason that I say it's a useful crutch is that I don't actually want to think about all those things. When I am thinking about making new different changes and updates to my site, I would prefer to think of it, in fact, as a single unified system that is simple, and I don't think about the process of getting code out or whatever. That's true for me as an engineer. It's true for project managers, especially who don't necessarily understand the rest of the process. But by allowing ourselves to think of this end site as this sort of simple, contained little box that's not at the end of a long process, we make it much easier to be smart about the new features we're trying to plan. So really, DevOps is about those steps that get the code from development all the way through the stages and onto your final production site. And it's important because we need DevOps in order to deliver on that promise that we make to ourselves. Every time that we rely on the crutch that says, the end site is all that really matters and not everything in between, we're promising to ourselves that it's not that hard to get new code changes out. And if it is that hard to get new code changes out, then we've shipped our own expectations. We're like, yeah, sure we can get this out next week, because you're in the thinking mindset of, well, it's not that complicated. And it actually comes time to do it. It's much more difficult. So there's this distinction that they make in continuous delivery where it's just basically what I'm saying, it's a notion of something being dev complete. Your developer said it's done, maybe even it passed some tests, versus it actually being live. Or my preferred way of saying it done versus done done. The most dangerous place for code to be is somewhere between done, that is dev complete, and done done, actually deployed. Code that's in between these two places, we don't really know if it works yet. I mean, we think it works. All of our tests passed. We're super smart. And so obviously, we're not going to do anything wrong. But at the end of the day, when it's between done and done done, it's still in our minds as a developer. You still have to think about this code because you're not totally sure that it's actually going to work. That's taking up brain space, taking up brain space that could be dedicated on working on that new feature that you actually should be iterating on. So there's this cognitive weight that's occupied for code in between. Additionally, if you have to come back to some feature, once you've actually sent it off and it's passed your basic tests, you want to be able to move on to the next thing. If you have to interrupt your day to work on something that came back, or even worse, like two weeks later, because the code has been laying dormant and not actually deployed and you just figured out that something went wrong, you have to go through the process of recreating everything that wanted to generating that new feature in the first place, which is not good. So continuous deliveries answer to this basic problem is we should construct pipelines, clear systems by which code moves through each stage from development to production. And that we should lock those steps into a structured and semi-automated process. Here I am stealing from Jez Humble again. So the idea here is that whenever code, whenever new code is written, we start here at the commit stage. In the commit stage, we might make some very simple analyses. We look at the code just to ensure you can do things like run code sniffer and show that it passes PHP lands, whatever. But some very brief immediate tests. And then after that, there's a series of other tests that we need to go through. So acceptance testing, which can encompass a more exhaustive test suite. It can encompass the automated acceptance testing. It can encompass, well, a lot of different things. But the commit stage, the goal of the commit stage is to get immediate feedback to your developer saying, yes, there is nothing flagrantly wrong with this code. It should take no more than 10 minutes. No more than 10 minutes. Come back to that. Acceptance stage can take a longer period of time. But after you've passed the large full test suite that you have in your acceptance stage, at that point, you can pass along to in the lower right there, that's UAT, user acceptance testing. You might have some performance testing or capacity testing thing that you're doing. But at this point, if you've passed acceptance testing, that code is ready to deploy. And in fact, it should be deployed, if at all possible. Which has, if you're thinking about that, it's like, wait, I should deploy simply because the code is there as opposed to because we're ready to put some new feature Yes, key concept, maybe number one. The whole idea is that if you are deploying constantly, then there's really not a lot of risk and deployments. Deployments are, I mean, I can't think of any other time in my life as a coder when I am higher stressed than when I am engaging in some complicated deployments. So taking the risk and the stress out of deployments is really worth doing. If we automate this whole process or at the very least, clearly identify all of the different things that we want to accomplish in the process between having code committed on the one hand and then deploying it to production, there is much less gray area, far fewer dark corners that code can sit in for a week or two until you realize, oh wait, it didn't pass this test. I have to come back to it and disrupt my entire schedule. So, sorry. So the notion of the deployment pipeline is that for any given project that we have, we articulate these different stages that we want to have. And we move code through in a structured way. I'm sorry, Mike. No, it's just frizzed up on me for a second. Now, fortunately, and this looks much less pretty because it's actually real, there are, there's actually a lot of software out there to help with this. This is a screenshot of the continuous integration plugin or continuous delivery plugin for Jenkins. There is in fact a Jenkins plugin that can capture this. And if you look at what's being shown up here on the top, this is the basic steps that a project goes through, fast test followed by some acceptance tests and run at the same time as second set of acceptance tests. Then you run a backup of production and at that point you can do user acceptance tests and then eventually you have a deployment on the bottom. It shows the state of an individual revision that's coming to the pipeline. So imagine that we have the archetype on top and then on the bottom, on the bottom a new revision has been pushed in to code. It's triggered off the initial test, the fast test passed in 0.18 seconds. The developer got their immediate feedback and then it started kicking off builds and running the additional tests. The key though is that it's not just about running tests, it's also about potentially letting users interact with the system, which is why we have this automatic versus manual approval stage. We can teach computers to do certain things but there's also plenty of things where we need to bring some other person into the process. If you actually have a QA team, that's great. If it's just the person sitting next to you, that's also okay. The key thing though is identifying that we actually want to have that step and then having a tool like Jenkins help us with capturing that whole workflow where code moves through, we've done our automated test, now we know it's time for humans to come in and do their work and they can actually come in and check something off, it says yes, it's good and it can pass along to the next stage. And eventually you get to the point where it's passed all tests, human and computer, and with a single button you can deploy to production. So one last sort of high level piece before I try to move into more concrete, Drupal related aspects of this. This is usually the weirdest thing for people when they hear about continuous delivery. Continuous delivery's conventional wisdom says you don't have anything other than main line. You don't use feature branches. So there's reasons for this. Divergence is risky. And this is one of those things that, I mean back when I was advocating for Drupal, you can have feature branches, you can toss any feature branches off, you can work on something, you can throw it away and that sounds great and fine. But if you think about it, you're intentionally creating a divergence in that situation and whenever on a project where you're working with a team, you've created a divergence every time, every additional commit that you make is much more likelihood that there'll be some conflict, some system level conflict with what has been worked on in main line. And that's something that you'll have to resolve later on which again comes down to one of those instances where I've worked on some code and then here we are two weeks later when I try to merge it and I've forgotten everything that I've done and then there's a conflict and I have to resolve it then. When you switch off of your main conversation space, essentially branches are conversation spaces, then you have this risk that you might have to do nasty, nasty reintegration later. Also complicates that whole deployment pipeline view. I mean again if we go back to here, you kind of have to have a whole another one of these for each one of your different branches if you're gonna have different branches. It's like, so they can think about this, I mean if the whole notion of a feature branches it's never going to get deployable eventually to be merged in, then half of this doesn't make sense. There's no way that you would ever be deploying the production or deploying your feature branch to production. It's not the purpose of the feature branch. So some drawbacks. I am not completely convinced that you should not branch as I will say, but so one of the ways that you get around the fact that well gee maybe you have, maybe there's some features that we ought to, we ought to do this deployment even though the feature isn't necessarily ready to be shown to people yet. You use these concepts of either feature toggles or branching by abstraction, which is essentially saying write a layer into your code where whatever feature it is that you're working on, whatever user facing thing that you're working on is actually attached to some very simple like, I mean for Drupal it's a variable set or something like that. You know it's a global com variable that says turn this on or turn this off. And if you do that then you can have your code out there and you can deploy it safely and users will never see it until you're actually ready to turn it on for them. Facebook has, one of their guys has been quoted as saying that they have their next six months of code is running on production right now. Like the next six months of features are already deployed. They use feature toggles to mask that from the users. They actually use a lot more than just that. The other option is branching by abstraction which is just a notion that in your architecture itself you create a couple of abstraction layers which goes hand in hand with feature toggles. Generally like you create some abstraction layers or feature toggles choose which one of the layers you ultimately select and work through. But these are ways to allow you to work on and commit code while all staying in the very same tight consolidated conversation space so you're not creating the vergences. And at the same time not have features be shown to users. But either way the overall goal here is that we are deploying as often as possible and the fact that we are deploying has little to nothing to do with whether or not a given feature is being turned on today. Being able to separate those two things is great if you can. So yeah, ponder on this really. How different would it be if in your own organization you did not use branches? I mean what changes would you actually have to make? And given that this is the suggestion of many of the continuous delivery gurus like the gravity of what they're saying. They are really saying in order for us to have good tight effective software development processes we need to keep our conversations very focused and contained and letting people wander off into their own feature branch domain not necessarily the best idea. Okay. So let's talking about actually doing this with Drupal because some of the pipelines that I showed already and certainly some of the ideas that are put out by Jess Humble and his company ThoughtWorks. I mean I would say that they tend to be oriented towards larger more static teams and more static projects. And as a result many of the things aren't that directly applicable to Drupal and the kind of context that we tend to work in. Might also just be that well a lot of us come from actually show of hands. How many started with Drupal as a hobbyist? That's at least a third of the people in the room. In a situation where a third of people have started as hobbyists like the idea of a deployment pipeline for a hobbyist is insane. So it's entirely reasonable that you have to go through some steps and change some thinking but since so many of us come out of that kind of mentality we often slide into much simpler workflows. So talking about actually doing this and starting to achieve continuous delivery with Drupal. First of all, Jenkins in the middle always. Everybody needs Jenkins period end of story. Set one up like today. If you're not running a Jenkins instance already for your company or your small organization or whatever like get one of those set up. And starts by moving every DevOps type process that you have into it. Things that are cron jobs, move those into Jenkins. You'll be able to add more things later but at minimum pulling cron jobs which are running some important parts of, I don't know like pulling feeds from external location that you need for some website that you're running. Sure, something like that. Even moving those things into Jenkins helps you have a single dashboard where you get the overall state of much of your platform and functions as a state dashboard for your whole company or project. Set up notifications like crazy. You can do a lot with this. You can connect with, it's got a built in IRC bot which can spew things out. You can connect with, there's a Hubot plugin for it. Hubot is GitHub's super sexy Node.js bot. You can connect with Jabber and you can have it send scads and scads and scads of emails. But either way, what you wanna do with notifications is help create this rich information environment for your developers. And for other people as well that say this is exactly what the state of our code is. Because again, it's that gray space in between done and done done where you're at risk. So the more that you can do to clearly explain the state that that transitionary code is in, the more that you can, that's the more that you're moving towards a continuous delivery mindset. There's also plenty of plugins that you can play around with. Jenkinsphp.org lists a lot of them. That site's maintained by Sebastian Bergman who is the author of PHP unit and basically every other useful PHP code analysis thing out there. Not all these are necessarily that Drupal useful, but still, some of them even, so there's, what is it called? PHP, PHP LOC, PHP lines of code. But it does some interesting analysis and they're like, for example, how many, the average number of lines of code in your classes, things like that. Even information that isn't directly strategically useful to some overall goal that you've defined can be handy to have if only because we like looking at graphs and like I said, rich information environment helps people to feel connected and tuned into the process of producing the code and moving it through the overall pipeline. Right, so back at the very beginning I highlighted some of those words in that quote from Jess Humble. And really what it's about is building testing and deploying, these are the three things that you do over and over again in one form or another when you're working towards continuous delivery. And everybody's process is gonna look a little bit different but it's gonna be these pieces over and over again. So building is about taking what's in version control. And oh yeah, if you're not using version control, also. Yeah, you can't even have a conversation if you're not using version control. So it takes what's in version control and creates some kind of deployable package, something you can send out to an environment where it's actually seen. It also ensures that the deployment target exists. Oh, I'm sorry, I had that phrase better before and I tweaked it. Forget it. Ensuring a deployment target exists. The whole pipeline thing is about we've got lots of environments, we've got environments that we develop in environments that we test in environments that we do this, that and the other. You need to make sure that that environment is actually available. That might mean adding a new V host or something like that so that when you drop code into place it's actually web accessible, it can be seen. It could mean a lot more as I'll talk about more later when we get into the sort of more complex things that you can do with it. But if there's an actual discrete pipeline step and if you Google around and you look at various continuous delivery things you'll often see things listed as like a build step. Usually what's happening there is things like generating binaries, which is obviously not a concern for Drupal, but it is more pertinent if you're working on a distro, of course, since the thing that you're pushing into version control is your make file and not the entire build site. So you need your CI server to go from that make file and actually build out a deployable package. And of course, local environments need building too and this is actually one of the areas that's trickiest with Drupal. It's the difference between, sorry, it's the difference between our local environments and the deployment environments that were the deployed target environments, whether however we construct production, are we like cloning from Git and doing that? That's really not a good idea. Are we using Capistrano or something like that to move code out? One of the ways you can help with this though is if you use Fing or Ant style tasks or also Rake or Paker fake style tasks, there's sort of two families there. Let me just do a quick show of hands actually because I'm curious. Hurt of Fing, Ant, Rake, how about Paker fake? How about Paker fake? Yeah, that's what I thought. So Fing is a build system in PHP which is basically structured after Ant. It uses the same type of XML based build files and everything. Rake of course is a Ruby make system. Pake and fake are both styled after Rake but they're in PHP. So there's a fair bit of, you will quickly find yourself wandering into Ruby territory as you start working with continuous delivery. The other part of building though of course is that you have to figure out where you're gonna get your canonical data sources from. Since, as we all know, any Drupal site is basically database plus code, we have to figure out where does the data source come from? If I'm gonna build a test environment, which database dump do I use? If I'm gonna refresh my local environment, am I gonna pull that down from production? Can we do some things with deploy? It's something that has to be sorted out based on your very specific situation. But the big thing to do when you're figuring it out is to ponder on and document the when, how, and why for generating your database dumps, your artifacts. And example on Drupal.org we have daily jobs which creates specialized, sanitized dumps that have, there are even a few different versions of the sanitized dumps that serve different purposes. Those are generated and then we use them for a bunch of different tasks. Jenkins can and should produce these artifacts for you but as part of your process for figuring out what your build pipeline looks like, you need to basically tell Jenkins to do it. Like I said, Jenkins can do basically everything for you. So when in doubt, put it in Jenkins. And then here's a little tip. Who's heard of Stagefile Proxy? Yeah, all right, super cool. I've had some problems with it but the principle is not hard to understand. Install Stagefile Proxy on a site and you tell it the proxy site that it should download from and every time you request a file from that site it will go and grab the file from the original one which means no more copying files directories around. No more having to have like four gigs of space for every instance. It just grabs the files that you need on demand from production. Basically, awesome idea. And makes the whole problem of moving files around basically irrelevant. There's some limitations to it and it needs some love and it hasn't moved much recently but it's very good. All right, so you also have deploying which I kind of encompass with releasing. The idea behind the deploy is you're getting the latest code, database, everything, whatever out to some existing environment that's already out there. And you need to be using this at the very least for whatever staging environment you've set up. That last stage before you actually deploy to production you definitely need to be using your deploy process for it because this is the thing that tests whether or not your deployment to production will function properly. So features, exportables and maybe deploy your friends. Features and exportables obviously help you with expressing things in code so that's much easier to send out since it's in version control. Deploy is a much more complicated question. Again, I'm curious, has anybody actually used deploy to not move code like from a staging server to like a content staging server to a production server but like use deploy to move content like from dev to an integration server. And then for some reason we're further along. Yeah? Okay, interesting. Yeah, it's not something that I've tried. But ultimately when you're actually running the deployment again Jenkins is your central, central piece and you're gonna ask it to kick off Capistrano or some similar system maybe just deploy is Sonobom in here? No. All right. Don't use just deploy then, nevermind it's a terrible idea. Just deploy is a just extension that Mark Sonobom wrote which is basically similar to Capistrano but tailored for PHP and for Drupal. And you have your test steps. This is where a lot of us get tripped up. So there are lots of different philosophies and approaches to testing and I'm clearly not going to like delve way in and explore all of them here but test-driven development, behavior-driven development either way, pick your approach. There are a lot of factors that go into that which is more appropriate for your particular case but your essential tools that you're looking at are the built-in simple test that we have in Drupal which is problems or PHP unit and then Selenium to do your web interaction test. The basic ones that most of us have heard of. The limitations with simple tests are many. Generally speaking PHP unit, when I'm writing a set of tests for like a client site I generally don't put those in simple tests. I generally write external PHP unit tests which are run. There's also much better PHP unit integration with Jenkins so it can read its output and give you your nice feedback as things proceed through your pipeline. Some tools you may not have heard of though Copybar, Cucumber and then Bahat and Mink. On the left there you have your Ruby tools and on the right you've got your PHP versions. Copybar and Cucumber are very, very well accepted. Bahat and Mink are sort of newer projects that I have not spent anything, like I've read the code and that's it. I haven't actually run with them but they serve exactly the same ideal purpose. Copybar is kind of a test runner and Cucumber is a behavior-driven development framework. Who's heard of behavior-driven development? Wow, okay. I don't have a super good summary of that. Whereas test-driven development is more oriented towards a developer's view, an engineer's view of what problem is being solved. Behavior-driven development says let's start with the behavior that the user is going to enact, define all of the conditions in different ways that things can go and then have our tests conform to the user's behavior. Really, really useful if you are using agile methodologies clearly, which are very geared towards user stories and everything else. Behavior-driven development goes hand in hand with a user-story-based approach to planning your work. So I mean a simple example is just something like, in fact many of the canonical examples are, user goes to an ATM and you would go through in behavior-driven development and say things like, all right, given that the user or given that the person has enough money in their account and they have the PIN code in this and that and the other. And by laying out all the given that's and therefores, you clearly lay out all the different paths of possibilities, all the different routes that your code ultimately has to support. But you express that in a clear narrative language which is understandable to engineers but also like folks who are not at all part of your engineering team. So it creates a platform for dialogue about what your tests should actually be testing for. Which again, makes the whole process where we're trying to involve folks who are not just engineers but acceptance folks like if they can look at the same tests and understand what they're testing makes communication a lot better. So Cucumber is a framework for creating behavior-driven development tests. Bahat and Mink are PHP versions. It's bar and Cucumber are both Ruby. What can get tricky though with tests? So time and scheduling is a really big deal. Again, if the whole goal here is that we have a humming, happy project system where code goes in and then it moves clearly through a series of steps and then ultimately gets deployed as quickly as humanly possible. Well, if you've got a test suite which takes like three hours to run, then you're gonna have a problem because if every commit triggers the test suite, then what, do you have to wait another three hours to commit and send something up? It immediately throws sand in the gears. So when thinking about how to create your pipeline, this is the importance of, they're called a few different things but I like to call them smoke tests. Just quick immediate tests which are run. The rule of thumb is no more than 10 minutes and preferably less than five. Quick tests which are run that ensure that nothing is actually on fire, hence smoke tests. And that lets your developers move on to something else while the full on test suite is running. The way that, I'm preemptively answering this question because this is the one that always gets asked. So the way that a test suite should operate though is you go through the smoke tests, smoke tests say, great, okay, this is functioning, Mr. Developer, you go on, continue what you're doing. The commit then continues on through a chain and maybe goes into a queue for three hours long of tests. At that point, by the time that those tests are finished and everything's passed, you've got like 10 more commits that have come in. So those are queued up behind. The way that the system ought to operate and does when it's functioning properly is, all right, we finished with one commit that we were testing. Now let's go back and grab all 10 of these and test them all together. It's, there's basically no way around it. So, but it tends to be a good enough solution and if that suite of tests breaks, you don't exactly know which of those 10 commits it is that broke it. But if you're iterating rapidly and everybody is kind of on the same page and communicating well because we're all in the same branch and everything like that, it's really not very hard to figure out which of those 10 commits were the problem. So, more specific now. Useful advice for any project, regardless of its size, since DevOps tends to change a lot. But actually, let's not even say regardless of its size, you're simple Drupal projects. Not talking about distributions, not talking about something with like, I've got complex multiple servers or anything like that, just basic Drupal site. The first thing that you wanna do is, of course, decide if you want to use branches. So, I generally, I generally don't recommend that people try to go with the single branch route in Drupaldom. For one thing, those alternative options that allow you to keep features out. Feature toggles are a hard thing to do in Drupal just in general. The system is not very amenable to it. So, decide if you wanna use branches or not. Actually, physically sketch out your pipeline, lay out all the different stages that things are gonna pass through. What's your commit stage? What are your initial smoke tests? That, what's your larger test suite that's being run through? Who is doing the user acceptance tests? What are those, and where are they getting that information from? And at what point can we deploy this, and then who is responsible for actually clicking the button once all of the indicators have gone green and Jenkins saying this commit is ready to go? So, doing that entails that you have to figure out what the build and deploy strategy is for each individual environment. In the simplest case, you really don't have to do much building for most sites. I mean, what it basically comes down to is you have a version control repository. You're gonna clone that down locally and you're gonna grab a database dump from somewhere. So, you do still have to figure out, you just still have to figure out where your database dumps are coming from. And that's where the tools that I mentioned before, Fing or Ant or whatever can actually help you is you can write some simple build tools which will grab a database dump down and load it up locally. And if you create that logic and you store it in your repository, then those can be the exact same tools which Jenkins will then later use when it's setting up your test environments or whatever other environments you may need. Oh, sorry, that's, clearly I'm very well put together today. I apologize. Your repository, I've been up and down through this and if you came to my dog presentation last year, this might seem a little weird that I would be advocating that you have only one repository. But still, some folks, when they look at this, they say, well, we should have like one repository over here where we're keeping track of all of our build instructions and whatever else. Now, you want one repository and in top level subdirectories, you've got Drupal itself, you've got your tests, and then you have any build information, any build scripts that might be necessary to do those things like grab the database, grab a files directory, do things like that. Here's one of my favorites. How many people put settings PHP and version control? Usually that would be a good thing, but I'm actually about to advocate that you do put settings that PHP and version control. All right, so I cannot think of a single situation that this would not work and work well for and scale well. So, in your settings.php file, you should define general settings which are applicable to the site regardless of what environment it's actually in. I don't know, simple examples being like, yeah, I'm blanking on simple examples. Well, on Drupal.org, for example, we use, all right, okay, the mechanism by which we pick which, the mechanism by which we select which viewer is used for code. So like when you go to drupal.org, that's all controlled by a plugin. And that plugin, which one is used, is determined by a variable setting. Therefore, we have a variable in settings.php that hardcodes it and makes sure that we use the get web one so that it'll point you to drupal.org and everything will be hunky dory. That information is the type of thing you put in settings.php. That's gonna be true regardless of which environment you're in because you're never testing something else. You always want all your environments to look, you want the environments to look the same at that level. You then, from settings.php, include a settings.local.php which is not in git. So this is the file that you actually set up when you create a new environment. And in that file you put your database connection information, other specific things like that, database connections, like memcache bins, which ones are there, perhaps cache settings, things like that, and then optionally, you specify a role, which is to say something like dev or test or stage or prod. And then what you have is a set of other files, settings.roll.php, like settings.dev, settings.prod, settings.etc, which is stored in git. And they are used to store settings which are specific to the way that that environment should behave. So for example, in a settings.dev.php, that's the perfect place to set the setting to ensure that no mail will ever leave this server, that it always, you know, dead ends right back in, because no dev instance should, as a general rule, ever be sending mail. So this lets you respect these role concepts, these environment concepts and manage that inside of version control, which is the closest thing to your engineer, so that's where you want to do it. But still, your global settings php, you can have your local, which isn't in git, so you don't have your database connection strings anywhere, which is obviously crucial. But then you can still have per environment settings. It's also useful to do things this way, because when you granulate things at the level of a file, as opposed to just like php logic inside of a file, it's much easier to do tricky things with like configuration management once you get into more complex situations. So the basic system that I described before, really only applicable if you are working on a very simple triple site. If you are to the point where you have like a web head and a database box, or multiple web heads and a database box, or even more than that, then you should be using configuration management. And that's a whole other layer to introduce into this. By the time you get to that point, you should have your repository that was laid out the way I described. You've got Drupal and tests and build information, whatever, and then you have another repository which you, in which you've got puppet chef, whatever, configuration management system that you want, which is laying out how all of your different, all of your different role-based machines work. You've got the configuration for your web head, you've got the configuration for your database box, you've got the configuration for your solar box, whatever you have. And if you've gotten to the point where you need configuration management, then you've gotten to the point where you need to have people instead of developing in whatever local environment you can throw together, you need to start seriously looking at switching over to using local virtual machines. So at that point, you can use a tool like Vagrant. Vagrant is a Ruby library for working with virtual box to set up very easily, set up and provision virtual machines. So you can use the exact same configuration management system that you use to provision your production boxes and all those production environments to provision the local VMs that people are working in. So at this point, what you've done is you've gone well beyond, well beyond what we'll be on simply ensuring that like, oh, we've got the same database and the same files, it's oh, we are in exactly identical environments. These virtual machines have been provisioned in exactly the same way and we know that they're exactly the same as the way that the production ones work. So you can say goodbye to all those problems of, oh wait, I had a different MySQL version or PHP version or something like that. That's locked down. It's far more complex to actually set up though. It is, however, what we're doing for Drupal.org. So yeah, I want to talk a little bit about that because it kind of illustrates how a lot of these pieces can fit together. The system we would like to put together for Drupal.org, okay, who's tried to actually do work on Drupal.org and like get a hold of the code base and contribute? Yeah, it's hell. It's absolutely awful. Like I mentioned, I'm the Git maintainer for Drupal.org. It took months for me to even get the proper testing environment set up again after we launched and I do not have a functioning local test environment for Drupal.org. I do not. It's that bad. And the way that we're ultimately advocating that this would work is that we have, because we already use some configuration management to manage certain pieces of the Drupal.org infrastructure we want to shift the rest of it over to there so it's all controlled by one single configuration management system. And we then can utilize that same configuration management information which provisions our production boxes to provision all of our test boxes and all of our staging environments so we can be absolutely sure that when we deploy to staging it is exactly like deploying to production. There's nothing different is gonna happen. We can trust our deployments and again, it's perfectly reasonable then to say, all right, we've deployed to staging, back in the context of our pipeline, we've deployed to staging, that means that deployment of production is gonna work great. Off we go. But beyond that, it's also a system for people to be able to get their local instances set up. If we can make progress on this, then we will eventually get to the point where the steps required to work on Drupal.org will be nothing more than download Vagrant, download VirtualBox, clone one repository and then run Vagrant up. And about four hours later, it'll be ready. It has to download like 10 gigs of stuff. But you will have on your local machine at that point, like four VMs running. A web head, database server, the Git server and a solar server, all of them will be connected and talking to each other in the exact way that the production environment works. And you'll be able to go into those boxes and edit code directly on a functioning local copy of Drupal.org that's been all insulated from sending emails or anything like that using some of these techniques that I've described. But you'll be in that magical place where you have your own functioning copy that you can safely break. There is nothing better than being able to safely break stuff when it comes to learning about how a large complex system works. But when you get to the point where you have an environment as complex as Drupal.org, you have to have the sort of full stack system to think about it from my perspective, right? Like, if somebody here were to set up their own local version of Drupal.org, I know how complex it is and I know the likelihood that there's some piece which someone has probably broken in there. So my ability to trust whether or not the changes that you have actually worked locally is like marginal at best. I basically have to go through my whole own entire testing process. If, however, I know that you've worked off of this system that we've set up which has been provisioned properly and just works and is correct, it drastically lowers the amount of time and energy that I have to invest in ensuring that your code's actually correct. So, from there, we start defining a pipeline for Drupal.org, a deployment pipeline for Drupal.org, and that I think might be one of the ways in which we could actually push back on some of this highly engineer-centric approaches to things that we have because the user acceptance testing can be all of you, essentially. We get to the point where some feature has been worked on, it's been pushed up, it moves to the pipeline, it passes the automated test, it's automatically deployed to an environment where people can poke at it, they can look at these features and say, yeah, this works and no, it doesn't, and they can actually push it back. But we have a structured way of having that conversations. Right now, it tends to be very difficult to get people to contribute to those conversations, and I would tend to say only because, well, apart from, let's stipulating that everybody thinks Drupal.org is awesome and they wanna work on it all the time, but the fact that we do everything in such one-off ways makes it really hard for people to know where to listen and what to pay attention to and where their energy should be focused in order to be most helpful. You solve some of these basic coordination problems by agreeing on what your process looks like and people can spend a lot more energy working on actually making stuff happen as opposed to figuring out how to make stuff happen. So, end of the day, though, good enough is what's most important. There are reams and reams and reams of stuff written on continuous delivery, and I would seriously encourage you to go have a look at it. Like I said, we make this sort of promise to ourselves about how we think that our site works, and unless we have systems like this in place, we end up shooting ourselves in the foot. So, read about it, research it. I'm happy to talk about these things at any time, but end of the day, it's really easy to get overwhelmed, so pick out the pieces that you can do most easily and stick with what's good enough. Thanks. Questions, please use the mic if you do. Sam, in that last scenario you described with the role-based machines, using Vagrant to provision local VMs, what kind of system requirements would you be looking at for a local machine to be running four VMs at one time to do dribble.org work, for example? I've run four VMs on this thing, and it hasn't actually been that bad, but I mean, it depends. That is the type of thing that you generally need something with the power of a desktop workstation, and you usually can't run on a teensy little laptop. So we've been just exploring some alternate methods so that we're not eliminating anybody who doesn't have a hugely powerful machine from being able to work on dribble.org. But you can pare down the memory usage and stuff because you can configure how much memory is allocated to each VM, but yeah, there are limitations. Next question. Hi, thank you for your presentation. I'm just wondering if you could please elaborate on why you think that using Git to deploy, like let's say a live branch is a bad idea. It's not that it's a terrible idea, it's that it can lead to laziness and often does. Maybe the best way that I would put it is the situations where I tend to see people using Git as a mechanism for deployment are also the situations where people like tend to just be sort of deling around near their production code. And whoops, I accidentally made a change in production and hey, a whole bunch of things broke. There are concrete reasons though, I mean like there is going to be a window of time in which your code is not fully updated because Git takes much longer to update. The standard practice for doing deployments is basically you swing a Simlink, which is what Capistrano does. You take some package version of the code, you drop it into place in some subdirectory and then you swing whatever Simlink was pointing to the old production code over to the new production code and that way it is instantaneous that all the new code is slotted into place. So that can actually become fairly important for a six or seven second window and if you get a bunch of requests during that period of time then you're going to have some weird cash issues that happen, there's a cascading series of effects that can occur. So the Simlink swing is best. So if the site is small enough and deploying that way is very fast like less than a second and you have a process like you have a dev branch and a staging branch and a live branch and a branch that is in between the live and the staging and you make sure that everything is, I mean, for me, that's what I've been doing for our website and that has worked for us perfectly but I didn't know if there was something else that could go wrong. If it's good enough, it's good enough. There are some small things that can go wrong. Like I said, they can turn into a sort of giant thundering herd. Generally speaking, it's fine. It's good enough, it does function. Yeah. Hi, my name's Stephen. So somewhat related to that but with Git and the manual approval process or acceptance testing, would something like Garrett or Gitterlite or Ketosis, something like that fit in that spot? Or could it be used elsewhere? What do you think of that about those servers? Potentially, yes. I mean, Garrett assumes a particular workflow and it's representing things in version control that a lot of continuous delivery advocates would say are not a good idea. Like, you can commit to an incremental place and then it requires a whole bunch of people to consense in order to actually send it to the real place inside of version control, which is not technically part of the, I can't think of the condescending word that I want to use. The official mandate of continuous delivery. You could use it like that, but that's also what, like I said, there is this Jenkins plugin which provides the view into the pipeline and can help you control all those stages that is designed to do the same thing. It is the same principle though. And if you were to choose to construct your system or on that, like, I'm sure that you could do that just as well. I tend to prefer to just put as much as I possibly can in Jenkins because it, like I said, it becomes a dashboard for sort of the state of your entire operation and the more information that you have there, the better. There is Garrett Jenkins integration, but I haven't done much with it. All right, now Patrick's going to beat me up. No, thanks for getting this conversation started. And honestly, I'm just really excited that this is happening. Can you speak a little more into the mic? I was just going to say, like, one thing I want to emphasize is I was really worried that some people might be listening to this and being like, whoa, this is way over our heads. This is way above us. This is way too complicated. We just want to make small sites. The great part about this is that, like, I guess the comparison would be, like, when someone explains Drupal Bootstrap to you, and if you don't understand everything that's going on, that doesn't mean, whoa, Drupal's way too complicated for me because we keep, it's an application that we keep in code and we share easily and no one needs to understand. And the big push with DevOps is that your infrastructure, your tool chains, everything you're using is that's in code as well. And not only does that give you the security of knowing that, hey, if our data center blows up, we can just bring it back. It also means this tool chain is in code and we can share it and have a common platform for discussion and share processes together. And you don't need to understand what, how it's all working. Just someone has to have set it up. Some people need to have set it up and then that can be shared. So I guess I see awesome applications in just sharing processes in an open source community. That's true too. Yeah, yeah. Yeah, I just, hopefully no one's intimidated. I saw a lot of people walking out right after. I just read the word that it's not too much for large or small size. No, it's, there are discrete pieces. There are definitely bits that you can tackle. And thank you for saying that. I have this hope that if we can put together that Drupal.org system that I described that actually this talk is nice, that system would be fantastic. And not just for Drupal.org but because, think about it, right? There's, we have a place to share Drupal code but we don't have anywhere, which is Drupal.org. We push all of our code out there but we don't have anywhere where we can actually collaborate together on running a site on what these practices might look like. And to have a space like Drupal.org be something that's managed by these processes where we can work these kinks out and folks can optionally jump in and learn about how this works and then take some of those best practices back to the places that they're actually employed. I could have amazing effects in the community because we can develop some of the best practices out in the public space. So, yeah. Hey Sam, nice to talk. It was great. Thanks. So in seven now, there's a gitignore file with the settings.php, actually it's wild carded. I just looked setting star.php. Yeah. Are you suggesting we hack core? I absolutely am. I did not endorse the placement of that gitignore file in order to, I think it's a good idea now. I don't like it either because I have to add to gitignore a lot and so I'm hacking core all the time too. Yeah. Yeah. No, it was, it was. So what do you do? You just like replace it when you update? Yeah. Okay. Thanks. I don't think this is the same question he just asked but if it is, tell me. The settings.php goes into git but settings.local.php, you do a php require once inside of settings.php? Yes, I'm sorry. I tried to put the code up there and it wasn't working well. In your settings.php, it should basically look like settings you want to set at the top, then there's a require once settings.local.php. Then there is a, like if dollar roll is set which should be set in settings.local and that file exists, then require the roll based file. Okay. Make sense? Yeah, it does. So then on every different machine, settings., or my local copy, the production copy, there will be a settings.local. Yes, and that's the file that you have to set up manually. Got it, okay. Right, and that ends up being, yeah, I didn't circle back to this point, that ends up being much easier to mess around with in a configuration management system. It's way, way, way harder to write a regex or something that replaces a database string inside of settings.php from your config management. It's really easy though to do something like since your config management systems are going to be aware of what role it's deploying to, it's the whole idea with them, it can do something like, actually it could even do something like if you wanted to go this way instead of expecting, well, it's much easier for it to have a bunch of like settings.locals which have been stored somewhere in like a separate secure repository which it then drops into place depending on what role it's looking at. And so you can have your database connection strings actually stored somewhere and then it just drops it right in and you don't have to muck around with the internals of any files. Perfect. Okay, thanks. Yep. I wanted to ask you a little bit more about configuration management. So one repo, build directory, and inside there is directories for each of your site types. Is it best to extrapolate out those like Etsy that's going to be common to have like almost a fourth one of like a base, a web database and a varnish? And then my second question is once you start getting into web server database and how should we be laying those files in just like straight up file paths of so when you could check it out from root and it would lay right down into the OS? Not sure. Well, let me just the first one. Wait, I got so distracted by the second one that I forgot the first one. What was the first one again? So the first one, is it best to take the things that are, so a web server and a database server you have a separate configurations within that build directory. Is it best to take things that are going to be common between those two machines like NTP settings and stuff like that and put it into a. Yeah, a good configuration management system. I mean like use Puppet or Chef. Either way, I don't care. But those are the big systems which are kind of out there today. There are a few others, but either one of them will allow you to create like a class hierarchy of no definitions which can inherit from one another. So you can have like a base from which web inherits, from which like a web staging inherits whatever and you can create a nice structure where those things actually pass down. So yeah. So we should be replicating that in our version control. You should, yes. Where we're storing the configs. What's that? Where we're storing, replicating that abstraction in our version control system where the configs are stored. Yes, you've got your repository which contains your Drupal site. It's tests and it's build stuff then you've got a separate repository. Ah. Yeah, that's nice. Right, separate repository which is for your config management and follows all the rules. There are plenty of rules and structures about how those are supposed to be laid out. Okay. Read about those, but yeah. And so the second question was. Was, so if there's like web server and I'm running Red Hat, should there like just be Etsy, you know whatever the settings are at the HPD comp like sub-directories all within that sort of like web server top end or what's kind of the best way of keeping all these configs inside a repo? That's, we could, it's a long answer to that question. So we should let the next person go. Sorry. Okay, it's two questions. Should we speak a little closer to the mic please? Sure, it's two questions. So Jenkins, or actually three. Jenkins used to be called Hudson, did it not? Yes. Yes. Is it possible, here's the real question. Is it possible to install Jenkins on a system that doesn't have a qualified domain name? Yeah. So local host, no problem. Okay, good. So Vagrant doesn't have anything to do with Jenkins? No, Vagrant is a separate system. Separate thing, okay. Yeah. Thanks. A quick question, I don't expect a full answer. The whole question of what to do, what arrangement to do with the Drupal distribution itself? How you organize that in gear? In other words, should it be a sub-module or all of that question? It's a big question, but it would be really cool if that information could be updated because when I read the Drupal docs, you know, there's some really cool ideas, but I don't know which one to use exactly. So I'm always unsure whether Drupal should be a sub-module. Do you have any specific recommendation? And apart from that, it would be cool to discuss that later. Yeah, yeah, it's complex. The simplest answer, and you know, good enough. Yeah. That's right. That's right. Which that is, and that is largely why I said like one repo. Sub-module's complicated. You know, I wanted to work on dog, I don't intend to work on anymore. It could have introduced the possibility of us using more complex repo clusters and stuff like that, but without some consolidated tool like that, my recommendation is that you, you know, pull out the tar ball of core and then drop your tar balls in a place and you just have that in a sub-directory inside of one single repository that's joined with your tests and your build system. It actually, it's almost, it can get even more complicated for Jenkins to have multiple repositories. Like people might be able to handle it, but your CI server might even end up getting more confused. If you've got like, you know, test stuff over here and build stuff over here, but then the actual stuff it's testing in like a separate repository, it can be very awkward to get it to like, look at multiple repositories at once. Okay, okay, good enough. So again, about the settings file. Yeah. So does every server, including production, have a settings.local? Every functioning server will look like settings.php, settings.local.php and then a series of like settings.settings.dev.php, settings.prod.php, settings.stage.php. And it will read settings.php and settings.local.php always and then we'll read one and only one of the role files. Okay, and so what that leads to is I think the way that I've been doing it forever is just wrong because I've been setting up, you know, if my site is example.com, when it gets to production, I've been using default. Yeah, oh, that's, sorry, yeah. Don't use multi-site for environments. That's what I've been doing. So like dev.example.com, like a home, you know, for my local machine, I would have a multi-site for that, so that is not the right way. That would, you, there are, I mean, it is not a way that I would recommend that anybody do and it is a case where I feel like the benefits to doing it are not enough to offset the weirdness that it can cause in the long term, especially when a technique like this works, you know, sort of just as well. All you have to do is just keep a settings.local.php somewhere that you can copy into every local dev instance that you have, because it's probably gonna be almost the same in each case, except for the database name, so, yeah. All right, cool, thanks. Yep. So my company has a lot of non-Drupal projects, so they've settled on TeamCity as a continuous integration tool. Okay, I'm sure you've heard of that. And it seems to be working well enough for the Drupal project I'm working on, but we're really not all the way there with our continuous deployment. We really just have dev to testing. Are there certain things about Jenkins you think will be better for us? Drupal, specifically for Drupal later on? I mean, since I don't know TeamCity, I couldn't speak specifically on one product versus the other. All I can say is really just that like Jenkins has the clear support of, I mean, it is the leading CI server in the open-source space, and it gets the attention, and you will get probably the same, I can make the same, the same not tightly causally linked argument that you make about open-source in general, which is that you're gonna get benefits because everybody in the Drupal community is looking at Jenkins, so if there's ever gonna be specific plugins that pop up for it, it's gonna be for Jenkins and not for TeamCity. All right, thanks and great talk. Yep, thanks everybody.