 My name is Michael Godek, and I work with ThoughtWorks, and this is... I'm Robert Stroff. I work with Acquia, and we both had a lot of experience with various large DevOps and continuous deployment and just managing big projects over long periods of time. I think that leads into basically what we're talking about here. So, the session slides are currently available, so if you'd like to follow the slides on your laptop or on an iPad, then if you click through to that link, and you're not going to click through to that link, you're going to copy that link right through your fingers. ThoughtWorks.com slash Insights slash Blog slash DrupalCon Austin. And from there, there's a link to these same slides. You can download those, and they're available to you here. Also, on that same session notes slide, there's a whole series of links that's in that that are all background material to this discussion. All kinds of stuff that we would really love to talk about, but there's not that much time really here. A lot of the stuff that we would want to talk about that we can't is on there, and that includes the history of the Go project that we're going to be talking about. And Go is not Go language. Google's language, Go. That's not what we're talking about here. Go language is awesome. It's really very interesting, but it's not our topic here. Go is a software build delivery tool. It's been around for a very long time. It's got a great history. Again, the history of it is on that slide, so I'd encourage you to go through and read that. There's a link on making the case for continuous integration. That's about Martin Fowler's post back from the way back when. The case for continuous delivery is an extension of continuous integration and other posts and related topics. So really it would give reading for the days and weeks to come to fill out the context of this larger discussion of how we reliably deliver the software from the point of the conception of an idea of a requirement to its delivery to successful delivery and timely delivery through to production. So that's the background. Get into the session notes. All right, so welcome again at the official launch of the session right here. You know who we are at this point. All right, so this session is about build automation tools, but in order really to understand these tools, we need to talk about why we use them. Build automation is a component of a practice called continuous delivery or CD. So we need to have some context of what CD is and why we use it. Continuous integration would just be a subset of CD, right? Continuous integration requires CI, but it involves a lot more than just than what CI does. In the introduction of Jez Humble's book right here, which is the most awesome book on continuous delivery. And we actually have two copies of this book to give away to whoever's got the coolest tweets that come out of this. And the tweet handle is go CD hashtag is go CD. And we've got two copies of this book and we've got a copy also of the ThoughtWorks anthology once on software development. So get some tweets up. So in the beginning of this book, Jez Humble writes, he refers us to this industrial consultant, Edward Deming, who's worked in the 1970s explaining the Japanese method to American plant managers, revolutionized the thinking about the manufacturing process. And so Jez writes that Deming's work in the 1980s, 70s and 80s is the foundation of what CD practice in software is today. So what better place to start than the opening paragraphs of Deming's 1985 book back in the Reagan administration, right? Before our whole era of software. He wrote a book called Quality and Competitive Position because it sets really exactly the right context to understand why we value software build pipelines. So here's Deming. I'm going to read a couple of paragraphs from this. If you don't recognize the name Edward Deming, at some point you should at least check out the Wikipedia page on it. The aim of this book is to illustrate with simple examples that productivity increases with improvement of quality. Low quality means high cost and loss of competitive position. Some folklore, folklore has it that in America quality and production are incompatible, that you can't have both. A plant manager will usually tell it that it is either or. In his experience, if he pushes quality, he falls behind in production. If he pushes production, quality suffers. This will be his experience when he knows not what quality actually is or how to achieve it. A clear, concise answer came forth in a meeting with 22 production workers in response to my question. Why is it that productivity increases as quality improves? Less rework. There's no better answer. These people know how important quality is to their jobs. They know that quality is achieved by improvement of the process. Improvement of the process increases uniformity of the output of product, reduces rework and mistakes, reduces waste of manpower, machine time, materials, and thus increases output with less effort. Other benefits of improved quality are lower costs, better competitive position, happier people in the job, more jobs to better competitive position of the company. These are some of the lessons that management must learn and act on. Reduction of waste transfers man hours and machine hours from the manufacturer of defectives into the manufacturers of additional good product. In effect, the capacity of the production line is increased. The benefits of better quality through improvement of the process are thus not just better quality and the long-range improvement of market position that goes along with it, but better productivity and much better profit as well, improved morale of the workforce. They now see management is making some effort themselves and not blaming all faults on the production workers. Alright, so what you're trying to achieve in your use of build automation tools is to have your team spend less time on rework, leaving more time available for client work. I think that's pretty clear. There are other goals and measures. For example, in continuous delivery, one of the measures you're trying to get to is making the decision to release a business decision rather than a technical one. In other words, when you have a release candidate that plays the stories that a client wants to see go live, then at that point you really shouldn't be having to go back to devs to say, is this really ready? And you shouldn't really have to engage the devs' time or much of it in order to release that at that point because at that point it should be ready to release and it should be releasable, ideally, with just a click of a button, with no interaction whatsoever with the production system. It's fully automated. And with a rollback plan that's similarly simple, where you have easy, safe rollback options, these are fundamental concepts of the objectives of continuous delivery. In this sense, all of the effort of build automation is like a musician's rehearsal in that way. When it comes time to walk out on the stage, all your technical issues really should be resolved at that point. So the rest of the session is focused on dev stuff. So we're going to drill down into concepts, code, and configuration, and hopefully leave you at what being able to walk out of here with the idea that you actually can implement this stuff. It's like high-sounding, but it's actually possible to work into it. But we want to make sure you don't lose the bigger picture and the ultimate measure that the reason that you're doing this stuff is to get to where you can have more frequent deliveries, successful deliveries with less rework. And so that's a measure that you want to keep putting out in front of yourself to try to evaluate whether the kind of efforts you're putting into build automation are yielding those kind of results. So DevOps are the foundation of all of this. That's kind of underground. So let's start with that. So I'm going to run through a few slides that are just going to kind of review the basics of what we call DevOps. The name comes from, you know, a portmanteau of development and operations, which traditionally were sort of two siloed departments and the idea that they have to work together. And I'm sure most of you are familiar with it as kind of a buzzword slash movement that has kind of grown over the last five years or so. But beneath the hype, there are some like good general principles. And I think as we read through them, they also echo all the way back to Deming's principles. So one of these principles is configuration as code. And the idea was we figured out how to track changes in amazingly complex systems in software through the use of version control and testing and other things. So if we can express all other, whatever else we can express as code, we may be able to apply those same principles to. So if we can express our infrastructure and server configuration and cluster configuration and so on as code, then we can put them in version control, then we can write tests against new configurations and so on. Another general principle is automation, but not automation just to save jobs or not automation to save small, repeatable amounts of time. You might spend a week automating something that you do once a week for a couple of years and maybe it only saves you five minutes each time you do it. And if you add that up, it doesn't really look like it's paying you off and save time. The reason why is the five minutes each time you do it and it works, that's not the time you're trying to save. The time you're trying to save is from the one time that you didn't do it exactly like everything else. And the hours or days that it took you to figure out that that was the source of the problem and fix it. So automation is about exactly reproducing things, not necessarily just speeding them up. Another general principle is to put metrics on anything you can. And usually in the world of DevOps and continuous integration, this means things like logging, new relic metrics and so on. And furthermore, it doesn't make much sense to keep this information as silo. Most of the use of this comes from pushing this throughout your dev organization so that as far back in the chain as you can go to subcontractor devs, if possible, you want them to be able to see your historical performance graphs, for instance, and know whether your project is overall getting slower or faster. And so those, that DevOps slash continuous integration thing, the idea of continuous delivery is can we apply that to more? Just as we took the idea of configuration as code and then we're able to apply these principles to it, what if we go further and higher up in the business? So that instead of say software unit test failing and the general principle being that the committer sees that as quickly as possible and can fix it as quickly as possible, what if we push that also up to say business requirements so that project managers and client interfacing people are also seeing the results of some sort of test quickly and reacting to them and have a process around that? And I think that that leads into what sort of goals of what Go is trying to achieve. Cool. So the Agile Manifesto was about 12 years back, something like that, or longer. The very first principle that's stated in the Agile Manifesto is our highest priority is to satisfy the customer through early and continuous delivery of valuable software. And some of the people that were actually pinned that thing on the board, Martin Fowler was the one that supposedly wrote that up on the whiteboard at the time, were engaged in, they took that principle and they said, okay now we need to build some tools to implement this principle in practice. And so starting back in 99, 2000, before Jenkins there was Hudson and before Hudson there was this project called cruise control, which was really the first build automation tool in our modern history and it was an open source project that was run by ThoughtWorks. And Hudson came along, then Hudson forked at Jenkins, around somewhere in there ThoughtWorks took go to become an enterprise product where they were looking for a model to fund the development, so it became a licensed product, and it's been a licensed product for the last eight years or so, and it was just open sourced this spring, right? Now it's an open source project again in this way, it's back in the fold. But it's a code base that's been under active development for over 15 years. And it's a really, really cool enterprise tool and now it's just on deck for you right there. Go consists of, like Jenkins, it consists of a server managing many agents. You'll typically install agents on development VMs and the agents are doing the real work. Go server is your interface to the agents where you configure and monitor your build pipelines. And so Go has a number of important features baked in, such as support for trusted artifacts, we'll talk about what that means, built-in fan-in support for dependency management. And these are things that maybe you could accomplish in Jenkins, but you'd have to do it with a lot of plugins and glue code and stuff like that. And this stuff is baked right in to Go right out of the box, this part. Go also has a plugin architecture, but there's a lot of features that are built right in. With Go, you can deploy any version at any time. It's an interesting concept. Go really stands out when you get into modeling complex workflows and managing dependencies between software builds. Then the whole idea is to surface your, surface problems and breaks as close to the commits as you possibly can and do it all the way through the pipeline, not just with devs but with QAs and business analysts all the way through to production release. Go has a very nice end-to-end visualization, auditing features, implementing a lot of the metrics that Robert was speaking of, such as doing a build compare where you can diff both commit messages and actual files between two arbitrary builds. So you can get to the bottom of where a problem is pretty quickly. And Go has a more fine-grained permissions model than Jenkins with per user group and per pipeline authorization. One distinction is that Jenkins has this master-only mode that maybe many people, especially starting out, use where you don't use agents. Everything is done in the server. Go doesn't have that kind of model. You always have agents that are actually doing work and go server that's driving all that kind of work. So conceptually that's where it is. What we're going to do right now, we're just going to drill right down now actually into what a pipeline is, conceptually, step-by-step and how you implement it. And that's what I want you to be able to walk away from here is to say that you understand at least how this stuff fits together and that's not that hard to do. So there's six key concepts that we're going to go through in the next 15, 20 minutes. What are build materials and how you declare them, how to set up build stages, how to set up jobs for each build stage, how to match up your agents with your jobs, how to configure tasks, and what build artifacts are and how you use them, trusted artifacts. So with these six concepts, you can build a go pipeline. You can go out to the open source site, go.cd, download the server, download the clients, install the agents, and start configuring your server. Some of the goals that we have in our build process that are not like go specific, but things that we're looking for in CD practice is we want to be able to build our software out on production and preview it completely built on the server and be able to preview it before we release it to cut it over. So we don't find out about problems afterwards in the end. We want to have zero downtime releases and we want to have really simple rollbacks. So whether you can achieve all of those in your scenario, it depends on what you're actually trying to do, but those are some of the goals that we're trying to achieve in improving our deployment process. So out of these set of principles, the build materials, build stages, setting up jobs, figuring out agents, configuring tasks, and this thing of these trusted artifacts. This is one of the key takeaways that we want to leave you with is that conceptually understand the difference between a build material and a build artifact because they can be kind of similar things, but they serve a really different purpose. And so a build material is what you're pushing into the build process and an artifact is something that's coming out. And so we'll cover that over a couple of different angles as we go through. The first thing that you're going to do when you set up a build pipeline, you're going to download Go server and you start a pipeline. And the first thing you're going to do is you're going to tell the pipeline what your source code is, what are the build materials that are going into this. And so here's a build material screen from Go server in which you just, in this case we're using Git, it supports SVN and a number of different version control systems. And so here you're just going to declare your build materials, your source materials, so that Go knows about it and Go will take care of a lot of the management of pulling these resources for changes, kicking off builds when the resources change, getting the right version branch, getting the right branch. It's pretty straightforward, but it's a fundamental concept to start with to say these are the materials that are going into your pipeline. The most fundamental build pipeline screen in Go is your, this is your pipeline general options, right? And in that, there's not a lot of attributes right here. The key one is this check box that it's automatic pipeline scheduling. And so once you've got build materials declared and you check off this check box in your pipeline, now whenever a commit is pushed to one of those materials, you'll automatically get your pipeline will kick off automatically from there, right? So that's the first step for in the context of you're looking at even continuous integration where you want to see a build kick off as a result of a commit to code. This is all you have to do to get it going. So we all have the concept of these build stages, right? We're using the commit stage, commit stage, QA stage, showcase stage, production stage. Maybe you have other stages. Maybe you want to have a release candidate after showcase before production, whatever works. Maybe you call the commit stage the dev stage, but generally most of us were moving software through these different stages in order to isolate the builds at different levels so people can look at them, evaluate them and decide whether they really want to go forward into production. So really one of the biggest obstacles we had before we did our build pipelines was we had this anti-pattern of all of these discussions between devs and QAs and QAs and BBAs and the product owners about actually what was on the QA server at this point, right? So we'd have to like have these conference calls and we'd say, well QA, we just put the features in from that story that we were doing that's on QA now, but it doesn't have this other... There was these inconsistencies in what it was and we ended up having to talk about that a lot. There was a lot of discussion around it. After we got our pipeline in, that kind of noise level just really dropped down to where now it's just like really we just post an email out to the group to say, you know, this is what's on QA right now? And there's a really, really high level of confidence about everybody knowing that testers are testing what they think they're testing and testers are not testing stuff that the devs already know is broken. We were doing that a lot before, right, where the tester time was being wasted because they're testing stuff and I would say, yeah, we know about that and it's in the next phase. That's the kind of stuff you're trying to like dry out out of your build. So it brought us some stability. So that's a concept of build stages. Here's build stages in go, right? So we go into go and we just knock out these four records right here, very simple configuration where we model our stages, right? And we have the trigger type which says whether the thing runs automatically in success from the previous one because the stages are a succession, a progression. But we actually, we don't, we do something a little more refined than this because actually when we talk about build stages there really are, to disambiguate the terms, there really are deployment instances. We have these VMs that we have a commit VM, a QA VM, a showcase VM, a production server, right? And our stages are actually more granular than just the commit stage. We actually have three stages for each deployment instance and that's that we build the software and then we test the software and then we release the software, right? Pretty straightforward. And so when we model that in go, then it ends up looking more like this, right? And so you probably can't really see much there. So let's drill down into just one of those deployment instances. And so here in go it's the same screen just zoomed in, right? Where we have three stages that are the progression of deploying onto the showcase VM. And so again we build the software and then after we build it we test it. And by having tests in a separate stage from build that allows us to rerun the tests against the same build, right? Without having to rebuild it. Pretty simple concept but that's why you want to break stuff out in stages. And then by separating the release out from as a separate stage that's been really, really useful to us because what happens is say we do build, we push code onto QA and the QAs and the devs say, no, we push code into, we push commits and then that automatically builds on the commit stage or the dev stage. And then the devs say, this feature is ready to go, we're going to push it to QA. And so we build it out onto the end of the next stage, we just kick it off into the QA stage and we run the QA build and then we run the QA tests. And there's more tests on QA automated tests than there are on dev. There are different tests and more tests. And if the tests break at that point then we don't push that, we don't release that onto the QA server. And so even though the QA server has a very small community of maybe one or two people that are testing it, it means that we're not pushing stuff in their face that then we're going back and saying, oh, I'm sorry, we shouldn't have pushed that out there because actually it wasn't quite ready yet. So by breaking out release into a separate stage, simple concept, the release stage is a very small set of tasks but it's very helpful to break it out into a separate stage and it's really awesome in production because we do the same thing in Prod where we build the software on the production server, we test it on the production server, but it's sitting out there ready to release but not released. That really gives you, gives one a good set of confidence. So here's like a checkpoint in the, so what we've covered so far are build materials, how you configure them, stages, right, pipelines like a container for stages, and in the trigger configuration, how you kick off, how you configure Go to kick off your build automatically from commits. So now what we're going to do is we're going to drill down into what are jobs, tasks, and artifact configuration. What you're looking at right here is the postcard view of Go pipelines. So when you've got Go server and you build out a number of different pipelines, then this is your first interface into the pipelines. Okay, so let's just keep moving here. So a pipeline is a container for stages, right, a progression of events that you're going to go through to get to production. Stages are containers for jobs, right, jobs are, they can, jobs in the same stage run in parallel. So they have to be things that are functionally independent of one another. All the jobs in a particular stage have to succeed in order for the stage to succeed. To run a job, Go server is going to find an agent to run the job, right. So nothing happens on Go server really. Go server is just the metadata that you configure to do all this stuff out there in your world of VMs. And the agent is where all the heavy lifting happens. And so one of the concepts is that Go is going to give you a really flexible way of managing agents so that you can model some pretty cool stuff. Go finds a suitable agent for your job. That's how you configure that. And it hands the job off to the agent to execute. And Go matches jobs to agents based on resources that you define in the job. And that's what's highlighted in the lower part of that. There's resources. And so we tagged these jobs as being like Centos and QA and website. And then the other one is Centos, Dev, and Store. And so Go is going to find agents that have those same resources that are mapped in order to figure out which agent is qualified to run that job going through. And so that's one of those concepts is like when you, if you walked in to Go server on your own, that one would probably take you a while to like read through the documentation to like figure it out. It's a very simple concept. So there you have it from there. What we're doing in this stage is we're actually building two websites in parallel that end up being to the user one website, right? We have a Drupal Commerce Store, Drupal Commerce site, and a website. They're separate builds, but, and we keep them as separate code bases because they have a lot of differences. But we build them in parallel because they work, they always work together and we keep the APIs and interfaces synced up this way. So here's the agent view in Go server. So this is the other side where you install these agents out on VMs, right? And then you register the agent that you installed with Go server. And when you installed the agent, you gave it some resource tags and that's what's listed on the right hand side here. And so it's pretty simple. Go server is going to go out into its known agents and it's going to find one available that has the same tags as how you tag the jobs. And then your job is handed off to the agent. So a pipeline is a container for stages. Stages are containers for jobs. And so maybe it's no big surprise to get to the next step to say that jobs are really just containers for tasks, right? And tasks are just stuff that you do, right? This is where a task is where some command gets executed. And so now we're finally getting to where you're doing stuff, right? So if you're just doing stuff interactively, if you SSH into servers and you run bash scripts and you do different things in your deployments, then building a pipeline is just taking all that stuff and wrapping it up in tasks in a proper order and bundling them up in jobs and putting them in a pipeline, right? So you're doing the stuff anyway. If you're doing deployments, you're doing all this stuff. The pipeline is just a way of ordering it, so you get reproducibility, right? Which is one of the core goals that we're getting to in this. Not to save time, but to get shit right. So the Go server is like configuring metadata, setting the stage, you know, and configuring tasks is the script that really runs the show. So let's just keep drilling down into this. So here's the stage that we're just looking at. This is the task configuration and the task configuration dialog in Go server for a particular task. And this task does something that you're probably all familiar with, copy, right? So that's, it's all it is, it's copying. And in this case it's just saying take the build properties file that's appropriate for this environment and make it the build properties file that the job is actually going to use. Pretty basic concept, right? Take the dev properties file and we say make it build properties and that's what we're going to run the next part on. And then the next task is what's actually going to build our site. And in this case we're using Fing. Fing is a patchy ant for PHP and there's a primer for it by Lollabot. The link to it is in the session notes. So you can go down to that if you're not familiar with it or even if you are, read through it. You can use all the essential details of how you get started in Fing. You don't have to use Fing. You could use Drushmake. You can use Bash scripts, whatever it is that works for you to drive your process. That's what we're wrapping up here. We're using Fing. Again, most of these build automation frameworks are either target based or product based. A patchy ant and Fing are target based systems and make is a product based system. And you can kind of look at which one you want to use. Probably the best choice is the devil you know, right? I don't know which one really is better, but I like Fing in that way. But Fing is, it's easier to get started with I think than make. It's kind of easier to like change how things work in order. But as the project grows larger, these Fing ant projects become kind of like cryptic and difficult to manage. That's usually the biggest complaint about target based systems as opposed to doing make stuff. But anyway, so here is, we're drawing down into that particular Fing task, right? And so the command is, just what you would do, type on the command line, user bin Fing. We give the whole path. And then the actual arguments, the dash F just says what file it is that you're going to be loading, which is build.xml because Fing is all in XML based. And within that file, you're going to run a target called deploy dev. And that's it. That's your command, right? And so we have two tasks, copy property file and kick off this Fing file and that builds the website. And you're done with the go configuration of task for building this website, right? So just so to kind of clarify it all, we can drill down into the Fing task itself that we just called. This is that deploy dev task. So this is the kind of code it looks at. So you have some setup properties, you have some environmental setup stuff that you're doing. We're deploying PHP.ini and we're cleaning up the doc root and we're deploying some stuff. But then the main thing is that within this, aside from the setup and the cleanup, we're calling and we're drilling down to another target called deploy website. So it's just kind of classic procedural encapsulation of stuff. And so if we drill down into deploy website, this is our next target, then this is a stuff as Drupal devs where it probably ought to start to look familiar if you're doing any kind of automated build at all. We deploy Drupal core, we deploy sites all, we deploy sites to follow our file system, and we run drush site install. So we're doing a build from zero, right? A zero database, all in code. And so then that's kind of like the setup block for that. And the point is not that you do exactly this. You could do something simpler or different, whatever, but the idea is this is where you're building a Drupal site. We're wrapping it up in Fing. Again, you could do it in Bash if you wanted to. So just to take away the magic, this, if the pipeline is a wrapper for stages and stages are a wrapper for jobs and jobs are a wrapper for tasks and tasks are a wrapper for commands, which in our case we're doing Fing. And in this case Fing, to a large extent, is serving not entirely, but is serving as a wrapper for drush, right? So if you're using drush, then all we're doing is like putting all of these layers and layers and layers around it so you have a really clean way of automating and repeating what it is you're doing anyway. And so this is our Fing target for doing drush site install. And this is based on there's a Fing file, thingtask.php, which you can get out there somewhere in the world that is the plug-in for drush to Fing, right? And so now you can, and that's also in the build, in the session notes. So pulling back out of that base level, drush level back into this build website Fing target. Now this is the block, this whole site build block in which we're actually building the site. And so in our case we're using primarily features to deliver our stuff. Whatever you're using, if you're not using, if you can get config management to work, I couldn't when I tried, but whatever you're using to build your site out of code, then this is that block in here. And so what we're doing is we have targets to enable the contrib modules that we need, enable the custom modules, enable menu blocks, build our taxonomies, build our views, build our blocks, build our permissions and users, burn in some content and set our theme. Okay, so now you've got a Drupal site and that is just all a matter of like wrapping up the details. And then we have some cleanup at the end. We're deploying some varnish and retus settings and then clearing caches. One of the choices that you're making when you're building a pipeline, at a certain point you've got to make these decisions about what belongs in server configuration and what belongs in your build pipeline. Because your build pipeline is stuff that could be changing incrementally like your build is a recipe. And when you push commits, then you're changing the recipe. And usually we think about our code, but actually, well it is all code, but your retus, like if you've got a retus caching and you make changes to that configuration, do you want that in the server build, right, in puppet chef, however you're managing that, or do you want it in your build that's kind of a decision that you'll be making going through. But really in the core of this, one of the key things in this build website is like we build the whole site out, but then in that we have this target called save database. And what that does is that it takes the build, but before we've written any environment specific values that are only appropriate for dev, we've got this clean build, nobody's ever touched it, we know exactly what's in it, and we take a database dump of it at that point. And then from there, what we can do in go, this is introducing the concept of a go trusted artifact. We take that database dump and we go into this go build artifact configuration. We see we're in job configuration on the artifacts tab, and then in there we tell go server what to look for as a product of this build, of this job. We're saying this job is going to produce a file that you need to keep track of for us to make a trusted artifact. And so if at the end of the job there's not a commit stage db.zip in the scripts directory, then our job will fail. If it's there, go server will upload it from the VM, whatever VM it happened to run on, it'll upload it to go server and keep track of it for the rest of the pipeline. So if we did this on the commit stage dev, then in the next stage qa in this will declare fetch artifact task. The other task we were just talking about is simple like copy task, a thing call. Well this is a go specific task called fetch artifact. And you just put that in the stack of tasks that are in the job along with the other stuff you're doing. And here's the configuration for it. So here you just tell go, you say in the same pipeline, that's what it means when the pipeline is blank, that implies that you can actually fetch tasks from other pipelines besides your own, which gets into some really cool complicated stuff. But in this case we're just saying within the same build pipeline, just go back to the previous stage dev build, right? And in that build, go to the job called build dev site. And from that stage, from that job, then get us that trusted artifact called commitstage.db.zip, right? And so go will manage this so that if you run dev and then you run it again, you get another commit so you have another instance on dev. So the dev server deployed instance is yet another build from what it is you did before. But you're in go server and you want to push the previous build out to qa, not the latest thing that the devs are doing, but one or two commits back. You've got all those, each one of those is its own pipeline. And so when you push from dev to qa, then go will guarantee you that the artifact that you created with the recipe on that build pipeline, exactly what that code was, will reliably be delivered to wherever it is that you're building qa. So it's really cool. You don't have to do any glue at all to make all that happen. You get a trusted artifact simply by configuring a couple of dialogs. And it's a really, really awesome, powerful way in order to bring stability to the build process. So here's kind of this conceptually the distinction between build materials and build artifacts. We had a good question off of Twitter from General Redneck there at the back and kind of having wrapped up the deep dive through building the pipeline might be a good place to address that. I mean, how easy it is to automate building projects that are already complete into these pipelines and jobs. I mean, observing that most of the time your continuous delivery kind of evolves as the project evolves if you take go and you're applying it to something that's already live. I mean, are you going to run into any special issues there? I'll step back and answer kind of a related question first and then that one. And that is that if you already have an existing Jenkins infrastructure that's already running and doing continuous integration, you could take the existing Jenkins builds that you have and just configure it as a go task and say tell go just to kick off your Jenkins job and wrap Jenkins in a larger delivery pipeline. But with an existing site, you're delivering it somehow today, right? And so whatever it is that you're doing to deliver it today, on the most rudimentary basis, what you're doing is you're SSH-ing into a production server and copying a bunch of files into a production directory and doing like maybe you're forced to do a DB update and then you have content and you have to log into admin and whatever you're doing, well, the logging into admin, whatever you're doing as long as you're not going through admin, you can script that as tasks one way or the other. It depends on how sophisticated you are in your current process to say how easy it's going to be to do and go. If your code is in Git and you already build it somewhere other than production before production, then yeah, sure, it's not actually going to be that difficult to create a simple go build pipeline. You're probably going to be managing a database dump. You have this whole issue of like how are you going to get your production database into your next build and so you're going to do like a database dump from production and you're going to do a wash and scrub to remove stuff that's in the database that you can't have in non-production environments and then you're going to have to get a delivery back out there. So it's hard to really answer it in a single way, but... Is it okay if I elaborate on that a little bit? Yeah, absolutely. So I think with the question that Rob asked, it wasn't why I intended, but it's a very good one, you could actually manage that using something like your artifacts. If you created a database artifact from production on your last build, then you could use that in your development as a wash and scrub database. With that said, my question was actually intended along the lines of we have several projects going on at one time. Is there a way to make a template project, so to speak, so that you can release, so that you can quickly make this, set up this pipeline, the entire project as an entire continuous integration system without having to go through and manually create the build in-go every single time for every single project? Sure, yeah. Go configuration is XML. And so Go Power users are like, they don't use necessarily the admin interface at all in Go. They just take their XML files and run away with that. And so for sure, you can create pipeline templates and then you can build a pipeline from template that's built right into the core of Go. And then you can go beyond that again by just doing stuff with the XML where you can take an existing pipeline and grab the XML locally and go change the markup and then push it out there as a new pipeline. Absolutely all that stuff. I appreciate it. One of the things that you mentioned there at the beginning of the question was about taking that production database dump and bringing it back. And you refer to it as this artifact from production. And it's like an important concept in the sense that in continuous delivery, an artifact is a product of your build process. And so it can only really come from dev forward, right? Something that's coming, if you're taking a database dump from production, then it's effectively a build material. It's one of the sources of your build, one of the source materials that you're going to push into the build. And so it's a really, really important concept and it's kind of easy for it to be kind of fuzzy in between. In like compiled software and like Java projects in continuous delivery, the idea is to build all your binaries in the dev stage and never build them again. So you have a contract that all your binaries are exactly what you said they were at the beginning of the pipeline for each build. And for us in that kind of world it's like the database is sort of like our binary in that way. So we could go forward into thing or we could go forward into questions. Just any time with questions if you want to jump out at this point. So what we've got here is this is like a thing call stack. So now instead of looking at thing code, right, this is more the conceptual view of our entire build and thing, right? And it starts from, so in go we just have that one task that says deploy dev. And then when that task executes, then this whole stack of thing targets runs through in our case, right? Just to give you an example end to end. And so the yellow and green targets that you see here, that's your Drupal site build. And so all the white on white targets, they're kind of like we could turn those into a template. So they would be the basis for pretty much any build that you do in your style, right? It's going to be the dev-oppy stuff that you need to do in the context of your specific build. And then your world is this, the Drupal build stuff, however you're going to do it, whether using features, however you're going to get your stuff out there, database dumps from production as materials, whatever it is, it's kind of going to go in here. And then that save database target, that's where we're cutting our artifact and then we come over and release. In our case, what we're doing is every single build on dev QA stage production is its own document root. And then we, and it's time stamped, right? So when we deliver to production, then we're just sitting out there with this new document root with its own database with a completed, fully completed site ready to cut over that we can go and run automated tests against and we can inspect it through a private URL manual. It's right there in production. And when we cut over to production, all we have to do is change the sim link to current from the existing production site to the next doc root over. And so you can figure from there, if we wanted to roll back, it's just a matter of rewriting the sim link. The caveat in this, of course, is you've got data that can only possibly exist in production that got written there by users like a second before you cut over, right? So that's not going to be in your build and it's not going to be in this next database over. And so the other part that we have in the script is that the actual deployment process is that we build the pre-release and then we run the cut over, then we put production into maintenance mode. We SQL sync the handful of tables that are actually only updated in production. We SQL sync those to this other database that's out there running. And then we cut over and then we're back up. And since the site's running behind varnish, it's like, you know, there's very few, there's not that many URLs that are not going to be, that are going to be going to the back end at that point anyway. So it's like zero, it's a totally seamless zero time downtime release. So when we get to the QA stage, this is the call stack of thing targets in QA. And the thing that you'll notice is that one is we have this other pink target deploy website QA stage. That's where we're fetching the artifact from dev and we're loading that database on QA. So when we start QA, not only is it faster for the build because we're not repeating all the steps of building the site, but we also have an absolute guarantee that the database on QA is exactly what we left off with on dev. And then you'll notice that the yellow and green target set, which is the Drupal-y stuff, there's not that much stuff there because it's already done. That's what we got in the artifact. And so here all we're doing is we're loading in more content in our case, right? There's automated stuff coming in content. And then because we've touched the database again by adding content, then we have yet another, we create a second artifact for the QA stage because now we have a new database and we save that out. And so when we go to the showcase stage, now you'll see that there's absolutely no yellow and green stuff here. All of the Drupal site build stuff, it's already done. So when we deliver to showcase, when we cut over to say, this is our release candidate, this is what we're going to show to the product owners to say, is this what you want? Is this ready for release? When we deliver there, we're not touching anything in terms of the Drupal build. All we're doing is we're taking that trusted artifact and we're loading that database and then we're wrapping it up with all the other dev-op-y stuff that we have to do to actually deliver this thing. So our contract that what's on showcase is what we think it is, it's pretty solid. And then when we go to production, we actually do exactly the same thing again, right? So again, when we release the production, the level of confidence that what we're actually going to put out on production is what we think we're delivering, it's pretty high at this point. So more questions. There's a mic. We can repeat or you can go to the mic. I think I had a question about how do you effectively, or can you explain a bit of the steps that you take to sync data from, like if you have, you say you're pushing up your database from development. How do you sync the existing data that's in production to that rolled up database? I wonder if you can explain a little bit of the steps. Sure, yes. It's not that complicated in the end, really, because there's a thing target that implements some Drush SQL tasks, right? And then Drush just runs a few SQL statements. And those in the SQL statement are just insert statements from the production database to the 2B production database, right? And so we do this like on our website that we just have a few tables where we have, our comments are in discus and we don't have user contributed content. We have all our stuff is editorially from the back end. And so there's just a few things we're capturing on the website, but we also have a Drupal commerce site in which you have a whole stack of tables that the data could only exist in production, right? All the order data and everything that goes with it. And we do the same thing for that. It's just a stack of statements that sync the data from one database to another in a moment of time at point of cutover so that every release has its own database. It gives you a really strong audit history of exactly what it is you had at a previous point in time. It's fast. It's reliable. Okay. Yeah, I was just asking because we have a really complicated Drupal install and every time we have to run stuff on to production, we have to just basically run all the update hooks that need to be run directly there. Right. Yeah, the idea of running update hooks on a production database to me is like, we definitely need to talk about this because it's just really a matter of time. Yeah, it takes a little time. No, I mean, it's a matter of time until you get knocked up the side of the head by it, right? Yeah. The other question I had is you mentioned having different builds on each environment like build, test, and then release. So the way you do it is that on each environment you have separate installs or each one of those steps, different VM or how do you manage that? Yes, there's a separate VM for each stage, right? So each stage VM is a replica of production other than resources. It has less memory and processors, but otherwise we manage them exactly the same as production. So the build, test, and release are each a separate VM? No, no, no. Because that's where we disambiguate the term stage, right? Because we use the stage to mean the deployment instance, the VM that we're running on, and we're using the stage, the concept stage, and go to say each of these steps. So a deployment instance is a VM QA. And then on QA we have an agent that goes through each of the three stages, build the software, test the software, and then if the test pass, then release it on that VM so that the publicly available URL for our QA site then becomes the new build at that point in time once we run release, right? Yes, okay. I think I'm done with that. Yeah. Okay, so first off just a clarification. So all of these deployments that you actually look at, are all of those being viewed from a VM? So there's not actually, it's not living directly on the server. It's the deployment is living in a VM that you're looking at? The deployments actually, I'm not sure if I really fully understand the question. The go agents are installed on the VM that you're going to build the software on. You're going to build the website on. And so each deployment instance is a VM. It has its URL. That's where you're going to view it, test it, poke it, prod it. Is that what you mean? Yeah, so I was just trying to clarify that kind of a more traditional deployment is that if everything lives directly in the file system, you have your services installed on the operating system directly. And so it's running from that file system. You're building your document route, your website document route, like in VAR lib, my websites. And then you're building out each instance from there, right? And then the rest of the server is just, you know, your basic, your basic server, what packages, how you're Apache comp, all that stuff, how that's done. Okay, so just coming from the background, we've got a Jenkins setup running. And so we've got one build server and then everything, like once it builds all the files for the project, it syncs it over to the actual... Right, yeah, you can actually do that model. We don't install the agent on production, for example. We only install our agents on pre-production instances. And so on commit, QA, showcase, then the agent's building right on that server. But when we do production, then what we do is we run the agent on the showcase, the same instance as showcase. We run the agent there, but then we run it through SSH over to the production server so that with the agent's not on the production server, but we're building, we actually build on the production server. We don't, as you say, we don't build on one server and then push the stuff over to the other server. We actually run the agent via SSH on the remote server in that case. There's kind of disadvantages of doing that because that means that production is different and the production scripts are a little different and that's the one thing you really want to avoid. You want to have, as much as possible, you want to have production be absolute, because the smallest thing eventually is going to break, right? You want the stuff to be as repeatable the same all the way through. Okay, and then my final question is, you kind of mentioned passing, being able to integrate Go with an existing Jenkins setup. Yeah. Could you elaborate on that a little bit? That kicking off the Jenkins build would just be a Go task, right? So it would just be the command and the parameters that Jenkins needs in order to run. And then, you know, if Jenkins is producing an artifact, you can tell Go about what that artifact is and it gets passed down the pipeline all the way through. And so that way you get to take existing infrastructure, not have to rewrite it in order to get it in Go and then get it in a context where you have more infrastructure around the whole process from the initial commit all the way through to deployment on production. So you can take it beyond continuous integration and take it to continuous delivery. Just to add to that, I think a lot of people who use Jenkins often overlook the fact that it has a REST API as well as a command line interface, kind of a drush for Jenkins, which, you know, makes it relatively accessible in terms of integrating and calling it from something else. Hi. First, thanks very much for your presentation. Is there a model that Go can support where the agents are not... Well, let me get back to the actual principle. Can you spin up a new VM with Go in order to... Or does the agent have to pre-exist? So, like, is there something where if Go does not find a matching agent, it triggers a puppet process or a vagrant process that spins up a VM, runs puppet, puts the agent on puppet or on the VM, immediately runs the tests and then shuts it down, for example? Well, if you can script it, you can do it. Yeah. I mean, it's probably not a workflow that I've seen, but, again, all the things that you describe are all scriptable. So not necessarily a good fit for Go, but definitely it works if you can... It's a pretty complex scenario where you're like... I'd say it's not... In the world of cloud computing, it's not that complex anymore. You're right. You're right. It's not that it's a bad... It may not be common practice, but it's as good a fit as it is in Jenkins. You can certainly do a plugin which attaches to any cloud API and says, hey, there should be at least this many VMs processing this and if the backlog gets so much, start five more and all that sort of things you definitely can do. So it's like a typical thing would be where your deployment involves some sort of massive amount of processing, like giant tests or Hadoop processing before you kick something out. Okay, cool. Thanks. Here's kind of a more enthusiastic answer is to say, if your use case was that you're a hosting provider and you want to put Go as software as a service infrastructure for part of your offering, then yeah, you've got a use case where actually you're not going to know what you're going to need. It's going to be kind of done on the fly because I'm thinking of it more in terms of how an IT shop does their stuff internally, but yeah, you could build Go deployment as software as a service for a hosting company or IT department that's that complicated. Hi. In your version control system, do you have a branch for each stage in that build process and how does it trigger the process to kick off? Do you have like get hooks that do all that? That's all built into Go, so we don't have to really think about it. We just in Go, we just declare the materials, we just say here's this get repository and we click off a check box that says pull this for changes and then you're done, right? So you can do that in the first five minutes and then you push a commit into that repository on the proper branch and Go will kick off your build. So you have a branch for each stage basically? We do not, no. I mean typically for a build pipeline, the whole build pipeline is going to be built from the same stage or the same branch because by definition, the pipeline is a single recipe that you're trying to move down into production. And so if you're talking about different branches, you're talking about different recipes, in which case you're talking about different builds. You could have different pipelines for different branches, right? So you could have your master branch and you could have a feature branch and you could have both of those running as pipelines or you could have the branch be configurable part of the pipeline so that for any particular pipeline run, you know, pipeline run number 622 could be from master and 623 could be from features, whatever works for there. One of the things we found is really cool and continuous delivery, one of the things you want to do is, you're trying to do is cut down the cycle time between when you get a feature and when you deliver it to production. And we're a very small organization or team that's working on this and we were doing GitFlow before we did the pipeline. But once we had the pipeline up, then we found that because we were delivering like stuff every day that we could get rid of our feature branch and we only have master now because it's granular enough for us. Now that's not going to work for everybody in every case but it was kind of granular enough the fact that our stuff was actually getting through the pipelines fast enough that it wasn't that complicated if we had like different things happening at the same time and that really simplified things for us a lot being able to take it down that way. Okay, thanks. This question is really for you Rob as kind of a follow up to a couple of questions ago and the commentary about how this could be something that a hosting company might potentially be able to offer. Where does something like continuous delivery fit in with Acquia's offerings right now especially given that one of the things that's kind of powerful especially for somebody like an enterprise cloud user is kind of the fact that there are multiple environments available all on the same cloud for making consistency between those environments. I mean I could see how you could cobble this together with the existing cloud API, but is there any kind of future thought for how you might make this a little bit more accessible through Acquia's offerings or either of you know of anybody else who's doing that sort of thing in the hosting space? I guess the current state of the art on most of the big projects does fall under the cobble things together with the cloud API thing and that's unfortunately kind of the state of the art across hosting a lot of the times. And so like if you look at some of the various things like you know site factory and you look at any big segment of industry which has the problem of managing a lot of sites often parts of everything we've talked about have sort of been embedded into some more specific product like site factory and so on. And so in some ways I think a lot of this is sort of a generalization of it and it could be like a more broadly applicable thing down the road. So but I don't know where the future is really going obviously. I think you know if you had a specific project the best thing is to kind of look at what you actually have to do with your own project and try to stay as close as reasonably possible to what other people who have the same problems are doing so that you can live off their work. I think that's going to like curve you toward using something like Go and you know burying jobs and tasks in it that talk to cloud APIs. And I don't think Acquia has anything like in the pipe. There's nothing like we've talked about offering like a hosted Jenkins or something like a host to go wrapped in a pretty gooey or whatever but there's like I don't think anyone's working on that except for specific big projects integrated with other things like site factory. So is that... Yeah I mean it was just the questions coming from our perspective of there are some benefits that we could reap with a smaller team but with minimal support available for any kind of DevOps operation it's kind of called my spare time. I was curious if there are anything available that might kind of ease the process without eating too much into the teams maintenance of this tool which allows us to get a bunch of benefits. Yeah and so that's a... there's a bit of commitment when you add an additional tool into your workflow how many different things do you want to manage so I think as you look at Go and does it belong as part of your project try to look for other things that it could replace so you reduce the number of total tools that you're maintaining like try not to add try to add one thing and take away two things. Thanks. There was also a I guess one last question well a couple more Chris Luther had a question that I think would be maybe kind of a good summary question that was just kind of where does Go stand compared to other tools like any mentioned specifically Travis CI, Bamboo and Jenkins and I think we've kind of touched on that one throughout the other questions but the basic role is Go is a little bit more ambitious in reaching into more of the overall process most of those tools sort of deal with things inside the dev team and the idea of Go is that it might replace those tools or call those tools but it's also going to reach out more into like your total business process that people like like business people are going to look at the screen see green and make a decision whether or not to deploy based on business reasons like do we want this to go out instead of code reasons do you think we can elaborate on that? The biggest problems in software development is the siloing of efforts between different kinds of teams that devs are devs and QAs and business analysts are business analysts and so the whole agile movement for over a decade now we've been trying to emphasize and bring together how you get people actually working together all the time on the same software in a meaningful way and one of the things I think that we can really do by building out automated build infrastructure is in particular as a specific example is to bring the BA the person who's actually negotiating with the product owners into the build process in a meaningful way that's measurable that's what's missing when we do continuous integration and the kind of testing that's done in continuous integration it's difficult to map that to the actual deliverables that were promised to a client at a point in time and so there's kind of like faith-based thing that we're doing the testing and the testing means we're good but does the testing really mean that when we try to deliver the client's going to sign off and it's going to go to production that's hard to measure and so what we want to do is we want to figure out what the product owners are actually asking for in a particular iteration and make sure the devs build that but then what we're proposing and that's what jazz writes about in this book in this way is he's saying the BDD testing the business-driven development testing is really like the most useful framework that they found in trying to close that gap in continuous delivery for PHP right so and that's what we use we have these B-HAT tests and so the proposal is if you can get the BA the person who's actually negotiating requirements and the person who's saying that the dev team is actually delivering these requirements to translate those requirements into domain-specific language tests so for example if you promise the JSON API from the website then the BA one of the BA's roles is to say is to write a B-HAT test that just says as an API user at this end point the result will be valid JSON and these JSON elements will not be null a contract that you have for that right there right so the if you translate it if the BA translates that stuff into executable automated tests that they can understand as not being a technical person and then they put that in as a gateway to delivery now the BA is actually technically involved not just like philosophically touchy-feely going to meetings involved in delivering the software they actually have a technical role in writing the gate right so that and it defends dev because if those tests are broken then that's there was the BA who's supposed to like clarify that that on the domain level on the product level on the stories that we're promising to deliver that they have a role of putting this right in the pipeline awesome stuff it really is can change the way teams work so do you have one more question so one of the things with website builds as opposed to web applications is there's very often not only does it work right but does it look right I was in a talk they were using a tool