 My name is Michael Godek. I'm with ThoughtWorks. I'm from Austin, Texas. ThoughtWorks is based in Chicago and has offices in 16 countries around the world. And it's a software delivery company, right? ThoughtWorks delivers software. And I'm co-presenting with Mike Menetsky. I'm Mike Menetsky. I'm the director of technology at Four Kitchens. We're a small but mighty web development shop in Austin, Texas. Famous for such things as press flow and doing the Econus migration. So welcome aboard. The hashtag for this conference is or this session is go4cd, which is up on the It's up on the bottom part of every slide coming through. That's G-O-F-O-R-C-D. So if you could tweet any questions as your question questions and comments as we go along and then we'll take what we should have time for for questions and discussions Towards the end. There should be plenty of time for that. So if you could push some material out that as we go along that will be helpful. It's this session is going to be fast, kind of dense. I'm just going to have to assume a lot of stuff because we just can't explain really all the context, right? We're going to it's we're going to assume that you know about D'Rosch and you know about how to you know something about building and delivering software and We want to take you to the really good stuff. So we'll move really fast and so where I'll definitely leave you in place is missing certain context just because the field is so broad. So wherever you feel like you're missing context from any part of the presentation then again ask a question on Twitter about it and we can talk about it and come to me afterwards. Let's have more discussion. This is a big topic. So there's there really is a lot to a lot to cover with it and then to promote it actually there's we're going to give away most importantly a copy of this book to someone here in this room at the end of the session. This is Jez Humboldt's book on it's really the seminal book on continuous delivery. It's a really really cool book. This is one of the best books you can read in software development today and so it's based on whoever promotes it has the most interesting tweet coming out tweet question or comment whatever it is. We're also giving away a copy of the ThoughtWorks Anthology which is a series of essays on software development which is really actually pretty good and then you know if you have kids this is really good one right. This is Kermit the frog with the frog it's not easy being green because once you get into continuous delivery your life becomes this thing of watching these pipelines which either break and turn red or succeed and turn green and it's really true. It's not easy being green. There's also two sets of these posters on continuous delivery that also we're going to give away. So there's like five people are going to walk out here for something. They're very cool posters. They are creative comments and you can find these out on the internet and make your own if you want and so they're they're an interesting study in and of themselves to just walk through spend time looking at what that what the what the information that's on those posters. All right. Also the there's session notes at the URL that's listed down here ThoughtWorks.com insights blog Drupalcon Austin because we gave the session in Austin and the same session notes are there and in those session notes there's a there's really a reading list that and if you work through that reading list about there's like Martin Fowler's original article on continuous integration the the case for continuous delivery the history of this product where it came from links to some primers and other tools that would be interesting in this so Those session notes have a lot of valuable resources for following up from this session. Hopefully some of you will go away from here like doing this stuff right think maybe you already are so Enough said right let's go so So here we are you know this session is about build automation tools But in order to understand how to use these tools we need to talk about why we use them And build automation is a component of a practice called continuous delivery or CD so we need to have some context of the Kind of problems that CD intends to solve So the definitive resource on continuous delivery is Jez Humble and David Fairley's book continuous delivery And in the introduction of that book they look to the industrial consultant Edward Deming whose work in the 1970s right when back when Jimmy Carter was president introducing the Japanese method to American plant managers revolutionized thinking about the manufacturing process and Jez writes in continuous delivery that Deming's work is the foundation of CD practice and software today So we're going to start off with the opening paragraphs the very first paragraphs from Deming's 1985 book Quality and competitive position because it sets exactly the right context to understand why we value build software build pipelines So here's Deming The aim of this book is to illustrate with simple examples that productivity increases with the improvement of quality Low quality means high cost and loss of competitive position some folklore Legend has it in America that quality and production and production are incompatible that you can't have both a Plant manager will usually tell you that it's either or in his experience If he pushes quality he falls behind in production if he pushes production quality suffers This will be his experience when he knows not what quality is nor how to achieve it a Clear concise answer came forth in a meeting with 22 production workers in response to my question Why is it that productivity increases as quality improves? Less rework There's no better answer. These people know how important quality is to their jobs They know that quality is achieved by improvement of the process Improvement of the process increases uniformity of output of product reduces rework and mistakes Reduces waste of manpower machine time and materials and thus increases output with less effort Other benefits of improved quality are lower costs Better competitive position and happier people on the job and more jobs through better competitive position of the company These are some of the lessons that management must learn and act on Reduction of waste transfers man hours and machine hours from the manufacturer of defectives into the manufacturer of additional good product In effect the capacity of a production line is increased The benefits of better quality through improvement of the process are thus not just better quality and the long-range Improvement of market position that goes along with it, but also greater productivity a much better profit as well Improved morale of the workforce is another gain They now see that management is making some effort themselves and not blaming all faults on the production workers end quote And so that was that last part was the one that really nailed me because I spent I've spent most of my career in In IT shops were effectively in the end when we push broken software They were blaming us right they're blaming the dev or the QA or whatever and it's it's really about the prop it's more about the process than it is the The the skills of the well, it's about everything, but the process is a lot to do with it So a key measure of what we're trying to achieve in our use of build automation tools is to have our team spend less of their time Fixing things I rework and more of their time available for new stories There are other goals and measures for example making the decision to release a business decision instead of a technical one That's to say when you when the stakeholders determine that a particular release candidate plays the stories They want to see go live then you shouldn't have to go back to the devs and ask is it really ready or Ask them for time to actually do the release because at that point the release to production should be a click of the button With rollback options similarly simple taking the risk out of release is a fundamental objective of Continuous delivery. That's one of the things we're trying to get done In that sense all of the effort of build automation is like a musician's rehearsal Right when it comes time to walk out on stage the technical issues should be resolved An automated build pipeline is where your team rehearses for consistently successful deliveries So the rest of the sessions focused on the stuff devs love But the only side of the bigger picture and really the ultimate measure Right the effort that you're putting in to build automation is it resulting into more frequent deliveries with less rework If so then it's something that you can definitely make up will be profitable to do if you're not getting More frequent deliveries with less rework then you have to question whether you're actually what you're doing is actually Useful because you have to find some metric to determine whether it's worthwhile The benefits to large high-risk projects are pretty obvious for doing this kind of stuff But agencies that do a lot of small one-off projects benefit from continuous delivery practices as well, especially in Reducing risk associated with delivering changes to projects where there isn't any budget left really to fix things when they go wrong, so if you can get a solid delivery pipeline underneath this stuff then you can control your costs effectively so Just to keep things in context continuous delivery builds on agile and dev-op practices, so it's an extension of those things If you don't already have if you don't already automate infrastructure, which is not what we're talking about specifically here You should probably do that first and so before we dive into the into this material We're going to take a step back and Talked really on the context of what DevOps is which is this like the foundation underneath all of this Okay, so So how does this all fit with DevOps and what is DevOps? So in the first line of Wikipedia and what is DevOps is it's a Porn to mew of Development and operations, so it's just a combination of those words And that actually speaks volumes as to what it really is It was started out more as a cultural movement similar to agile rather than a technical movement and it was a and a need to increase collaboration between operations and Development teams the problem was and the problem still is in many organizations today as that Developers want to get project new features out as fast as possible and the operations team has the almost always opposite task of making sure that the existing Features are working correctly, right, so but so it's no it's no surprise that those two organizations with but heads very frequently The word has really grown into a much bigger sort of broad term. It's grown into a job title Which you know people a lot of people in the DevOps community complain about and sort of find Wrong-headed But at you know, you can say the same things about agile, right? You know agile has become a buzzword agile so everybody's doing it and you know if you know But and everybody says that everybody else is doing it wrong, right? The real But that that doesn't that doesn't make agile less useful just the fact that it's you know that The the word has grown to mean more than it had been intended to the same is true for DevOps And a lot of the principles that make DevOps important for organizations Big enterprise organizations and big software projects really apply also to Making the lives of all developers better So some of those principles that sort of that that that came out of that our configuration as code automation and Metrics for everything and then finally continuous delivery so Configuration is code. So this is the really the first step in sort of in in automation Because how many people here have pushed a release that worked perfectly locally? But on on the production, you know completely blew up and didn't work at all, right? And that was because of some very small Difference in minor version, you know of some library or PHP or something like that so and we've We've solved that problem on the code level and you know, everybody here I hope is using source code management to manage their their software development It's becoming increasingly Common but still very new for lots of people the the idea of putting their servers into source control and And storing those configurations in a way that is repeatable Automatable and sort of fits into a Continuous delivery pipeline in Drupal. We have a whole lot almost there So in Drupal we have like it there's another step before even Sort of taking our configuration the server configuration. It's our Drupal configuration that we need to be able to To put into code and so really the first step for for for anybody working with Drupal is to start using features And start using update hooks. So features has sort of become a thing but So how many people use features actually their question? All right, how many people? include modules or enable modules in update hooks Okay, much fewer. So that's the so people using features, but then still sort of deploying stuff by hand really what you want to the point that you want to try to get to is is the ability to To not have to do anything on production once you push code just run update dot php It's not as hard as it seems and it's really worth it So and how does that relate to the you know to the quote from earlier? less Elimiting those differences between different environments means that you don't have to Fix bugs between them if it works locally you can then if you know that those two environments and you can prove that those two environments are the same You can confidently push code forward and know that it's going to work in both environments, right? So the next step is like once you have the ability to to Automate things is of course automating things So an important thing to think about when you're thinking about automation Is that it may take it may take a week to automate something that takes five minutes every week and On the surface it seems like that's not worth it, right? but when you When you factor in the time that it takes to unfuck the one time that you wrote that incorrectly Or did those steps incorrectly? It's gonna that will increase your and factor that in plus factor in the client management time to Calm them down after a break in the site Then factor in the time needed to have a couple drinks to forget about how bad of a day that was Suddenly that equation gets turned on its head and it becomes worth it to be able to automate that five-minute task All right, so it's not about automating for speed. It's it's for reproducibility And again to reduce the amount of rework So the next thing is metrics and measuring So at this point you have automated builds and they're going out if you are If you're kind of just following along and this is your process that you're building out your Your your your environment you're right now in an extremely dangerous state and so one of my Colleague mine who works at a chef one of the automation tool builders He has a great quote and it's that anybody can Anybody can make mistakes, but to really screw something up you need automation Because at this point, you know, you can push something to production without my completely hands off There's a button you press But you don't necessarily know if the if what you did actually work You're trusting that those environments are the same, but you know everybody and that your functional test was Was good, but did you really check every page in the site? And is that are there? Performance problems that have crept in If you don't have metrics if you're not measuring how things are working Then you don't know whether what you're doing is actually working and the other and it's not not just Collecting that data. It's presenting and providing that data to people who can Take action on it. So it's not enough just to have you know, I mean the errors are probably being logged, right? but how many people here have a visualization of the number of You know Notices and whether they increased or decreased You know with a build or the number of errors that have started to creep up in production If you can start looking at that knowing where those errors crept in you can catch problems with the with the build that that haven't built up yet because a lot of times it's it's Performance problems come in two different varieties, right ones that have you know You've just a big problem somebody wrote a really long query, but most of the time those problems come from Different things being combined to a small little problems building up on top of each other So finally That's where we get to continuous delivery and this includes other ideas such as Automated testing continuous integration and continuous deployment In big organizations big software projects those the amount of time that it takes to Set up those pipelines and set up those those processes are really really easy to To justify But for a small shop those kinds of those kinds of It doesn't need to be a full dive Doing any one of these things is going to bring you closer to To greater productivity than doing all of them Automated testing is a great case where if you do too much of it you can actually slow down your whole Production pipeline because you have developers sitting on waiting for tests to run instead of writing new features So it's finding a balance and writing one test is better than writing all the tests automating All the way up to staging and then you know having manual steps to go from staging is better than than starting from from scratch so It's the the the final steps are being able to See that visualize that pipeline and see how it's all working and that's where That's where go start comes into this and finally the most like the this all like ties back to Agile in that the highest priority is to satisfy the customer the early and continuous delivery of valuable software so Taking all that stuff and putting it all Putting all putting all the effort into continuous delivery is really one of the one of things that enables agile So it takes work outside of the project to make the project be able to move fast and And produce features Cool, so let's let's let's dive into the product right or drive dive into the tool Let's see if we can build some stuff So Again continuous delivery comes right out of this first statement of the agile manifesto right 2001 The the practice is just an extension of it and This tool that we're looking at go started in 2000 ThoughtWorks produced cruise control, which we think is the first Modern CI tool that was made and cruise control was open source product I guess still kicking around somewhere, but in around 2008 or nine Jez Humble who was a ThoughtWorker then he's at chef now Led the team that rewrote cruise control into go and Then it was a licensed product of ThoughtWorks studios for enterprise continuous delivery up until this year and at this point it was like the the the desire to advocate for software excellence by Building a community for continuous delivery was really more important than having a product right so it made sense to open source This thing and let's just use it That's where it came from. There's more in the history on it. Oh, we'll get let's just go into the product, right so You know like Jenkins go consists of a server managing many agents And probably the most important distinction is that a pipeline is a first-class object and go Go servers your interface to the agents where you can configure and monitor your build pipelines Go agents get the job done right the agents are actually the ones that are doing the job Jenkins has this master only mode that lets you run agents from the server But in with go you're always using agents. You have the server and you have the agents out there doing work Go has a number of really important features baked right in such as trusted artifacts and Fan and support that you'd have to build out with plugins and glue code if you wanted to do something like that with Jenkins With go you can deploy any version that you've previously deployed again at any time Very easily Go really stands out when you get into modeling complex work complex workflows and managing dependencies between software builds so say for example you have a bunch of house custom modules that get included in most of your sites and You can model the dependencies and go so that all of your products that include those modules get automatically rebuilt when code Get gets committed to any one of them. So when a developer making changes They're likely working on one story for a client and they may they may miss how their Their change breaks other projects that they're not actually working on the dependency and code And so surfacing the break in the dependent projects really close to the time of commit Rather than like weeks or maybe months later when that other project gets a change That's a really a huge benefit to quality and productivity Go really has a lot of nice and end visualization and auditing features Such as a build compare view where you can diff Both commit messages and the actual files between any two arbitrary builds So that if something goes off in production, then you can really quickly identify the sources of the change Looking at this interface in the tool some of the things that you get as In go as like first-class built-in concepts or the ability to trigger a pipeline as a unit The to make one pipeline depend on another to make artifacts flow through a pipeline to have access control at the pipeline level to associate pipelines with environments and to compare changes between pipelines, right So I think those will make more sense as we go forward so here are Six concepts we're going to go through in the next 20 minutes, right? What are build materials and how you declare them how to set up build stages? How to set up jobs for each build stage how to match agents to run the jobs how to configure tasks for Agents to execute and what are build artifacts and how you use them so with those six concepts? You'll be ready to go So a key takeaway from the session is like getting the idea of what a real a build pipeline is really all about So here's a quote from Martin Fowler who's chief scientist at ThoughtWorks on the concept of build pipeline I Can't do the British accent so But it's one of the challenges of an automated build and test environment is you want your build to be fast So that you can get fast feedback, but comprehensive tests take a long time a Deployment pipeline is a way to deal with this by breaking up your build into stages each stage provides increasing confidence Usually at the cost of extra time Early stages can find the most problems yielding faster feedback while later stages provide slower more thorough probing Deployment pipelines are a central part of continuous delivery So Increasing confidence as you go forward is a really really big part of what you're trying it's one of the values You can get pulling out of this So some of the other features we're kind of looking for out of our build pipeline are Running our automated tests before we cut over a build so that when it when When a when the test failed and we don't have to look at the results All right, it doesn't it doesn't trash the environment going forward Having a pre-release being able to release into production and preview it before you actually cut it over Very very nice in terms of building confidence zero downtime release Really simple rollbacks is a fundamental principle of continuous delivery taking the risk out of delivery is one of the things we're after in this You can achieve it in different ways, but that's essentially the goal Another key takeaway, hopefully will be to understand the distinction between build materials and build artifacts All right build materials are the ingredients going into your recipe typically source code and binaries artifacts or the cupcakes that come out of the oven, you know because You're always tinkering with the recipe so no two batches are really actually going to be the same and so trusted artifacts allow you to have these contractually consistent cupcakes within the context of a single build pipeline For our Drupal build like in the Java world a trusted artifact will be the binary That's the production of the compilation that you do only once in a pipeline. You'd never compile something twice For a Drupal build probably for us our cupcake is the database that we're producing as a product of the build Not what we're pushing into it, but the product of the build. That's our cupcake. So we'll come back to that in a minute So declaring your build materials is the foundation of your pipeline So this is where you start out go supports a number of version control systems including get Once your materials or your build materials are declared which means telling get go where your source code is Go server will pull these resources Those repositories and automatically kick off a build in response to a change or in response to a commit and Go agents will take care of pulling the source code when the pipeline actually runs So now you've kind of getting to the foundation of having continuous integration They have a new build kickoff automatically whenever your materials change You just tick off this automatic pipeline scheduling checkbox in the pipeline general options dialogue and You're just telling go where your source code is and And now you've got CI or you're you're getting there So let's talk about build stages A pipeline is essentially a container for stages There are here's the set of stages. We're using we use commit qa showcase and production that's the way our code flows and Maybe you call them something else or you'd have more of them or less of them that Whatever works. It's your pipeline Before we built our go pipeline when we were deploying with drush and bash grips a really common kind of anti-pattern was this conversation we'd have with devs especially like on a distributed team and About what was on Each build we'd have these conversations right what's on qa is this is this tap is this been on qa It's not on qa devs were testing for things on qa where the devs knew were broken And so when we got our stuff into the pipeline It really brought a level of stability to the team and it increased everybody's confidence that they knew That what they were looking at was what it is that they thought they were looking at it really was It was nice change So here are our our stage is built and go right the commit stage is going to kick off Automatically based on that pipeline general options settings. We did earlier So the trigger type that you see here doesn't matter for that first stage But the on success trigger type indicates that the qa stage and the showcase stage They're automatically going to kick off as long as the previous stage doesn't fail So in this example and typically the production only gets built when we kick it off manually If you set production to on success as well now you have continuous deployment as opposed to continuous delivery Continuous delivery is a practice Deployment is a point where you get to where you know what if you commit code and it passes all your your test coverage then Automatically goes to production and we do that for static html help files for product help if it if the commit If if it passes a link checker then we just automatically deploy it right because that's the only thing that we think And break with that But for software we do more We do things manually to production But you know your pipeline stages Aren't going to correspond probably to one-to-one to release instances right so the disambiguation of terms right so What your team thinks as a stage is this thing of commit qa Showcase production will start calling them release. They're the release instances these VMs right and what we're doing is we have Actually three pipeline stages for each release instance one stage to build the system Another stage to test it and the third stage to release it breaking it up And by separating the automated tests into a separate stage Then we can rerun the test without rerunning the build right it's pretty straightforward and by having to release its own phase Then well it gives us a lot of it gives you a lot more flexibility and manage managing things Overall it it puts you in control So here's the stack of three stages for each of four release instances, right? So now we have this pipeline that's got these 12 stages as you run the stuff runs through right So let's drill down into one of the stages so we can actually see something So here's the three stages and go server for the showcase instance and the stages are containers for jobs Jobs are what a agents actually run and so the number that you see there on the right are the number of jobs in each stage And we'll come back to that so so far. We've looked at build materials triggers and stages and now we're going to drill down into jobs tasks and artifacts But the review a pipeline is a container for stages Stages or container for jobs jobs in the same stage run in parallel So they need to be functionally independent of one another All right, no dependencies between jobs all jobs need to succeed in order for the stage to succeed To run a job Go server finds some agent that's suitable to run the job and it hands it off to the agent to execute So the agent typically is out there on a VM somewhere away from go server and go server matches the jobs to agents based on resources You define in the job with those resources you define in the agent Essentially that just means you put in tags in the job and you put the same tags in the agent and then go server figures out that That's the that's the agent that you want to have run that job. So it's not that hard This view shows two jobs in one stage and so in our example It's a Drupal website and a Drupal commerce store which are two separate builds that go into making One site that from the end user perspective. It's like one website The two sites they have like common header and footer. They're themed the same To the user they seem like one site, but to the team. They're two separate dependent code bases And so when a commit is pushed into either code base We have go rebuild both projects in parallel with some automated tests that validate that the interface between the two sites It actually works the way we expect it to so it's one pipeline two jobs So here's the agent view in go server So when you you install an agent out on a VM and you tell it when you do that Installation on the VM you tell the agent where the server is right and the agent goes and registers itself back up to the server And it shows up in the list here, right? You don't have to do anything else and then it's just the the tag matching is what's highlighted on the right, right? so the same tags in the job is same tags in the agent and Your your job is running. It's not that hard if you have Multiple agents with the same tags as the job then go is just pick the first one it finds that is available to run it if you have Two jobs that get called against and there's only one agent available then go We'll just run them in sequence one after the other right makes sense so Okay, so if a pipeline is a container for stages and stages are container for jobs Then probably it's no big surprise right that jobs are containers for tasks a Task is a command to be executed. So finally we're getting to where we're actually getting to do something, right? Setting up pipeline stages and jobs is effectively configuring the metadata of the pipeline Right setting up the stage for the action that's going to come and that tasks are the action So all the stuff if you do if you do your deployment manually Building your pipeline is just like taking everything you would do one by one by one by one Putting them all in your tasks in the right sequence and now you've got it automated, right? So in this example building our Drupal site is done in two tasks The first task copies the appropriate property follow-over for the for the target environment that we're actually in because we have different settings for Qa and showcase in production and then the second task actually builds the site So before we drill down to the tasks, let's take a look at these four meta concepts in a go pipeline, right? Go has they've carefully considered abstractions providing tasks inside of jobs inside of stages inside of pipelines with a mix of Parallel and serial right pipelines run in parallel and Jobs run in parallel Stages and tasks run serial and so this design in part was motivated by this Joel Splatsky's Law of leaky abstractions, right? And here's Joe Sploxy. He says To have powerful enough abstractions the right ones to make it possible to model your path to production effectively and More importantly to model it to remodel it as you learn and evolve over time Well at the same time resist the temptation to continually introduce new Unnecessary abstractions that are only going to make things more difficult than the long run because they will be leaky Right So when you try to you know, you've you've probably seen pipelines and Jenkins right but pipelines and Jenkins are like your own Everybody's own like abstractions of how you're stringing together a series of tasks, right? Go it was designed from the ground up with this idea of the pipeline as a as a first-class object where you can You can get your abstractions, right? Okay, so we're back to tasks Drilling down into the task configuration, right? So here's the dialogue for configuring the copy command, which was the first one of our tasks You can see there's no big magic going on here, right CP, right? You just tell go agent what to do in the task the same way you might on the command line When you set when you install the go agent you configured it to run as a particular user on whatever machine It's installed on right and so you need to make sure that the command you you're running is on the path for that user Or you put in the full path of the command, right should be Also the the go user needs to have the agent user needs to have the rights to do what you're asking it to do Each argument is set is specified on a separate line So if you did that on the command line, you'd say you CP right to Parameters and here you're going to break each one in a separate line The working directory you see below is relative to the agent's home directory So wherever the agent is installed, that's like ground zero for the for the working directory And so this is the scripts folder under the agent install The second task in our job is a thing call to actually build the system So thing is Apache ant for PHP. It's a target-based XML scripting language And there's a link to a lullabot primer that's very good on it in the session notes, right? build systems are Generally classified as either being target-based systems like ant and thing or product based based systems Like maven and make make is a product-based build system Target-based systems are like they're easy to get started with I think and They tend to get hard to read as they grow in complexity because you got all these like targets It's hard to see how they really all fit together Product-based systems are kind of hard to debug It's like all it tend to be all or nothing right you run either it makes it or it doesn't and ad hoc Reconfigurations are kind of are more difficult right with the target-based system you can pull a target right out of the middle of it You can run on the command line You don't have to start at the beginning if you don't want to Probably the best choice is like the devil you know whichever works We chose thing Go is a tool for orchestrating your build not for locking you into any one of particular tool Right use whichever tools that you want and for example if you already have Jenkins builds You don't have to re-implement them to put them in a go pipeline right you can kick off a Jenkins build as a go task and And the wrap and essentially wrap Existing Jenkins infrastructure inside a first-class pipeline right you don't have to start over to go do this So here's the task dialogue for our thing call right that when we're actually building the site and So it's we're giving the full path user bin thing is what's going to run and the dash F is says You know the file name is going to come next the file We're calling is build that XML which is the default and thing and then the thing target is deploy dash depth That's what is they're going to thing is going to go look that up and build that XML And it's going to go do that whatever that happens to be and it's going to look in the scripts directory Under where the agent is running So let's drill down into that thing target right this is where it starts to get more fun So here's the actual thing cart target. We're calling the target name there at the top is deploy dev deploy dash dev and So that's our commit phase the first phase of the of the bill of the pipeline The first line is a thing built-in command that t-stamp and That initializes the thing date time variables, which then we use to For the doc root and database name in the build so every build gets its own doc root its own database And it's just time-stamped named. We'll have to worry about it the main action here is the call to Deploy dash website target, which is the one that's highlighted there. So we'll drill down into that next the rest of this that the rest of the stuff in there is like Environment cleanup and setup all everything that needs to be done in the you know step-by-step all the way through the build, right? so here's the This build Website target right now. We're getting into like now. This is a Drupal site. We're building right The first lines copy source code from the get repose where the go agent is managed is managing it for us So when the go agent kicks off right it goes and pulls all your source code from that from get repository Whatever your source repo is it pulls in all the materials and all the materials are sitting there stacked up like on a loading dock Over by the agent right the agents got them. They've you know, they're there It's sort of like the truck just comes to your home and offloads the building materials And so here we actually move the building material we create the document route where the site's going to go We move the source code the Drupal site into there. We deploy settings dot PHP all the stuff that needs to be done And then we when the code's in place when the document roots in place then we hand it off to drush site install right? So drush is supported in thing with a plug-in called drush task dot PHP So drush is not natively supported in thing you need to implement this plug-in once you have this plug-in then And obviously if you're doing Drupal sites, you're you're going to put this plug-in in and you're going to drive your build with drush Right so everything you're already doing manually or interactively or scripting with bash grips if you're using drush You just wrap them up here and thing and It's beautiful So just to show you that you know like there's nothing up my sleeve here, right? Here's a drill down all the way to the drush call Right so it's just the same thing you do in the command line if you can do it in the shell Then you can do it as a go task Right the rest of it is mostly a matter of paths and permissions right permissions is kind of like a big for me It's been a lot of work sometimes making sure we have the permissions for everything right on the server So after this drush site install returns then we have a series of targets that we run through that Enable the appropriate Contrib and custom modules build our menu blocks taxonomies views blocks users permissions as well as Initially any content we've declared is like metadata for the site content that really needs to be in place for the most basic stories to play We'll build a little bit of content out if something if you have content that is linked to a menu Then we'll build that content to make sure that we can test the menus right that kind of stuff So go tasks really are a wrapper for Thing in our case you can use them other things and thing is a wrapper for drush Right you're all the way down to the ground now So In the end, you know, this is again art more art the example in what we're doing in our build right we have We at the very end of the build we deploy Redis and varnish settings and clear the caches right now we have our active site And so just as like a digression that the redis redis cache settings and the varnish VCL You could manage them what you might probably you might want to manage those in the server not in the build Right, we put those files in the build so that they would go through the wash cycle every time if we made some Little tweak to the varnish VCL Then it would go through the whole testing stack So if we accidentally broke something we'd be more likely to maybe to catch it then on the configuration side, but But it is kind of server confer configuration type files So here's the interesting part at least for the way we're doing it After we build the site before we've written any configuration to the database which is specific to the environment So we're building QA, but we haven't written the vars that really make that that site QA specific We take a database dump which we then hand off to the go server as an artifact We don't we're not using database dumps of a way of delivering Configuration content into the build That is we don't use database dumps as build materials Instead we use database dumps as like a signed and sealed contract of what this particular pipeline run is about To make sure that you know once our cupcakes the database Dump comes out of the oven then every subsequent stage in the pipeline is guaranteed to get the same cupcakes no questions asked So really really really builds in confidence So let's walk through it to see how this stuff works this part right so here's the artifacts tab in a go and a job dialogue We tell go server about the cupcake what its name and where to expect it to be when the job completes And if it's not there at the end of the job the job fails when it's found Go server uploads it from whatever environment the agent ran on or whatever that happens to be So that it's available to the any subsequent stage in the pipeline running on any other VM Right, so if you have get into complex terms if you have a lot of VMs that are out It just doesn't matter where the go agent is running You don't have to keep track of where the file is or copy the file anywhere else because go server Just it's just wrapped up and go server just takes care of it That particular database dump that cupcake is only going to be available in the context of the same build pipeline instance Because each one of the pipeline gets its own cupcakes right baked to the according to the recipe changes for that particular run That's what the pipeline is about right is keeping track of like what your recipe is So here we are on the task configuration for a job On the following stage right we pulled the cupcake out of the oven and in the commit stage and then here we set up a task to pull the cupcake back out of the pantry right and Go server is going to fetch fetch it from the previous stage and deliver it to whatever whatever environment Agent executing the current job of QA is on stage is running on So here's a drill down into that fetch fetch artifact configuration The empty pipeline field there tells go server to look for the artifact in the current pipeline So that gives you a clue that you can easily pass artifacts between pipelines as well as between stages So you can build in to some really cool very complex scenarios If necessary, this is we keep it simple For now we just want to reach back to the previous stage of the same pipeline the one here called dev build and Then we're going to tell go server which job to look for which is that? build dev site and Then we're going to tell it which file to pull and when we leave the destination blank It tells go server just to deliver the cupcake to the same directory as the current jobs agent right to the same concept of the dock route for the for the agent where the agent is running So now you've got this distinction between build materials and build artifacts All right getting rid of the database as a build material something getting getting pushed into the build It goes a very long way to getting total control over what you're delivering because now Nothing is in there. Nothing is in the database. Nothing's in the build that we didn't explicitly put there You could do different hybrid strategies such as like You know you could do it like a all-in-code delivery without content and then use like table selective database dumps or Services or node import from JSON to deliver content. We're having this discussion like how do you delivering content is is There's a lot of different ways that you could go about that But the main point here is not our specific strategy of how we're doing that but to emphasize That really the considerable benefits of trusted artifacts and a delivery pipeline. It really really It smooths things things out and when things go wrong you start debugging with a much higher confidence level Which helps find the problem faster So finally we'll walk through the schematics of our thing targets For these four deployment stages because we're looking at this thing. We're looking at these targets from the perspective of Go server right where you're in the task You you call a particular thing target, which is going to do the work And this is more like a call stack of what is happening inside thing This is the part that's hard to do in thing is to visualize Actually the sequence of events because you've got this big XML. That's the problem with XML, right? You have this big XML file. It's got all these targets all over the place And so it's helpful to do this kind of visualization of how the stuff works So the pink box at the top is the ploy dev That's what our go task is calling right? That's our target name and The rest is the thing call stack to the completed through all the way through to completed site So we get the end we've got a website that's working Hopefully working The white on white targets are mostly your dev ops type stuff, right? And the yellow and green targets are the Drupal site build that's where we're doing this this the Drupal specific site And so that's the part you'd organize your build into whatever set of targets that made sense for The way you wanted to go about building your Drupal site You don't have to do an all-in-code site. However, you're doing it. You're deploying somehow now, right? The question is just wrapping up how you are deploying it into a set of tasks that can be repeated The save database target there in the middle that's where we pull the cupcake out of the oven, right? So in this next stage, right? So that last call stack was for the commit stage the first stage in the pipeline So now we're looking at the similar call stack for the QA stage so the target is to this deploy website QA stage and This target is where we load the database artifact fetch for us from go our little cupcake We pull it back out of the pantry and then we use that database dump to load into the site So we're not repeating anything that was written into the database on the first stage So we don't have to worry about like any there's any possibility that we could write something differently We don't have to worry about what VM the devs had run on if Another dev build is run since then which is you know like if you run dev and it creates a cupcake You're on another one. It creates another one. Which one do you want? You want the one in the same pipeline all the way through because you can have multiple stacks of pipelines running Simultaneously, right? You have pipeline running out. That's like you're trying to get out the production You can have others on top of it where you have stuff that's been committed after what you're trying to get to production And you can still be running at pipelines for that, right? And it keeps it all separate. You don't have to worry about it too much. It's nice So we're guaranteed to have exactly the same database as we had in the previous stage of the dev build All right, so the yellow and green target here is everything else We're going to do with to the database in the QA stage So for us what it means is getting the content in right so in the commit stage We build the site so we can do the most basic level of testing that we can right to validate that the devs didn't break the code In some kind of obvious way and then in in we in the QA stage then we load in all the content So now we have a full-fledged site with whatever content that the QA people need to actually test the site in any kind of Meaningful way and automated tests that are different right more extensive the tests on QA take a lot longer than the ones that run on the commit stage And so since we're actually modifying the database here We're going to push a new cupcake back up to go server. So the next stage showcase gets the next one down, right? So that's where we are here When we go to showcase you'll notice right away, there's no yellow and green targets Showcase got QA's cupcake and you know, that's the end of the discussion It doesn't matter if testers went on to QA instance and added and deleted nodes Because the cupcake that we got to showcase got pulled out of the oven before the testers ever got touch anything on it It's very very nice that way It doesn't matter if devs go in and modify code on any of the servers It just doesn't matter because we always get the right thing on the right stage And so now when we go to production. Well, it's exactly the same process as with showcase We just take the same cupcake from QA that and that's what we're going to deliver to production It's a brand spanking new doc route with its own database with no worries that somehow the recipe changed in any kind of way in the last stage And when we're done Building on the production stage We have a completed site sitting out there on prod in a pre-release doc route available for inspection on private You would also you can look at it before you cut over to it on production and It's ready to cope cut over simply by rewriting a sim link For current we have a sim link to current that's that the Apache v. Host looks at and so you just rewrite the sim link And the site goes from being the old site to the new site so you can imagine when you look at Rollback, you know rollbacks pretty easy rollback simply is a matter of rewriting the sim link Rewriting some sim links, right? That's all we have for rollback. You do have issues of like Content is being generated in alive on the site. And so there's strategies. We don't have time There are strategies where you deal with all that stuff too, but We don't can't go there So really pipelines are for people, right? It's not just a dev thing It has a lot to do with getting people from different parts of the organization to collaborate, right? Much of the ways to releasing software comes from its progress through testing and operations And so we use build pipelines to solve this problem to find and remove bottlenecks And so we want to get rid of like inflexible monolithic scripts We want we don't have any slow sequential testing We want we don't want to have flat simplistic workflows And we don't want like what one tool to rule them all right We want to use whatever tools that we want to use to get this job done and we pursue Relentless automation to avoid accidental changes to shorten the feedback loop and to ensure repeatability build really builds confidence and finally we value Optimization and visualization to get people from different parts of the organization to collaborate in Timely delivery of useful software. We want to build we want to drag in the non-technical BA's Stakeholders and drag them into this release process So it becomes there that the release process becomes a business decision, right? That once we get the stuff out on showcase Technical issues are resolved It's a business decision whether the features that they're looking for are the are the stuff that they want to see delivered in production And if it is then we then they just click the button and the stuff goes out there This is the goal of continuous delivery, you don't have to get all the way there But this is the path that we're all on So thank you. I think I've burned a lot of the time here So we have like a couple minutes for questions and I'll hang around right so do we have questions coming up? So you want to lead off here up Yeah, sure I guess that the first one is the last one which was when deploying to a live site. Do you still use site install? site install yeah drush no site install site install is Yeah, we do yeah, yeah drush site install. Yes. Yeah, we start with is it for us we start with a zero database every time and We build everything up from code. We do drush site install. Yeah, absolutely We use drush for everything we can so We do okay that what we do is we we build everything we build the new site out on production So it's sitting in a new document right next to production, right? It's sim linked as pre-release so we can go look at it when we actually click The the the final stage of release then we sequel sync the we put we put the production site into Maintenance mode which the users generally don't see because everything's in varnish anyway almost everything is and then we sequel sync any data that's that's Coming from the live site we sequel sync it into the new site and then we bring that and then we cut over the sim link You know we're up and the whole thing takes a couple of seconds, right? If you had more data, maybe it takes it 15 seconds or I don't know what it you know, but it's not very long It's this that's how we're doing it is sequel syncing the live data at the last because it's the only way Well, you have to do it at the last minute because the moment that you You know the last moment that you before you put the site in maintenance mode Somebody could have committed a record right you have to get all of that So next question is there a way to reuse tasks or even jobs? So you don't have to reconfigure at every pipeline from scratch templates. Yeah, you have Go is is Behind that whole GUI of go it's all writing the stuff in XML And so the real go heavy-duty guys So they they just go in and write that you know write an XML instead of you even using the GUI you can make templates and Recreate, you know entire Go Pipelines from the template or you can copy XML from one task and paste it in another pipeline. Yes all of that no problem So the next one in in continuous delivery Do you use any other metric on deployment decision then whether or not unit slash behavior tests past? There's a lot of emotional metrics that go in We're still and you know we're in the path of discovery and all that kind of stuff my view is that that the BA the business analyst should have this responsibility not only for the decision to release the production but for maintaining the the test suite in At that at that showcase stage right that the business analyst We should drag the business analyst in and say this is your test suite You're the one that has to decide what has to be here because that's a person that's really mapping requirements from the product owner back But it's a pretty that's What is it a fiery point right whenever I talk about that question like people? There's a lot of different opinions about how that's supposed to work We don't do that. You know for us It's we basically rely on our tests and if our tests don't break we kind of look at it and we push the stuff out I Think the the last question was I missed any is Do you have experience with big legacy code bases and and how would you migrate to? Continuous delivery while minimizing problems during this transition. What was the first with? Experience with a big legacy code code bases and how to get those into CD. Yeah. Oh, yeah, I have a lot of yes I have yes Yeah, yeah, yeah, yeah, I've got yeah That's why I'm in Drupal is because I escaped from big, you know, 20 years of big legacy corporate Microsoft Yeah Windows Oracle SQL server stuff like endless stuff. You should see what they have running it You know American Airlines is like they've got this stuff that's still from the place to scene that's running out there They still run basic, you know So I bit by bit is the only way that you can do that, right? It's called Martin Fowler calls it the strangulation pattern, right? It's really the only the only thing I can think of to do that, right because I've seen so many of these Legacy projects multi-million dollar. Let's convert this legacy project fail I don't even I can't even count them, you know How many millions and millions of dollars of these projects and they they try to rewrite the stuff And it's like the next team comes in and like, you know, we're gonna rewrite this stuff, right? And then it's like two years seven million dollars later and it's like total failure and they're gone and it's the next team So they just you know, you have to you have to agile, right? You have to like Figure out what's most important. You have to figure out something you can actually deliver and deliver it, right? That that's that part and the second part of the question was oh Yeah, yeah, yeah, sure. Yeah, it's I mean the thing about the thing that I've found and do it actually is It's kind of there was a lot of pain in building this stuff out because while you're doing it your your It takes a lot longer to Automate your stuff in a pipeline then it does to do it manually because manually you just kind of pull your way through whereas in when you're building automation then You have to like go through these you have to just iterate through it, right? You have to like go through and you end up breaking further down the road and you have to start from the beginning and go all the way through I Put I probably had the most time into permissions issues that I had a difficulty really Getting my head straight around right what had to be in sudoers and you know under which user and which place and getting that all in chef You know the infrastructure automation side of things But the again the the the key is to you know bite things off in the right size Don't try to automate everything as a task. Just find some part of something that you can kind of get running right and And then you build on that you try to find something that would actually be useful in some way some part of the delivery Right, so you're gonna start obviously if you're not doing anything you're gonna start with ci you're gonna build you're gonna do You're just gonna be building from commit Every time a coat you know commit gets pushed and then write tests against that against that and then you'll do the rest of your deployment However, you're doing it right where you like move it to the next stage and then move it to production However, you're gonna do it and so yet then you have ci is running and that's gonna be your first step And then the next part would be to try to do something for qa right to automatic You know push it at the qa and then if you're still like got a manual production deployment process Then you just keep doing that until you get further out. Just bit by bit one bit at a time, right? Don't stop never give up It's really worth it right there's no we're all going to be doing this you because sooner or later at some point You're either doing this or you're dead, you know because they you know once it once it becomes Standard practice right that it like continuous delivery is the way thing and it will be that it is becoming that It's kind of the beginning stages of real actual implementation But as you it becomes what people are really doing then your competitive position is going to be really really bad if you're still putting all of this time into Handcrafting your deployments right because because the to handcraft the deployment you're using talent. That's not cheap generally Right. This is expensive talent that's involved in this part of the process So we did there are two more questions First one is how do you manage large databases that take hours to dump slash restore? How do you write? Yeah, you know, I don't know. There's not I don't think there's any single answer to that. I mean probably You know the easy thing to say as well, you know, maybe we should be looking at like document database and charting or So there's one thing that we like sort of doing which was taking the database dump and restore outside of the build pipeline So that instead of having so that there's a warm database ready for a new build So in one project, we're actually tailing it on to like after you do a build the last thing that happens is another database Job gets kicked off and then we're also doing that on cron so that there's warm databases for That site builds are able to build on so that's another kind of If you that's a if it's taking too long see if you can take it make it not serial Right doing things in parallel is like a big part of making Because it because question goes to these feedback loops, right? You've got to get your build process and one way or the other You've got to get it into something that you can live with in time, right? That's actually going to allow you to Because when the build process it's the same as with software delivery if your overall delivery cycle gets to be very large then It becomes a problem the time becomes a problem in itself, right? At American Airlines, we did this thing where we said we're not going to do we won't agree to do anything if we can't get it Done in six months Right, so we had a shorter cycle for what we're doing on our website now We release one or two times every day and by releasing every day or twice a day or three times a day That means that whenever we release something we're just releasing like one thing and so the risk becomes very very low so When you're dealing with you've got these kind of like large databases that that be that's you know Obviously, that's an obstacle to the workflow, right? Because when something goes wrong and you have to repeat now you're starting to get in trouble, right? When you have when you have processes that are very long, so you just need to have some kind of I don't know what the strat There's probably a lot of strategies for how it is that you can do that. I don't know if you could yeah Yeah, maybe you can put it back on that you can do stuff on this on on the database back-end side and replication Maybe you can do your database dump in parallel where you Where you actually take one database and have four processes done four different parts of it If you can figure out how to make that work, right? Yeah, there's a there's a npm Module that we built that does actually that too so I can somebody tweets at me I'll figure out what they actually URL is for it. I literally forget the name of it right now But yeah, it does the very same thing. It just allows parallel The database dumps to speed these things up because these are things that we run into all the time So it's a problem, but yeah, the the warm builds actually makes a big difference So that's one of the things we're doing and you know in when you're trying to when you're trying to get to where you can say You know, we're practicing CD One of the things you're definitely doing is you're always on the hunt for bottlenecks, right? Wherever that is and that's your problem, right? That's the one the bottleneck is what it is that you need to find some automation Solution that's going to clear that and when it's cleared then you know You're on to the next whatever the next problem is and Part you know a big part of the game is just making sure you do the right thing first All right, because that's what allows you to keep working it depends on you know Whether you have support whether you're trying to do this kind of thing is like a skunk work thing Did because you can't get support an organization to do it or you've actually got a commitment on an organizational level That this is what you're going to be doing So last question. Yeah is How do you handle continuous delivery if you have user-generated content or e-commerce transactions happening in production? Yeah, we actually do this this what I demonstrated here. We do with our e-commerce site It's not a heavy traffic site. Yeah, so it's not we just all software licenses We don't get a lot of business on that a lot of volume But we do see we sequel sync and so it's the same thing we put the commerce site into maintenance mode After it's still after the new build is delivered we put the commerce site in maintenance mode and then we we sequel sync all of the commerce tables from the Previous production site the going out production site into the new target production site Obviously, we don't have varnish in front of that. So, you know if somebody's in the store right at that time They're going to get maintenance mode for as long as that takes which are you know our stuff is small So it's just a couple of seconds if you have a serious commerce site obviously can't take more time and So I don't I don't think that there's I mean there's a lot of the internet as is user-generated content You know commerce is user-generated content Facebook is user-generated content, right? So a lot of it is I don't know any other way to deal with it besides at the last moment, right? You bring it over. Yeah, but some somewhere you've got like a A database instance that the second before you went into this process the last second somebody You know posted a payment, right? And so that record has got to be in your next production one or the other right wherever it is You could do that obviously that's actually it's that would be the next obviously really cool strategy is that you do Exactly that right you have you bring up your new site without this this updated data, right? And then after your new site is up Then immediately you have a process that goes back and figures out what you missed and you do the sequel sink right after right? And so that way you could do this zero downtime and you would get almost everything, right? Right, but whenever you do this stuff. It's hard to say that you're actually always going to get you know Right Yeah, and so how do you then how do you figure out when yeah putting the site and read only mode is that yeah? So you the site's fully functional. They just can't proceed right and so then presumably in that case You wouldn't be breaking transact because then you know you you know this question of like what if you have a half-completed Transaction and all that kind of you know that kind of problematic stuff we just hope that we can Kind of dance around that but this stuff happens pretty fast Right really the kind of processes we're talking about for the kind of things we're talking about it goes You can do it right zero zero downtime releases zero risk Very very low risk You know on the on the releases, you know, this this is where we're going to it makes you feel a lot better On this stuff and if something does go wrong Then the again the confidence level that you go back into right that you go I know that I know the infrastructure is all managed right so I know that this isn't a package issue I know that this is the same database as we just finished all this testing on I know that for a fact I don't have to question that right and then I can you quickly go back in you can diff the What the files between they came out and it's like so so it burns out if you don't have this stuff and something goes wrong Then all of a sudden you're in this thing of like You know, where do you begin? The process of figuring out what's wrong and in regular development work it you that's what we do all day long We did that kind of stuff but in production You know, that's when I don't know about you, you know It's like I sweat and I've spent most, you know You know, I spent most of my career again doing these these really wild ass What look like now backwards, you know of the deployments where we had nothing as a guarantee going forward So in the order that you read off the questions. Do we have are we still here? so is Is that Henry yeah, yeah, are you still here would you like a copy?