 Okay, we're going to crack on. I'm not sure if they fixed the video, but hey, it's 10.45. So, hello. Thanks for coming. There's an awful lot of you. I'm Marcus. My Twitter, Nick, is at manarse. I feel free to talk about what I'm talking about and tweet questions and so on. I shall be taking some questions at the end. So with one click, what am I here to talk about? Can you make a build with one step? It's one of the questions of Joel's infamously random Joel test. It says completely unscientific testers to whether or not an organization is doing things right. So imagine you're there. You're starting a brand new project. Someone signed the contract. You've done all your business analysis. You know exactly what you want to build. You've got your designs, your wireframe. You're ready to start coding. Right. So what do you do? You might have an existing laptop. You might have a map installed. But just pretend you've got a brand new computer. You're going to need to set up some sort of dev environment. You might install a Linux VM. You can get yourself an M2 repository. You are using source control, right? You can download Drupal to the new checkout that you've got. And then you'll probably want to start coding. You might be getting some contract modules. You might be clicking through, configuring, using the admin you are. You might be doing cool things like features. You might be building custom modules and things. Stop. Have you thought about how you're going to actually get it to live in the first place? It's all very well to sit down there and start coding and configuring. But if you haven't thought about how to get that build from your local dev environment to the live environment and all the things about how to test all of this, then you're going to be incurring technical debt. You might be making things a lot harder if you try and add this after you've started. So there's typically two parts to this. The first part is from day one. Most people, you're not going to start with a fully built site. You're going to be starting with an empty DB, a blank code base. You need to install something to phase one. Once you start going into the build, you might deploy the site. You might have it up and running for six months and then your client comes to you for a series of updates. There are two very different problems. I'm going to talk about Cold Starts quickly because typically this isn't where all the complexity comes in, because you've got a blank DB, right? You can just build a new DB and start from scratch again. So you've got things like Drash and installation profiles and most of the distributions like OpenPublic, they're just a big complex installation profile, a whole bunch of modules and so on. So this might be your very, very simple web server. You're deploying to a VPS or you're running a separate web server in DB and maybe you've got a varnish in the mix, some sort of proxy. But all we need to do at this point is get the code from the repository to the server so you can edit that and so on. And maybe there's a bit more to it than that. You might have Simlinks. You might need to clear caches. You might need to do things like run update DB after you've deployed your code. So there's loads of ways where you can get the code from A to B. Pretty much everything you can think of has probably already been done. If you're using FTP, then you've probably got big problems from day one. But then you wouldn't be the only ones because I worked for an organisation a couple of years back. Two years ago, 8 million hits a day. They were using secure FTP, but it was literally copying the code from one place to the live server. I saw that and got very, very scared. There's lots of tools like ant and thing. So I'm going to just take this one off. Technical problems. Do we plan this deployment? So Jenkins, ant, thing, azure, drush. I'm sure they're all familiar names and there's lots of talks on some of these in the conference which I'm going to mention later. But the point is that we're starting to look more and more towards automation. So if you have this very simple setup, you've just got a DB and a web server and they might all be on the one box, you could have a very, very simple deployment tool. The slides will be on SlideShare. I know it's really hard to read, but this is a simple script. At the top, I'm declaring a tag number to release. I'm giving the URL of the SVN server and the path to the doc root on the server. And it's doing a checkout to a timestamp folder, so it's checking out to var, 2011, 08, 20, and so on. And then once it's done the checkout, it's doing a sim link, so var, current is always pointing to the checkout that you've just created. And then it finishes right at the bottom, just about C. It's a drush kit, clear cache command. So this is simple enough to automate the deployment and you might want to run UpdateDB as part of that script. And this is actually what I use on my blog. It's really, really trivial. However, we work with sites like this. We've got two HAProxies, a couple of varnishes, two, three, four, ten web servers, DB, SlaveDB, and then a memcache box and a solar box and a NAS and LDAP, an enterprise service bus, and then you go, that's really, really complex to deploy because you've got loads of different hosts. They're all playing different roles and there are different actions to do on the different roles. You want to check out the code, but you're not checking the code out to every box. You don't check out to the DB, but you do want to do things like restart the memcache server when you deploy. So if you try and do that with a bash grid or on the command line or by hand, you are going to be in a world of pain. So to break it down, you'd start off with a list of hosts. These are just the IP addresses or the URLs of all of your different machines. And you've got all of the different roles. So you might have ten web servers, two reverse proxies, something for network management, data caches and so on. You can take each individual action separately. So that might be check out the code, restart Apache, clear the memcache, and then you can group them together as an action. So my action might be deploy to live or like Apple, you might want to shut the site down for maintenance and put a maintenance page up. And that might be something that you do by changing your network layer to point to a different holding server whilst your main site updates are going on. So you've got these overlaps. Hosts and roles. And tasks, actions and sort of the target that you want to do. And that target is environmental. So your individual action might be restart Apache, but that might be one of the steps of the task of deploying to live. So looking back at all of those and all of the concepts that we've got, how can we make it simple? Do we want to write a bash script where we start declaring every single one of those hosts to begin with and creating a mapping of hosts to the roles that they're playing? And you could do this in bash, you could do something like that in ant. Azure has got a similar sort of philosophy. Or you could do what some organizations do which is roll their own where things get moved about with SSH. By the time you get to this stage it's such a complex beast that you're not able to manage it, you can't maintain it. And when it goes wrong and you can be sure when it goes wrong it'll be when you did that last minute deploy at five o'clock on a Friday and you really want to go to the pub, then you won't know where it's gone wrong because these things will be really hard to find. So what you want to do is take a look at the open source alternatives, take a look at the products that can make it simple. So I want to talk a bit about Webistrano because all of those concepts like hosts and roles and actions, Webistrano has these built in from the start. Webistrano is a Ruby on Rails app. It's perfectly safe. Capistrano is a commands line equivalent. So if you like command line you can do everything with Capistrano. Webistrano gives you a really nice front end and with this you can give this to your project manager and give him the ability to deploy a new checkout and the tag to the demo site. And you can give your ops team one click deployment to the live environment. So it makes it easy and it means less work for you guys. So it's good to be lazy. So some of the things here you've got a list of projects. These are the stages or the environments within your project. So you've got production, testing, demo, development. It gives you things like a history of the deployment. There's the list of hosts that we talked about. There's one custom recipe. So I've got Webistrano running here and if I'm lucky I'll see if I can show you something underneath the hood. I didn't plan to do this in 640x480. Can you see that? Okay. So Webistrano is a Rails app. By default it runs on port 3000. You can add in all of the things like access control and so on. And here is where you manage things like projects. So this would be the equivalent to you've got client A that you're building, your rail network site and client B that you're building, your browser website. And you would treat each of these as a different project. You can add hosts. You go into here, create another host. If you go into here you can configure all of the code that's appropriate to this project. So your script to restart Apache is probably the same on every one. Your path here would typically be you'd want to deploy to var www. You can change the repository. You might want to change that and tell it exactly which project you're checking out. And so this is sort of the very basics and the package it comes with will do a straightforward deployment. It will check out the code to var www. It will check it out to a timestamp directory and use a sim link current to point to the current one. But then you might think, well, I'm using memcache and WebEstrano didn't take that into account. So you need to create your own custom recipe to restart your memcache servers. I go into here. This is where you get to learn Ruby. If PHP wasn't enough. I'll just make that a little bit bigger. This just puts it into a namespace so you can do different tasks. This is a really, really simple example but it can do things like take input output. You can restart different memcache servers. And so on. So once you've chosen which environment you want to deploy to, you've got a list of all of the individual actions. So if you can't see that at the back, that's things like deploy cold, deploy finalized update, deploy migrate. And this is the standard Capistrano package which means there's lots of rail concepts in there. One of the limitations of WebEstrano is that it's been designed to support Ruby on Rails applications. So something we're working on is to take the default templates that they've got and droopify them so that we've got things like, you know, check out Drupal and restart Apache and all of the things that you would do in a normal Drupal deployment. So it's something we're working on and may well be planning a bot on that. So if you're interested, we would love some input. And if you look through here, you can see each of the actions and this is a description of what it does. So it does make it, like I say, it's a really nice GUI just to make everything simple so you can click deploy, start deployment. That gives you Ajax. It tells you if the build passed or failed and it gives you a log of every deployment that's been attempted. So if I go back here, I can see in the deployment history. I can see the build failed because it's not going to find svnt.example.com so I was going to struggle to do a checkout with that. Some of the alternatives are going to be talked about later on. Tomorrow there's a talk on Azure Dog is a system to do post-git commit hooks. So when you do a commit with Git, you can hook it all up to do some sort of wonderful deployment. That's Sam Boyer's talk. So given that he did all of the Drupal Git migration, then he probably knows what he's talking about. Some work on Drush. A talk on Drush is going on. There's nothing about deploying with Jenkins or Hudson, but there is a talk on CI later on Thursday, I believe, where you can see some of the principles of using CI to test. Stepping outside of deployment, I talked about setting up your virtual machine and if you're sitting down and having to install a new Linux VM and you're typing yum install Apache, yum install PHP, you're relying on being able to do that the same way every single time because if your dev environment isn't the same as your live environment, then none of your tests are particularly valid. You might find a surprise when you deploy to live. So we're using environment automation. We're using Puppet, and our Puppet manifests let us write the same thing. You have a single Puppet Master and every new machine that you spin up talks to the Puppet Master and says, hello, I'm a web server. What should I do? And the Puppet Master replies with, this is your manifest. These are the things that you have to set up. And those things are, you need to install Apache. You need to have this user install on your machine. So you need to put this file in this place. So Puppet is the one we use. There's Chef, there's Vagrant, Fabric, they're all similar alternatives, similar approaches. There's also a project on Drupal.org called Quick Start. And that gives you a pre-built virtual box, virtual machine. So you can just download Quick Start. That's your whole environment set up ready to go. Some of the test tools are things like Jenkins, Hudson, Code Sniffer, Selenium. Again, there's another talk about CI with Selenium later on in the week. So it's one to look out for. I'm also going to pull up some of our Puppet Manifests so you can see the work that we've been doing there and see what it looks like. Is that big enough to read? Can you read that at the back? Or do I need to make it bigger? Bigger, okay. So Puppet breaks it down into sections. So you can have the different environments. There's Manifests. There's Modules. And Modules is very much the same as a Drupal module in concept. It's a standard package that you can choose to use on all sorts of different machines. So there is an Apache module for Puppet. And that's instructions on how to install and configure Apache. So you'll want to use that on your web servers, but you wouldn't want to use that on your proxy. So you've got things like Manifests. And a Manifest is a set of instructions to a server on what to build and how. So here, I've broken it down by saying that dollar environment variable is something configured on the server that says I'm a web server or I'm a reverse proxy. So this is describing its role. And if I go into my Manifests directory, I've got to site.pp is the default Manifest. And what I've got here is a script to start replicating the Puppet master itself. And this means that once I've got this set up, I can clone it, which means that if I go into a client's organization, I can give them a complete copy of our system and walk away. And that can be done in about 10 minutes. So this is very, very simple. All I'm trying to do here is show that I'm creating a file. I've called it this just to test. So what I can do is start the target server. And if this file is created, then I know that the server is configured properly and creating the right Manifest. If I switch over to my target, I can see that it already has this file. So I've deleted that and I can see the file doesn't exist anymore. If I restart Puppet, I can start to pull the Puppet master for its Manifest. I look at what it's doing there. Sorry, that's... It's log produces binary output, most of which is readable, but it doesn't really fit on the screen. So you can see it's already created this file. And if I go back to the recipe, that was what I told it to put in the file. And it's there. So it's a way of fully automating the deployment and set up a new environment, which means that your environments are always going to be identical. And it means your ops guys who have sat down there for, you know, an hour for every new developer, typing yum install Apache, gonna thank God for that. And they'll be able to do the work that's important and difficult like performance issues and dealing with NASs that have gone down rather than the mundane stuff that we can just automate away. These are some of the modules that I've been setting up. And this is a module manifest so that all of the machines in our environment are always going to be using our custom repository. So it's a bit like object-orientated. It still puppets DSL, it's domain-specific language. And I'm saying that I want to file. I want to put it in et cetera, yumrepos.d, capgemini.repo. And the content is just a standard repository file so that when future puppet scripts tell it to install something, it knows that it doesn't have to look at the standard red hat repository. It's looking at our private capgemini repository. And that's important because we've got tools like Drush which don't have a package on red hat. So we package them into an RPM ourselves and this lets us automate everything because the first step is to say use our private repository for all of your future code. You can do a lot more with puppet but plenty of sites on the internet where you can find out all about that. So a few other things I've come across. I've talked about the Joel test. And the Joel test is a series of like 12 questions which says things like, are you using source control? If you're not, then you've got quite deep problems. And Lorna Jane speaks at a lot of PHP conferences and did a talk about how the Joel test applies to PHP so that's quite a good reference. And my colleague Martin is doing a talk on CI and that's going to talk about Jenkins and Hudson and how to integrate testing with all of these deployment tools that we're talking about. So I work for Capgemini and they're kind enough to give me time and effort to spend on building all of this stuff and talking about it and running bots on it so thanks to them and to all of you guys for coming along and do we have any questions? So if you try to tweet a question and it won't come through does anyone have anything they would like to ask? Yes. How do we do source control and deployment to the configuration? It's a good question. I'm going to see if I can find some code somewhere that shows this. Okay, so this is fairly typical of my standard let me make that bigger. There we go. So my standard source control setup I've got a directory called comf a directory called htdocs and a directory called updates. So those are the main parts of it. The main bits of configuration where they can, things like Apache comps and varnish comps go into the comf directory and in the updates directory I'll see if I've got any in here. So that was a test patch but part of the deployment cycle when I talked about the post go live tasks that you can set where we're trying all up to run. So I showed a recipe to restart memcache. We also have an update script and that update script reads the DB to look for something saying last release and that gives me a patch level. In this directory of updates there's a set of timestamps so you can see that was 5th of July at 5 past 11. If they end in SQL then it's a series of pure SQL commands that are run directly on the web server. So on one of the web servers to apply to the DB server. And this means that if you're developing the site and you know that you've got duplicate entries in the nodes table you could do something like delete from nodes where no status equals zero to delete all of the unpublished ones. Now I'm not saying that you want to use SQL commands for everything because if you do that you'll be missing out on all of the Drupal APIs. All of your hooks won't fire. So SQL's very much an edge case but something's gone horribly wrong and to give you an example and I think the examiner have faced the same problem curve comes across it in all of the migrations that they handle. You just don't have the performance to do everything through the hook system. If you need to migrate 10 million nodes then you can't trigger everything through no loads and no saves. It just takes too long so you have to do some things at the SQL level. But I don't have an exam... Well I can show you the sorts of things that we would see if I were to do an example of Drush Update if I had created the new foo module and as part of this deployment we needed to enable a foo module I would have a file that just says en slash y foo and because it's called dot drush the build script finds that file, runs it through drush with substitution for the doc route and the site variables and so on. So you can fully automate all of the steps that you want to do. You've also got update.php which will run through all of the hookup day ends so if you have db schema changes you would normally write those in your doc module file. The difficulty comes because typically you want a module to be self-contained and referencing just itself but when you start building modules for sites then you end up with sort of a generic module for things that are touching everything. So if you had used the node profile module to give yourself user profiles and then you had changed the makeup of that profile you wouldn't want to write your hookup day 10 in node profile because it's a contra module and you don't have contra modules. So you could write your hookup day 10 in a generic module and as part of update.php you can run through all of those steps that you've done. So there's lots of ways to skin this cat and part of the tools we use is just putting all of the ways out there and leaving it to the developers to decide which one is most appropriate for what they're trying to do. That was quite a long answer. Did that answer your question? Thank you. Yes. The question was why don't we use features? The answer is we do. We use features in different places. It's one of the arsenal. It's quite hard to put a finger on when I do use features and when I don't. I know some of my colleagues and other projects have had issues with features and changing the schema of cck as it was a Drupal 6 site and ended up writing a lot of hookup day 10 code to get around that. But I would see that as a bug in features that we need to spend time on addressing. So I'm sure there will be bugs in features that will make it harder for us and we just need to work out what the cause is and fix that. Yes. Do we have a strategy for environment-specific settings? The short answer is no. I've got a load of different tools and approaches I've used. So one example is in settings.php I usually have a line at the bottom that says if file exists settings.local.php require once settings.local.php but as to a broad strategy of how to separate them out no, I think it's something we need to work on. I think Greg Dunlap, who's heading up the configuration management initiative has got some really interesting ideas so I'm hoping to get together and I'm sure there'll be a buff on that so I'm interested to find out what other people do. Oh, there's some from Twitter. How do we handle Puppet with race conditions? It behaves differently on different dev machines. That's an interesting one. I've never had Puppet have race conditions. One of the things on Puppets is that the manifests that you apply if you don't provide any rules are running in arbitrary order. If you don't specify then you don't know that the checkout might happen before the cash clear which you wouldn't necessarily want. Part of the Puppet manifest language allows you to specify dependencies and they're changed so if A depends on B and B depends on C C will run first. Whoever asked that question if you want to come and catch up with me later I've got an example somewhere which I can show you how that works. Any other? Yes. I'm not sure I heard the question entirely. You asked about RPM deployment? Okay, so the question is how do we use RPM with our deployment system? There's packages like Drush which don't have an RPM package or if they do they're several versions out of date. So first of all we build our own RPMs for these packages. There's lots of tutorials on it but basically it involves setting up folders in a particular structure. You give it things like the sources any patches that you want to apply and you use RPM build and at the end you just get out a .rpm file. To serve it up you need or you can copy the RPM to a machine and install it locally using RPM-i for install but it's much easier to manage things with a central repository which uses YUM. A YUM repository itself is just a web server so you would set up I've probably got one on here. So in this repository I've got one single RPM I've got some repo data so you create the directory you set up an Apache v-host to serve it and you use a package called create repo which creates all this metadata here and once you've set that up it's exactly the same as the Red Hat repository or the Apple repository or the Jenkins code repository behaves the same way so once you tell your machine how to find it when you do YUM install drush it will search your repositories as well as all of the others some of these servers might have it there we go and you put this file in and you tell it the description of your repository the URL to find it and things like the encryption key so once you've got all of that it behaves just like any other so the question is when we use Puppet to install software that has an interactive install system how do we manage that is a traffic management system it's use extensible traffic manager and if you install that on the command line it does give you an interactive install but like a lot of Linux software they know that people want to automate this so although by default it asks you interactively you can supply a preceded set of answers to that but that depends on the people who wrote the software being nice if they're not and I'm pleased to say I haven't come across an example that I've needed you can use a piece of software called expect and it's a Linux command line system that allows you to interactively answer requests and interactively work with a build but it's still automatable so there are solutions out there but the answer is to find packages that do that if you can yes have I got some experience with strong arm we use it as part of features and we do use it for variable management in different cases I don't often do things with strong arm by itself where I use it it's normally part of a feature does that answer your question okay so the question was have I got any experience in using Aigre, Azure I'm not entirely sure how it's pronounced either to build a deployment workflow I've tried it I've got to the point of being able to push stuff around but you're right there are issues in things like if you have a live site and that's got lots of user generated content like files on there if you want to test everything then sometimes you need to do things like sync the files between your live and your test environment and lots of other actions the answer is I don't know how to do those but there is a talk on Aigre tomorrow I believe so hopefully some of those questions will be answered there anything else in that case that was the last slide so feel free to tweet me have a look at my blog the slides will be going up on slide share in a few minutes thank you very much