 And it's easy to get these things going if you're starting on the green field. When I started my own company three years ago, it was easy to grow my own business culture to start with automation from the first server to get measuring in place and obviously, as I'm curating the DevOps track, I'm trying to share knowledge as well. But what do you do if you have an existing company? Maybe that's been existing for decades, that has a very much matured culture. How do you get DevOps going in such a business? Well, I'm very happy that we'll have John Topper here to tell us about that, tell us about his experience, good or bad, we'll see. And I'm very much looking forward to hear what he has to tell us. Thank you, John. Steady on, I haven't started yet. Thanks, Jochen. Yeah, I mean, as Jochen says, I'm John Topper. I've run a small DevOps consultancy out of London. We work with small clients and large clients. And so we have a, luckily, a broad view of all of the spectrum of organizations. But this is, talk largely about our experiences in one particular large organization a couple of years ago now. And we'll be talking about what we learned in that organization as we attempted to put in DevOps processes and specifically around Drupal. So when we talk about DevOps, I'm assuming that since you're here on the DevOps track, you have some kind of concept of what we actually mean by that. If not, the way I kind of look at DevOps really is that it's mostly about shared ownership of the quality and availability of code and platform in combination. This is the graphic from Wikipedia. It's a bit hokey, but it basically shows you that these three areas need to work together to share the quality and availability of what's being built. DevOps is really a fairly recent term. It's quite a recent coinage. Really, it's been sort of rising in popularity since about 2010, as you can see from this graph from Google Trends. And really, I look at it as another name for good practice, SysAdmin. And that's sort of what recruiters have started to use it as, which I'm kind of okay with because I don't think we really need a new term for it anyway. But I guess having a name that we can kind of put events together with is quite an important thing. It's a bit tribal, I guess. It's not quite the cultish thing that some people might view it as if they've been reading a lot of blogs or only been exposed to it incidentally. And as Jochen says, we have these four pillars that we refer to. These were coined by John Willis, who you might know on various internet places, as Botcher Gallup. He was formerly of ops code and is, I think, currently still an astratist who have recently been acquired by Dell. And he shares these four things with us. And culture is really the shared values and vision and knowledge within an organization without which we can really do any good. Automation is all about removing manual steps, reducing the margin of error, getting more things scripted, basically to improve the quality and the efficiency of what we're doing. There's monitoring, so improving the operational visibility of the platform, understanding what it's doing, why it's doing the things that it's doing, what's different now when it's gone wrong versus what it was doing previously. And sharing, which is kind of, as CIS admins, we've traditionally been a little bit stuck in the basement of building, not talking to anybody. And the sharing aspect really is about making sure that these monitoring data are available to everybody, sharing tools, sharing processes, sharing approaches, blogging about it, talking about it, getting everybody involved in the conversation. It's all very important stuff. And if you've been reading about DevOps online or maybe you've watched some of the videos that have come out of previous DevOps days, then you'll have heard of a lot of these kind of organizations that we sort of lift up as being examples of people who are doing DevOps particularly well. And these are people like Instagram, who at the time of their acquisition by Facebook had just 13 employees, which considering the amount of cash that Facebook paid for them, I don't remember exactly how much it was, but that was a lot of money, is pretty impressive. It's a very efficient organization. People like Spotify, who are very often at puppet events, sharing what they do in their world, how they use their configuration management tooling. People like Etsy, who've done an amazing job at sharing a lot of the tooling that they've created. And IMVU, GitHub, people that are talking about what they're doing and people that we look at as good examples of the craft, if you like. IMVU was Eric Reese's lean startup, I guess, and he was blogging about releasing code once every 10 minutes, tested, working and into live, probably sort of four or five years ago, which is really the sort of inception of all this stuff anyway. But one thing that we notice when we kind of look at these organizations is that they were all founded or kind of launched within the last decade. They're quite young. They're small companies, and they're all ostensibly technology companies. That's what they're selling. The technology is an end in itself for these sorts of organizations. And really, that kind of makes them a bit like this guy. He's kind of noisy and young, and not all companies are like this. This is not a truism within DevOps that everybody is a small, agile, technology-focused company. And I don't know what the statistics are, but it could probably be shown that most companies are not like this, but we hear from the noisy ones. And really, some companies are a bit more like these guys. They're old, they're slow, they're a bit angry about everything, but they're quite well known. They've been sat in that box for a long time. They're doing quite well. And these are the sorts of organizations that I'm talking about when we talk about large organizations. And so we'll look at one particular large organization that my consultancy spent some time working with a couple of years back. But before we do that, I'm going to ask you a seemingly unusual question. Those of you from the UK, how many people from the UK have we got in the room? Okay, this will make some of that sense. The connection between a large, stately home full of posh English people and a zed-list celebrity in the Australian jungle eating a kangaroo anus to win treats for the rest of his group. These are both television programs. They're both television programs put on the air by the same company. There's Dan Snabby on the left-hand side there, and I'm a celebrity, get me out of here on the right. And the thing that connects these two organizations, these two programs, aside from the fact that they're both published by the same organization, is that they both in the past, at one point, had Drupal web properties associated with them. And the company we're talking about, for those of you who are not from the UK, is a company called ITV. And ITV is the UK's largest television network, commercial television network, I should say. It was launched in 1955, so this is a 60-year-old company. This is your Statler and Waldorf of organizations. They've been around a long time. There are over 4,000 employees, or at least that was true at the end of 2012, which is where I got these figures from. And they have sizable revenues. And the things to bear in mind about this organization is that they are old and large, but they're also not primarily a technology company. Technology exists as a way of putting entertainment, if you want to call it that, in the case of kangaroo, anus eating in the jungle, in front of the general public. And some of that technology is broadcast. Some of it is putting things on cable television networks. Some of it is on the internet. And the internet stuff is on the rise. And that's where we sort of come into this story, really. Most of this revenue comes from advertising, right? So this is interstitial ads in between programs that you sit through in order to get to the content that you want to watch. And the first advert that the ITV put on its air was for a toothpaste back in 1955 called Gibbs SR, a long forgotten brand these days, no doubt, but which presumably is the sort of thing you're going to want to buy if you've spent a week in the jungle. So 2010 quarter four is when we sort of picked this story up. And when we started at ITV in 2010, the majority of the web content was served from a .NET content management system, a big sort of internally grown, organically grown, homebrew CMS platform. There were some Drupal 6 sites in existence. These were quite small, primarily built by external agencies, arguably not very well, quite difficult to maintain to upgrade, lots of sort of hacking going on in the core where it shouldn't really have been. The .NET platform was being released on a kind of six-weekly cycle. And these releases would take place out of hours with a bunch of developers and some pizza and some cola. And it would take a number of hours to happen and no one would go home until it's finished or until it's rolled back. And this isn't a comfortable place for anyone to be really. In order to deal with this sort of big complicated release process, there was a big complicated change management process where a man with a clipboard would require developers to write a lot of Word documents about how this was going to happen. The intent being that this would improve the quality of the release, arguably it did, but not to the extent that it was necessary really. A lot of these releases, probably one in three, I think, were being rolled back at this point in time to make way for another attempt on another evening with another load of developers and beer and pizza. The organization itself had a multi-tier operations team, so a sort of first line, second line, third line style operation. The first line team really were performing manual tasks that the content management system should have been capable of doing. So a lot of the front-line teams' tickets were about adding rewrite rules to traffic managers so that the new TV show could have a vanity URL. And that sort of thing really, as people who use Drupal, we kind of look at that and go, that's not really very smart. But there were people employed to do that sort of task on a day-to-day basis. The availability and performance of these sites were pretty bad. The fact that Akamai was sat in front of these web properties probably saved an awful lot of grief, but it hid an awful lot from everybody. And there was only kind of minimal monitoring of this stuff. Some enterprise monitoring platforms were being paid for. These had been tuned to ignore the first half an hour of alerts because that might be normal. Just kind of, not a good practice for the most part. And the ops team probably, I think, was around 12 to 14 people and there was a budget on the table for increasing that to increase the operational hours of the platform. In spite of that, though, the ops team were only responsible for the application. The software, the hardware, the hypervisor, the switches, network, sands, all of that kind of hardware, all operated by third parties under a service management contract. And so there were people in that ops team whose main job it was was to triage tickets between end users within the business and this third party team. Really inefficient use of people, basically. And the primary cause for things like the poor availability and performance is just some bad architectural mistakes. And architectural mistakes that had come about through miscommunication that had been an assumption that the developers weren't ever allowed to use a database anywhere near the front end. And so they built a database out of XML and shared file systems. And it was just crazy. And, you know, we're tempted to laugh at this, right? This is a comedy situation. It's a comedy of areas. Everyone's kind of in this position at some point in their lives to be of a certain size. But really it's important to realize that nobody tries to get here. Nobody aims to be this bad at stuff. And there's a context to it. There are reasons why we're bad at these things. And the context in part, as I see it, and this is largely my opinion rather than anything that I would argue to have measured, but a lot of the problems were down to the web deliverables were tied to TV broadcast dates. So you'd be building a piece of technology. Someone would go, oh, I need that for this program. And then all of a sudden you've got a deadline that you previously didn't have. These broadcast dates don't change. If the website doesn't work, no one's taking Downton Abbey off the air. Which is a kind of grim realization for those of us who have done work with startups where actually if things aren't ready and are likely to fall on their ass, you probably won't release them. But also, even though we're working on fairly tight time scales, tight immutable time scales, there's a lot of last-minute changes coming in. I mean, it's a creative organization. A media company, to me, feels an awful lot like an enormous agency. And the agency model is very much like, get this out of the way, get the next client through the door. And so there's no real kind of focus on reusability of quality or those sorts of things. This was an online operations team living within 4,000-person organization. There are many other teams within that organization that could place demands on that operations team, none of which necessarily went through any kind of management hierarchy of commonality. They'd just appear one day. You come in in the morning and there'd be a new thing that had to be live by Saturday. And these are the sorts of things that just sort of show up and distract from time to time. In particular, one piece of fairly small programming on one of the channels that very few people actually watch. The creatives had decided they wanted to put an opera singer in a studio, point a camera at them, have them sing tweets and live stream this on the Internet on Saturday. This was mid-week. And we did our best to accommodate these sorts of things. It's also important to see that in 2010 we're sort of recovering from the financial crisis to some extent. I guess we still are, to some extent. And so a lot of people have been laid off. A lot of tribal knowledge had exited the building. Lots of people licking their wounds. Arguably some of the better people had left. A lot of the things that the dev team knew about the amount of code that an individual member of the dev team had seen was probably lower by percentage than it had been previously. And there are third parties who are doing some of this service delivery. As an organization, this team can have complete control over the end-to-end delivery of all this stuff. We also had some in-flight projects on the go at this point in time. An in-flight project such as a multi-million pound data center refresh. After the layoffs, there were a number of buildings that had fewer staff in. As a sensible precautionary measure, we were looking to move IT infrastructure out of office buildings and into data centers so that the office buildings became less of a place of reliance. We recently had a completely new board installed, new CEO, chairman, I guess. And so there was a business transformation project on the go. And the business transformation project was designed to get out of this or avoid in future the situation that had led to these redundancies in the first place, which is if you're a company whose sole income comes from advertising, then when the bottom drops out of the financial market, advertising revenue dries up. Everything's fucked, essentially. So part of the business transformation program was to find a way of generating revenue from sources other than advertising, basically. So selling directly to the consumer, expanding into international markets, all of those sorts of things, which is why things like Downton Abbey are all over the world right now. As well as this, we're looking at a completely new head team in the online on demand part of the business. And so there's a new project spun up to replace the content management system, which isn't agile enough for anybody, and a new online video player project just for good measure in order to allow us to sell video content to the general public and hopefully avoid these financial problems in the future. And that's where Drupal comes into the equation. Both the online video player project and the CMS project began in life as Drupal projects after a process of RFP where we went out to Tender and looked at a number of different options, some of them .NET options to address the fact that most of the internal knowledge was in .NET, but there were a couple of Drupal contenders and eventually we chose Drupal as a platform. So in order to build a Drupal platform, obviously we needed a new Linux platform. Up until this point, the Linux serving out of this organization was a couple of Drupal 6 sites with single points of failure all over the place, manually configured VMs, running Red Hat 4 with old versions of PHP and MySQL on, kind of a mess really, but reasonably contained. The benefit of only having one server per property is that if you don't have any properties, there's no service to worry about. And so my team and people that we brought in as new hires for the business started to build our new Linux platform for this stuff. We'll talk briefly about the technological, the technology pieces that we used as part of this deployment. So we were automating configuration using Puppet. We were building development environments using Vagrant. We built continuous deployment pipeline using Jenkins and we'd gone through a hiring process and the hiring process was a bit more than just interview now. We went through technical tests and found a good team of people. I apologize to Paul who sat in the middle because one of the guys we hired used to work for him. And so we built these technology components. 12 months of replacing the DevOps market is difficult. If you're not currently working in DevOps, consider it as a career option. We'll leave that to the end. So we used Puppet for doing our configuration management. And if you're unfamiliar with these tools, I'll kind of briefly cover them. If you want to know a bit more detail, then come and talk to me afterwards and I'll sort of do my best to explain them. Puppet's a configuration management tool. It's one of a number of current tools in that space. You've also got things like OpsCodeChef, Ansible, SaltStack, all those kind of things. The questions often ask, should I use Puppet or Chef? The answer to that question is yes. It doesn't really matter what you use, but this is an important thing to look at. Configuration management is an area that if you're not doing it, you are probably going to be in trouble, particularly if you're running it more than a handful of servers. And it allows us to build infrastructure as code. It allows us to sit down with a text editor and describe into a text file in a domain-specific language how we would like our servers to look. It's convergent, which means that having built my description of how a server should look, if I run it on a server, the tool will take a look at the current state of the server, take a look at the required state or the desired state that I've described and work out how to get from one place to the other. And it will only make as many changes as it needs to bring that configuration in line with how I've described it. So the second time I run Puppet using correctly written Puppet manifests, nothing will change because the server is already in the desired state that we've described. And it runs Client Server or Masalus, which means that it's suitable for running thousands of nodes or just one or two. It really doesn't matter what approach you take. It's able to do both of those things. And infrastructure as code means that we have access to software tools for writing this kind of stuff. So you can use your IDEs, your Emacs, your BI or Eclipse or whatever other diversity you like. And you can use version control systems. You can use continuous deployment, continuous integration tooling to test these things. And that puts you in a powerful position because that means that you can start looking at infrastructure as code components. And if you are building a DevOps team with developers and operation staff, a lot of this stuff is kind of familiar to developers. It's not necessarily a very straightforward language to pick up, but given time, it is easy to follow. And it's fairly easy given an existing structure to figure out what changes you need to make to get to the effect that you personally desire and to change. You use pull requests as you would if you're collaborating on a Git project to quality control that stuff. It all becomes much more like developing code. And that, I think, is quite a powerful thing. Another powerful tool that we use as part of this project, is the first time we've used this particular tool. It's now an absolute staple of pretty much everything that we do. It's a tool called Vagrant. Some of you, if you've been following the DevOps scene, may have heard of Vagrant. Maybe using it yourselves. You may have heard Mitchell talk about it at one of the events, the many events all over the world that he speaks at about it. Vagrant is essentially a tool for managing development environments. And it's a set of Ruby command line operations around managing virtual machines, either locally or on remote hypervisors or on AWS or whatever. It's all sort of plugin-based. On the host operating system, it supports Linux, Mac and Windows. So it doesn't matter what your developers are using. Well, okay. They're probably not going to get much joy if they're using free BSD, to be fair. But if they're using Linux, Mac or Windows on their desktop, then they're all going to be capable of running this thing. And it allows me as a DevOps person to provide a base image and some puppet or other configuration management language for standing up a new development environment that is as close to live as we can make it. Because it uses the same puppet manifest. It uses the same operating system packages. It uses the same version of the operating system as a starting point. And this is, again, really powerful. It means that in a development team of, say, 20 people, if we can try to upgrade the PHP version by a point release, you don't then have 20 people building PHP for a morning and trying to figure out linking arguments and all the other joyous things that go along with trying to build PHP. We can just build a new package, ship it into the package repository that we're using with Vagrant and ask that they update their Vagrant environment. It uses a folder sharing mechanism to expose folders from your local machine inside the virtual environment and the development environment, which means that, again, you can stick with Emacs or Eclipse or TextMate or whatever it is you're using is your development environment and your tools of choice and know that when you come to run the code it's going to run inside a virtual machine using all the same packages, all the same versions of everything as it's going to be available on live. Version puppet manifests. We can stage these things. We can provide pre-release versions that we then move through environments into staging and then live. This is a really good way of providing these dev environments. Your developers need a fast laptop with decent SSD in it, with a decent amount of RAM, but if you're not buying them, those sorts of things already, you hate your developers and please don't do that. The environments then become as portable as sat in a cupboard somewhere, which is a fairly standard pattern for development organisations. You can now stick it on your laptop, take it home. If you do hate your developers and you're going to make them work on the weekends, they can at least do it on their own laptop instead of coming into the office. After Vagrant, we look at Jenkins. Jenkins is a continuous deployment tool. It's essentially providing orchestration around build, test, deployment. Really, it's just a batch job runner with a nice interface. It's not a complicated tool necessarily, but you can do some pretty complex things with it. It's like Lego. You can build some quite impressive stuff with Jenkins. And importantly, for a large organisation, you can have a dashboard that everybody can see. You can construct different shaped dashboards for different business units, so the change of release manager can see what is available. It has role-based access control, so you can authenticate this against your Active Directory or other horrifying identity service and use that to permit only people with the change of release management group membership to release code. In an organisation like ITV, it's important that you don't push out a new version of the homepage in the middle of a really big show. The change of release manager is the guy who owns that process. There's a load of plugins for this. You can use Jenkins to spin out. You've got to spin out new test environments. It has plugins for all the standard version control tools. It's got all sorts of stuff. We also put in Zabix for monitoring. Those of you who are passionate about monitoring, not looking at anywhere in particular, Chris, and about there, don't really like Zabix because it's old and a bit crap. That's true. I'm not going to disagree with that, but the thing that Zabix gives you is a software tool that does both time series, trends, gathering, and also anomaly alerting. It's quite easy to get up and running, even if it's a bit point and click. It's certainly way better than no monitoring. In our view, certainly in 2011, it was much better than sitting down with Graphite and trying to plug all those pieces together. It gave us a quick win on exposing this data. It also meant that we could monitor the Windows platform because you can run SNMP traps and that sort of thing. But this gave us insight into what the servers were actually doing, which was new and exciting data for a team that had never really looked under the hood in that way. So let's go for these projects. It started out initially as two Drupal sites. There was a site refresh for a specific site that had been built in Drupal 6, I believe, by an agency. This was being improved, having better search added to it. And on top of that, there was intended to be a generic CMS build. And this was the intention was to replace the .NET platform with whatever this resulted in. But the pipeline and the team that was doing all this work kind of the responsibilities grew into all the other Linux properties. So as a completely non-random, utterly real example, the Opera Star singing in a studio live stream thing, that all had some PHP back end and some flash front end that needed to be deployed. That went through the deployment pipeline. So anything new that popped up, the mantra was it goes out into the platform via pipeline, or it doesn't happen at all. And we'd made attempts to provide the vagrant dev environments to the third parties, largely outsourced to be able to develop their software within on the platform. So it's not only understanding that if it works in the vagrant environment, it will probably work in production as well. I say intent and quite work like that. Then as other larger projects cropped up, so for example the new player platform, it was generally understood across the business that new sites were being built in Drupal. And so new teams started cropping up with oh, let's build this in Drupal. So they'd come and talk to the teams that were involved. This developed into a number of off-site and on-site teams. And by the end I would say there were probably 50 or possibly more developers or working on projects that related to Drupal with this toolchain. Possibly more than that. And I include JavaScript themeers, JavaScript developers, CSS people, product managers, project managers, all of those kind of members of staff. This is a lot of people, like a massive number of people who were involved in this project. And I think that's the size that it was probably in about three months, three to four months, something along those lines. I may be misremembering. But the teams were owned by different business units and a lot of the business units had no commonality in management structure until you got to board level. And that was part of the business transformation was to try and solve some of these problems. And I think that's the biggest gatekeeping of the live platform. Some of these external teams from different countries, different time zones, different languages, it all got kind of complicated. So how did we win? What did we do that worked pretty well? We'll cover that first because there's more of the rest. The wins that we had, we had automated deployment of the Linux platform from day one. It was a mantra. We enforced it. I was in charge of the team that was doing all of these kind of build work. They knew that they were not to be doing anything that wasn't automated. And they automate everything mindset came partly as a result of our presence in the organization, but also partly because those are the types of people that we hired. And I think it's probably fair to say, have we got any windows people in the room? Or at least the tool chains that existed on windows for automating stuff up until recently have been absolutely shocking. Now with things like PowerShell and actually they're a puppet and chef for windows now, that is a situation which is gradually improving. But it was definitely the case that the team as it existed in that organization was not really o-fay with the idea of automating that much. There was some stuff with group policy objects in Active Directory and those sorts of things like that. So I think it's a good idea to let's make all the IIS servers do the same thing at once. There's configuration drift everywhere. New team members always bring outside experience. And I think after a period of layoffs adding new members of staff with new ways of thinking I think was quite invigorating for a team. And a lot of the existing guys who hadn't really kind of been that involved with getting their support teams, the operations teams, they were given new Linux related responsibilities and so they were learning new things. People always thrive when they're learning new things. We removed a lot of complexity from the change management because we were automating stuff. You didn't need to write a seven page word document on what you were changing, why and how. You were pressing a button in Jenkins and it was running this script that you tested. It was the change and release manager who didn't really understand why was not important to write all these documents. That was a win. We increased the operational visibility so we had monitoring. We also, which I didn't mention, added at this point gray logs so that you could as a developer go and see the logs that were coming out of the Drupal platform. Everybody had a better idea of what things were doing. We put Pingdom in to look at the outside world. We put some better information. What challenges did we have? We were building these environments and I said the project. The project is plural simultaneously. Any problem with the environment would play into the developer timeline. Most of the development was being performed by external parties who had already agreed to deadlines. An internal team causing a blockage for an external team was obviously not going to make us any friends. These timescales were immutable. They shouldn't have been but of course they were attached to the release of a television program which meant they had to be delivered by that point in time. Things like the player project were board mandated projects. These are things that have been promised to shareholders as a solution for the income during a financial crisis problem. So this is a very politically charged project and it had to happen. Ultimately it was a lot late and a lot over budget and a lot of people involved don't have jobs anymore but that's the way these things play out. Many of the developers I would say almost all developers fewer of the operations team were contract staff and so you can't fail to lose tribal knowledge at the end of a project when you've had enough of this I'm not supporting it I'm going somewhere else and rolls off and not all the developers were full stack competent so back in 2011 when we were deploying Vagrant for the first time Vagrant on Windows was a real struggle and it involved lots of horrifying stuff like JRuby and having to understand how gems worked and configuring Putty for key based SSH access lots of legwork. That's all solved now. The Vagrant project comes with installers where that's all dealt with and sometimes we were having to do a lot of that sort of manual support and people who write CSS for a living don't necessarily see why they should have to install a Ruby interpreter on their box which I think is probably fair and the team expanded rapidly and I don't think I've worked anywhere that's had a rapid team expansion that's also managed to hit deadlines immediately as a result of that I'm sure we've all read Mythical Man Month if you haven't go and get a copy as it was 40 years ago you can't just keep adding people to the team there's overhead of communication and that's going to set you back at least initially so the consequences of these sort of challenges that we faced really was that the automated testing side of things was pretty much ignored I mean we were supporting the running of tests in the delivery pipeline but weren't actually writing any ourselves the teams that were building the software were not writing tests for it and so the deliverables were not especially high quality this combined with the time scale issues led to quite a lot of blame storming quite a lot of finger pointing and a lot of that was just because we didn't have that kind of shared ownership we were all working for different parts of the organization with different agendas, different time scales and at the end of the day quality versus deliverable speed there's always going to be a trade off in one or the other we had a fairly high staff turnover so a lot of people who were working on these projects kind of rolled off their contracts either because they were not deemed good enough or were themselves sick of doing the work and so we bought some more experts in and the experts that we brought in brought more challenges to the table specifically this was resulted in a redevelopment of part of the platform at quite a late stage to solve performance problems that we hadn't proven existed and that was a problem and so ultimately the platform the platforms limp over the finish line and get out there but they take a lot of operation they're not always up, they're quite slow there are some issues to be solved they they haven't solved some of the issues that were problems with the .NET platform as a result of some of the re-architecture that went on and ultimately across the whole organization that the confidence in Drupal just sort of dropped right off and it was deemed then not to be a suitable platform for new development a whole new team of people came in mostly from the BBC and now they're rebuilding these sorts of things from scratch and that's a combination of Java and PHP yes and that's let's not talk about that so let's have a look at what we did there as an organization how do we help ITV well let's go back to the four pillars we put automation in we put puppet and Jenkins in place to reduce the amount of manual tasks to improve the efficiency of some of the work that was going on we added monitoring Sabix went in we also put gray log in the sort of availability of data to the teams was much increased as a result of the work that we did we encouraged sharing by providing shareable development environments and making the logging and monitoring data available to to all the teams but ultimately when we come back to this list there's a sort of fundamental missing point here and that's culture and we kind of didn't do anything with that and I think that's the main sort of learning point from all this really is that success we're technologists we're at a technology conference you and I are probably all happiest Saturday terminal bashing commands into a black window with a prompt on it or at the very least cutting code or you know running scripts or what have you but success in anything like this is really about more than just the technology and a cultural factor will make or break any project it doesn't matter how good the technology is you can bring the cutting edge fully working to the table but if the cultural factors aren't there it's not going to work and that's really the bottom line and if you have culture that suggests that failure isn't an option you can make a mistake, you can be fired or we must hit deadlines or cost or you know testing can wait doesn't matter we'll do that later let's just work everybody a bit harder they're all contractors right that we're paying them a lot of money although those cultural aspects will always undermine any technical quality that's going on but cultural change is hard you're dealing with people we're not dealing with pieces of software if I could upload new firmware into an operations engineer and put them in a new task I would absolutely do that it's not possible, you can't do it and cultural change it comes from the top it comes downwards if your manager doesn't have the cultural values that you have it's very difficult to get them recognised within your organisation and so in a company of 4000 people you can imagine how deep that stack is before you start talking to people who have the same cultures here but a long way down the people who sat at the board table shareholders they have no idea what automation is or what continuous delivery is or why that would be important and you can sell them on it but it takes time and it's again not as easy as putting a new firmware in your board although give it a try maybe it'll work for you and cultural change under pressure I mean change in itself is hard cultural change is hard change under pressure is really difficult but it's a really difficult situation of nightmares and there's a book written by Tom DiMarco which some of you may have read called Slack which is about making time in your organisation for basically Slack activities and he says paraphrase Slack is the lubricant of change bad companies can only obsess about removing it and it's entirely true I think if you're obsessed to the quicker the thing is you're not going to get cultural change out of them you're not going to get risk taking out of them no one's going to do new exciting things in that kind of environment all of this kind of stuff takes time and as managers within a large organisation we have to find a way of providing that time to our people so that they can excel otherwise you end up with a bad taste in your mouth that's me I'll take any questions about the technologies that we used or maybe some of the specifics about the organisation but maybe not in too much detail Any questions? Can I ask how often they are spinning our branches of their code now as an organisation in this kind of environment? So I mentioned the .NET platform was going out on a sort of six weekly basis in parallel with the work we were doing in the Linux world we also had guys from ThoughtWorks doing a lot of legwork on bringing that kind of automation into the .NET world and the releases were still not frequent by kind of IMVU standards but they were arguably weekly possibly fortnightly but much less broken and there was much higher confidence to release those things during the day rather than having to drag engineers there were some architectural things that were not going to be fixed because the decision had already been made to back the Drupal horse which meant not many of those changes went on in the Linux world we were able to release multiple times a day but because there was only limited automated testing we would only release all sorts of changes cautiously but on average they're doing multiple times a day they're doing releases now they certainly were at the point at which all this stuff started getting released yeah much less than that the deployment of a Drupal site if you're automating and doing it right there's no reason why it should take more than a couple of minutes you're just copying files around the database migrations one last thing you said there were 14 to begin with what's that scale down to now for the Linux part so right now there are I believe two or three members of third line Linux engineers but the front line staff there are still that many people there because they've expanded the business hours coverage and also that team have taken on Linux responsibilities but it's troubleshooting and escalation as opposed to kind of full on deep Linux knowledge so the 14 was for the entire operations team and the existing operations team took on more Linux responsibilities they weren't sort of chopped and changed for Linux people although some hiring went on to add some more Linux capability to that team but ultimately my feeling is that if you're doing DevOps in the right way hiring low end Linux operators doesn't really help because most of the troubleshooting you're going to do is complicated and so all you're doing there is adding a new set of people to triage through hello you said that there's a monitoring system that you used in 2011 and is there any change two or three years after that is there better systems for this now okay so the question is am I still using Xabix and the answer is yes but we do still use Xabix we we find it it suffices for the most parts like most customers that we work with are of a certain size right and Xabix is more than capable of doing that we also have a set of libraries and patterns that we can reuse for deploying that sort of monitoring so deploying Xabix for us takes an afternoon whereas doing something else would involve starting from scratch the the other monitoring systems the newer technologies that are around now things that are based on graphites and CollectD and all the other kind of exciting things in that world which if you were at Monterama recently you probably heard a lot about for me they're not a fully formed solution yet and we make our kind of living taking solutions to people and so we already have a formed solution and the point at which we find that there's a good way of plugging all those new shiny exciting components together in a way that means that we can deliver them quickly and reliably then we'll be doing that as well so just a quick reminder I'll be doing a talk about monitoring in new preserved code age about specifically those tools tomorrow about monitoring he's a big fan of monitoring and that's okay Chris is also running if you're interested in DevOps in general and you're here today which of course you are because I'm talking to you in this room Chris is running a DevOps meetup nope someone is running a DevOps okay there's a gentleman over there in a wine colored t-shirt who is running a DevOps meetup in Prague this evening go and talk to him because apparently neither I nor Chris is going to be talking about it cool you mentioned testing what kind of testing were they doing what was automated and the kind of TBBD type stuff yeah so the components that we were essentially spiking were around Capybara and Cucumber size of things and that ultimately may not have been the right tool chain partly because the technology dependencies for making that work certainly on the Red Hat 5 environment we were working on it was actually quite arduous we had to rebuild all of like QT windowing environments and all sorts of new things plus the testing tooling was then all ruby and given that we were hiring PHP developers for the most part that didn't really sit properly there was an existing automated test function within the organization who were doing gukin style testing on the Windows platform we assumed that we'd be able to make use of the same sort of tooling but ultimately it didn't really happen it wasn't adopted by the dev teams we probably could have helped more but that was largely where we were at right now I couldn't say how I'd do it differently I guess there's some sort of Selenium based things that might be worth looking at this and what's the be cat it's an area I'm not especially familiar with it's all a bit kind of programming for me I'm a bit further down the stack these days but yeah, I'm difficult to know how we'd do it better so it's certainly an area that I should probably read a bit more about because I think that's probably moved on quite a lot more in three years than much of the other stuff so sort of following up there's a Be Hat lab tomorrow at five o'clock I think but my question was about would you do any performance testing as part of your deployments as well as functional testing? yeah so we did so the non-functional stuff was the tests that we put together for that were not part of the deployment pipeline so we engaged with a third particle Soester who have a product called test which allows you to build tests test building blocks and plug them together it's a little bit like editing video there's a sort of timeline and you go I want four of these and ten of these and you can put sort of random based decision points in there it's a really good tool pretty expensive but it also for us saved a lot of time and grief because it has a test it can plug into agents running behind your firewall and report back what the actual result of that is so you get to see a dashboard of what are the tests doing what are the errors look like what are the actual error codes coming back what's the database doing you know actually pulling out MySQL statistics and that sort of thing so definitely worth checking out but because of what was involved something that we do sort of pre-release or pre-large release beyond that the non-functional characteristics we measured page response times using a combination of Pingdom but also from the logs on the traffic directors on the front end and put that through Zabix so you see how the page performance changed in trend after a release I know Soester are pushing kind of integration hooks for their tool so in theory you can integrate it with Jenkins and do it as part of a performance regression test but the price tag was sort of five or six figures which was something that we tried to get in there but ultimately it was a difficult conversation to have when there were other things to spend money on great alright Thanks for your time Any other questions? Right then give it up for John Thanks very much