 Good morning, everyone. So this talk is DevOps leveling up your team I'll mention right up front that it's a beginner talk as I tagged it. This is sort of Not a nitty-gritty seeing a bunch of code kind of talk I did one of those in Austin for the DevOps track with a live demo, which didn't go so well So I decided not to repeat my past mistakes and maybe give kind of a higher level talk One that I don't see a lot of in the different Drupal events that I go to I think a lot of the people that are presenting on DevOps are really excited about talking about the specifics of vagrant and logging and monitoring and all of the stuff that we love But they don't always make a great case for why Drupal shops should be investing in doing that kind of work And that's sort of what this talk is about. It's about Why I think it's worth making the investment in DevOps and how Sort of the culture and the values are important things to implement Even if you don't implement all of the tools that everybody spends so much time talking about Although if you buy into the philosophy You know, you're you're probably going to start implementing those tools as well, right? It's it's kind of a package deal So I'm going to be talking about the tools that we use throughout but also kind of about how The company that I work at Uses those So Hi, everybody. I'm Howard. You probably know me as tizzo on the internet This is powdered toast man I work at a company called zivtech. We do mostly Drupal consulting, but we've kind of expanded and are doing some other open source stuff as well Developing custom infrastructures and building apps and other technologies as well Um, and so I've been there for about five years. I'm the VP of engineering there So I spend a lot of my time sort of picking technical direction and trying to advise about how we should be doing things and auditioning new technologies Working on our hosting stack. We do really small scale hosting for some of our clients just like Custom stacks where it doesn't make sense to put them somewhere else. I'll kind of talk about that a little bit more later on too um And we've been doing Sort of what I've I'd consider kind of a devops workflow. Um, since way before that term existed Um, which is not to say I liked devops before it was cool Um, but it's just to say that like it I think um devops has become kind of this rallying cry It's also become this buzzword right where like if you have the word devops appear anywhere associated with your name You just start getting like three recruiter emails a week um And the thing is that you know, a lot of that a lot of them really obviously don't have any idea what devops really is Um, it gets used for this buzzword that just sort of means operations Or just actually means doing sys admin stuff or just means automation um Or means full stack So I kind of wanted to talk a little bit about what I mean by this word and sort of the what I don't mean so I think what devops isn't is a job description Um, I think that's kind of antithetical to the whole idea of devops I'm being that guy right now complaining that everybody's using the word wrong Um, but I mean I think in my defense Everybody's using the word wrong Devops isn't just automation Um And it definitely isn't a department Um, just kind of a show of hands who here Who here's a developer? And who here does operations work as well? It's pretty much everybody's hand went up and who here would say that their opera that their organization um Sort of does devops Works in that way Okay, good a good split So hopefully I can make the case to the rest of you guys that um that you want to bring more of these values into your culture um Devops is this is going to be hippie dippy. I appear. I apologize right now It is kind of a movement of people trying to organize around A particular way of working and a way of thinking things It's a philosophy and and Most of all it's kind of a better way to organize and a better way to to collaborate um So the history I think helps to explain why I'm so picky about this phrase and I promise I'll be off of this in just a second Um, so historically there are two teams right there are developers Uh, and they just have one mandate which is add features and add features quickly, right? And then there's a separate operations team who are responsible for keeping the system up all the time Right, so the developers are measured by their ability to roll new stuff out deliver business value on a regular basis The operations team is measured by how many nines they can have in their uptime rating from last year Right, these are naturally just opposing forces You can't these mandates are are sort of fundamentally incompatible If you just have these two teams walled off from each other with their separate mandates, right? um For one to win the other one needs to lose To for developers to roll out new features. They need to roll out new stacks. They need to upgrade stuff That of course introduces instability Operations gets woken up in the middle of the night. They don't want their pages going off. So they're resistant Right, so that's where the idea of dev app sort of comes from is Reorganizing things so that we can build one team not two and not silo, right? what's actually in the business's interest is finding the right balance between being able to roll out the new stuff that the business needs that um, the stakeholders are asking for and You know not having it go down an embarrassing amount of time, right? Probably we don't really need as many nines as we think we do Um, probably we don't need as strict slas as we sometimes ask for right a lot of the time when you ask a client Um, you know, sort of what's an acceptable amount of downtime. They kind of look at you like you're from outer space and they say none right Um, which is just crazy for most organizations. Usually usually we're way more paranoid than we need to be So the idea is that we want to build one team That can deliver more value by pulling together the the seemingly opposing interests and trying to get Sort of a shared Understanding also a shared responsibility to enable experimentation, I think this is sort of The key to all of the dev ops movement is really enabling people to experiment So that we can change more quickly without sacrificing the stability that everybody wants And thinks they absolutely need Um, and then the other thing is it's just a lot more fun. I think um to sort of to sort of work in this way where Where it's a more collaborative environment where operations people are allowed to contribute to developing the tools that they need And where developers sort of are freed to be able to contribute to operations and how things run Um, right when you take tear down the wall between these silos Um friendship is magic um Right, so what kind of got me started thinking about this stuff is I was the only tech guy working at a film company Um, and I was working on Drupal based sites back in the day For sort of sharing the video and sharing the story behind how all this stuff was produced and sort of building the marketing sites and all of that stuff And I was developing features on mamp on my local environment and I was deploying them to iis and sentos and Stuff managed with plesk And things really did keep breaking for me I was constantly running into problems where there were minor server configuration differences between different environments I mean, especially iis, you know, six or seven years ago um And things would just constantly be breaking on me um So I decided like look, I need to not have any differences between production and development. So I started rolling Um, you know vms with with parallels the only thing I could get to do it at the time and just manually configuring everything Um, and I found that to be a way better way of working because all of a sudden the you know, it works on my machine problem dissolved for me um And I stopped learning How to administer mamp, which is not generally a useful skill. Um And I started learning how to configure an actual server Which is really how I started to move from doing dev to doing ops stuff as well Um, and when I came to join zivtech everybody at the time at zivtech was actually still developing on mamp locally And they were running into these same problems So I slowly got everybody else to start using vms We had like, you know our golden image that I maintained and Whenever somebody wanted to change configuration They had to like convince me to like leave doing client work for a while to like Go work in their Apache configuration change and then we need to manually roll those out to all the servers But it was sort of the beginning of starting to work in this way where Everybody the the the skills that we were all developing were the ones that were relevant to um What we actually used for real right We'd start to tweak things like the Apache um max children and all these like tunables that aren't very accessible with sort of like out-of-the-box stacks And even where we weren't running hosting ourselves a lot of the time our clients would start to ask us questions like Oh, well, how will this scale? How will this affect caching? What do we need to do to configure this? And all of a sudden instead of having one or two server people ops people um On the team We started realizing that everybody on the team had opinions and had an understanding of how this stuff worked all the way down um I think uh pantheon always likes to to talk a lot about pushing people up to the top of the stack That's one of josh's um big things. I think there's a lot of A lot of value and a lot of wisdom in trying to say let's not waste people's time with solved problems Let's try to push ourselves into the realm of of Where we're innovating. I think his argument is that that's usually at sort of the application level That's usually with building out your Drupal site your Drupal features um I think there's a trade-off there though Where if you go and say look i'm just going to use the Off-the-shelf solution. I'm just going to pick aquia or pantheon or one of the other you know sort of polished packaged hosting providers You know you have to know your own site your own product and whether that's the right choice for you A lot of the time it is and we we push a lot of our clients to go that route um But the the thing that you're giving up the the price that you're paying for Not having to ever worry about whether a server is up or down or whether my sequels tuned properly Is that you're also sacrificing your ability to innovate at the stack level? um You can no longer make a decision to experiment with a new technology that isn't already on that hosting stack Without having to solve a whole bunch more problems about You know connecting over the network to some other data provider living in some other You know part of the world or maybe you can get into the same data center, right? But you're still having to make a jump between um, what's installed in your hosting provider and Where your mongo db instance lives you're giving up the ability to say hey I just want to install this custom php extension. I just want to try this thing out um And I think there's a lot of value to being able to to try those things Having that ability has led us to start doing Sort of more service oriented architectures where we're moving more pieces that are sort of Logically self-contained units out into their own sort of applications so that they can be versioned and used independently Um We started using sylex and node js for building sort of some of our api components Still tying into droopal front ends pretty much all the time But I think uh, if you've sort of gone the route of saying look, we're just going to pick off the shelf cloud hosting What you're also saying is we're giving up the ability to audition those other things for trying those things out easily and with the advent of puppet and chef and some of the Sort of mature recipes that you can use You know providing droopal hosting doesn't need to be that hard, especially for your development environments um so, um Luckily devops has kind of given us a set of organizing principles um To kind of rally around so this is sort of the the philosophy and the movement part of devops is uh, You're right. It's not just a bunch of people sort of Saying, um, let's stop having developers and operations fight Um, it's about saying, uh, let's let's identify and and hold up some of the really valuable cultural Things that we've discovered And and sort of talk about how we can implement them and how they can benefit us So cams is sort of the the devops acronym Culture automation measurement and sharing and these are kind of all overlapping complimentary Ideals and I want to talk about each one and kind of how it relates to droopal and specifically how How we use it how I use it in my droopal practice? at ZivTech So culture culture is one of those kind of tricky ones Where you hear the word and it kind of means everything and nothing at the same time um I am I'm stealing this uh definition from another devops presentation who stole it from a book So I like the definition a set of shared mental assumptions that guide interpretation and action in an organization by defining appropriate behavior For various situations. Oh typo for various situations um So it's a little bit wordy, but the idea is sort of that we have a shared understanding that Enables us to look at any given situation and kind of all make the same judgment call there Right. This is something you kind of develop over time with your team. You certainly can't change it suddenly um So some of the some of the sort of devopsy cultural assumptions that I think we have at zivtech One is everything should be repeatable So if you're doing something in a way that's not repeatable, we consider that a bug And we're certainly by no means perfect with how we implement this stuff There's lots of room for improvement But It's sort of like Anyone at zivtech if you they see something that's not repeatable would tell you that that's something that we need to work on Um, and we'll get to some of this, right? This has a lot to do with automation and I'll get to some more of that That later another thing is developing locally And the sort of upshot of that is that you deploy for integration It's amazing how many of these conferences that I go to where people still are arguing that they shouldn't have development environments Or that they don't see the value in it. Um, I think Pantheon adding sftp support so that you can just switch it into i'm just going to do stuff right on the server mode Um has kind of empowered a lot more people to continue working this way Where it sort of was starting to become fighting upstream before that um So the point with developing locally is that each member of your team and the more members of your team there are the more Critical this becomes has their own copy of the site locally Um so that they can make their changes locally Without breaking somebody else's right? That's one of the benefits if they forget a semi colon in a php file It doesn't cause a catastrophic failure for everyone that's working on that dev site Um, but the other thing is that this starts to enforce that repeatability Because if you develop the feature locally, how do you get it into the dev environment? And if you've got a staging environment, how do you roll it from the dev environment into the staging environment from the staging environment into the production environment? And there's a bunch of different ways to solve this, but I think it's critical to have a way And um, I'll kind of come back to some of the ways that we do that in a minute If it matters its inversion control Um content would be the one big exception to this. Um, we tend to keep We tend to sort of have a blessed canonical database early on fairly early on in our process, which is where Actual nodes, actual comments, actual users live But presumably all of that stuff Is I mean it's not that it doesn't matter But it's sort of not what we're developing. We don't have content writers on our team at this point What we're working on really is the code stuff And uh, and the sort of client data Content lives in a backup system But if it's a ticket assigned to a developer, it ends up in version control And if that means adding something to a node, that usually means It ends up in an update hook and somebody's written some code to be able to deploy that Everything gets reviewed so at ZivTech, nobody closes their own ticket We sort of have adopted the Drupal.org workflow of you mark your own ticket as sort of resolved Or you know ready to be reviewed And then someone else marks it closed and or fixed After they've had an opportunity to review the code see it actually functional in an environment And demonstrate that it does what it was supposed to do Another one of ours is to open source whatever you can I don't think this is strictly speaking tied into DevOps, but it sort of overlaps with some of the sharing component That we'll come back to later We try to push as much as we can out into contributed modules contributed patches to To existing modules core or whatever And that sort of that sort of ties into the communication component and again We'll get back into this in this sharing section But if someone's done some work and they haven't documented that everybody says Right you made the wrong call there We try to make sure that everyone's as redundant as possible That somebody winning the lottery and leaving forever isn't going to ruin us Maybe more likely getting hit by a bus isn't going to ruin us Code review I think it's worth mentioning this again Because I think this is the single biggest thing you can do to improve your team Who here has code review as a required step in your process? Right so maybe a quarter of the audience Um If you take nothing else from this talk or this Drupal con at all I think it would be worth attending if you can just Make that one change um I think There's a bunch of different benefits to doing code review um Part of the deal is that um, this is kind of how reviewing your own code usually goes um You kind of tap at it But really it's your baby and you love it and and you're not going to really be very rough with it um Your team members who are afraid of being woken up in the middle of the night with a bug report Um, they kind of wail on your code a lot harder um And so like that's really what you want before things hit production right you want someone to really wail on this thing And let you know if it's wrong um We're pretty we're pretty serious about the code review step We enforce Drupal's coding standards if your if your code doesn't pass coder it gets sent back to you If your comments don't start with a capital letter and end in a period a lot of the time we'll we'll pass it back Like our our view is that Everyone's code should look like it was written by the same person um Which yeah, you can a lot of the time you can still kind of tell but um, but the point being Hmm. Yes, please Our team has around 20 developers at the moment. I think But they won't they won't all necessarily be on one project That's right We always review the code from the team. Absolutely. So, um, the key again part of it is getting it built into the culture And um And so we have it built into our sort of issue tracking system as well I mean It's in people's ability to mark things closed directly But uh, but the workflow you again you get in trouble for doing that the workflow that we have is um You work on the ticket you mark it resolved and you pick another member of the team to assign it to So it's not always the same person And I think that is a critical component of code review as well And but if you're on a team of two you're on a team of two right But if if there are other people it's really important to have heterogeneous sort of code review because there are different techniques There are different approaches And sometimes what comes out in code review is not that what the person did was wrong It's that there's a better newer way And so by constantly passing code around You end up Communicating the best ways of doing things communicating the standards More fully and you start picking up more from other members of the team It is the I think it is the single thing that we've done at ZivTech to make The whole team better It's really been I think transformative for us And we started that fairly early on other questions That's a great question. Do we do we have a hierarchy for code review where can people on the same level review each other's code? Yes, it's sort of it's sort of a mix so Um We don't send a code review from a senior person to a junior person some deep right Some senior person's writing some obscure views plug-in. We don't send that to someone who mostly does site building and say does it look good? Um We try to find someone on the team that's that's going to have a sort of critical eye and applicable feedback That said Having junior people review that like obscure crazy plug-in is an amazing opportunity for them to learn So if we have the time We will sometimes Do sort of shadow code reviews is what we call it where we'll grab a junior a senior person will grab a junior developer and say I'm about to sit down and review 10 tickets I want you to just sit with me and you just walk them through it It's an investment because now you're paying a junior person for an hour that you probably can't bill anyone for But they learn a lot The other thing that we'll sometimes do is assign a junior developer a code review ticket and then have a senior developer Just take another pass at it One of the required elements of our code review like I mentioned before is seeing it work You can't just review the code and say oh, it looks fine. You need to check out that feature branch deploy that code See that it's actual actually functioning on your local environment See that it's functioning on the dev environment see that code somewhere And so having the junior developer sort of work through it and write up what they were able to Glean and then having a senior developer still review the the code that could be a much more cursory review for the senior person Making sure that there's no, you know security vulnerability performance implications some of those things that junior people are less capable of picking up But yeah, I think I think moving it around to having junior people see more senior people's code and vice versa Really gives you some of the best opportunities. So it's really about sharing things all around other question How much time how much time should you allocate to code review? I think our general number is somewhere around 10 percent I think it was a much bigger time investment at the beginning of the project And to some degree if you get into a huge amount of debt In code review, um, it becomes very difficult to dig out So another important component of code review is that the feedback loop should be as tight as possible um I think doing a poll request based kind of workflow like using something like github or bit bucket, um Makes a lot of sense, um because it makes it much easier for That code to kind of be isolated from all the other things that are going on and you can still click that merge button And there it gets even more Important because if you don't merge it in soon the code starts to diverge right and You've got people sort of chasing patches like we do so much in the Drupal community um Oh, right. So does the person who's reviewing do the merge or do they well they do the merge if it's ready Right at some point. Hopefully you get through review at some point. Hopefully it doesn't just keep bouncing back forever Right. Hopefully at some point your feature gets finished gets polished the comments get fixed the The tabs get replaced with two spaces the Security problems get resolved and you click that merge button and pull it in And we don't always use feature branches for some of our smaller projects that have smaller teams We'll we'll commit right to master especially if there aren't very many junior people where there's less stuff that you end up having to revert or roll back But it's kind of a It's kind of a mix. Um, but yeah in terms of whether we're using pull requests or whether we're committing right to master But for any of the bigger stuff or scarier stuff, we use them feature branches And then um, and then yeah, the the person who's doing the review is usually the one that that merges that deploys that rolls it into the next tag whatever Once it's ready if it's not ready it gets bumped back. It doesn't come in yet Um Any other questions on that Right, where do we um, what does the code review cover? Is it just uh, Where you put the curly brackets and whether you're using two spaces or is it the functionality? Um, it's both So the idea again is that this is the review before It lands in master before you consider it done before you check it off as finished So if the functionality isn't fully there, it's not ready to be Committed to the degree that your Specific ticket isn't finished. So we what we try to do is break things um down into The smallest the the the indivisible piece of work Um, so a lot of the time we won't have a ticket for build the homepage We'll have a ticket for um, you know build the recent news items list And then that gets sort of incorporated into the home page in that piece of the ticket so a lot of the time Your responsibility for whatever you're responsible for whatever functionality was in Sort of your ticket we try to break things down as as far as we can because having smaller tickets that have That have a shorter lifecycle makes it easier to keep track of the code It makes it easier to do review especially it's hard to review a Ticket that has a thousand commits in it. It's really easy to review one that has seven Also it we try to work in two weeks sprints And so if a ticket's going to take more than a day or two that usually starts getting scarier in terms of estimating scarier in terms of delivering on time scarier in terms of holding up other people And so we try to sort of break that down and then the code review does resolve both the coding standards issues and Whether things are are actually functional and usually writing any tests is built in at that level You're responsible for writing the tests on that for that ticket for that piece of functionality But again, someone needs to review the tests right because the tests It's really easy to to have tests for all of your modules. You can just add a little You can just add a little simple test that returns true, right? It doesn't actually need to test anything to have a test, right? What you really want to do is you want to keep track of your code coverage You want to know how much of the module you're testing? You want to make sure that someone else has looked at it and seen that you're testing what you actually care about and you're not testing Things that you don't or I mean it's fine if you're testing things that you don't but it's not if you're failing to catch things that you do Automation again, this is what a lot of people just sort of confuse with dev ops on whole But I think it's a really important component particularly around that repeatability stuff So I think one of the keys and again, this is one that that really helped us was having infrastructure as code This again one of the things that sort of is one of the hallmarks of the dev ops movement A lot of people think dev ops just means chef or puppet It doesn't but you're able to do a lot more when you start looking at infrastructure as code So as I alluded to before we started out developing On mamp moved to a golden image and then eventually moved to A golden image built from puppet and then a golden image built on demand from puppet using vagrant So the idea is that um, does everybody here know what puppet is? Most of you okay good. So um for the few hands that didn't go up. It's sort of a way of um And chef's kind of the same. It's it's sort of a way of describing As data what should be installed what should be configured On the server and then you can run puppet it'll compare what's there to what should be there and resolve the differences So having infrastructure as code really allows you to start experimenting and allowing how allows you to have more members of your team Be able to say hey, I think we should switch from memcache to redis Well, that's easy for a developer to say If it's not their problem Figuring out how you're going to deploy it how you're going to manage the instances how you're going to test that it actually works How you're going to get it deployed onto everybody's local systems so that they can try out redis Or more importantly, um How they're going to get it into production if you're running your own production instances And having infrastructure as code really makes that easier because it's you get they set it up on their local system using Puppet they can repeat it over and over and over again using vagrant which automates doing that inside of a virtual machine And then when it's ready, you can roll it out to production with the exact same code Testing so who here does automated testing on their Drupal projects? Like a tenth of the audience maybe After code review, this is like the next biggest thing that you could pull away from this whole conference But definitely this talk Amitya is is always big on saying like just have a test even if that test just logs in It's so much better than having nothing it tests so many things. Um, it tests like that catches the stupid merge Conflict that you accidentally committed and pushed to master right before anybody complains because they finally hit that Um, they finally hit that dev site um I think Testing is one of those things that it's really easy to say. Um, it's too hard or it's not worth it Um, because you sort of assume we don't really have good testing unless we have a hundred percent code coverage Getting a hundred percent code coverage on a Drupal site. I mean Good luck like you're just not going to do it. It's not going to be worth it Um In terms of being able to execute every one of those freaking conditionals from all of the permutations Of all of the configuration options from all of the checkbox that Amitya built into the organic groups administration, right? Forget it. There's way too much that you can do Um, so i'm really big on testing the stuff that you care about the stuff that you're worried is going to break The stuff where your client might fire you if you roll out a new release and that thing doesn't work um, also sort of There's a lot of things that you can do in Drupal where you're tying together lots and lots and lots of Really sophisticated pieces, right? You can take organic groups. You can Index it into solar using search api You can build a panel page that teases a part of view that gets rendered from the solar search results And lays it out, right if any of the apis of any of those two dozen modules Involved in that process changes something You can just have catastrophic failure And so that's that is a concrete and specific example of something that i'm always careful to test Let's write a test that goes in hits the surge page search Page and make sure that we're actually even getting results Um, and that has absolutely captured a lot of regressions Um environments Um, who here has multiple environments development staging local Okay, good just about everybody um I think aqua and pantheon have helped a lot when I used to talk to people at conferences. It was amazing how many Did things cowboy style on the production server? Or just headed dev server in a production server and no one worked on local instances um I think having multiple environments is really key and we'll talk about that more in a bit Uh, yeah question I'm gonna come back to that Cron is another thing that every single drupal site needs to be able to run. Oh, so I just Glossed over deployment This should be automated. Absolutely. If you're on aqua or pantheon, you know, congratulations. It pretty much is Um Minus the fact that an amazing number of people Still have a checklist of things that they need to go in and do manually on the site when they deploy their code um That might be worth it. Um But I mean maybe or sorry, maybe it's not worth it to automate all of those steps I've seen projects where It wasn't It usually is I think if you have a checklist of things that you need to do That's not really repeatable And you've done something really important. Um, you made an important mistake You've made it a point of friction To be able to deploy your code So what you really want to be able to do is redeploy the development environment every single time somebody pushes code You want we usually have an automated process every night that takes the production Database and pulls it into the development environment and then runs the deployment step Which usually means reverting all the features and then running update hooks um So that that happens every night which kind of gives us that continuous integration that we know that nobody broke anything In production by changing something in the database or doing something done that they shouldn't have done Um, that's going to that's going to trip up something on our dev site Um, so that just runs all the time and then we usually have a staging environment where we think things are about stable We deploy it to staging And prep that and the client vets it Or promises that they vetted it and don't actually look at it and then complain a lot when it goes to production But they complain less when you point out look it was deployed to staging three weeks ago. I you said you looked at it Yeah, so you want to be able to roll your deployments all the time if there's friction you'll just do it less That's just a fact And so automating that stuff totally becomes worth it because that's always the stuff that bites you when you're in a different environment There's human error. You're under a lot of pressure That's where you'll forget a step. That's where you'll trip on something um Cron is an important piece of automation and so many people just go into crontab and set a cron job for to run You know drush cron and they call it done um Those people do not log in and check mail on the server to see if cron's been failing generally Right. They just wait until they get a call from the client saying My search isn't working. I can't find the content that I posted and then they find out that cron's been failing for three weeks and Was it doing anything important over those three weeks? Let's hope not We the way we manage cron is we have jenkins Run it for us. I'm not sure that that's necessarily the best way to do it, but it's a pretty good way so jenkins calls out to the The server that is running the drupal site runs cron and what's nice about that is it captures not just whether it was successful or failed Not just contacting us if it is failing But it also starts to graph for it collects the standard out and standard error where the cron ran So that you can go in later and see if it did fail. Did it throw an exception? Did it time out? What happened? And it also keeps track of how long cron is taking so you can graph it over time and see Do you have some process that's going to become a problem later? You can just kind of check in once in a while in your jenkins instance and See if cron starting to take longer and longer and longer Some modules doing something stupid and probably crawling over all of your content or something And at some point it'll fail. So being able to see that stuff before your clients do is really helpful Obviously backups right if backing up isn't an automated step you're doing it wrong fix it Again, if you're on pantheon or aquia, that's dead simple. You check a box in the ui If you're on your own jenkins again is a good is a good answer So infrastructure as code I think virtualization is extremely useful a golden image is sort of less good again puppet chef ansible for being able to See the differences because then your server is in version control and you know exactly what's changing And again to adopt this you don't need to start from scratch. You can take one of the existing projects I work on one called proviso There's also the The calla box where they've open sourced their calla stack of their configuration of their lamp server I've got zivtech we've open sourced ours which has a lot of the same stuff that we're running on our actual servers Like production servers You don't need to start from scratch. You can get something totally functional right out of the box and then make it better And docker I think in some ways is sort of the future Because it allows you to have you know a single test server that can that can be testing heterogeneous software stacks I'm working on a project right now To be able to have an environment stood up for each one of the pull requests for each one of the tickets So that you can go in and sort of see that stuff Um in the continuous integration environment Because you can spin up any number of environments. They're really cheap. You can put them to sleep when Um, nobody's looking at it Um, I think we're going to see a lot more around that testing Uh, what do we use we use behat? That's our tooling behat is a project that came from symphony Um, there's a whole bunch of great droopal tooling around it. This guy jonathan head strum and um And melissa anderson and a bunch of other people have worked on a droopal extension that knows how to create users and do a bunch of other things Um, behat is behavior driven development Um, I highly recommend I think there's a couple of sessions about it here If you can catch one, I highly recommend it. Basically, you can describe it comes from cucumber and And that set of ruby tools you can describe in natural language um or A domain specific language that's fairly natural Given some set of conditions when a user does something then this is what they should see And that grammar really doesn't leave much room for ambiguity. It doesn't leave much room for writing Flowery but unclear verse about how your droopal site should work. It forces you to lay out these are the conditions This is the action. This is the resulting behavior Um, I highly recommend behat for doing droopal testing, especially since you can sort of compose a lot of it from pre-existing steps Um, and keep it readable even to the client So our clients collaborate with us on writing the specs that become the executable tests that verify that that functionality that they care about continues to work forever And it forces you to keep your documentation up to date because the documentation is the set of tests Um, another thing that we've used to some success as casper um, sort of a framework on top of uh phantom j s for Controlling a browser You can run these on travis if you kind of get everything wired up right and if your tests don't take too long I know travis has a limit on exactly how long it takes Which is why we run around jenkins instances and run our tests there Um, so every time you push code it hits jenkins. It runs the tests. It reports back to hip chat Which i'll mention again in a second If you're you're running php projects your droopal con you probably are jenkins php is an awesome template for setting up a jenkins project Um for doing code coverage so that you again don't get freaked out if you're testing 20 of your droopal code Um, that's way way way way better than none and probably most of what you're using um But uh, so it's got pieces for doing um for doing that for doing Coding convention enforcement so you can actually graph and see whether you've gotten more or less coding standard violations over time Um, if you're running this on your entire droopal code base You're going to have coding standards violations. I think there are still a lot even in core Um, but you're going to know whether you're making the world a better or worse place with each commit Um, and again jenkins php has some of that Um Just sort of the configurations and the and the plugins jenkins plugins that you need to do some of the code quality Um code sniffer stuff Again the point of this stuff is that you just want to sort of um be able to start rolling patches Um before there's a gaping hole that you're trying to cover um It's really better if you can sort of uh know how something's going to go before you roll it out It allows you to move a lot more quickly and just sort of always hit the ground running Um, so you can just kind of have release after release after release without actually losing any momentum um I know it takes a second to really get what's going on in this gift Um, any questions on the testing stuff? I know that's kind of a big topic that I sort of talked about at a high level um Environments again, I think you should have at least three That's dev production and a local for everybody that's working on their site um Dev ends up being an integration environment because everybody's working on their local stuff And then they need to deploy it to the shared development environment For us that usually happens in an automated way either on aquey or pantheon or we use a jenkins job to Learn, you know see that the commit came in Depending on how long the tests take in the nature of the site Either we run the tests and then deploy that commit if it worked Or just deploy the commit and kick off the test to run the background because some of our Drupal test suites take 45 minutes, which is a problem in itself But yeah, maybe n having an environment for a feature branch makes a lot of sense a lot of the time Especially if you're developing some big change you want some place if more than one person's working on a Working on a branch working on moving things into a branch. I think that should have its own environment That's dead simple on pantheon with multi dev if you've got that on your pantheon account You can just click a button and get a new environment. Give name it whatever you want But it's really nice to be able to have that solved for you Automating environments I think the best way to do it probably is to push as much as you can into an installation profile So that you can install fresh and get everything there from scratch We always start our sites from an installation profile But like I said before we usually end up with kind of a blessed database fairly early on our workflow is Involves getting the client in to start putting into site putting in content to the actual site as early as possible So that we can start getting feedback on the ux So we usually start with the back end building out the content types building out the administrative interfaces Then we start working on the front end so that we can get those clients in there Learning the system and giving us feedback on how it works for their business But I know a lot of other shops are moving more and more to installation profiles and the content comes from somewhere else Synchronizing environments having multiple environments isn't great if you don't have them Synchronized in an automatic way. So again, that's mostly about Your hosting provider or if you want something open source dev shop is an excellent Provider allows you to create as many instances as you want across multiple sites multiple servers. It's built on top of agar Aqua and pantheon have that stuff built in you want it to be one button to synchronize an environment You can just use drush drush has a sub command called sequel sync and another one called file sync Which will move the database or files in place I also built a drush extension that we use called drush fetcher I'll give you a quick commercial for it basically what it allows you to do is Define all of the different sites and all of the different environments that you have in one place It can either be in drush rc files That you can version control or there's a services module So you can have a shared drupal site where all of the different pieces go in one sort of cloud accessible Drupal site and then from the command line you can see the list of all the sites that are available and you can say Grab me this one grab the database from the dev environment from the production environment from the whatever and set me up a local copy The key is that you want this to be as effortless as possible, especially on local If it's easy, you'll do it the right way more often if it's not you'll end up having You know, oh, I can't actually test that right right now because my database is three weeks old Right the database you should be able to update your database while you're getting a coffee Um Features um Right it sucks, but it's still the best we've got till d8 and configuration management solves all of our problems. I hope um Failing that there's also update hooks So there's lots of stuff that just needs to happen in the database still Writing update those hook update end functions in my experience is the only way that you can really roll out anything on a drupal site Cron I already talked about backups Again, you can just use what's built into drush and Jenkins if you're not running if you're running your own hosting That should happen automatically Measurements so You know this whole idea of cams is a lot about sort of Bringing the scientific method repeatability measurements To drupal development pretending computer science is really a science um Oops montioring montioring is critical um Some of the things you might want to montior are uh Uptime this could be as simple as something like pingdom. Um, just hitting your site and letting us know Letting you know if it's if it's working um But ideally you also want to be monitoring Montioring if you're uh underlying services. So, um, you don't want to wait and see the whole site went down You want to i'm really glad that uh the chair is documenting that slide um What's that? Yeah montioring also sucks Monitoring and montioring we need a hashtag for that as well um So again, I think kind of like automated testing Anything's better than nothing and you need to ask yourself what good enough is If you're on aqui or pantheon You get their status page And maybe pingdom or something um One of the reasons that we still offer hosting is uh some of our clients You know besides the like they want to run something that aqui or pantheon don't support or they want it behind their firewall They want better monitoring of their underlying services. They want to know how much How often redis goes down whether the database goes down. They want to know that immediately um in terms of tooling um I've used nagius isinga and senzu Senzu is my current preferred Answer to the monitoring solution. Um, I highly recommend you checking it out if you're setting up monitoring It takes a little bit of setup to get a rabid mq and some of its other dependencies working But once you have it it's dead simple to roll out little pieces of of checks to add and remove servers Um, and you can just write little bits of code instead of waiting through miles of configuration Lucky used to in the nagius world um Senzu sen su It's uh, it's written in ruby. Um, it's short enough that you can Read the all of the code on your train ride home probably uh, it's relatively small relatively new um really pretty elegantly set up Um and uh built for the cloud where you have invite you have servers sort of coming up and going down And you want to be able to join fluidly without having to roll out miles of configuration like you do in nagius right now Another thing that's that's sort of harder to monitor, but useful if you can is rate of change However, that you want to measure that whether that's number of signups per hour a number of click-throughs Um amount of code lines of code being rolled out at any given time um Some of the tools that we use google analytics, um, you know, it's not great, but it does an awful lot and If you've kind of learned your way around it can be customized pretty well again, that's more for Maybe less for your ops side a lot of the time that ends up helping you to measure more some of your content strategy and Um other things, but um, but it can be pretty informative Um for a lot of different things again ping dumb really dumb just is the site up or not Senzu that's how you spell it um Really good for being able to keep an eye on you know is my sequel replication too far behind is the redis instance up is um Sort of name your thing highly scalable really nice Um log stash for collecting logs. We actually pipe it into gray log for the ui Um, or you could use a service like log lee um graphite is a tool for graphing All of the things Um, it has a really nice scheme for being able to just chuck metrics into it Um in kind of a namespaced way so that you can easily sort of aggregate them Say like show me the number of hits per hour in this data center Um on this environment On this server Um, and you can sort of get as granular or as broad as you want if you group them properly Um, I've never used libretto, but it's a it's a hosted solution probably a lot easier to set up um Yes The Drupal module called production check. I haven't used that one Production monitor. I haven't used that one. I've used the Drupal Nagios module Which will let you know whether things are up let you check on whether modules are out of date Let you check on when cron was last run Um, so we use that And you can use that with sensei as well No, but sensei can use nagios plugins So you can you can sort of get it working with the the nagios check Um Profiling so just some things to kind of keep an eye on for keeping track of performance Again, measuring all of the things is important. Um, but uh, these are just a few things that we try to keep an eye on slow query logs Um, you know, it's kind of a dumb check, but it is really informative with Drupal a lot of the time That's where you end up sucking up a lot of the uh losing a lot of the performance um Maybe even more why slow? Um being able to tell you why your front ends being slow Which external library the client required you to add is blocking the load of your page um Just simple back benchmarking just even something as simple as using apache a b um to just hit the site and Um, tell you how many pages right you know, how many loads you can run in a minute Um, you know, don't forget to run it a few times and make sure you You know not accidentally measuring when things aren't cached or something or when they only when they are Um, and you can get you can get much go much deeper I've done some things with the grinder or jmeter or other other tools But um a lot of the time a b will tell you a lot um If you need to get down into it xh prof and x debug Um can be really good xh prof. You can actually run in production If you're running your own servers and you can sort of find out how long you're spending in all the function calls Initially, it'll seem like a bunch of gobbledygook. That's not actually useful But once you start to work with it for a while and get a feel for where Drupal spends most of its time Um, this kind of lets you uh lets you learn a lot about when you're making things better or worse in your application Um, and it's important to monitor in all of those environments because if you're not monitoring in dev It's kind of hard to have a good sense of whether you're making things better or worse before you roll it out to production um Oh monitoring in dev, uh, right you might want to do some kind of stress test again, um a b for For really dumb tests or you could use something more involved like the the grinder Um to be able to go through and do something sort of deeper Um, I don't I don't have a ton of advice on that. Um, a lot of the time I use a b making sure that I'm not Hitting the varnish cache And then in terms of monitoring that that's in terms of profiling the dev environment in terms of monitoring It's sort of on the regular setup. We don't do load monitoring on dev But we'll just slam it and see how it holds up before and after we're making changes Again good enough like we usually do that where we where we know we might have an performance implication if we're changing css We don't go reprofile, right? It's just not likely to be the problem Um, so it's it's a matter of walking the line and trying to You can't always predict what the implication of something's going to be but a lot of the time you can kind of have a good sense for it And you check when you're worried Yeah, stress the environment when we need to again a b is the simplest way to just Throw a ton of requests with whatever tunable concurrency you have And measure how fast you can serve pages back up again Infrastructure as code allows you to have your dev environment be very close to your production environment if you're running cloud servers You know, you can never totally isolate things from the noisy neighbor or whatever might impact the specific production instance But you can get a really darn good idea. I'll spin up another eight gigabyte rack space instance with the exact same other credentials or exact same other specs. I mean Slam it and see what it can do and that's that's usually pretty darn close um And we do a lot of that by the way spinning up a new environment that has the actual production specs and testing a major release because Right, if you're running dev on a two gigabyte instance and you're my sequel server alone has an eight gigabyte instance in production You're not going to get a good sense of The real topology and having all that stuff in puppet makes it easy to be like let me spin up 10 servers I'll just pay them pay for them for the next five hours while I do my tests and then tear them back down um So sharing one of the things sharing responsibility at zivtech everybody's on call um Every once in a while it comes up. Why don't we hire an ops person that can be the person that has to be on call? And I'm always like no because I don't want that guy having to deal with everybody else's mistakes in the middle of the night everybody Everybody's on call And that has totally been transformative for how confident developers are Um, and how much testing they want to do before they roll stuff out Um, we we host a lot of meetups and we push other people too as well That's another great way to level up your skill. You guys are all at Drupal con So I'm probably preaching to the choir about getting out there um I thought I fixed that slide luncheon leers um It's a lawsuit waiting to happen luncheon learns, however Um are really helpful. So what we do What we do I must have loaded an old copy of my slides. I'm so sorry guys. I fixed both of those typos um So luncheon learns uh are Where we have someone from the team and again we have junior developers and senior developers more often senior Present and attendance is optional, but the company buys lunch. So pretty much everybody stays to get free lunch um And it I think it absolutely pays for itself in how much we level up the team How much people learn and it's a huge employee perk because most of us are in this industry because we like learning new things And so pushing other people to present to practice getting up and talking in front of a group To show whatever it is they're working on in their free time Whether they're sometimes we'll you know do things that are way off topic present on You know, what you're doing with bitcoin none of our projects touch bitcoin, but it's cool to learn about Or more often, um, you know show show the monitoring stuff that we're rolling out the the new The new module that we discovered that we think everybody should be using um Communication again, uh making sure that everyone's as redundant as possible But if you don't come into work tomorrow things don't grind to a halt Um, I know I'm I think I'm just over time here. So I'll uh This is sort of my last slide Email and chat I highly recommend everybody use something like hip chat slack irc something I also highly recommend having a hu bot in there That can tie into your other that's a chat bot. It's github node.js project makes it really easy to be able to add on nice features and get Again, this isn't just from hu bot but get An agent from your monitoring and from your testing in there There are Jenkins plugins for for irc and and he and hip chat So that when your tests pass or fail that pops up in the room for the project so that everybody knows the current status of the builds of the deployments Every time anybody clicks the button in Jenkins to roll a new tag and deploy it to production That pops up in that chat log Which gives you this really nice history when you want to go back and sort of see what happened because people are discussing the projects And you're seeing the automated steps pop up with notifications in the chat window so that you know you know It's really nice for portman post mortems at 505. We deployed a tag at 510 We were in there screaming about how the client was on the phone and furious At 515 it was resolved Um And so the uh automated feedback that's the key of getting those Into that email and chat so that you kind of have that event log of what's happened And so that the whole team knows and is and is involved The other thing with sharing is giving everybody logins to any of the automated processes Every member of the team should be able to get in and see the puppet dashboard see the um See the um The censu notifications Um, we have a single sign on proxy that sort of sits in front of all that stuff so that you can Do one single sign on and get to all of the different services that we're using to do monitoring So in review, um, you don't need to roll all this stuff out at once, right Anything is good. Anything is better than nothing. Um, and you need to keep in mind kind of what's good enough Um, you know, we started small and just kept adding a little bit piece over piece week over week Trying to level up our game and and pretty soon. Um, we had a pretty complete suite that was really making our lives a lot better Um, if you don't have the right tools and the right methods, um, you know, you can you can kind of get by Um, you can sort of work around it, you know, you don't you don't necessarily need the keys Uh, to be able to accomplish your job. Um, there's sometimes ways that you can kind of Get around it, uh, not do things the right way But um, right, you're never really going to reach nirvana. That's never going to get you riding through the sky on a unicorn Right that for that you need to level up your team level up your game. Um, and start rolling these pieces out Um, most of the gifts came from devopsreactions.tumblr.com I highly highly highly recommend that you follow that account. Uh, it's really good. Um The end thanks guys