 This is my puppet script. Move this side of the room to the other side of the room. We really should have saved some time. Alright, everyone sitting comfortably and can see enough of the slide or as much of the slide as they want to see. So I'm going to talk a little bit about SysOps. It sounds like a lot of you are kind of DevOps already. You guys do some development and some systems. I'm going to try to convince you if you're not already convinced that knowing a lot about systems and infrastructure will make you a better developer. So some things you should know as a developer about infrastructure. And just a quick question. How many people are just developing modules and things like that? Themes? How many are deploying and maintaining entire Drupal sites? Yeah, that's a lot of you. So a lot of this stuff will hit home. You want to know where your code is going to live. You're going to want to know what resources it's going to need. Memory, bandwidth, disks, things like that. You're going to want to know how it gets deployed. Is this getting deployed with a Drush script, a Puppet script? Is it something you manually have to do? Is it some hidden Perl script in a directory somewhere that you don't know? And how does it get maintained? How is it going to be upgraded? Who's going to upgrade it? Are you responsible for upgrading? Is it going to automatically upgrade from a particular Git branch? When you update that Git branch, it would be good to know that before you commit to that Git branch. In other words, you need to know about operations or ops for short. So knowing the system in which your code lives can make you write better code. If you know something about the bandwidth situation, you'll know if your code needs to deal with a lot of latency. Low bandwidth. Are you on a cluster of VMs behind a big load balancer with a big fat pipe? You can code a little differently. Understanding those issues will help. What's the storage situation? Do you have all of your disk storage on a NFS share over a slow link to Kansas City? It would be good to know that before you write your code and set up your assets. How slow is it really going to be? Do you have a way to know how slow it's going to be? Do you have a way to measure how slow it is? Can you use a content delivery network? Can you cache it? Is memcache available? Is a key value store like Redis available? Can you store it on a disk cache? What is the system that your code is going to live? What does it look like and what can you do with it? Knowing the system in which your code runs means you can plan. So what version of PHP is on that system? What version of Drupal Core is on that system? What libraries are available? Are you using a system library for graphics manipulation? Do you know if the correct version is on that system? What tools are available? Are they using Drush or is it all manual? Do they have R sync, other kind of tools that you might need to maintain your code? And yeah, PHP 5.2, it's out there. Is your code able to gracefully handle the situation where it's on a server with PHP 5.2? So there are often reasons for old libraries, old systems, old PHP. And understanding the system side of things, understanding why those things are in place will help you deal with that situation. Knowing when things get upgraded, why they get upgraded and how they get upgraded. For instance, PHP 5.2 might be there because some core application needs it and they cannot upgrade because it's a shared environment. Understanding that will give you a lot more understanding of what's going on in the system side and make the system's admins a little more sympathetic to you when you come begging at their door to upgrade to 5.3. So when are they going to upgrade it? Understanding that cycle, understanding when to expect and upgrade will help you plan ahead. Understanding things like security updates. Sometimes things get upgraded, you know, tonight at midnight because there's a security hole that was just found in ModWizki. Does that affect you? If you don't understand systems, you don't know. So understanding what's going on in the system side will help you plan ahead for that. The other thing that knowing the system helps you do is it means you can develop and test your code safely. So how many people have a dedicated dev environment that they can develop their code in? How many have a dedicated test environment? How many people's dev and test environments are just their laptop? There's a couple. So understanding systems will help you create those and maintain those dev environments. Not just a box that has a lamp stack on it that you can install Drupal but an actual environment that completely mirrors what the production environment's going to look like because if it's different from production, you can test all you want but you don't know what's going to happen when it actually goes into production. So the fastest way to know infrastructure and get familiar with system and tasks is to build some infrastructure. And virtual machines make this very, very easy. There are tools that are free and easy to use on all platforms. VirtualBox and Vagrant are the ones that I use. Vagrant is sort of a front-end framework for VirtualBox. It allows you to script and automate creation of virtual machines and management. But there are VMware parallels and other systems that you can use. And there are configuration management tools. How many people are familiar with Puppet, Chef, CF Engine? There are a lot of them out there. And if you're working on a system where you have system-ins maintaining your servers, find out what they use. Learn those tools. If you understand Puppet, you can apply those same rules to your production environment that the operations guys are applying to the production environment. You can create a set of VMs that exactly mirror production. And when production settings or libraries change, you can easily change those on your virtual machines. If you're not deploying to a site and you don't have system-in people to talk to, learning these tools is very useful anyway because it will help you understand how systems operate, how things work in the real world. So even if you're just developing a module, I really recommend you looking into Chef and Puppet. So a great thing about virtual machines, they are disposable. The other great thing is you can build machines that exactly replicate your production environment. You can actually talk to your system-ins and get the same scripts that they use and apply them to your virtual machines. And you can make lots of boxes. Make a test box. Make a dev box. Make an A-B test box. Make an A test box and a B test box. Make dozens and dozens of boxes. And if you've ever, in frustration, wanted to go into your main server and do a rm-rf slash virtual machines are the way to do that. You can destroy them all. So I hope that I've started to convince you that knowing systems is a good thing. It sounds like most of you are already at least halfway there. So go forth and use your new system-ins superpowers to develop better code. And with that, I'm going to turn it over to Rudy. All right. So be a dev-up. How to get ready for deployment. There's a few things to know. Are you releasing too frequently or not frequently enough? There's a careful balance to maintain there. Knowing what it is time to actually deploy your changes to production is important. And having a simple, easy-to-follow deployment process helps keep everyone happy. And if you're a dev-up, it just keeps you happy. Your dev side can rest easy knowing updates have been tested before they go out if you have a good deployment process. Your off-side can rest easy knowing there's a simple deployment process in place. There's not a lot of things that could go wrong when your ops guy is deploying your code. You don't want to light your infrastructure on fire. So it is important to know your upgrade path and the dependencies that may come into play during your release. So reduce your moving parts. That's just a general, you know, less is more in this sense. To make things easier doing deployments, try and decouple as much as possible. You know, avoid depending on things just to depend on them. Sometimes this is inevitable because you need to depend on something. But understand whether it's a Drupal module depending on other modules or library versions or on a lower level things like package dependencies at the OS level. Things like CentOS 5 comes bundled with PHP 5.1.6, I believe. It's very old. So how do you upgrade that separate from having to upgrade to CentOS 6? So an example of that would be like using an external like third party package repository that has PHP 5.3. The IUS repository has this. It lets you upgrade within CentOS 5 to PHP 5.3. You know, test your code without having to actually upgrade to CentOS 6. Then when you're ready to upgrade to CentOS 6, you can do that separate from upgrading PHP at the same time. So it kind of reduces the amount of changes required to deploy whatever you're trying to deploy. When do we really need to upgrade? There's a few security fixes, critical bugs, obviously those are important. These are generally smaller fixes, you hope, so they're easier to test. Once they've been tested, deploy when they're ready. New features are also important. Make sure to thoroughly test and take your time because if it's not a security vulnerability or something critical, take your time, test it, make sure it works. And if that means a deadline might slip, deadlines are important, but your PM probably has a contingency plan, right? Hopefully. We all make mistakes testing before it's too late as key, but so is keeping it simple. So what does your deployment process look like? I don't know how many of you are familiar with Gitflow. It's a semi-common Git development process. Over on the left is the keep it simple, stupid development process, which is what the GitHub kind of development process looks like. So what does yours look like? How hard is it to deploy your changes? There needs to be some rigidity here because you want to be able to test easily. You want to be able to know when things are ready and you want to be able to easily move from dev to production in a way that doesn't involve too much, where your development process is the cause of unplanned downtime and outages and bugs in production because you forgot to merge the right branch or you merged the wrong thing or you didn't commit the right thing. Just having a simple process helps keep the human error out of things. So what do you really need? Well, it kind of depends. For Drupal, Drush is your friend. Drush is a great example of an ops tool that works for both development and deployment. Web apps are wild beasts and there are many ways to deploy and maintain web applications. It's really a talk of its own, actually how Oregon State University manages large-scale Drupal is a talk about that and it just happened. So if you were there, that was a talk about exactly this. So just kind of to touch on this, Drush is a great tool for kind of testing your changes, getting things ready, knowing you can test the latest version of Drupal easily. There are commands like run server and QD to download a quick copy of the latest core version of Drupal to run the server without having to have a patchy installed on whatever you're testing on. Being able to r-sync files from production to staging, being able to copy your production database to your laptop or your development environment or whatever it may be, Drush is extremely handy for your deployment process. And finally, testing. I've already talked a lot about testing, so you know it's important. It's kind of a necessary evil and it can be tedious, but continuous testing makes things easier. So Drush has commands that will run Drupal simple tests to test your code if you're doing Drupal modules or themes or whatever that have simple tests in them. And a tool like Jenkins, continuous integration, helps you sort of continuously test those things and it will notify you when, you know, there are bugs in your dev branch that Jenkins is testing with Drush in simple tests and you've broken something. So having to kind of take that away and automate it so that you don't have to remember to always run it, you just know, okay, Jenkins passed the test, so my software is ready to go into staging or to go into the full development environment. And when you fix a bug, you know, this is true for anything, write a test for it so you don't hit that bug again. You never know when some change that you make brings that bug back up and you forgot to write the test and you don't remember, or you remember and you're like, oh shoot, forgot to write that test. So, you know, have that ready, use the continuous integration, and on the off side, you know, we should test our infrastructure too. You know, you can write system tests, behavior driven development type tests with Jenkins, things to test if your web server is starting HTTPD or Nginx or whatever your HTTP server is properly and running and serving Drupal. That could be a suite of tests that you have for your infrastructure. Or does my database server start my SQL with a working Drupal database? Is the web server able to connect to it? Tests like that we can do on the off side too. And so once you have your deployment process defined and respected by everyone in your organization, it's time to go from the development to production. And with that, Greg's coming up to talk about going from dev to production. First I had no idea I was going to be on a game show today. That's a little different, so I hope the price is right. So, yeah. You know, once you got your code written and actually even before you write your code, as Ken and Rudy talked about, knowing what your production environment looks like and doing things to replicate it as close as you can makes life a lot easier, both for you and for the systems people, or if you're all the systems people, you're making life easier for yourself. Some of the things you'll note is that development environments very often are small systems. Usually everything's running on one system. So your database server, your Apache server, all that stuff is running on the same machine and it's probably on a VPS somewhere and it's running on, say, your laptop, something like that. So, very often, some assumptions happen there because of that when you're building a tool that might not map very well to a production environment where you are running, say, a load balancer with cache servers in front of it, multiple web heads, maybe having some different internal caching, things like a database cluster rather than a single database server on the back end and also things like network storage. All of these things are places you can kind of trip up and end up with unexpected bugs or issues or performance problems. So, the more you can be aware of, oh, gosh, this is what the environment I'm going into looks like and the more that you can try to replicate that, especially in your development process, your life gets a lot easier down the road and you spend a lot less time putting out fires at the end stages of a project when you're getting ready to, you know, deadlines coming up to PMs on your case to get things deployed and done and you run into an issue where, oh, gosh, I'm wondering properly because all of a sudden it's pulling a session from the wrong cache server or something like that. So, keeping track of where your infrastructure looks like early on in the project makes your life a lot easier down the road. So, some of the things that aren't necessarily obvious that I'd like to highlight to keep you from falling off that cliff is be agnostic about the underlying systems. Don't assume you're running Apache. Don't assume you're running MySQL. Don't assume that you're running on local host for your database because what's going to end up happening is you're going to write something in there and then, oh, gosh, this is getting deployed and the box is running engine X. Or this, you know, oh, we're using Postgres on our back end or our SQL server. Suddenly you may have code in there that is going to completely blow up when you try to push it into production. The other thing is there are a lot of performance enhancements and opportunities here where if you write it into your code ahead of time, your code is going to work a lot better at scale. Things like taking advantage of slave databases. So you're, you know, doing a split read write where you have a master database for all the writes, but if you then offload all your reads over to a slave database, you're going to, you know, spread your load over the database cluster. Your reads are going to get a lot faster. If your documentation gets big, you get slash dotted, you're okay because you've already built that performance into the system. That's also especially true when you're dealing with cache servers. It's very, very common. It's almost, it's hard to find large scale systems now that don't have some sort of load balancer or cache running in front of them. And being aware of that, especially being aware of how you deal with cookies and sessions with those cache servers doesn't, because you can run into situations where if you're setting cookies, you're storing information in a cookie that really maybe should be stored in the database or in some of the session variables. By storing that cookie, more often than not you are invalidating the cache because the cache sees the cookie says, oh, I can't cache this, pass this off to the back end and you're increasing the load on the back end for what is probably static content. So you want to be really careful about what you do with those cookies and if you're ever, it's uncommon in the Drupal world, where you're manipulating the headers or you're setting expiry times, those front end cache servers respect those expiry, so if you set a very short expiry on a static asset, the cache server is going to respect that and it's going to keep querying that back end and increase the load on the back end servers maybe unnecessarily. So, you know, be aware of that and be nice to your front end cache by giving it the information it needs for items that do have a short lifespan, yeah, set a short expiry or even a zero expiry. Because you know are static that aren't going to change or change very rarely, set as long an expiry as you can on those items and you will make your life a lot easier, especially down the road when the site's in production, load starts hitting it and suddenly, oh, gosh, you know, because it is so difficult to really reproduce and test using high concurrency, real user traffic, you can do it, but it takes a lot of resources to simulate users. So by being ahead of the curve and expecting that sort of stuff and writing your code in such a way that you can allow those front end caches to take that traffic, you're going to make life a lot easier on yourself down the road when load starts increasing so you don't have to spend a ton of money spending up extra instances trying to support this traffic. Another key thing to note is especially when you're running multiple front ends, sessions, be really, really careful about that because more often than not, on large scale systems, like your session table is going to be stored in something like memcache rather than the database because it's so much faster. But be careful on that because if those memcache instances are running local to the web heads and they're not shared, you could be in a situation where your session data gets stored in one web head, somebody hits the site again but their session's not stored on the next web head they hit, they're going to either be asked to re-authenticate or worse, lose their session entirely. So you end up with very confused users. So when you're working with that sort of thing, be aware of that sort of stuff and make sure that your ops people, when you're working with them, that they understand that you're using memcache or you're storing stuff in cache that needs to be shared across all the systems so that you're using a centralized shared cache infrastructure for things like your session variables, things like that. The other thing that's important to note on that is that very often when you deploy to production, you may be sharing hardware with other systems. So set unique keys on those for those key value source like memcache that are site-specific because there's nothing worse than hanging a site, having the web server pull some session information out of the cache and pull it from the wrong site. Suddenly you have the theme registry from a completely different site. You have a very, very broken site and a very unhappy user. So being aware of these things and looking at how you're deploying your systems, these are some places where you can trip over them and end up with probably an unhappy manager yelling at you. Talked a little bit about this, but when you're dealing with clusters again, another important place to look is things like the files directory, especially CSS and JavaScript caching if you're using the aggregation because if one web head generates that to its files directory, somebody hits a different web head, those aren't going to line up properly. You need to make sure that those things like the aggregation, the user uploaded files, things like that are on a shared file system that's accessible to all the web heads because if they're not, you're going to end up in situations where you have broken JavaScript, broken CSS and again unhappy users. I hinted about this a little bit a minute ago also, but it's a lot of these systems. So what happens when either 10,000 users hit that site all at once? Are you ready for it? Is your site, is your code using cache efficiently? Is it to using slave databases? Is it setting those headers properly so that the front-end cache servers can actually serve that content and not overload your PHP web heads? At the same time, you want to make sure that you're leveraging things and you have a situation like OSU has where you have 1,000 sites running on the infrastructure. And what happens when you start deploying more of those? Is that going to be a problem? How do you reduce the overhead of either one, managing that? Or two, even the overhead of things like your APC opcode cache. If you have 800 separate copies of Drupal, your opcode cache is going to have to cache those all separately because it doesn't know they're the same. So keeping track of that and understanding where your target environment is and understanding that where the bits and pieces are make your life a lot easier down the road and makes it much more likely to, especially under load, your site's not going to fall flat on its face. And then lastly, this is my own personal pet peeve I've run into. I don't know how many times. And that is do not assume that you're running in your own Drupal instance and you may be running in a multi-site environment. Don't hard code for sites all because as often as not, it's entirely possible you're running under site slash something else. And so by assuming sites defaults, especially in themes, but also in modules, you're going to run into problems because the folks actually deploying to operations are going to have to, when they are, find all those references and change them to point them where they really are or do some crazy sim linking or something like that. That makes them grumpy and then they don't like you very well and then they don't deploy your code as quickly as you want. So keep that in mind and be aware that check your expectations, check your assumptions at the door. Work closely with the people managing your infrastructure if you have other people doing it and understand what they're doing so that you can make their lives easier because then it's going to make your life easier. And this, you know, we wouldn't be this wouldn't be a Drupal con without the the constant plea of don't hack core. I know it's really, really easy to just tweak this one little thing to make it all better. But when you're deploying, especially in large scale environments in automated deployments and things like that where you have systems that are automatically upgrading automatically pulling down security updates and things like that your hacks are going to get wiped out besides the fact that it's just bad practice anyway. So, you know, really do think of the kittens because hacking core while it is easy and oh, it's just a small site and I'm just changing one line it'll be fine. Yeah, it's going to come back and haunt you down the road. So even though, I don't know how many times we say it again and again and again don't mess with core. Put everything over in site-specific directories for your themes for your modules, all that sort of stuff because it will kill you down the road. And we wanted to leave plenty of time we actually kind of moved kind of quickly which is good given all the problems but we do have some time for questions. There's a microphone up here at the front so that we can actually hear you because it is awfully noisy in here. Chris, come on up. Let's love to hear your questions. Oh, we got opera now. How exciting. Didn't realize we're going to get an aria on top of everything else. Many skills, singing, not one of them. So you're talking about optimizing cash and headers and that sort of stuff and that's kind of from the developer's perspective. Yes. I'd like to kind of turn that around from the operations perspective. How do I make sure that my developers are kind of doing all that for me? That's a good question. From my point of view, when I'm going through and looking at code I'm getting ready to deploy when I'm working with my front-end team and my developers, usually the first thing I do is I hit the site with Firebug enabled and look at the headers on the page. I want to see what's there. It will be able to tell me really quickly so if I pull that up I pull up the headers on say the CSS and look at those and say, oh gosh, there's a cache of zero or there's a cookie set on that. I also generally have some debug headers I inject at the varnish level that tell you whether it's a cache hit or a cache miss and why it was a cache hit or a cache miss and so all of those things are usually pretty easy to check and so you just hit the site once or twice with Firebug, you can then see that and you can do it in staging or in development long before it ever gets in development team if you have a separate development. The other thing you can do is especially if you're running something like varnish use the log tools and you know, use the varnish log to watch the back-end requests because there's a lot of information there about what is going to the back-end why it's going to the back-end whether it was a cache miss or whether it was a hit for pass, something like that. So between those two things usually I have a pretty good idea within a few minutes of just hitting the site to be tweaked so you can flag for the developers. Hi, Mark Dorson from Martha Stewart. So we have a number we have a custom deployment tool that we use that takes the site offline when we do our deployments which we're not very happy with. We have a number of ways to mitigate the effect of this on our end users but there are some parts of the site that it's hard to such as search with very unique queries that haven't been cached previously where it's still a huge pain point when the site has to come offline so I was curious about if you had any strategies for deployment and either mitigating or eliminating the need to take the site offline. I'm answering all the questions. I'm happy to do it but if you guys have something go for it. There is one trick that I've used in the past and depending on how your site's structured it may or may not work but usually at least for Drupal core upgrades and most of the time the module upgrades obviously test the upgrade first in like a test environment but you can almost guarantee that like the schema changes and the database that need to happen for the upgrade can happen before you actually upgrade Drupal core so you can separate the database upgrade by you know linking using your host file or some test site to like the latest version of Drupal or with the latest modules installed run the database update on production but leave the production site pointed at the older version of whatever you're running and then when it's time to upgrade you don't have to run the DB update at that point you can just push your code into production and reduce the time that it's offline during the upgrade process. Another thing you can do is kind of related to that is especially if you're running multiple web heads behind a load balancer or something like that take one of the web heads out of rotation upgrade the code there test it again you know run the DB updates things like that make sure it's working okay put that back in rotation and then rolling update through the rest of them I mean depending on the way your site is built that may not be the best solution but for a lot of sites that works quite well because generally you can kind of stage that and so you might have a period of time when you're going through that update where some users might get the old site some users might get the new but Drupal is generally pretty good about backward compatibility on that so both of you wouldn't be concerned about having I mean you both are kind of saying update the DB first whether it's completely separate or on a single web head so you're not concerned with having a newer schema with the older code pointed towards it as long as you test first usually you I mean you'll catch it if it breaks when you're testing but in my experience it almost always is backward compatible where like you're adding an index or you're adding an extra column or something like that that the module doesn't know about because you haven't actually updated it yet in production so you can do that and like what Greg said with multiple web heads you know take a few of them out of rotation upgrade those switch back and do it on the fly but definitely test that through with Jenkins or with some other continuous integration first and make sure that it actually works like that thank you come a system administrator with open concept consulting I was looking a while back into a vagrant virtual box and I like the idea of virtualization and I like what you were saying about building a disposable system that identically mirrors my production environment and that's a really great idea but when I was looking into it a few months ago virtual box on my laptop RAM like crap and I just I couldn't get the performance out of it to even do development I know that they've made a lot of improvements on it I know that the author has now support for VMware so if you want to use that you can there's also plugins out there so if you have a for example like an EWS account you can spin up EC2 instances using vagrant so you don't have to do something local so there's various options with vagrant as far as I know I haven't personally tried it outside of vagrant or virtual box but there's a lot of options for that so you might check that out maybe even just running it on a different system another thing you can do is the disk I owe in virtual box is utterly horrible and that will kill you what we've discovered is if you actually because it has support for NFS mounts if you actually configure the vagrant file to till it to mount the web root as NFS even though it's a local NFS mount it will increase in production you're hosting the NFS mount from within the virtual box as well or from the host system basically what ends up happening is it has your host machine offering it basically configures your host machine to offer up whatever directory your code is sitting in as an NFS share that then the virtual box mounts and it's vastly faster the one disadvantage to that is it does require at least on Mac and Linux it does require when you're spinning up the virtual box to provide a password for sudo because it has a sudo up to export the share that then virtual box mounts but that just means you can't blindly spin up boxes when you spin up a box you have to type in a password I've had problems with NFS on macOS in the past but that's a whole other can of worms the virtual box share mechanism sucks with bandwidth so I would not try to chance for a lot of files over it you know doing the NFS mount is definitely the way to go that's an interesting plan the other thing just as a follow on to that one of the things that I've done with my developers is we now ship a vagrant file in the root of every repo so every project we create a vagrant file that will spin up and mount the web root of the repo so that all of our developers have to do it when we create it which is great it's actually saved me a lot of headaches with our front end folks is they're able to clone the repo do vagrant up and immediately start working on the box without having to worry about any of that and they're also working in an environment that's much closer to production can you speak to managing and testing and deploying configuration changes that live outside of Drupal root like if you have a mic change that you want to test and then escalate through development and stage and production are you using configuration management so that's kind of the the key purpose of configuration management is to give you kind of a central well in most cases a central like get repository where your configuration lives and then a tool like Puppet or Chef to take that configuration depending on how you've templated it or put it in your configuration management to deploy to places and with a tool like that it makes it very easy to say with Chef spin up a virtual machine for testing and assign it the same role and recipe that you're using in production to test your change before you actually roll it out to production so you can use configuration management to stage like your dev staging production environment to actually test configuration changes to your core infrastructure before you push them to the production environment hi there my name is Albert I'm from TP1 in Montreal I have a question about what exactly should I ask of my office people because they're like in a different building and I don't even know them and it's often like I ask some questions and I'm not exactly sure what to ask so if I understand you correctly I use Jenkins with Vagrant to set up a virtual machine which then puts on a LAMP stack with Drupal and test that so it would be the puppet file which I would ask for my office people to give me that and then use that to populate my virtual machine is that a good way? quite possibly yeah I mean Vagrant definitely will support that nicely the other thing you can do is just even one getting to know them personally even if you know even in specific questions aside you know sometimes putting a face to the name will help a lot but I think even just getting a list of what the standard deployment environment looks like as far as versions what distribution are they running what package versions are they running how are they managing that and just kind of first of all systems guys love having people ask us well how do you do that so you know just going in and asking them what's your process look like what can I do to make the process work better with you even if it's not a specific question sometimes just starting with the general question of what can we do to make this process easier for you we'll very often open up the discussion and get it going but I think one if they can open up the Puppet repo to you so that you can see what the Puppet configs look like great but even if just grabbing what package versions and distro versions and things like that that's usually a great place to start because then you can work that into your Jenkins and Vagrant install so that you're at least replicating somewhat closely to what they're doing yeah and you know if they're weary about giving you access to the Puppet repo you can either ask for read only access say can you just give me a virtual box VM that's set up like production using your you know your Puppet or whatever already done and then we can use that as our base box and kind of replicate from there that way you have something that's as close to production as possible Hi I'm Selene from WebEnable I want to know if you guys like what components are you using in your stack you mentioned Vagrant, you mentioned cluster how is the cluster set up does it scale out does it scale in does it do it automatically where is it hosted internally but you know I mean how many developers you have do you guys use GitFlow now that's a lot of questions I don't know if you guys have like a starter guide for new developers or new sysadmins or new devops people that I can access or we can access from your site that kind of walks us through your environment your components that would be really nice to have I don't think we have anything posted right now of kind of like a description of the stack but we also have many different stacks that we're managing one I can talk about is the Oregon Virtual School District and they are a K-12 posting of Moodle and Drupal sites and so we have a developed a hosting environment for them and that is you know multi-tiered there's a pair of load balancers in front configured with the virtual IP with HA Proxy doing SSL termination to a dynamic number of web nodes on the back end there's I think six right now that just kind of handle requests and those are using IngenX and PHP FPM but it's very specific to kind of the needs of this particular client of ours and so depending on what you need and what you're doing IngenX and PHP FPM you might go with Apache and Mod PHP or something like that but for this we're using IngenX and PHP FPM mostly for all the static content and static files that they have posted database side we're running Percona 5.5 in a master slave configuration with a virtual IP but a very manual failover process because we don't want failure to happen and start you know split braining switching back and forth and losing student data because there is student data in this database and so that's just one large database server with the large slave and split with just master slave replication and then as far as like replicating the files across in a clustered environment we're using GlusterFS I was going to say yeah, it was NFS or GlusterFS and GlusterFS has worked really well in this environment it's a little bit more complicated in that we have a very standard kind of Drupal GitHub Git repo that has the core Drupal for OrgVsD posted on it and that is replicated to the web nodes via Git and so they're kept in sync with Git and then these sites directories for these sites that are deployed are actually on GlusterFS so there's a sim link out to GlusterFS for the sites default sites all and the modules themes and files for the sites and everything else is kept on the local file system and what tools are you using to manage all this stuff just shell? Chef is in progress well I was going to say just a shell in a bash or something a lot of that is through Jenkins so Jenkins is kind of the master controller for all of these web nodes and deploying and upgrading and all of these things so we've written some Jenkins jobs that do the deployment when a certain branch is pushed to both for updates it will deploy and get pull on all of the sites and update those and then run through the process and so there's the ability to test beforehand and there's a staging environment setup and all of these other things and Jenkins is kind of like the mastermind behind that in addition to CF Engine and Chef for the configuration management part of the actual operating system and PHP and MySQL and all of those tools and how many developers and sys admins do you have? There's two sys admins and one developer full time we have a team of students I think there's about 16 developers slash systems students right now that we utilize and they work up to 15 to 20 hours a week oh not bad okay thank you just a word on our student developers and learning how to know what to ask systems it goes back to that once you start trying to put things together you will know what to ask because you'll know you'll understand what it is you don't know so what we do with our student developers we just plunge them into the code we bring them in we give them oriented on the systems how they work we start assigning bugs and they're plunged in the middle of the code base I think it's similar with the system side they're in the middle of the configuration management with tasks they have to do and they learn how to do those tasks and that's how they learn that's my advice for setting up virtual boxes or environments and learning the system and stuff just start doing it and you'll figure it out time for one more does varnish provide visibility into which pages are cached and how can a developer request varnish to recache a certain set of pages the first question not really theoretically there are ways probably the easiest way is actually just to request the page and as long as the varnish config is set to tell you set a header on that it'll tell you whether it's cached or not that's probably the simplest way but that's a little bit brute force the second half though is how does the developer tell a page to the varnish to flush page fortunately for varnish there is a very good varnish module that's available for Drupal but you can also do it manually and that there is a command line interface that you can connect to that's control port you can send it in a command and you can send basically a flush command with a regex just describing what page or group of pages you want to invalidate and it will drop them out of cached and then the next request that comes in will load them directly from cached but the varnish module is great it's rock solid stable and does a very good job it's available for both Drupal 6 and Drupal 7 so yeah so it will cached pages that are for authenticated users as well it can but it's genuinely not recommended from the most part when you're dealing with authenticated users in varnish cached what you're probably going to end up happening is all your static assets your images and your css and all that's going to get cached by varnish but the actual PHP page content is probably going to pass to the back end there are some interesting tools which varnish supports and things like that they can get really interesting and esoteric where you can actually start caching authenticated user traffic there are also ways that you can do with the cookies and using the cookie as a cache hash you can actually then for maybe a shorter period of time cache authenticated user content but generally you're going to end up with frustrated users because what will end up happening is they'll hit something like an edited page and then it comes back cached which means they're not going to see their edit so in the most part you want to try to avoid caching authenticated users when possible but there are ways to do it what about if you just load the dynamic content with Ajax yeah possibly depending on yeah there are ways to do it you probably have to watch for that in your varnish config so you probably want to work with whoever's managing the varnish server because you're hoping to have to put some custom code in the varnish config but there are ways to do it it's interesting and complicated but it's possible thanks well thanks for coming thanks for putting up all the hassles and the noise and the echoes so thanks for coming everybody enjoy the keynote and we'll be around all week and we also have a presentation on Thursday about ORVSD if you want to check out what we're doing with that thanks