 Static works, it's dry up here, I don't know if you've heard that. So we're a little bit early but we're just sitting here and thinking that you guys got here early and maybe you're checking your Twitter streams but I thought I could also share a story of a recent customer attack to set the mood for the importance of monitoring and the role of monitoring in site operations. So this particular customer was running a Drupal site of course and they noticed that they were getting a lot of reports about spam coming from their domain and it just kind of all spiked on, I forget what day it was, of January, some day in the middle of January and they thought okay what's going on that could be leading to a lot of spam reports coming from our domain and we started to dig in, dig in, dig in and noticed that there was an extra PHP file that was getting called a lot inside of their files directory. So we did more and more investigation into that and found that the file was uploaded at a particular time. The refer request for the file upload command came from a Google search for sites that had the PHP input format available to anonymous users. So that led us to thinking about it, we saw where that request ended up and found a particular form on the site unrelated to the input format system. So this custom module was avoiding the permissions associated with input formats to allow PHP execution for anonymous users. So a pretty interesting situation and I think it's pretty illustrative of what can happen to your site and how monitoring for spam coming out could be an indication of an arbitrary code execution vulnerability somewhere else in the site. So how many people want to admit to having that happen to them? One honest guy, good job. Thanks very much. I was glad Greg wasn't looking at me when I raised my hand so I'm with you whoever was out there before but maybe the moral of the story is by the book, if you're going to be writing custom Drupal modules you should buy the book on Drupal security by our friend Greg here. Thank you. Well welcome everybody. We're so excited to be here. I hope all you hackers have gotten your afternoon coffee so that you're all as full of energy and looking forward to talking about keeping the lights on, operations and monitoring best practices as we are up here. We're certainly excited about this talk. We believe that this is the fun part of websites. Certainly the design and the UI and UX are so much fun and they're super critical but that's not the part we're focused on today. We're focused on the infrastructure part. Before I dive in, my name is Ned McClain. I'm the CTO and one of two co-founders of Applied Trust up here in sunny Boulder, Colorado. Love being in the area and it's fun to see all these smiling faces out here joining us in our beautiful state. At Applied Trust we're an infrastructure company. We focus on, we do a lot of operations. We also do a lot of assessments and we do a lot of architecture work. Being diverse in all those different areas means I think two things that are relevant to this talk. One is that we just see a lot of environments. We see what the best practices are that are being out of hospitals and financial institutions and people that have to take care of really sensitive data. We do a lot of work with power plants that are generating power and have to comply with the national standards for securing that national grid infrastructure. We have a perspective to see what those very best practices are. If you had unlimited time and money, if security and availability and performance were your number one priority, how might you achieve those? We also work with startups and small organizations and nonprofits and government entities and so we have a good pragmatic perspective as well and I guess that's the other part of being an operations company is we do carry a pager, try to make it all carry pagers and we're sort of the people that have to eat our own Wheaties so while we'll be making recommendations up here for what we think are good best practices, we'll be trying to balance those with what's pragmatic, what's realistic to do. We all have features to get out there, websites and functional applications to launch and we have to balance that with these aspects of operations and so that's the perspective we're going to try to bring to that. I brought two celebrity chefs along with me. We're going to tag in and out to try to keep this exciting for our little one hour we have together and first I'd like to introduce Trent Heim. As I mentioned, he's my co-founder at Applied Trust. We're celebrating our 10th year in business, something we're really proud of. Thank you so much. And you know a couple of fun Trent stories is I think he has the, I know he has the distinct honor of on this stage being the person who has written the code longest to go that's still in use today and I don't know I would ponder or postulate that perhaps he's even that entity in this room and that's back at UC Berkeley when he was a student wrote a lot of the code contributing to the BSD branch of UNIX before Linux even existed and specifically to the M tree code. Thank you Trent. Good contribution. The other two quick things I wanted to say about Trent is he's certainly one of the initial one of the people that have handled the very first internet security incident ever back in 1988 the Morris Internet Worm broke out and infected something like 80 or 90% of the internet granted the internet was a lot smaller but still it was a tremendous infection and certainly the first global internet security incident of its kind ever. At CU Boulder was one of the team of engineers that did the forensic analysis to determine how the worm was spreading what ports and services were vulnerable and how to shut those down to try to get the internet back online and that didn't happen in a matter of minutes it was a matter of days to get the internet back and so that was a legacy experience he brings to perspective he brings to this talk today and Trent's involved today in the community a lot he's involved at CU Boulder in the CS the computer science department there helping to establish curriculum for the next generation of computer science people so I'm really excited to have Trent here also here is Greg Madison who who if you have ever spent any time on Drupal.org it's hard to know where to start to comment on Greg you'll see Gregals his user ID all over the place and that's contributions from stuff like documentation and process and gosh he's always on the forums saying hey maybe here's a better way to handle that really snarky response you just gave to this new user and he's like the most patient forum person but he's also super super technical right if every day most of us use the path auto module to make our URLs like really sexy for the users and search engines out there and Greg was the initial contributor of that module so we really every day we're using his code on the other kind of end of the spectrum I was I was recently using the PL upload module that allows you to drag and drop using HTML5 goodness a whole bunch of images into your your image CCK field or whatever and Greg was the original author of that module so great has a ton of technical and hands on experience at Drupal you know one of the gregs also on the Drupal security team so he deals with core vulnerabilities and contrib vulnerabilities with getting those notifications out there with managing the processes around you know how do we get a responsible notification about a vulnerability when do we disclose that to the public how do we make sure people are applying the patches when we announce them so a lot of you know both political and technical work there and brings a great perspective he's also the co-founder of growing venture solutions gvs and also of Drupal scout comm check out Drupal scout comm if you're looking for something to do during our talk and now he's the director of security services at Acquia so so Greg is certainly a big hitter in the Acquia field let's welcome Greg or in the Drupal field let's welcome Greg awesome and you know we have a good-sized group today in only an hour so if you have something that you're dying to say and you think it really impacts the group please speak up and we'll have will definitely holler at you but if you want to chat about this stuff or something specific to your site afterwards or online we'll have our Twitter and email stuff at the end of this talk and of course we're on Drupal.org quick roadmap we're gonna split this into three sections talk about monitoring management and measurement we're gonna talk about security testing and monitoring and then ongoing operational security and I have had heard the comment gosh that's a lot of security content for an operations focused presentation and I wanted to take a minute to say that you know that we're not doing this stuff we as organizations or we as Drupal site managers and developers we don't do this stuff because we're good boy scouts or to tell our friends that we're following the best practices and operations right the the goal here is to take our resources our limited human capital and allow them to do the most useful stuff possible for our organization and that useful stuff is not breaking break fix stuff all day long and there's a really well-known study out there from Microsoft but it's a really great study of organizations that high performing organizations that have good change control processes have monitoring in place they know what's happening in their organization they're applying patches they have those basic pro operational processes in place they spend 20% of their time doing break fix and 80% of their time doing projects and deploying new functionality and all that awesome love stuff that we love to do right and the other end of it is organizations that don't have that stuff since 60 to 80% of their time doing break fix and that's the stuff that makes us not like our jobs right is dealing with we got hacked our site is slow something is down and that stuff will happen but if it happens one fifth of our life instead of four fifths we're going to be happier people and so the goal here is to put these things in place that make us more efficient that allow us to focus on the fun stuff and not spend all our time dealing with patching and someone what happened what's in the logs so this that's that's really the big picture goal here the the story I just heard recently they have to retell is about Paul O'Neill the Treasury Secretary and I guess in 1987 before he was Treasury Secretary he took over as the CEO of Alcoa the aluminum company makes all the aluminum metal right and they're a failing company they have terrible quality problems all the aluminum people all the people want aluminum are going overseas to get aluminum because Alcoa is not producing quality aluminum and their efficiency is down and their stock price is dropping and Paul O'Neill comes in and instead of focusing on quality or efficiency or whatever he says the number one thing we're going to focus on is worker safety we're going to make sure that we have no accidents here at Alcoa and these people are 60,000 people all pouring hot metal out of drums right it's not the safest place to be working in general and he's like we're going to focus on safety and the thing that happened out of that that was a keystone habit for these people that focusing on safety meant we had to follow our processes right if we're going to double check something we're going to double check it every time and if the temperature of something's going to be x we're going to make sure it's that for safety reasons but the outcome was that Alcoa exploded right in a time of industrial you name it for America Alcoa is actually a quite successful company right now and also it's safer to work at Alcoa one of their 83 aluminum plants then it is to work at a computer job where you type you have less chance of getting a physical injury working at one of these Alcoa plants because they're still focused on safety and so while we're going to talk a lot about security the point is that security stuff that those best practices establish an environment where we have control where we know that our site's not crashing because of missing patches where we know that our site is stable and performant and we know where the weak points are and where we need to focus when there's issues so with that let's dive into or let's take a step back and look really quickly at this at the evolution of industry itself and specific to pollution right back at one time in America smoke stacks on the horizon in your city was a sign of progress right that was a sign of industry this is the place to move if you want an awesome job in a factory come to the city we've got smoke stacks and Greg offered up this awesome to date this is a recent photograph from a local hyper local site driven by Drupal here in the Denver area and this photo on the top right well it's hard to see the Denver seal which I'm sure you can Google has a smoke stack in it right so even Denver was in two dates were advertising oh look at this beautiful place we have mountains and prairies and smoke stacks and so there was a time when industry meant pollution right and that was okay and then that's evolved right and that evolved first the first big progression there was all these regulations from the EPA and and regulations about emissions that said oh you can only dump so much mercury into the river and so everybody dumped as much mercury as they had into the river up to that level right and we've seen that evolution happen again where now it's a marketing brand strength to say we're green right Toyota does it and now their manufacturing costs are down right because they're figuring out every piece of what it costs and where they're where all their waste is thinking about green but as the site effect they're super efficient and super productive and so where are we at in the software industry right we're unfortunately not quite as evolved as the environmental industry we're often sadly still in this state where people say oh websites just get hacked oh my Microsoft were just crashes or my application just crashes right and people just accept that and hopefully we're starting to evolve into this second phase where now there's regulations right if you're gonna collect credit cards no matter how big or small your company is you have to be PCI compliance and that means you have to have specific things around logging and passwords and basic security controls we should probably all be using anyway and HIPAA there's health care regulations and so we're kind of in this middle stage but we're certainly not as evolved as we are in the environmental world and so and you know we can all say gosh the environmental world could be a lot better too so what I'm trying to say with this is I think it's a super super exciting time for us right we're right in the middle of this evolution and as Dries said in his opening keynote yesterday awesome keynote is Drupal's also in a state of evolution right and so we're right on top of this really dynamic really exciting time and as Drupal evolves so must we and so that's what what hopefully the punch line is here you know Drupal when at one time maybe it was mostly focused on newsletters and community sites and things that were outages and where security incidents didn't have as a big of an impact and I've had the privilege of working recently on Drupal sites that are responsible for life safety that are in use at hospitals and by governments to manage incidents when there's floods or other incidents like that and I certainly many of us have worked on Drupal sites where the Drupal where the business person stands over our shoulder and says for every ten minutes this site is down I'm losing twenty three thousand dollars or whatever and so Drupal's evolving right it's awesome it's becoming this new powerful tool that people depend on for their business it's no longer just a little luxury or a little internal internet this is something that many businesses depend on and so we have to grow up a little and a key ingredient in doing that is monitoring so we're gonna talk a little about monitoring I love this quote from Brian Ellis that measurement is the link between math and science all too often we go into an organization and they say oh we just bought a brand new internet connection because our site was slow and you're like oh my gosh you have these uncompressed images all over the front page it's never gonna matter how fast your internet connection is and I think the point is that measurement is critical where operational processes in general move us from the 80% of breakfix to like the 20% of breakfix monitoring specifically is focused on meantime to recovery so in hospitals or enterprises where they have metrics around how often is my stuff up what happens if it's down and planning around that availability they mean time to recovery is a key metric for them it's if we have an incident how long does it take us to fix it no one has zero incidents so when we have an incident whether that's the site is down or it's hacked or something's not working right some function at the views just broken the site looks like crazy now whatever that incident is it's how long we take to fix it and the trick to monitoring is is taking that mean time to recovery and instead of 80% of that recovery time being figuring out what's broken it's in 20% of the recovery time is figuring out what's broken and since we've shaved show shaven all that time off right we have less mean time to recovery overall right their outages just last so much less and so that's the point of monitoring this is the I think a key slide that of messages I wanted to convey around monitoring if you have an awesome monitoring system in place today that's spectacular if not or if you have a monitoring system place you're not sure about these are characteristics of a monitoring system that I consider essential today and this is certainly an evolving field if you are on Twitter you can see the hash hashtag monitoring sucks there's a whole community of people out there talking you know a little sarcastically about monitoring sucks but about how to do monitoring is not an easy problem and so it's certainly changing and so here's some characteristics of some monitoring systems if yours doesn't have that there's some amazing vendors on the show floor that can introduce you to more powerful monitoring systems and there's also some awesome open source systems to meet all of these requirements so the first one is real-time and trend monitoring most people have real-time monitoring my pager goes off this if the site is down the other half of that is capacity planning information that's often where people say Ned my site is slow come help me out and I say great show me your CPU utilization and your memory utilization for the last month and show me the you know last week when you had that really slow site performance show me what it looked like at that point was where's your disc busy you know how many requests per second were you getting well some of that might be available through Google Google analytics most of that needs to be through your infrastructure monitoring system and so that's the big lots of people alert only we need to be doing trend monitoring seeing a graph of our measurements custom plug-in system whatever organ our organization probably have really strong PHP programmers perhaps we have Python programmers or someone that loves to program in Pearl like I still do or maybe a bash scripting guru that's just bash is the only thing they will use if you have a monitoring system that has a proprietary plug-in system you have to write it in dot net or you have to write it in Fortran or whatever you're going to be very limited in who can contribute to your monitoring system this is often that the monitoring role is like relegated to someone right oh you're the the new guy so we're going to make you do the monitoring and set it up but really we need the smartest developers and site builders to be figuring out what we actually need to be monitoring runs your functional tests I'll talk about that in one slide active and passive monitoring this is something we've lost from the days of the 80s when we had monitoring systems like HP's network node manager where the simple network management protocol was how devices sent reports back that send alerts so my network switch is going down or my router is too busy and we've really moved away from that to a pole based base monitoring and I think that's a little bit of a loss we need to look for monitoring tool that supports a way to push alerts into it whether that's an API or a command line tool you name it log analysis we'll talk about in one side escalation is important for our quality of life right today we should not have a monitoring system that just pages one person and doesn't support a complex rotational system and allow you to roll up to people if alerts aren't acknowledged and then finally I'll talk about job execution briefly on a couple slides this is like a graph of monitoring HTTP and when I mentioned runs your functional tests the one thing I want to get out there as a tool is the name Selenium if you don't have that tool in your tool belt hopefully you're using an even better commercial tool that's even better and more awesome but if you're not then Selenium is a tool we can all have in our tool belt if we're just a tester and we're filling in forms Selenium will automate that form type of filling in but if we're building a view or a complex form API integration Selenium will allow you to run through many steps of a web website and give you that transaction back so the point of this light is if you're just connecting to a site and getting a 200 code back that's probably not sufficient monitoring today we need to use a monitoring tool that can run in a real browser and step through real browser steps and Selenium's a good tool to have a beer sleeve I wanted to mention page rank not because I care what your page rank looks like or even that much what mine looks like I'm not an SEO person but I think it's important to say that your monitoring tools are often siloed in the for the dev ops or infrastructure sys admin person they know how to use it and they're adding monitoring but the business should be adding metrics to right we can easily monitor page rank or number of subscribed users and often you'll see organizations that have an awesome monitoring tool but it's myopic it's only looking at that lower layer and we should be monitoring things that are relevant to the business to often like Greg said oh all that spam mail going down that's the red flag right maybe the number of users going way down or way up in our system is a red flag that there's an issue the apc tool for for PHP a lot of us are using it and I wanted to call it out as an example of a component that's often overlooked in durable environments so often you know this tool works 99% of the time and when it doesn't you'll get an error like this and that happens usually after you install more modules or something that's going to require more RAM from apc that's not allocated there and it's simple to have your monitoring system monitor this I swear 99% of sites are missing it so I wanted to try to point that out with a couple pretty charts the other two or three things to keep in mind for monitoring our cron you know we're so used to running cron as either poor man's cron or from a cron tab in our Unix system and I think that that's something for the previous generation that we need to move on and what I'm specifically advocating is whatever your monitoring system is it should be the thing connecting to your Drupal site and executing cron and that gives you a lot of benefit that you don't otherwise have specifically you can see how long your cron runs take right you might be able to dig through log files and infer some of that information but by executing cron from a remote system you can see how long the runs take and if they fail you get an immediate notification about that without digging through watchdog logs here's an example of a site where one of the cron jobs is failing and it certainly stands out the last tool I wanted to mention in terms of monitoring is the Nagios module the folks that have been working on the white house dot gov site contributed this module and it provides an open source way to see inside not in inside your monitor inside your Drupal instance from any monitoring tool Nagios is a good standard open source monitoring tool most tools can run Nagios plugins and so it's worth considering this tool tell you things like you have watchdog you have errors in your watchdog log or you have users who keep getting locked out or you have dependency module dependency errors that are causing something to be broken so the Nagios module is worth knowing it's out there this is an example of a site where we can see dozens of different Drupal sites each monitored through that Nagios module right it's if we have one or two Drupal sites it scales to have us log in and check it out every day no problem but if we have two hundred we need to have some automated system and should have some screen like this Greg's going to talk about a couple of other alternatives job automation and then logging in my last two topics so job automation Jenkins is the de facto standard for continuous integration or automating jobs like deploying to development deploying to staging deploying to production and really all Jenkins does is run command line scripts just runs bash or batch scripts that you've written right and I think the message here is that we should try to move towards scriptifying or codifying all of those system administration and deployment activities we do we can capture that in a script and commit that to get our subversion then we have a repeatable deployment process and Jenkins is a tool to look for if you're interested in learning more about that logging super important stuff the one thing I want to say about logging is that you should consider turning on syslog logging on a normal Drupal site out of the box it's logging to the database but not to a text file anywhere and if your site gets hacked it's easy for a hacker to edit that database and remove specific entries if it's written to a text file on that same server debatably they could still do that with the syslog tool you can send logs remotely if you're collecting credit cards or security is important to you you should have some kind of centralized off-server centralized logging system logging so I had to end with my little section with a little Paris Hilton twist I was going to say what it's hot or not and this is just a summary what we talked about paying our HTTP results code monitoring not hot what is hot is live user story testing and trend analysis through tools like Selenium what's not hot is poor man's cron or cron tabs for any job even if it's not Drupal related don't put it in the cron tab instead run it from your centralized monitoring system so that you have a way to a window into that logging to the database only out now we do logging to to central host through syslog logging in to see Drupal errors on each site and if they need updates or not that's not hot but what's hot is a central Drupal management system like Greg's going to talk more about and finally off-site backups yes those are still important but now we also need to be thinking about off-cloud backups if your back data is only backed up in one cloud provider you're probably missing the bus I wanted to end and I thank you guys I wanted to end really quickly and say if this stuff seems daunting like gosh we can never go from zero to sixty no one can go from zero to 60 don't try to eat the whole elephant there's two pieces of advice I have there's this great book called the visible ops handbook visible ops handbook it has very simple steps for how you can get a handle on your environment if you have outages all the time and don't know what's happening this has some really good ideas and then I would leave you with if you don't have this monitoring stuff in place instead of trying to tackle it all at once every time you have something break every time you have a client call or a developer or engineer discover something wrong like why the heck who changed this why is this broken or down add that to monitoring just when there's issues and you'll soon discover that you've got all your most critical pain points in monitoring and with that I will segue to my friend Greg I recommend to anybody if you're thinking about giving a presentation consider giving with Ned because I love the introduction thank you so much so security testing and monitoring is my portion and I broke it down into tools and services for detecting responding to vulnerabilities and threats so just to describe those a little bit more detecting is about finding the problem I think particularly with information security this is such an important issue because you know that example that the story that I started off with if you don't know that your site is sending out spam then then you're being abused and you have no idea if your credit card numbers that you're collecting from customers got stolen and you just don't know that that happened then that's a huge problem that you can't begin to address and so it's really important that you can detect things to find the problems and then responding I think all too often people come up with their response plan after an incident has happened so ideally what you want to do is come up with this plan for response before the incident has happened and then when you're frantic and you're you know you're freaked out about what's going wrong wrong you'll be able to follow that response plan and try to keep a little bit more of a cool head that is it takes a measured response so vulnerabilities those are the weaknesses inside of your site whether or not you know about them and then threats those are all the ways that the bad guys are trying to get after you whether or not those are successful so to try to provide a framework for how we might think of the vulnerabilities this is based on anybody anybody recognize it yes what is it yes so this is specifically the OWASP top 10 and if you're not familiar with OWASP I definitely recommend getting to know them they have an event actually here tomorrow in Denver so if you're a security nut more so than a Drupal nut you may want to look at Snowfrog I'm not suggesting a competitor Drupalcon maybe but it's a pretty cool group they also have local meetups and lots of great online resources so this is their top 10 set of vulnerabilities the things that they think are the biggest threats to the web application world right now in the Drupal world the ones that we have to worry about the most as site builders module developers, themers, site owners are the ones that I've highlighted in white so injection and that includes both SQL injection and code injection like arbitrary PHP execution we just had a vulnerability in the CK editor modules that did that XSS this is the biggest problem in Drupal modules Drupal sites is cross-site scripting vulnerabilities and it's approximately 50% of the security advisories that come out of Drupal.org are related to cross-site scripting so a huge problem that we really need to figure out some better solutions to CSRF or cross-site request forgeries next next one that we need to worry about and I guess I'll just back up for a second broken authentication and session management my sense here is that if you're using core then you're probably okay on that it's the kind of thing that is so important to sites that when Drupal gets audited that's one of the things that the researchers will look into the most so as long as you're using a typical way of doing authentication and session management inside of Drupal then you're probably in pretty good shape if you're writing your own single sign-on tool or if you're using one of the lesser known single sign-on tools then definitely you should be putting some of your own resources into looking into that. Insecure direct object reference this is one that Drupal doesn't usually have a problem with and we manage this through a lot of different tools that so when we talk about this one we don't talk about it this way we talk about it more as access bypass and that is a growing issue in Drupal but kind of at it in a different direction. So misconfiguration again going back to that story I told at the beginning they had PHP available to the world that was just a misconfiguration issue on their site and it's something that's so easy to do in Drupal is just shoot yourself in the foot with checking off one more check box than you should have and giving people permissions that you shouldn't have. Insecure cryptographic storage this is something that needs to happen at a different layer than Drupal you know this is about HTTPS it's about using SSH when you're managing your server using a VPN to encrypt all the traffic it's something that Drupal kind of lives inside of rather than having to worry about itself. There is one exception to that that I would say which has to do with things like the password hash and if you need to encrypt information inside your site so if you're dealing with PII that you feel like you want to encrypt even though even when it's in the database then there are some tools like the encryption module that can help you off with that. Failure to restrict URL access so this is mostly related to the access bypass that I mentioned earlier. So definitely a big problem that we need to worry about in Drupal. Insufficient transport layer protection again this is something that's kind of outside of Drupal's realm of concern and then unvalidated redirects and forwards we need to worry about that in Drupal we have this fun Drupal go to function it's you know it itself protects against this problem but module developers can introduce problems if they're not using it properly. So how can we detect with some of those vulnerabilities there I've broken this down into automated and manual and then code reviews and penetration testing. So if you're looking for a static automated code review there's a couple different solutions the coder module secure code review module and aquia offers that as a service not yet it's on the roadmap pardon me. So dynamic automated code analysis it's not a very common thing although there are some experiments in that Barry Jaspin in particular worked on that a couple of years ago in his taint mode PHP. So if you're interested in that idea then search for Berry and taint mode PHP. Automated penetration testing so a couple of one open source tool for this that I love is a Grendel scan or that I think is interesting anyways Grendel scan and then some other famous proprietary tools like fortify and rational also provide that although they are quite expensive so a bit of a downside there Drupal specific tools it's also on the aquia roadmap things that were we're working at on as Drupal scout and are now part of the engineering process at aquia and then so manual things what can you do a manual code review right if you know Drupal best practices you can look through the code and identify problems here I have an example of a sequel injection problem in Drupal code so if you if you don't know how to spot that then look at the session from yesterday that's going to be on video Peter will hand in and Jacob Sushi they talked a lot more about how to do manual code reviews in Drupal and then manual penetration testing this is kind of like you know your job is to just see if you can inject some some code look at different input fields that you have and see if you can manipulate those to get and a response that the developers weren't necessarily intending you to get so that you can become an attacker some ways to automate that a little bit there's the Vaughn module which is for Drupal 6 just begging for a port to Drupal 7 and it automates the process of injecting JavaScript into a lot of different places and then if you're using something like Firefox I think a great kind of gateway drug into the world of manual penetration testing is the tamper data mod or tamper data extension to Firefox which is very user-friendly for messing with the HTTP responses and requests so if we take that information and lay it back on the top 10 from OOSPA at least the ones we need to worry about we basically see that we can do a lot of code analysis and testing for all of these different things and so that's you know how I look at this on the one exception of course is misconfiguration so we need to do configuration configuration analysis instead of code analysis for that one big thing I'd recommend is the security review module which is freely available on Drupal.org and it has output that you can send to Drush so that you can send it off to your Nagio system and trend analysis do some trend analysis on it so how about responding to vulnerabilities I've broken this down in two different ways and you know if you have this problem yourself you're just gonna once you've identified the problem and need to fix it you know you fix it test it deploy the problem and then potentially contact the customers this is something that's increasingly important that different government regulations depending upon where you live may force your your site and your audience into is that you need to let them know about whatever breach has happened if you happen to find a problem in contributed code that's hosted on Drupal.org then you need to do those things probably but you should also be working with the Drupal security team so you need to work with them to let them know about the problem they can work with the module maintainer or the theme maintainer to help all of the other Drupal sites in the world as Jam put it so pleasantly in the opening presentation your fixed bugs or my fixed bugs and I think that's a really nice way that we can share fixes with each other. So how about detecting threats what kinds of things can we do I think it's you know if you've got spam comments spam nodes being posted into your site that's usually pretty obvious right but if it's a site that you don't visit often then seeing monitoring about the number of nodes that are created on a site might give you some good clue about a site that's suddenly under an attack of spam. A harder thing to track is if you're being used as a relay for spam that's going out because it's transient on your server it's just going right through and you don't have as much evidence of the fact that that has happened. So you know some solutions for that are again monitoring the number of emails being sent from your site. If you have defacement you know this is something that I think is interesting people will sometimes hide the fact that they've defaced your site. So how about raise your hand if you've had your site hacked and you didn't see it and then you looked in the Google cache of your site and saw that there were links for Viagra and things like that right? I see a couple of hands. This is a surprisingly common problem I think it's pretty scary. So how can you avoid that? I'll talk a little bit more about how to how to do that in a minute. So pardon me right now. So version control I think is a big way to do that it's often in something like index.php that people are adding that code and so if you've got index.php and version control you can see is there a def here between what's in my repository and what's on my server. That will let you know if there's been an attack on your server that modified the file that of course didn't make it back into revision control system. Or there's also the hacked module which is a great tool for analyzing your site and comparing the version of code on your site to the version on Drupal.org. The security review module has a feature where we'll look through node bodies and comments for any sort of for some kind of telltale signs of defacement so it looks for JavaScript or PHP injection inside of those areas. And again it's not necessarily that those are there that's a problem but you should do some trend analysis on it to see okay I've got three nodes that have JavaScript in them and they're all there because it's an embedded video or something like that. So if tomorrow I see that there are 20 nodes with embedded JavaScript in them I should look into those and kind of see what's going on. And then you know another great idea is to just take a look at the revisions of different nodes. I think that's another place that people like to hide spam. They'll create a revision that has a spam in it and then look for and then change it back so that if the revision is publicly available then that individual revision can be a source for spam. Quite tricky. Another way to hide the or to identify these problems is just crowdsourcing it. So on groups.drupal.org we have an up-down capability that people can vote down on comments. And in addition to allowing people to disagree with each other without taking the time to write a full comment it's also really handy for identifying spam. So by letting your audience vote down on things you can find those problems more easily. Two other tools related to specifically at the Drupal layer for looking at problems PHP IDS and Tiny IDS. How many people have tried PHP IDS and how many people are still running it for those who had their hands up. Okay one person that's good. So one out of three still like it. I tried this for a little while and found that it had too many false positives but I think it's worth experimenting with. Tiny IDS is a newer one. I actually haven't tried it. It was you know just worked on a lot recently and I think it's worth checking out. So let me know if you like it. Another interesting idea that's you know outside of the Drupal layer is some sort of a web application firewall and there's lots of great commercial tools for this. Another one to consider is the mod security tool from Apache. My sense is that they require a fair bit of configuration and I would just like to fix the problems in Drupal itself but that's just my perspective. So brute force password is another kind of a problem that people have to look for and the security review module can help you out with that as well. It's something that Drupal 7 now has protection against so it's hopefully less of a problem through a login delay if there's failed logins. So you can also just like look at your watchdog logs all day long right which of course we're doing or you can use a tool like drop door or any other monitoring solution for looking for those failed logins. So some tools to combat that. I think some of the problems with spam the tools for that are pretty well known. Malam Akismet there's even the spam module inside of your site or flag abuse which is again the crowd sourcing solution. If you have defacement then you know the resolution for defacement I think is just trying to solve where the problem came from so that you're not hacked again and then restore from a known good copy of the site. Luckily with Drupal we can usually just download the version of the code again from Drupal.org and we know that that's going to be a good version or knock on wood. We're pretty sure that's going to be a good version right. So you know take a look in the node revisions use an old database backup go back to a known good version and then apply that you know if you can if you need to copy and paste in new improvements to the content where necessary. For code injection altering your files you know I think that there's a couple different places to block this as I mentioned you know revision control system is a great place to monitor that and be able to have an idea of whether or not things have been changed and you know you just really need to keep like you kind of put all your eggs in one basket there and keep really good eye on that basket. Another thing that you can do is look for attacks at the firewall level if you see somebody who is trying to modify that stuff and then block them there. So brute force password I mentioned just earlier that in Drupal 7 mostly been solved or there's the login security module for those of you still on Drupal 6 or earlier. Another interesting idea I think is the HTTP BL module so how many people have tried that guy out? Yeah just one person? How many people have gone to Drupal.org? Okay so Drupal.org is running the HTTP BL module and it seems to be working pretty well since none of you knew that it was it's not doing too many false positives it will block it uses a kind of honeypot mechanism for identifying IPs that are exhibiting bad behavior and then block people who are identified via that honeypot mechanism. So kind of an interesting idea. Another side benefit of using HTTP BL is that it can reduce the crawlers and kind of spammers or spiders pardon me they're coming after your site that are acting in a malicious way and so it can reduce your resource consumption so interesting thing to experiment with there. So I want to talk a little bit more about some of the tools for site monitoring that are either internal to your site or external to your site. One great tool for monitoring is views right so a lot of monitoring is just creating lists of content and if you're managing a relatively small number of sites then this is a scalable solution to just create some administrative views to show you what's going on and you can inspect that particularly if you have a team of content moderators you can give them some good administrative tools via the views module that will allow them to do their job more efficiently. Mailman I put an asterisk on it because it's a brand new module as of last week I was thinking about that problem of Drupal sites sending out spam and said hey you know we should have a way to keep track of that so I built this tiny module I would love feedback from folks about it and then there's a couple of different charting tools I think the charts make it a lot easier to ingest this information quickly at a glance so quant report and the chart module all have some default charts that come along with them and are generally speaking extensible so you can add in more charts to them there's also some external and to usually paid systems that will help out with this so aquia network you know I've got to represent my company I think that they've got a great tool in the insight platform that helps with a lot of these tasks and then drop tour as well is a great solution they monitor a slightly different set of things and present them in a different way it is a much lower price point but when you're comparing them part of me it's not 24 per month it's 24 per year when you're comparing those numbers it's important to remember that the aquia subscription comes with a lot of other things so if you're just comparing on these two elements it's not really then then you're going to say well jump towards a lot cheaper but aquia has a lot more stuff Drupal monitor.com I actually didn't know about this until somebody tweeted it at me a couple days ago so I have not had any long-term experience with it but is anybody using Drupal monitor I guess yeah one person two people great so they didn't I didn't couldn't find any information about pricing so I'm curious I would love to learn more about that one as well so here's a couple quick screenshots this is a default set of charts that come with the quant module again useful for looking within your site but not necessarily going across sites this is just taking a look at the amount of content the amount of comments on the site again a high variation in those could lead to or could be an indicative of a new problem that's cropped up in your site the report module provides I think an interface that was a little bit confusing to me at first I had to poke around a little bit to get into it but then it was it felt like it was pretty dense in the way that it presented information the chart module I think is pretty interesting it's a general chart framework and then there's a sub-module system chart which provides some default charts that are pretty useful you know it that's using pygraphs here but you could also use it with a couple of other or you extend it to provide different views of the information on this particular site there's not a lot going on so we have just a chart of one color everybody's active so here's drop tour which I think is a pretty interesting option they provide again a really dense pack of information about your site and they try to present it in a way that you can manage a lot of sites at the same inside of the same system it's a Drupal module and a third-party service that work together so they'll give you a this was another interesting feature of it is that it will show the amount of memory used on pages so you can try to identify pages within your site that are memory hogs so it has you know some monitoring purposes and performance benefits lots of different areas and then here again related to the memory is showing some trend information about the amount of memory used and it has this overview checklist page that I think is pretty handy for saying like okay I'm gonna get my site into good shape this is particularly good if you're taking over a site for somebody else you can install a tool like this and like make sure that it's up to snuff and then you know this is where they give you a list of all of your sites that you can drill down into it there's two different tools from aquia that I think are interesting the aquia dashboard which provides trending over time of a lot of different elements all this on the left and then provides integration with a couple of different third-party providers like new relic and yada so that you can see all that information in one dashboard another new tool that we launched in November and just recently improved last week so if you haven't looked at it recently definitely reconsider it also 30-day free trial for aquia insight it tries to break down the status of your site as a single number so this particular site that I was looking at was a 67 percent and you could it's it's I think pretty interesting to go through this and see like okay I can spend an afternoon and take my number from 67 to a 90 or you know from one level to another and you know along the way improving performance improving security and it feels like I don't know personally gives me a sense of satisfaction that I know that I've made progress so when you're looking at those numbers you can drill down into the specific sections security performance and also let's see the velocity seo greater is incorporated into this right now the security is looking at configuration checks and I mentioned that next quarter it will be improved with some more configuration checks more advice about additional modules to add in and an XSS scanner so one one kind of overarching thought I talked a lot about adding stuff into your site and I just want to say that when you're adding stuff in you have to be really conscious of what it is you're adding and you know making sure that just because it's a tool related to increasing the security or stability of your site it's not guaranteed that that problem that addition won't have problems of its own so this was this is the security announcement about the barracuda web application firewall that had a cross-site scripting vulnerability in it so barracuda very famous company you know large enterprise oriented organization and yet by putting the barracuda web application firewall in front of your system you've now exposed yourself to cross-site scripting so keep that in mind and try to balance things as you're going for the buffet of open source modules that you can just add in. Thanks Greg you know one of the things that that's really interesting kind of about the space we live in is it's very launch-focused right we get all all ready we have project plans about how we're going to get to the point where we launch either an application or a site and we get monitoring in place and we do secure code review and we do the launch and we throw a big party and then we often hear crickets right that that that is we're done we move on to some other project we move on to some other site until a problem arises usually until we have an incident and then that day is not a very great day right we're like holy cow you know we've exposed private data or our site was defaced or something like that and so folks often ask what are the things that I need to be doing after launch to make sure that on an ongoing basis my site is secure and operating well so let's let's spend a few minutes thinking about that it it really boils down to these four points at some level first is maintaining eternal vigilance we want to automate that as much as possible we want to be using automated monitoring system or service but we also have to be aware what state is our site in how is it being monitored how is it performing what does it look like on a good day we'll talk about how to do that a little bit in a bit we do want to automate as much as possible so we don't have a human error factor and human errors comes in lots of different forms but particularly what we see when it comes to ongoing operational maintenance of a site is human error often comes in the form of I was too busy to get to this set of tasks right yes I knew that we were supposed to do patching yes I knew that I should have been monitoring disk space to make sure it wasn't full but you know I had a hundred other things going on and I didn't get to it and so the more that we can automate that so that it doesn't fall off the list is always a good plan we need to do periodic auditing I'll talk about that in some level of detail and then of course the all the catch all I was just not sleeping working 24 hours a day we'll fix it when when folks look at the list of stuff that we have to do and I actually added at the end of this talk a kind of a pledge that we can all take of all the things that we're going to do so that our sites are in awesome shape when we look at all the things you have to do sometimes it's overwhelming like oh my gosh I can't believe that if I run Drupal then I have to do XYZK and Q well let's maybe just step back for a minute and think about well is it do those things and do them right or could we see how some other platform does it so I thought it might be fun to just kind of compare Drupal to other platforms for a moment in a couple different areas so let's take a case of the the Twitter module available for another platform like WordPress and of course we've Twitter module for Drupal what you'll what you'll notice here is that there is a page for it and it's got some information that talks about its features but we see very little data on you know what things are going well with this module on what things aren't going well with this module on what things have been fixed that type of thing if we compare that to the information that is available on Drupal.org for the Twitter module you will see we have an incredible amount of data at our fingertips that can help us select a module to begin with that can help us maintain it we know how many issues there are the Drupal community is very open about not only sharing code but sharing here's deficiencies that you could help with or here's bugs that need to be fixed or here's a work around to make to to make something work you know even if it's not working perfectly we also get good data about how many sites are reporting that they're using a particular module why is that interesting is it helps us pick modules that are going to be operationally supportable that we know a lot of other sites are using and reporting bugs on and fixing that type of thing so yes do we have to do maintenance on modules we have to pick modules strategically absolutely but Drupal in particular gives us a good basis for doing that in an intelligent way we're not just throwing darts at a wall what about coding guidelines the word press coding guidelines this while this text is probably way too small to read especially in the back but I'll give you just some highlights right this page is primarily focused on things like well you should have a firewall and you should run a current version of your operating system and so on it's it's very what I would call hand wavy in contrast the Drupal coding standards are very crisp and clear you need to do these things you need to avoid to do these other things here is our coding standards as a community that allows us to to harden the platform together and make sure that that it works well in an operational environment just a couple more comparison examples of security announcements the WordPress security announcement stuff there is no way that you could concretely say I am running a WordPress site and I know what to be secure based on their announcement stream in contrast the Drupal security team does just an outstanding a plus job of giving very detailed timely announcements and fixes and quantifying and describing the risk that's involved so you can say well this vulnerability is going to affect my production site and it is so critical that I have to take it down and I have to patch it today or this is an issue that you know I can deal with during our normal maintenance window in a couple weeks those guys do a great job finally yes this is though probably my strongest when it comes from operating and choosing a platform my strongest argument is around module maintenance if you are living in the world press world or one of the many other open source CMS is almost all of them they have a you know some type of central way to describe that modules exist but if you have a problem with a module or there's a security vulnerability in module what they do is they refer you back to the module maintainer which may or may not be aware of the issues or how it interacts with others or whatever Drupal as Greg talked about all of this is centralized so if there is a known security problem the Drupal security team is going to address that and work with the module maintainer and that all of that information is available on Drupal.org in one central place makes our operational lives much easier finally and Greg mentioned this but I want to highlight this is Drupal core this is straight off of Drupal.org and I've added the little red highlighting box is that when we talk about site operations and and security it is incredibly critical that we are all running the same core code so that if there is a vulnerability we can get a fix quickly so that we we aren't introducing vulnerabilities core is an area that is just not intended to be customized for those of you who can't read in the back this is a page that says do not hack core and the box says are there exceptions to this rule nope we should be using the very extensible framework that Drupal provides us to do any type of customization that's going to help us with operational security performance and availability going forward you guys as site administrators and as just general Drupal advocates need to be communicating this with with everyone who touches the site really let's talk about periodic audit also too small to read in the back but this is available this presentation of course will be available but if you drop me an email I will also send you a PDF of this but what this is is this is just a stab at here's a periodic audit program what are the things that I should be auditing quarterly around paths of attack what are what are the things that I should be doing annually for network security architecture or quarterly for encryption usage and log handling and key handling and so on so it gives us a checklist of here's what we should be doing every week every month every quarter every year to assess the security of our infrastructure and our Drupal sites to make sure that they're in good shape you don't have to use all of these but my hope is that you'll look at through this list and it'll spark ideas yes these are things that we should be doing for our site these things don't matter because whatever we don't handle sensitive data or we don't have that type of infrastructure infrastructure that type of thing so either grab these out of grab this whole list out of the slides or draw me an email and I'll send you a PDF of it one of the things that I think is especially difficult to convey especially to non IT people when it comes to operational management and security of sites is that we have to build a strong cane from top to bottom and I've tried to illustrate this here way down at the bottom in the green box I've tried to illustrate all users we need all of our users even if it's users that are fairly non technical to have some level of security awareness right we need to educate them on how to choose a good password and not sharing their accounts and if someone calls them and says you know hi I'm from IT can you share your password with me hopefully they say no unfortunately if we don't educate users about that then they don't know how to handle those situations and they end up being a weak link in the chain and we can have done all of this great secure code review and monitoring and periodic auditing and it will be all for not if we have users that that end up being the weak link in the chain we expect that are on the top side of this the gray box is that we're training all of our IT security specialists and professionals about the technologies that we're using so this is where we kind of flip the table on this is that yes in many organizations we have folks that are familiar with IT operations or IT security and they're really good at their jobs but they don't know specifics about Drupal and so that's an opportunity where we as a community and as advocates for the platform have to go to them and say look here are the things you need to be aware of here's the Drupal security team here's their stream of vulnerability announcements you should be monitoring here's how the system is architected and then when we when we look at the kind of the the the middle layers for all of our what what should we call them our IT administrators folks that manage our infrastructure and manage our content and so on that we want to give them a structure where they can operate in a secure way and oftentimes that's a life cycle on this blue square it says assess plan implement operate and monitor is we need to teach them all about these operational steps that we're going to follow from the top to the bottom of that the unfortunate problem or challenge that we have is that I cannot give you a technology tool I cannot give you a piece of software that solves all of this and so in addition to doing all these great things that that that net and Greg talked about and writing great code and all those things we have to deal with a softer side of this in order to have ongoing secure operations of our site we've got to do training we've got to do awareness we've got to do education top to bottom one of the things I want to highlight is patching perhaps it's obvious and I hope all of you guys are patching but the reality is is that every day new vulnerabilities are discovered new patches are out there new methods of attacks are are developed and so for that reason we have to continually evolve a site again I started talking about this launch mentality I have run across a lot of organizations that they launch their site they let it sit and you know whatever it launched as a Drupal five site a couple years ago and today there it sits unpatched untouched and that's just not that does not work that does not work with commercial software and that does not work with open source software so we need to make sure that we have a plan for dealing with that for instance the standards net talked about us evolving from the smokestack world is now we see standards like PCI DSS telling us that we have to apply critical patches within 30 days so that bounds the window I would suggest that especially in today's world if you have a truly critical patch probably should be applied in much shorter window more like 72 hours almost done a couple more slides one of the things that we're going to talk about in this pledge in a minute is we need to have an incident management plan or incident response plan is that I would like to say none of you will ever have to experience a breach or an incident and knock on wood that will never happen if it does what you guys need to have in place is a documented plan that has something like this you can you could write it on one piece of paper or you can Google for it and find you know 50 page examples but basically boils down to some type of response some plan for around notification and escalation and that the the tip there is is that we want to keep the notification to the smallest possible group as long as possible so we really know what happened we know how we're going to fix it and then we can figure out what's the appropriate communication to the larger organization or to the outside world and then you know what is our our long term response strategy do we need to upgrade a module or or do we need to go notify users that their data was disclosed something like that important takeaway I've had people on this is as site administrators sometimes I think we're lazy we use the same password this is not an effective security strategy there have been many examples in the news of this recently including the fine folks at the PlayStation Network is that you know one thing that you can do as a privileged administrator to increase security of all the sites you manage is just simply choose a different password between them so that if one site gets compromised all of the sites or instances aren't also compromised alright finally we're down to the last three slides guys thanks for bearing with me I threw this together hopefully this is kind of a summary of all the things that we've talked about today but let's go through them just one bullet at a time really quick this is what I'm hoping all of you guys as as Drupal site administrators and site builders can pledge to I pledge to do the following I'll set a unique strong password for any accounts with administrative privileges and I do not share passwords across multiple sites we can all do that I use multi-factor authentication often in the form of something like SSHPs for OS level operating system level access and have password only access disabled on my system so if like if you're maintaining the operating system layer like Linux you have a key system in place rather than just a password it's free and easy to do I have an execute a patching plan that includes the operating system web server and Drupal layers including core modules and custom code right I have to patch all of those if I develop custom code I have to have a plan to evaluate and evolve it going forward I have an execute at least a minimalist periodic audit plan that's a slide that you couldn't read it said well we need to do the following quarterly in the following annually whatever's right for you guys maybe you just do an annual audit on stuff I hope you're auditing some items more frequently but you need to have some plan for that I am aware of and comply with applicable information security requirements for the data that my site handles so if you handle healthcare information you need to comply with HIPAA you handle credit cards you need to comply with PCI DSS so on and so forth you need to be aware of what those standards are and make sure you're in compliance I monitor vulnerability announcement mailing list for the technologies I use on my site that's the simplest way to find vulnerabilities just read the email I monitor my system regularly such that I know how it behaves under normal conditions why do we need to do that so that on a bad day when someone says the site's really slow you know is it that that person didn't have their coffee in the morning or is it truly that there's something wrong and that's an indicator that we need to go investigate I have a documented incident handling plan that I'm familiar with and can use in an emergency we talked about the need for that even if it's just one page you should have something that when that moment of panic sets in you can pull out of your desk drawer and say this is what I'm going to follow to get me through the next few hours I take responsibility for ensuring that any custom code is developed according to the secure coding best practices and is evaluated before being put into production we're going to write custom code it probably needs to be carefully reviewed before it's used on a production site I will be eternally vigilant and investigate any unusual or suspicious site behaviors I know we have all done this I have done this we're like wow that did that page didn't look right but I'll I'll come back to it another day sure enough that was a telltale sign for a much larger problem I have a process in place to ensure non-production sites are appropriately protected from external access or crawling you know maybe that's a special robots dot text file or a packet filter whatever whatever you use in front of that and finally I'm an advocate for practical information security practices like the ones we've talked about today but avoid security theater showmanship right we're not trying to scare people we're trying to say these are the practical things that we need to do and leave it at that I'll just mention any of those of you who took a plane to Denver you probably experienced a lot of security theater getting here all right that's it thank you guys so much for spending at the last hour with us and talking a little bit about some of these operational issues we love to do this as Ned said this is the stuff that we find fun so please contact us if you have questions or suggestions or ideas there are all of our email addresses and just to wrap up we love feedback there's a survey that you can fill out about this session we'd love for you to spend five minutes just clicking through some buttons and giving us a little feedback about what we can do better finally this is the last session in this room tonight and so the the facilities management folks have asked if you guys as you leave could just please look around if there's trash or someone left their cell phone or something grab it and and get it to this fine lady in the red shirt who will help you with that thank you guys so much for coming have a great night