 Alright, hopefully everybody's in the right place. So where's the fire? This is not real fire, fire, but fire drills that we go through. So we'll talk a bit about how to deal with disasters when you have websites. My name is Kristin and I've been doing Drupal for a long time since 2004 and other projects, in Java before that, in Pearl and CGI before that. So I've been around for a while. Some information here if you want to get ahold of me. So I'm going to turn the tables here for a minute and see who's in the house. Who here considers themselves a newbie? Maybe just been doing Drupal stuff for the last, you know, year or so. So we, oh, great. Awesome, I'm glad you're here. So we got a few of those. More intermediate, maybe two or three years of Drupal, oh, quite a few. Veteran been doing this for a long, long, long time. Wow, a lot of you, great. So maybe you can throw in some of your own tips and tricks as well. Who's more on the site building side, more on the clicking and that kind of thing? Not a lot though, okay. Project management, you have to talk to the client, okay, a few of you. Theeming, any Theemers in the house? Little, a few, a little bit. More on the development side, doing lots of development back in. A lot more of, okay, so that's probably the right audience. Who does everything? Oh, wow, okay, that is definitely the right audience. Okay, cool. All right, so what are some website disasters that might happen? And I'm going to turn the tables and I'll repeat the answer because, you know, audio recording can't hear everybody else, but just throw out some things. What can go wrong? White screen, you go to the site and there's nothing there. Okay, white screen of death, anything else? Just throw out some things. 504. 504, so that's a response code saying, you know, something's gone wrong, some network connection or whatever, sorry. Oh, PDO exception, oh yeah, I love those, those are awesome. All right, you get hacked, right? Okay, so there's lots of things that can happen. We'll talk about some of those. So it might be hard down, right? Maybe the site's just really slow. Maybe some files got deleted and you don't know why, or some code got deleted and you don't know why. Maybe the database got deleted, that would be pretty bad. Maybe just some of the infrastructure's not working, some emails aren't working or some feature, some third party things are going on. So there's a lot of things that can happen. And so when we're talking about the kind of disasters and fires that we're fighting, it's not always just the site's hard down. There might be a whole bunch of things that are going on. So now, what are some causes of these things? Again, I'm gonna turn the tables and see if anyone has any answers for that to throw out some things, right? Permissions issues, so file permissions. Maybe people can't see the files because someone messed up the file permissions. Any other ideas? Sorry? Traffic spike. Traffic spike, yes, exactly. All of a sudden you've been red-edited or slash dotted. I don't know if they're so popular anymore, but back in the day. Active development on live. Ooh, don't do that, yeah, exactly. Someone maybe fat fingered something on there. Yeah, that wasn't a good thing. Anything else? Varnish trouble? Varnish, okay, who's got some cash, weird caching thing going on. Anyone wanna throw anything else out? Committing code without testing it, yeah? Not so good. All right, so we've got some good things here. So we talked about some of these things. We had a bump in traffic. That could be good traffic. Yay, we're in the New York Times. That could be not so good traffic. We're getting some hackers trying to take down the site. Maybe there's CDN problems. Maybe we're using Cloudflare or Fastly and there's some problem with them. Maybe there's actually hosting infrastructure problem, something with the routers or the database servers or who knows what, all sorts of stuff. Third party services that we're using in order to support the site. Maybe we don't handle those gracefully when they're not available and so we're seeing problems that maybe we shouldn't see. The application itself. Maybe we committed code that we shouldn't have committed because we did forgot to test it. Maybe we made this big, huge cron job and it's running every 15 minutes and it's just hanging the site. So there's all sorts of things that can happen there. And then there's the fat finger. We did something online. We shouldn't have, we did something with permissions. We shouldn't have, we deleted files and we shouldn't have so many things that could happen and we just have to be aware that, and it's okay, people make mistakes. It's not time to be pointing fingers. Sometimes you're gonna do it yourself and you know, that's gonna happen so. So today we'll talk a bit about how do we handle these things. We'll talk a bit about planning, monitoring, diagnostics, support, recovery and prevention. Prevention is key. I mean you're not gonna prevent everything but you're gonna try to at least prevent the things you can prevent. So the first thing is try not to panic but this one's really, really, really hard because if it's 11 o'clock and you just got your kids to bed, well hopefully got them bed earlier but you know, maybe not and then you're like okay, maybe I'll watch a bit of TV and go to sleep and then you're like fire and you're like, ah I really wanted to go to sleep soon and you know, you're debugging and it's late and you're hoping that you're not gonna be up all night. So this part is hard but hopefully if we put enough processes in place and distribute enough knowledge to the team so that when only fires that are really, really fires get to you and it's not some false alarms that'll help with this. So that goes into planning. So for disaster planning we wanna document some sort of process that works for us so that we can protect our business infrastructure and make sure that we're handling things best that we can during a disaster. And the most important thing here is you want to find a process that works for you and your team and your client if you have a client or your organization or whatever. This is not a one size fits all thing but I'll give you an example of a process. Parts of a process that might work. So and this would be something that you would document and make available like on an intranet, you know if you use Confluence or if you have an internal intranet or whatever a Google doc it doesn't really matter as long as it's made available to the team so that the client or whoever owns that site can see this as well as the internal developer team as well. So one of the things they might do is go check some other websites to make sure it's really the site in question. So I'm like oh example.com is down and I'm contacting the developer team and going oh emergency and it's like oh well if I go to Google it's down too. I guess maybe that's my own site or my own hosting or my own infrastructure problem. So make sure it's really a problem for the site itself. Then have a list of status pages that we're gonna go out and take a look at and make sure. So this is something you could actually give to the non-technical people. You can say oh go check these places and see if maybe there's something that's outside of our control that has a problem. There's a thing called a trace route. We'll go over that in a bit but they could run a trace route. Then you could set up some sort of email alias that emails the team and say okay urgent we've got a problem and describe that. Another good thing is to have a calendar especially around the holidays or times around triple con when things, a lot of people won't be available so that people know oh when it's that time this is the order of people I'm going to be contacting. And then as far as how you contact people that's usually personal preference as long as it's documented. Personally if I get a text I don't get that many. I know that oh something funny is going on. So that's actually a good way if it's an emergency. I get lots of emails so that's not necessarily the best way if it's an emergency. Call maybe they should call me or whatever. And then if you have a support system internal ticketing system which you should if you're doing a project then opening a ticket and if there needs to be a follow through. If it's not a short term thing. So you need to make sure that you're documenting all this and putting it someplace where everyone has access and it all makes sense to them. You need to also make sure that your team knows how to do what they need to do in the event of emergency. So you need documentation that's more focused on your dev team. Make sure that they can access all the things that they need to access. That they have access to any diagnostic tools that you have available. That they can open support tickets because sometimes you'll have accounts where it's really the count has been created by the client or a certain organization. There's one login and sometimes there's tricky things if you're logging in is that but you're a different person and the support team doesn't want to deal with you and things like that. So make sure that they're able to do what they need to do to be successful. So if they need to deploy new code in an emergency situation, they need to know how to do that. For example, in Pantheon, there's a special hotfix process. If you already have sort of code that's ready in the kind of dev master environment that you've been working on stuff and all of a sudden there's a hotfix. There's a way of kind of circumventing master in order to get the hotfix out while not disturbing your development workflow. You need to be able to access your backups, make sure you have backups, but access your backups, how they recover and so forth. So all of these things need to be well documented and easily accessible to the team and review it with them, make sure they understand what's going on. Okay, we'll talk a bit about monitoring. So for monitoring, we want to be able to verify that the site is working as expected. And there's lots of level of monitoring you could do. You could go from super simple to more complex. So here are some popular tools for monitoring and I'm personally have the most experience with Pingdom and New Relic. There are lots of tools available. Some of these tools have free levels. So if you need a tool but your budget constrained and you just want some basic monitoring in place, then you still have options and you can look at the different ones. I don't remember all of the ones that have the free ones but quite a few of them do. So this particular example is Pingdom and in Pingdom you can set up what are called checks and the check is basically going to go and hit your site or ping your site for those who are a little more on the developer side which a lot of you were. So it's going to go and ping your site and try to find out if the page is coming back. So this is a very, very basic monitoring tool. I just want to go is in this case hook42.com that's our home page. Is it okay? Is it coming back with a status 200 or is it coming back with a 404 or not a 404 probably but a 504 what we were talking earlier. So there's some network problem or is it super, super slow and it's coming back. So you can configure all these checks and you can have fancier checks but the simplest one is basically just ping a page and then you can look at some of the reports that gives you and track things like uptime over the last three months or six months or that kind of thing and get a sense of the health of your site which is great. You could also get alerts which are good and bad. I do set up all my sites so that they send me alerts which is really annoying when you're at DrupalCon and you're trying to enjoy yourself and then you get alerts and you're like what's going on? I don't want to fire right now. And then these are configurable. You can set any number of people to get them. You can get them your alert by email or by text or various things. So but I highly recommend that you do that so that it's in your face. You're not gonna, so all of these things are very, it's very important. Any kind of alert, any kind of monitoring stuff you should really try to have things send you things when things are wrong because are you really gonna go to remember to go and log in there every day and make sure things are right? Of course, if the site's down someone's gonna probably tell you if you didn't already know. Okay, so there's some monitoring tools or also some diagnostic tools. So in terms of diagnostics we want tools that allow us to look at the data, see what's going on, try to evaluate our software. And here are some options. Trace routes, which I'll go into more detail. Status pages, which probably are pretty self-evident. The logs, and we'll go into different kinds of logs. Application performance management software which is an awesome tool. We'll go a little bit more in detail on that. And there are Drupal modules we can use as well. So a trace route is basically a way to figure out how much time it takes to go from point A to point B. So you're in DrupalCon and your websites in New York and you can do a trace route to that website and it's gonna give you some idea of how long it's taking to do that round trip time. So here's an example of a bad trace route. So there's a lot of gobbledygook and whatever and it's not super important that you understand all of it but it usually does three different tries and that's why there tends to be three different columns. If you see an asterisk you can usually pretty much ignore that. But what you're looking for are places where it's jumping hugely from one step to the next. So we're seeing in this example that once we get to these in New York City edge area all of a sudden our times have jumped a huge amount and then they're even ever increasing after that. So when you see that kind of stuff you're like, oh, what servers are those? Is that part of our network? Is that our hosting or is that our CDN or what? And you can start pinpointing things. Usually I don't do anything with this information other than provide it to other people. I should have said before I'm not a sysadmin. I'm an architect, a project manager, a developer. I could do system administration, basic stuff. I could do a bunch of it and all that but that's not my goal. So I usually use this as something to provide to somebody else who is in charge of the networking and that kind of thing. So as far as status pages what status pages you're checking really depends on what your site is set up how your site is set up, right? Because if you're on Aquia, you go in one place if you're on Pantheon you're going somewhere else. So there's also, you don't wanna check your hosting status page. Any kind of CDN you're using which I highly recommend using a CDN. There are ones that have a free level so Cloudflare has a free level for CDN. That just helps, that gives an extra layer of caching and also usually a bit of a firewall to try to keep the bad people out. So third party services are gonna have their own status page and so forth. So there's gonna be a lot of different places and should you just keep that in your documentation that you've got, oh, this status page this one, this one, this one. And a lot of them are kinda similar because a lot of them use the same service to generate their status page. In this case this is one from Aquia they had some things going on so it's one of the first things you're gonna wanna go do. Something's going on with the site you're gonna go in and look at all the status pages and if there's an obvious culprit, great. It's not you, you didn't delete the code, you didn't fat finger the live site. Someone else is having the problem and then you will monitor that. Just because the status page says things are good does not mean it's not them. So that's the other and very important thing to keep in mind. So I'll have a problem that's hosting related and I won't name names but I'll go on the status page I'm like, I swear it's them, there's nothing there, I swear it's them and then I'll open a support ticket and sure enough it's them but the status page doesn't change. So don't completely trust it. I mean definitely if it's there it's probably that but if it's not there it still might be that. The other thing you're gonna wanna look at are the logs and where the logs live depend on the hosting. So I just gave two examples here in Aquea, they have one set of logs that look a bit like that because they're on, I believe they're still on Apache and then Pantheon is on Nginx so it looks a little bit different. So you just need to document, okay where do I get to the logs? And then usually if you look at the logs, you tail the end of it, it's pretty obvious that something really bad is happening because you'll see it there. So you can also check the Drupal logs, not always actually if the site's super hard down like you can't even get in there. So you would wanna do the server logs in that case but sometimes you can't get in and see the Drupal logs, see what's going on. The first one noted is the database logging module, that's a core module and you can turn that on. There's some performance things with that so if there's any way to not use that and use something else, is this log or something different? It's actually better in terms of performance but sometimes your hosting environment doesn't let you do that in an easy way and so you're more constrained. So that's core option and there's lots of other ways that you can log and so it really depends on your site and how you've set it up but so where you're gonna go and look for those, Drupal logs just depends on how you have set up the site. Now we'll talk a little bit about application performance management. There are a lot of tools again that you can use. I am most familiar with New Relic but there's a lot of different ones and some of them do have a free tier and there are some hosting companies that partner with some of these so that you can actually have some of that as part of your hosting package which is a big value add so keep that in mind if you are shopping for hosting that if they have that availability that's really great. So here's an example from New Relic and so in this case I was looking in the 24 hour range something had gone wacky on the site we had gotten a few errors we were wondering what was going on and then later in the day we were trying to analyze and see what happened and you can see that okay there was some sort of spike of stuff going on and you could see the throughput oh okay so there was a lot more processing happening during that time and then you can start looking at other things to try to figure out what was happening there. I believe in this case it was someone had decided to go run some process some sort of admin script right in the middle of the day it was using too much resources and that was not a good idea so then it's education like yeah okay if we really need to run that maybe we run it overnight or some other time. So you can look at you know you can look at the PHP times you can look at the database you can actually analyze how long things take throughput and start looking at you know slow queries and things like that super helpful. So before I get to some of the Drupal modules that can help diagnose this I'm gonna open it back up to keep everybody awake here so any modules that you like to use to help diagnose problems on your site. Okay great yeah so one technique it's not Drupal modules specifically but a technique is to drill down more into New Relic to look at specific views. So actually New Relic has Drupal support it understands some Drupal things so you can actually look at specific views certain modules that are heavier and that is really great. The other thing is you are able to write a little bit of custom code if you're a developer and a lot of you were or are so you can write a little bit of custom code and add some extra stuff to New Relic to track which is really nice so things that we've added are for example the user ID in case they're logged in because we have like a lot of admin pages and a lot of heavy pages that do a lot of crunching and those are supposed to only be used at certain times or whatever and so we'll monitor that and we'll say oh so and so was hitting this page and well maybe we need to make that page better for them because now they're hitting it a lot more than we thought they were going to or maybe we need to tell them you know what that's some reporting stuff why don't you go over and do that on the test side or another environment because you're just trying to grab some analytics and you don't really need the real time site in order to do that we're gonna do it over in this other place. So there are ways that you can so we've added like the user's role and the user ID and a few other things to enhance that experience. Any Drupal modules that people like to use for diagnosing? All right I'll go over some. So the blame module is one that we've used before and blame is interesting what it does is it will track forms on your site and changes that people are making to the forms. So if you don't want someone to change a view on the live site which they shouldn't be doing you could track that if you wanted. Another way to track that would be to use the features module which is what you should really do for configuration in order to make sure that people aren't doing things on live. So the blame module you can set it up per form and say I only wanna track these particular forms maybe the permissions page might be a great form to check to make sure that people aren't missing. Again you could put the permissions and features so you could do that as well. The hacked module so hacked is a module that lets you check to see if your code has been modified compared to what is on Drupal.org. So you'll check core and it'll also check contrib modules. So if you need to change one of those things if you need to change code in core contrib because there's some patch that you need to apply you should be tracking that keeping that somewhere. Safe usually we keep a special directory where we have patches, we have a readme file and we keep it all really tidy so we know that we did that. You should never change that code without tracking it and be very careful about making changes to those. I mean ideally you're only grabbing patches that are from Drupal.org. Sometimes we will make our own custom patches because there's some weird behavior we can't really analyze it. We'll add a little bit of watchdog logging in certain places like if a certain condition happens we'll add a little thing so we'll add a three line patch we'll put in our patches, put it in the readme and so that we can track it. But the hack module would see oh, there's some code here that's not part of core, not part of this contrib module. So that's a good one. The security review module we'll check to look for security problems. You should also be checking your status page on the site as well in the Drupal, the reports to see if there are any security updates. So I really like this one, logging in alerts. I use the email log part, so there's some sub modules in this project. This is a contrib module and I get emailed every error. So Drupal's funny, there's emergency urgent critical, I don't remember, critical error warning. So there's like these top four which, I mean what's the difference between critical and error and whatever but I mean so there's different levels. I just get emailed all of the ones that are basically error and above and so it's really in my face which is good and bad. It's like whoa, why am I getting all these errors? But it makes you, it kind of forces you to go in and deal with them. Whereas if you're like yeah I'm gonna go in and check the errors, oh shoot I forgot to do that yesterday, yeah I'm gonna go get a coffee, go check with my friend. So if they're getting emailed to you it's really in your face. So I personally like that. Okay so that was a bit about diagnostics and now we'll talk a little bit about support. So tech support, kind of a dirty word I think but basically you want to have assistance to deal with your software or your technical product. And inevitably you're gonna be opening support tickets. I mean almost always, you know at least every few months you're gonna be opening some sort of support ticket. So the important thing here, these are support tickets for external vendors. You're hosting your CDN or your third party services. First try to make sure it's not you, right? So was it really something else? And just do your due diligence there, look at New Relic, or was it some crazy query I was running, was so and so running this big cron job, any of that kind of stuff. Try to make sure it's not something application specific first before you go and open that ticket. And then you need to know where to open the tickets and this isn't always obvious. Especially if you have a CDN and hosting, it's not necessarily obvious where you open the ticket because often they'll point the blame at each other, oh yeah that's the CDN problem, oh no that's the hosting problem. So usually what I'll do is I'll open the hosting ticket first and occasionally if it's bad enough I'll open it both at the same time because I'm like oh this is hard down, I don't see any problem with our stuff and I'll just open both and be like okay, someone's gonna go and look at this as fast as possible so that tends to be a good tactic. And some tech support will have two levels of tickets. They'll have just the regular ticket, I need some help, I have a question and then oh this is really bad, this is an emergency. Not all tech support does but some of them do so just keep that in mind when you're opening the ticket that you know which kind of ticket to open. I have a hard, I'm usually very polite, I am typically a very polite person but sometimes in the heat of the moment my kids will tell you that I'm not necessarily the most polite person so sometimes you have to take a deep breath and you're like everything's melting and then you're kind of like fix it, fix it. So try to be polite with them and they're more likely to help you and to also thank them for the help that they do provide. So it's really important just like if you're opening an issue on Drupal.org it's not like oh this is broken, it doesn't work. What, that didn't help me, right? So you need to give them as much information as you can. Give them the trace routes, explain where I went to the homepage, I got this error or it's the site sluggish, it takes this long, whatever you can, that's super helpful for them. Maybe you would only go to one part of the site and you have problems or whatever. If you have access to New Relic or something similar and the hosting company also has access to that I will actually link out to certain things and go this looks weird and this is not us, this looks like a network error so please go and take a look at this. So common sense but it's super, makes just things go so much faster. All right, now we'll talk a bit about recovery because you're gonna have a problem at some point. So recovery we need to enable our site to work again after some sort of disaster and how do you recover? Unfortunately it depends, right? Because we listed that big laundry list of all sorts of things that could go wrong and we didn't even list everything because there's so many things that could go wrong. We had this weird case where we have this mail system and it's kind of like Facebook where you can email people back and forth through your mail client but then it's routed through Facebook so it shows up or whatever or like eBay does that and the email just stopped. It was broken, everything was clogged up and we're like what's going on? It turned out to be an emoji, an emoji broke the email system and it was because Drupal didn't support was a four byte emoji, got a PDO exception which was one of the something that someone mentioned earlier which was interesting. So the email came inbound from someone's mail client. They put in this cute little smiley and it got into Drupal. Unfortunately at the time we weren't handling that exception properly so it tried to write to the database, the database blew up and then it went back and it talked back to the mail server and says something's wrong. Well then it kept trying and trying and it just like back to you know it just like clogged the whole thing and you know it took us a little time to figure out what's going on, what's going on here? So do you expect an emoji to break your website? No, not really, right? And we didn't even realize that Drupal at the time was not supporting, it actually still doesn't support it, something to keep in mind. I think they have patched up for Drupal 8, I believe but for Drupal 7 there's still some workarounds for that. So the upshot is it depends. If it's hackers and you've identified based on oh there's been a traffic spike, oh it's like the same IP, keeps hitting the site, you can tell that then you know an obvious thing, you could block IPs, unfortunately you know they might change the IP so yes you could do some short term things for this one, I highly recommend a CDN, Cloudflare is great and maybe others like Fastly, I don't know as much about them but they are also an application firewall where they will try to keep out those bad folks so they'll monitor things much better than we can and then they can just do the job. Sometimes you'll still get some leak through, sometimes it's just like one person and it's not even, it's not a software, it's just one person trying to be really annoying and try to get in and then you can block those manually. Well if it's hosting or the CDN or some other thing, there's not really much you could do other than open the support tickets so we go through the process and monitor those and hopefully they'll fix it as fast as possible. Well if you screwed up the code or the configuration or that kind of thing then ideally you can update the code or update the configuration, get back to where you were without too much difficulty and then you might have to go through this hot fix process depending on where the rest of your code is in your workflow and that's why you need to make sure that's well documented. Sometimes things are just foobar in the original definition and I won't say exactly but basically premessed up. So then at that point there's not much else you can do, you have to recover from your backups. Knock, wood, this is not, I don't think that's wood but I haven't had to do this on a major site, I mean I've done it on little sites, it was not a big deal, we had backups and stuff, I had to do it on a major site but if you don't have the backups then that's kind of a problem and I was in the situation where I'd given someone this task of moving my old site, kristin.org which I haven't touched in years unfortunately but I'm like, oh yeah, why don't you just move this from this hosting company to this other hosting company and because we were trying to clean up and I figure it was a good learning experience, go ahead and move this from that and then I get this chat back up of that and I'm like, well why, well I deleted all the code, I'm like, all righty then, fortunately I did have that, it was a site that I hadn't touched for a long time but I had some backup squirreled away somewhere and it was fine but I was like, oh yeah, maybe I should make daily backups of that or monthly, yearly, I don't know something. So you need the backups in order to recover, super important and people need to have access to those backups and know how to restore them. This is another plug for using managed hosting where it's all just kind of done for you, it's awesome when it's just like you don't even have to think about it, it's like you can't even disable it, it's just like backups just happen. So ideally you wanna try to prevent some things, that would be good, you're not gonna be able to prevent the hosting having problems or the CDM having problems or some third party having problems but you can try to prevent some of these disasters. So some things, actually I just mentioned it, so managed hosting, that is super important if you can afford it and unfortunately we get some non-profits that work with us sometimes and they just have no money, they just have no money and the thing is it's actually more expensive to have the developers going in and doing your system administration and to go in and dealing with those backups that didn't happen or trying to recover or move code around and it's just, it costs money to pay people to do this stuff so in the end it's really cheaper to do managed hosting, it's just that people don't see it that way. So I'm a big, big fan of managed hosting. Make sure your backups are working. Use a code repository, if you're using managed hosting you'll be using a code repository because it's all built in but if you're gonna roll your own then you still need to do that. You know, most people use get these days but whatever, I mean just use something so that you can go and see the history of people checking stuff in and make sure that something bad didn't happen. You're gonna wanna track and tag releases so even if it's something small you're gonna wanna make sure that you tag the code and say okay, here's this release, this is rolling out. If it's a managed hosting that's usually done for you as you push stuff along but if you're not then you're gonna tag it yourself and another thing, if you are using New Relic or some tool like that, you can also notify that tool saying oh, I've just done a deployment and then you'll get a little line in New Relic and a little log and it'll say okay, I've just deployed this. This is great because then you can see, oh wait, pages have gotten slower in the last week or two and like oh that's right, we pushed out this one little fix two weeks ago or whatever and oh yeah, the timing aligns. I bet you there was something that happened in there and then you can go back and look at your tag and say oh well what did we add and narrow things down. Very important to have some sort of workflow for your development, dev test live, ideally, sometimes people don't have that but at least a dev in a live and this is not local to live, ideally, right, you have an other environment, sides live that the client or the organization, someone who is investing that site can go in and do QA and that's before you're pushing it out. Test and backup before you do live. And then for, personally, I like to look at New Relic usually weekly just to kinda get my pulse on like oh is anything interesting, cause I have alerts but it's not like I can set up alert for any kind of possible weird thing. So I'll just click around and look for any kind of spikes and kinda drill down and see if there's anything strange going on and that's super helpful. And then similarly, and this goes along with either Pingdom or New Relic or something like along those lines is just looking at your load times in general to see if there's any kind of, maybe you're not noticing it cause you're looking too closely but if you were to expand out and look over a month, two months, three months, six months, are you seeing that your trend of your speed is going up then you're like oh maybe we just keep adding more features adding more features and you're just starting to, your site's getting a little unmanageable. So you could take that back to whoever's in charge and say look, we just were adding, we need to spend some time doing some performance enhancements, maybe some of these features you don't need anymore and that kind of thing. Some other things you can do, configure caching, out of the box in triple seven, caching is not enabled, in triple eight it is out of the box enabled. So you'll have to go in if you're using triple seven and manually do that. And then there's other things like varnish and the CDN or whatever you can use as well. Something that I've run into which you might not think of we write a lot of cron code, cron is something that you know, your code that you're gonna run periodically. And so for that, you just need to be careful because they can be very heavy and if you have a lot of cron code crunching stuff while other people are on the site, you know you could have an impact. So just monitor those and you can spread them out. There's something called Elysia cron which can help you so that you can spread it out throughout the day or put it maybe in during slow times during the day or night for your site. So this is hard. This is a hard one. I have found that I can't seem to get my modules down to like under 150 modules once I include like feature modules and core modules and all that. We build complex sites and it's this constant struggle. Oh, we want this feature. Oh, well there's a module that does that of course. So do you really want that feature? Are you sure you really, really want that feature? Okay, well I guess we'll add that feature. And then but the thing about that is you need to be constantly checking in and saying, well, do we still need that, you know, six months from now are we still using that feature or can I actually get rid of it now? And be careful how you uninstall modules. So update things regularly, make sure you're on top of definitely security patches but even your just regular updates, make sure you're keeping on top of those. And you kind of think, well, it's just a regular update. Things are working. I don't need to worry about it but what happens is sometimes there'll be another update, another update. If you wait too long, sometimes really wacky things will happen and it just doesn't update seamlessly because there's this interaction between different modules, right? And so if you waited a year to update a module and there's been like 10 releases in between and the same for some other module there may have been some interplay that you're not aware of and things can get weird. So proactively fixing errors in the logs. So if you're getting those emails saying fix me, fix me, fix me, you know, you're more inclined to try to fix it because it gets rather annoying to have those emails. Something that we've done is try to auto block IP addresses and we wrote a little bit of code. We're considering maybe we'll try to generalize it and have it as a contrib module but the idea is if someone keeps hitting the 404, getting a 404 a lot or a 403 a lot, it's questionable, like what are they doing? So we wrote a little bit of code that it keeps a count of those and after a certain amount of time it just keeps the information in the database and after a certain number of times of those it will actually just auto block them. Sends us an email so we can go and take a look. Typically it will check the IP address and it'll be like oh it's something from some random country that shouldn't be looking at the site anyway so it's like oh yeah, that looks fine. But we'll also log the, if they're logged in actually we let it go because we have certain users that it's fine but then we can say oh why are you, at least we can contact them and see what's going on. So that's been pretty helpful. Peer review code, if you have a peer, ideally. Sometimes you have to put a duck on the desk and talk to the duck because, or sometimes I'll talk, so if it's late at night and my husband's around, he codes but he's a physicist but I'll actually walk through the problem with him even though he doesn't necessarily know all the stuff I'm talking about and just verbalizing that can a lot of times be a really great tool to try to work through a problem and sometimes he'll just be like well what about, blah, I'm like oh, okay, and go and look and he's like oh yeah, that's great. So that's not just a code but ideas brainstorming, trying to figure out why are you having issues and that kind of stuff. And the last big one I have is limiting access. So this is a big one, we've had rescue sites where we got the code and everyone has full administration rights, people that don't know anything about Drupal, they could edit views, they can do permissions, they could add users, they have a clue and they just got everything and then you're gonna get a lot of problems. So that one's super important, always make sure you don't have to go crazy with roles, sometimes people go a little overboard but ideally the administrator role should be for Drupal developers, period. People really need to understand Drupal. And even then, even if you have a lot of Drupal developers maybe you can pair that down and we haven't usually had to do that but that's a possibility. Then you might have a content manager or content editor role and they have certain access and maybe there's someone who can add, so maybe there's an admin on their side but they can add a user or they can do certain things that are more limited, so that one's super helpful. So on that note, I'm gonna open it up to questions, there's a microphone here or if you throw out a question I'll repeat it. I know this is the last slot of last day so people are like yeah, I was at Pantheon all night and I'm totally toasted but if you have any questions we'll open it up, yes. Yeah that's actually a great idea I hadn't thought about specifically saying oh here's something I've been working on code-wise and then just showing it to the team there or the people there and yeah. So yeah use your local user groups as best you can if you don't have one, start one. You just need two people right, meet at a cafe if need be or whatever but yeah that's great. The location-based one is also great, fortunately on our team we usually have people from around the world so we'll be like hey is this site down for anyone else? Oh yeah it's down for me, oh yeah down for me, okay. Right then let's get on it. Sometimes it will be location-based so Cloudflare we had some problems a couple years ago where continually they seem to have an issue just in the San Jose area which was kind of annoying and so we were having problems and actually the company was based in San Jose or Mountain View close by and so people, all the rest of the world were hunky-dory and having good time and the kind of core folks couldn't use the site. So that's a huge, huge thing. Any other questions? Yeah sure, the APM? Yeah let me jump out of that faster. APM, oh and it's so small. So and these slides we'll put them up on the session page in a little bit. Yeah I should throw that one in there. Save your bacon, yeah in fact that's a very good point so I should add that to the slides. The backup and migrate module, someone remind me to add that to the slides after, before I post them. Backup and migrate is great and it's funny because there was a client recently and they're like oh the site's down, having all these problems, I took a look it was triple six site and I was like ooh sorry, looks like you've been hacked. Poked around, I was like oh this doesn't look good and they didn't have a lot of money and they were like okay and I said well I mean you could backup from backups. I said where are your backups? This was not a site that we had, we did another site for them and this was kind of some other side thing of theirs. They're like oh backups and they went and checked with the hosting company which was I don't even know what hosting, some random hosting company and they're like oh the hosting company didn't set up backups and I'm like okay but then I was like oh, I went and looked in the files directory and sure enough backup and migrate had been set up so someone did the right thing but they didn't even realize that they had used that and they had backups and I was like oh look you've got backups, awesome and they're like yay but then they decided it was gonna be too much money to sink into this thing too because they weren't gonna update all the module, they were just woefully out, it was triple six out of date and they just said okay, we just they just shelled the whole thing at that point but they did have backups. Yeah that is a really good point, yeah so just because you weren't at the mic but make sure that the database backups are actually valid that you could actually restore the site from those and it's my business partner. Hi, yeah extending on that topic, you'll document emergency recovery procedures and you don't actually do them all the time and also so we actually go through a refresher and especially for some of our big clients it's like wow we haven't touched that feature just because it keeps running but to actually recover it or handle it in an emergency situation and take some time so we actually do refresh the team and like especially over the holidays as we kind of trim down our team for the holidays we do review these things and it does require practice and then sometimes infrastructure changes and you're like what's going on? Like ah. So one way to check those would be just to grab it, install on your local, click around, make sure it does seem right or if you have any automated tests which would be amazing then you can run those right so the service that's free is NodeSquirrel and somebody took that over, was it Pantheon? Okay so Pantheon bought them out so NodeSquirrel free service it leverages the backup in my great module in order to do that so there's really absolutely no excuse not to have backups no matter what your hosting is you can set it up and it's free and good. Any other questions or tools, tricks? Everyone's up? Yeah so I've used, actually I do it for load time but there's the web page test.org which that's really meant for load time but you can use it just by checking from different locations and then well you'll know it's not loading it's pretty obvious but yeah, sure would other ones. Yeah and so and a lot of the monitoring tools will let you hit from different places Pingdom you can check certain, do certain locations, New Relic you can as well but New Relic does have a monitoring piece that you can use there's a free part where you can check some different locations and have it hit it as well I imagine that these other ones do but yeah that's a good one especially if you have an international audience then it would make sense to at least ping it from a few different spots. Okay so any other tips or questions or anything? Yeah okay so that's awesome to know so nonprofits you can get New Relic for free and that's New Relic Pro? Okay so the different things about New Relic there's different layers, different levels that you can do and there's a free one which you only get like a day of data or something which is good, I mean if you're having a problem and you go looking right now it's good but we use New Relic Pro which is a longer term thing where you get data for a long time and it was just mentioned that if you're a non-profit then you have access to that for free which is amazing. All right so I think we're pretty much done here if you're around tomorrow please come to the sprints if you never so sprint is we're gonna get together we're gonna try to make Drupal better you don't have to be a coder to do that you can anyone can help out so that's really good and if you could evaluate the talk you just go to the session page and there's a link there to go ahead and do that and that's it. Thank you for your time. It went really well. The state of Drupal would choose five. Hi. Hi. Thanks. What do you know what the answer's gonna be? 42 maybe? Yeah. Sorry, I'm sad to use the other thing. No, no, no. Weird. Yeah. Yeah.