 I'm going to talk about launching a global community for one million users. I intentionally won't mention the client's name because it gives me the opportunity to speak more open-hearted and not being bound by any restrictions. So I'm going to start with that. My name is Kresen Wienger. I'm the CEO of Adapt. I started developing internet back in 1994 as a front-end developer, but that stuff got way too complicated, so I became the CEO instead of. I know that yesterday, how many of you here saw Dries pressed up as a leprechaun? OK, he was pressed up as leprechaun, but you know what? He's not the real leprechaun. I am. I can prove that. Look at that. If we're both leprechauns, then I can tell you I'm the most happy leprechaun, at least. So today I'm going to tell you about Adapt. I'll do that in 45 seconds. This is not why you're here. You're here to listen to the case. I'll tell you about the challenge that we had. I'll tell you about the process and the solution we came up with, and then I want to focus on the learnings we had during this project. Hopefully, that will be enough room for questions afterwards. So Adapt is a digital agency. We promise our clients that will grow your digital business. We are focusing on digital strategy, digital execution, and digital conversion optimization. So it's analyze, execute, optimize. We've been doing this since 1998. We have 100 plus employees being around for so long with that many people. Of course, we have a lot of digital solutions on our conscience. We have 2,000 plus digital solutions. And we are now located in five countries and six cities, counting Barcelona, Berlin, Boston, Copenhagen, Kaunas, and Binyos. OK, that was 45 seconds. Now we go. The technical challenge we had was that we had basically two Drupal sites in one. This meaning that we are talking about a software company here being a game development company. And this company had a public site and also had a community site within the same site. And the thing about that was that we needed to take out the one site, the community part, and then make it to a site on their own with their own URL. This two sites of this site had mutated over the last five years. Moreover, it had a lot of different developers on the project during that a lot of years. And they also had no development strategy. And you can say they didn't even have a development plan. And not at least no documentation. I know this is totally new for you. You also always have a documentation. But this specific client didn't have a documentation. So that was the technical challenges. The business challenge of this was that, of course, when you run a company with more than 800 people on board and your software has been downloaded more than 5 million times, then your website is kind of business critical for you. So that was one of the business challenges we had. We also had the challenge that we had around 1 million users providing content to this site. So it was a content-driven site by the users. There were also editors. But mainly, the most of the content came from the users. We needed to do that. We needed to shut down for content updates for 24 hours. And when you do that, you're working up against a deadline. And that deadline, you can't really move it. Because when you announce this deadline, you need to keep that deadline. We weren't in a position where I would love to just say, hey, we need to push the deadline at least a month because we're not done with the software development or the website development. So that was really an issue to work up against a hard deadline like that. Moreover, being that many, running a website like this, you know that when you take an existing website and you then put it out over to a new one, then you're most likely will lose around 10% of your Google traffic. Your traffic will just drop like that. But if you can imagine how many money that the company like this spent on search engine optimization and AdWords, it's a huge amount of money if we were to drop 10% of that traffic. That would be very critical. And of course, when you're facing challenges like this, the CMO were breathing down on our neck because this is, of course, a high profile project for him. I'm pretty sure he was based in the US. I'm based in Denmark. I'm pretty sure no matter what time of day it was, I would have gotten a phone call if this weren't a successful project. OK, the process. Who many here uses agile development? OK, I should ask who's not. So I won't go into details. You know all the stuff about user storage product backlog and working in sprints and stuff like that. I'll say the only difference here is that we had to take over the project and then we had to run the community site and build some on the old site first. And then we were to begin on the development phase on the new site. So that was the only thing that differed. So we actually also had to keep the old site alive while we developed the new site. OK, the solution and the launch. The new site were built on the Drupal 7. You might ask, OK, why would we build it on the Drupal 7? But the thing is that the decision about this were made in November 2015. And if I recall, release date right, it was about, Drupal 8 was released with a release candidate in October 2015, something like that. So we discussed whether it should be Drupal 7 or Drupal 8. But at that point, we chose Drupal 7 due to the fact that the client wanted to test the software and they wanted the site to be alive for two to three years. So it's a short time lifespan. And they wanted a lot of modules that they could draw on. So now they have a documented code. And by mentioning documented code, I really mean that we also now have it on GitHub. We also have a decent deployment set up and stuff like that. We can roll back if something goes wrong. That's also very neat. We had no loss in traffic at all. So we spent quite a while making sure that we didn't lose organic traffic. And we managed to keep the 24 hours of constant freeze. We managed to reach the deadline as well. So the learnings, we would build it on Drupal 8 today. I've got to be honest to say that we would have put more emphasis on building and making the client choose Drupal 8 today. But at that point, I didn't realize how quick that the Drupal 8 would build up. It's really been taking up much faster than Drupal 7 did when Drupal 7 were released five to six years ago. And then we have this closure sprint versus finisher sprint. We work with software called JIRA, where we kind of run the project in. How many of you know JIRA? Yeah, you know it. OK, so you also know that you can easily close the sprint and say that the sprint is done. But it might not be done, done. Yeah, exactly. So if you close your sprint and say this sprint is done, and then we just need to finish up a few small more minor details, and then it is ready by the end of the project plan. But the thing is, if you close your sprint and you need to still do like 5% and you have five sprints, it's not like when you're by the end of the last sprint, you'll only need to do 5% back of the project. You'll most likely be lagging 25% or more work to be done before that you can actually go live. So what saved us in this manner were actually a conservative project plan, where we stated that we wanted one month to test. So we had a whole month to test. And that saved us, basically. So if I were to give you any learnings from this, I would say not just close your sprint, because that's easy to do in JIRA. You need to finish your sprint. And by finishing, I mean you should be able to test and show and test what you have been doing in this current sprint. We had a launch in summertime. Well, you can just make a project plan, and you can have everybody going on holiday. And you know exactly when they go on holiday and when they return. The thing is, you can only control your own environment. So if you have a client, as we did here, that somehow there was some miscommunication at the client. So when we were ready to launch, we said, OK, now you can push the button, go live. He was on his way on Autobahn in Germany, going on for campsite. And we couldn't get hold of him. We didn't know how to get hold of him. So we had to basically call the US. The US has to call back to Denmark. And we had a lot of stuff moving on. And then we could get hold on the guy who was supposed to push the button for our client. And moreover, if you want to staff up, it's really hard to staff up during summertime. So just a simple mental note about this. And then I will say, document the changes in scope. This was a middle-sized, large project. The beauty about JIRA is, again, it's very dynamic. You can really do a lot in JIRA. But the whole point is, when you start, I would say, I recommend that you now take all your user stories, export them, and save them as a PDF. And then you have that as a documentation. And by end state, and you launch, you can then do the same. And you can just match those two papers up against each other. Because you will experience with a project going on for seven to eight months. You might have employees that stop at your company, stop working at your company. You might have employees stopping at the client site. And new people coming in. And when they come in, they look at what's going on right here, right now. They don't look back. So you're in a position where you need to be able to document what was the starting point of all this. So document the changes in scope is also a pretty good learning, I would say. OK, so by this, I have a few minutes for questions. If you have any questions, OK, fine. You can find us at booth 201. Welcome to visitor. And I'm not the only beauty in the company. Yeah, I have a lot of beautiful people hired as well. Woo, yes. This is our booth. So if you walk in the room and to the left side, you will find us on that side of the conference room. Yes. OK. Well, thank you very much. Tu-tu-tu-tu-tu-tu-tu. Hello. It's all good. Thank you, Kristen. Thank you very much. Oh, OK. Thank you very much. Excellent stuff. I'm Gary. Hello. Hello, hello, hello. Hello. I work for JetBrains. I don't know if this microphone is any good, but I'll inevitably forget to lean into it. So you'll have to shout at me if I do. That's fine. Does anyone here use PHP Storm? Excellent. So I've been working for JetBrains for about a year as a developer advocate. And I've been using PHP Storm as a PHP developer for probably four years before that, paying for the software myself. So when we were given a chance to do a slot here, what I really wanted to do was to share the stuff that I found in PHP Storm that I never even knew existed since I've been working for JetBrains. So yeah, there's a few people here who don't use PHP Storm, right? So what I'll do is I'll just try to start off with the obvious stuff and move on to some of the hidden stuff later. So I'm so PHP Storm mad now that I even write my slides in it, apparently. I didn't completely forget to prepare a welcome slide and just scrambled and do this, honestly. So I've got some notes. We're all good. Let's do this. So yeah, I'm going to be live coding. So if there's anything you want to ask, just shout out. This is completely fluid. So I've got some things I'd like to show you, but I'm completely happy to go off on a tangent. So please do. So can you read that OK? Is that readable? I can try and make it a bit bigger. Yeah, that's cool. So the first thing I'll do is for the people who use PHP Storm, this is the material theme, which is a plug-in. Because people always say, what theme are you using? So the material theme is a plug-in. And this is an extra theme called material peacock. So you can just find that. Just the plug-ins available out of the box. The theme is just findable on GitHub. So the first thing I want to mention is that we have two things that we use to help you to code. They're called inspections and intentions internally. And you'll see those words used inside of PHP Storm. And you may wonder, what's the difference between the two? So I really want to quickly run through what those are. So inspections are the parts that look at your code and tell you when there's something wrong. So did anyone go to the static analysis talk today? Yeah, so it's kind of exactly what Joseph was talking about in that talk, but right in your ID. So I don't know if you can see over here, right in the gutter. At the top here, if you can see that. Can you see there's an amber box there? No, yeah? OK, you'll have to take my word for it then. There's an amber box up there, and it's all good. And it just says one warning found. So in this file, we've just got a really, really simple class, which takes a logger, an instance of a logger as a parameter. And then when we create an instance of this class, because we're not meeting the dependency, the ID is telling us straight away, you know, there's a problem here. We can see these problems with the inspections before we even run our code. It's kind of useful. So we can easily fix that. This is an intention, an inspection, telling us that it's inspected the code and found something wrong. So we can fix that by just passing in a new logger. And there's also a problem, because this needs a name. This is a monologue logger. And I don't know if that's bundled with Drupal 8. Does Drupal 8 use monologue? Yeah, I thought it would. Yeah, it does use the symphony component, so I guessed it would. Yeah. So these are the inspections. The intentions are, like, slightly different. They will do things for you. And you can fire these up, like, generally, or in context. So in this instance, we're being passed in a logger here, but we're not actually doing anything with it. And the default thing that we would do in object-oriented PHP is to take what's passed into the constructor and just throw it into a property. So we can just do that using an intention. So if I use alt and enter, you should see at the bottom of the screen the keys that I'm actually using. You can see that this is a context sensitive helper, alt and enter. So at this point, it's saying to me, I can do two things from here. I can initialize the fields, or I can generate the PHP docs. Interestingly, the PHP doc stuff will work and generate Drupal-specific PHP docs. If you've got Drupal-mode enabled in PHP storm, if you want to learn more about that, come and find me at the booth later. It should just prompt you automatically. But in this instance, I want to initialize the fields. And which fields do I want to initialize from the constructor? Just the logger, please. And you can see that the intention has created the code I need to actually do something with this logger. And it's also added any annotations and stuff I need. So these are the intentions. So if you ever see the words inspections and intentions, that's pretty much what they mean. Interestingly for inspections and something I really like is you can run the inspections over your whole code base. You don't actually need to have the file open in the ID to actually see what the inspections are finding for you. So here in the code menu, I can run an inspection by name. And one of my favorite inspections to run on a whole code base is the PHP 7 compatibility one. So if anybody's running code bases that they're interested in upgrading to PHP 7, but they'd like to know what's going to go wrong, this is really cool. So we can run the PHP 7 compatibility. I'm going to run it over the whole project. And you can see straight away that it's found three problems in this code base, mainly because I created a file with PHP 7 project, so I can demo it. Keywords in string and bool are not allowed to be class names in PHP 7. They're reserved words now. But you can imagine that this will find all sorts of problems in your code base. There's the differences in PHP 7 where brackets, when you're like chaining method calls, the weight of what's analyzed first has changed. So it'll find all these little nuances for you. It's really pretty cool. You can also run any of the other inspections. So if we look at the worst inspection in the whole ID, which is the typo detection, false positive inspection, you can see that it's found some typos here. Yeah, look, it doesn't know my name. I can't spell Drupal, which is actually good. It is right, so that's all good. But yeah, you can run any of the inspections you use across your whole code base, which is pretty cool. Yeah, shoot. Yes, yes, you can run the code sniffer. I'm actually going to mention a little bit about code sniffer in a minute, but yeah, the code sniffer stuff is just an inspection that passes off the codes to a third party and passes the results that comes back. So you can absolutely run code sniffer on a whole project through here if you really want to. So there we are, PHP code sniffer validation, and it would, yes. No, no, you wouldn't be able to do that. There, you wouldn't be, you know, it would either be all code sniffer as it's configured for your project. So code sniffer's configured like globally for the project in the actual settings pane. So I've enabled code, I've configured code sniffer here for the entire project. So I couldn't just turn bits off to run that. What you'd have to do is to update the configuration so that it's only, for example, using a XML file that has the code sniffer inspections you wanted, run it, and then kind of revert it back to your normal one. Any more questions? So that's like really cool. The other thing that I really like are the live templates. So let me tidy up here a minute. So the live templates like get bundled and they allow you to code quicker effectively by, it's almost like clever copy and paste at the end of the day, but they bundle as live templates because copy and paste is bad in programming so you can't use that term. But for example, I can do PUBF and then tab and that will give me public function. Like that's really nice. So I just type PUBF and then foo. Or I can do, you know, pry F or I can do pro F. So if I want to be, oh, I want to do pro F, not just pro. So if I want to be coding, I don't want to be typing, you know, all the stuff that I don't need to do, we can use these live templates. There's a ton of them bundled. I'm using command shift and A. Whenever I want to find a set in the ID and I know that it's there, but I can't be bothered to like go into the settings and find exactly where it is. You can effectively search actions or options using command shift and A on Mac. So here I can type a live templates because I know that's what I want to find and then go straight to the live templates stuff. And you can see in PHP, these are what we have bundled in the box. So you've got foo for each, blah, blah, private public functions. And you can just create your own. So if you want to create your own shortcuts that you can use by just typing a few letters and pressing tab, you can do that right there in this section. You can also find live templates downloadable. So I use PHP spec to do behavior driven design and there's some stuff that comes and I just found them on GitHub and imported them into my ID. So that's really cool. Okay, the other thing I really liked when I found it is the fact that the ID has a restful web client built right into it. So if you're coding, and I say it's a restful web client, it's a HTTP client, right? It's not particularly restful, but it's a fully working proper HTTP client. So if you're working right in an API and you want to be testing the end points, you can test it from right within the ID. So we can, for example, do a get request here to API.joined.in and then run and you can see that we've got the JSON back here. I can parse that nice and neatly with this button. I can see the response headers, the request headers, everything you'd like to use. You know, I've been using Curl for this kind of thing or standalone, even paid tools to do this kind of thing and it's built right into the ID. So I only found out like six months ago and I was, oh, that's pretty cool. Why didn't I know that before? Cool, the other thing that is nice when you're working like I am on battery and I'm giving a presentation and it's kind of scary. I'm okay, 60% we're all good. If you're traveling or whatever, so I don't tend to do any kind of HTML or JavaScript work. I'm a back MPHP developer, that's what I do. So quite often what I'll do is turn, right down in the corner here, can you see the little man? Hang on, let me see if I can. Can you, is that any better? Can you see the little man's face here, the Bola hat on? Just down here, let's see. How's that? Can you see that? Okay, so he's casually named Hector the Inspector and it's a way that you can quickly and easily turn up and down the level of inspections and syntax highlighting and help that you get in the ID. And the reason for that is the parsing of the files is kind of quite intensive, it's quite processor intensive, it's quite battery intensive. So if you're, you know, you fly in and you wanna try and finish something off or just play around, you're like, oh, my battery's on 20%. You've got this power save mode here that will configure everything right back. What it basically does is it means that you need to invoke any actions. So it won't automatically like monitor your actions. So if I want code completion, then what I need to physically do is to use the key to invoke code completion rather than getting it out the box. So you can see it still works, but it doesn't prompt me automatically. But that's still, you know, good enough for more situations. I also tend to quite often when I'm working in like new code bases where I only wanna like do a pull request on something and I don't care about the HTML. I typically will turn the HTML highlighting and inspections just turn it off. So I don't really care about that stuff and it just speeds everything up a little bit. That's kind of a handy thing if you do that. So the last thing I wanna show is external tools. So does anyone here use like the PHP CS right in the ID? So you've got PHP CS and it's like as an inspection, yeah. So that's really, really useful, but it can sometimes become really painful, particularly if you're, as you say, I do a lot of open source work. So I'll pull in projects that I want to run CS on but I don't want it to run like immediately when I'm actually working. So what I tend to do is to use the external tools functioning system in PHP storm. So if I type external tools, basically what this means is you can define, you can define external command line tools as tools in PHP storm and then you can bind shortcuts them, you can do everything you could as if they were part of the ID and code sniffer and code big defer are perfect for this. So here you can see I've just set up code sniffer to be an external tool instead of integrating it into the ID. So I can run it on demand rather than have it on all the time. And so all I've done is basically I have code sniffer installed globally into my global composer folder. And this work in directory, you can just insert from, sorry, you can insert from using insert macro and it'll tell you what you can use. So basically I just say when you run this because this is set up globally for PHP storm, look for the configuration in the direct, the root directory of the project. That's all I've said. So I can then run that using shortcuts because if I go to key map, you can see that down here we've got external tools and anything I set up as an external tool is available to have a shortcut bound to it, a keyboard shortcut. This is super useful. If you use, you know, if you use, you kind of often find yourself running the same command line commands over and over again. This is like super useful to kind of stop you having to do that. So my PHP CS is what I'll shift them P. So if I use, I'll shift them P. You can see that it's opened and run PHP CS for me and actually told me there's a problem here. And it's telling me that, it's telling me I'm a terrible developer basically. That's fine. I've also had PHP code bugifier which is kind of the way you can automatically fix CS fails and that's bound to command.nav. So when I do find these problems, I can quickly fix them if they can be fixed automatically using a different key combination. I know some people who've used external tools for things like when they need to R-sync because they develop locally and then they wanna R-sync everything up to a testing server to just have an external tool set up to run that command to push everything basically to development. That's pretty cool. So we got a couple of minutes if there's any questions. Yeah, shoot. So you definitely could do that. You could definitely, because you can pass in parameters here, you could definitely set up your own key combination for PHP CS, where the parameters was a different config file because you can pass a parameter into PHP CS and say dash, dash config equals and then a different config file. Yeah. No. No. No, it's fine. Any more questions? No, cool. Well, thank you very much and I'll be at the booth today and tomorrow morning. So if you've got any questions, you just wanna have a demo of any of this stuff, just come and find me. Thank you very much. Hey, hey everyone. Thanks for coming and for joining us today. So I will be talking about enterprise Drupal application and hosting infrastructure level monitoring and as catchy as the name is. So if you ignore the words enterprise, pretty much everything is the same as you used to see that but I would like to scratch the surface and talk about how at SiteGround, we're a web hosting provider. We approach huge hosting infrastructures and how we manage to keep everything up and running without actually interrupting the work of the websites. So my name is Daniel. I'm a senior site reliability engineer at SiteGround and I've been working there for the last nine years. Right now we host about 30,000 Drupal websites and we focus primarily on huge projects. So let's get started. What is an enterprise Drupal hosting in the first place? So it usually consists of multiple servers. You usually need to provide high availability. So if one server goes down, you don't care. The application just continues to work as expected. It offers auto scalability. So for example, we use Linux containers for everything and whenever an application server or your database server needs more resources, you can allocate those automatically or just add another physical server to the cluster and get the CPU or the memory from it and allocate it to another container somewhere in the cluster. And usually it requires multiple services to work as expected. So such projects usually need elastic search, Apache software, MongoDB, Node.js, you know how it is. And the really interesting stuff and no one talks about that, it really, it's really expensive and no one wants to manage that because it's really complicated. So when I talk about complexity, I would like to mention the traditional hosting that you are used to. So you get shared hosting and probably on shared hosting you cannot expect any of the things that I mentioned. After that you can get a single virtual server, a single dedicated server or platform as a service. And platform as a service probably the best thing to go for. But if the platform does not allow you to use, Node.js for example, it's not part of the stack or elastic search you have to do it yourself. And usually you have to get a server from another company, connect to it, which is not what you would like to focus on because you would like to focus on development. So the most complex of our custom private or public cloud infrastructures. Because what we do for our clients is we build those infrastructures specifically for them and after that we have to monitor that. So here is an example. You have load balancers, you have public or private switches for the client. We control all of that. You have replication of the database. You have backup servers, shared storage, and a pool of application servers. And the idea here is that if one of those servers goes down, you don't care. You can just spin another server and it takes over. It goes into the cluster automatically. So it gets even more complicated when you introduce the custom services that the project might need. So let's say that you're using GraphQL for your API or you're using MongoDB for, I don't know, something. You would like to store some data there. And if, for example, that MongoDB service is running only on one server, then you have a huge problem there because it's a single point of failure. And that's exactly what we would like to avoid and that's what we're doing with our monitoring. So the monitoring is split into two parts. The first one is monitoring the websites that are hosted and the second one is monitoring the hosting infrastructure. So I'm just going to scratch the surface and give you the architectural design of how we approach that problem. And I'm hoping that some of you will dig deeper after that. So the website monitoring part is pretty familiar for some of you. Usually what other services offer is that you can check the website from multiple locations and whenever the site goes down from one of the locations you get a notification. But for us, that it's not something that it's enough. And that's because let's say I'm in Amsterdam and the WOKO internet service provider there is having issues with their BGP and they cannot reach the server which is in London. So that's not a problem with the hosting infrastructure and the client is not specifically interested to know that the website is down from there. But he could be interested to know if the website is simultaneously not accessible from more than one location. So what we do because we have four data centers around the world, we check the website simultaneously from several locations and after that we cross reference the information and send a single notification. So we know that it's down from Amsterdam but at the same time if it's down from Munich or London then we can alert that there is a huge problem there. So we have defined incidents by severity and you have critical incidents. So for example, this one would be the website is down from all locations, a major one which could be that the website is down from only a single location or the MySQL replication is broken. There are PHP fatal errors in the locks or the file system is in read only mode. Of course the client can say that, okay, I would like PHP fatal errors to be a critical incident for me because they want to get the notifications tried away. So that is also something that could be configured. So what do you do with so much data? With so much data we have some core concepts that we would like to follow and that is a simple infrastructure for one of our big clients. So you have two load balancers and if one of them goes down, the other one gets the IP address of the first one. So the website is pointed to two IP addresses and both IP addresses become active because we control the network, we can move IP addresses between machines. After that, you have application servers, storage and in that example, you have a MySQL master and a slave and by default, the slave is there only for fault tolerance. So no queries are actually going to the MySQL slave. So some core principles related to our monitoring and I think that everyone should be doing that or at least you should be looking for hosting providers that already did that. Lock all the events and archive them because you are going to need that information at one point and when something goes wrong, we always write postmortems when it comes to finding the source of the issue. So check every single incident, even the minor ones and the notices because what we found out is that when we launched the system, most of the issues that happened were cascade issues. So a single server will fail, which will bring the load on the other servers up which eventually causes another outage. So a single server going down won't affect the website but after that it may cause a bigger problem. If you don't check the minor incident, you could have big problems, let's say two days later. Beware of those cascade failures. So if you analyze a single incident, you will know that it may cause a bigger one. And always strive to go back and to restore operation to the state which it had before that single incident. So one thing that we do is we always check one thing at a time so we try to make the checker as agnostic as possible. So let's say I would like to check the MySQL and we have about 14 different checks related to MySQL. How many connections do you have at the time? How many swallow queries you have recorded for the last half an hour? Or for example, what is the memory consumption? How many threats are running on the MySQL server? And define your limits. When you define those limits and you see that the number of connections is going beyond the limit or you have reached 90% of it, alert someone, alert the database administrator who will be able to check the things that are happening there. So here are some examples of cascade failures and that has happened. And what we did when this happened is we analyzed every single accident and we went back in time and after that we automated the recovery in those issues. So the first one is you have five application servers, one of them goes down. The load on the other four increases by 20%. And after that what happened is that someone invalidated the Redis caches because they needed to restart the Redis on every single of those four application servers which caused overload on those four servers. At that moment, the fifth server was still down and Varnish was also restarted on the load balancers. So we have full page caching for Drupal on the load balancers and when Varnish was restarted the cache was automatically flushed. So that generated even more PHP executions on the four servers. And what happened is that all application servers started to return 503 errors which means that if someone at step number one checked the incident at that point and bring up server number five which went down you would have avoided that to some extent. And the other example is related to MySQL. So let's say you have a master and two slaves and they're all working together. The master goes down, the first slave takes over, it becomes a master. At that moment again the queries are split between just two servers and not three. What happens is the second slave could actually be behind the new master and at that moment if the master fails again you will have either a broken replication and your database will be behind two hours or something or you will not be able to recover at all. So at that moment it could be a minor incident for you that one of the servers went down. Okay, the application is still running but someone has to check that and bring the server back up. So that is a screenshot from our monitoring system and that is how a single incident looks. You can see that PHP fatal errors were detected and after that you see that is the exact time of the incident when it started and when it ends. So for one minute here some PHP fatal errors were recorded into the log files on that specific server and not on any other application servers and after that you see the exact errors in there and that is available through an interface to our clients and also to our system administrators. And that is another thing that we do right here. As I mentioned you have to collect all the data and it's really important and what we do is we use high charts to draw those graphs. So here you see the CPU consumption and the memory consumption of those four application servers and the purple lines are actually code deployment. So you see that at the first code deployment the CPU overload of web server number four went down after that there was no effect of the other code deployment and after that when a code deployment was made again pretty much the CPU load on that server went up again. So what you can do is correlate your information and for example if you have a lot of session related errors in your PHP log files you can see the deployments and you can find out okay so that specific commit over here cause those specific issues. And I would like to wrap up with some key takeaways. Embrace failure and always designed for failure. One thing that we're striving to do as our next goal is to never SSH into a server again. So if a server fails we just cancel that server spin another one and it all works out of the box and it automatically joins the cluster. Automate recovery so the issues that I mentioned we worked to automate that so if the same thing happens again we have some scripts in there that will try to solve the issues. If they fail they will alert us at that moment. And walk all incidents measure and grab the performance of all components and one thing that I really love is to regularly break things on purpose to see if you're monitoring and recovery is actually working. So don't rely on chance. So for example just bring down a whole MySQL server and see what happens. If you have issues then automate that so the next time it happens you won't be affected. And if you're interested into learning more I strongly advise you to read those blog posts. There is a book about site reliability engineering here and the last blog post is from Etsy where they actually introduced this concept of putting graphs one on top of another so you can see when code deployments are made how does this affect your CPU consumption or for example failed attempts to log in. Well join us for the contribution sprints tomorrow and thank you. If you have any questions just grab me in the hallway.