 All right, so again Campbell Ernest and I we kind of deal with varnish on a daily basis So we kind of came up with this idea to do this lab So this is gonna be like I said, it's gonna be a little bit hands-on a little bit through slides We're gonna try to mix it up a little bit and get you involved. So And like I said whoever ends up with the key last gets to keep it so So again about us you already heard about this so keep moving So we want to start off with sort of a survey of kind of where everybody's at Who here has done a vehicle before and varnish anybody Good, we got a couple awesome Who here is familiar with like different layers of the triple stack so red as memcash APC those different caching layers That's better. Okay. Good. So we're in the right place So the stuff I'm gonna talk about it's gonna be very high-level overview It's gonna be more of architecture type things things that we do As a hosting company and as in as someone working with Drupal a lot We'd look at it from a system administration point of view they look at it from a development point of view So you're gonna get kind of both sides of the coin Couple overviews of technology that we use I'm gonna kind over some of the architecture get you to install this this first one since locally Then we're gonna go through and do basically Drupal and varnish type setups Building a vehicle from scratch or having some edge cases that we can show you Let you siege this on your VM that you have in your laptop and then how to troubleshoot those and give some other examples So basically the again, this is gonna be overview for some of you that are they're here But I had to cover a look you know from ground up so different life cycles the HTTP protocol, right? So can CDN cash you can use varnish as a CDN You know basically by doing that you're You're allowing yourself to have this offloaded and so Apache and Drupal aren't actually the things that are getting hit as hard So you want these things to happen you want these cash layers to happen you want to do them layers So you don't want just to have one thing being your end. I'll be all of your caching SSL offload presents, you know a way to cash your SSL traffic You can basically put something in front determines your SSL pushes that into varnish or pushes that into something else And then you can cash it at that level as well. So you can't get around your SSL issues You know load bouncing out that traffic make sure it goes to multiple web heads You know just using varnish in general Obviously the patchy stuff that is there We're not going to get in the holy war of who uses engine X versus Apache versus lady versus all those things But we'll kind of talk about different things that you can do To make those things go faster Highly available solutions again using load bouncers offloading Just trying to keep things load bounce whether you use varnish for a load bouncer Whether you use something else like you know IPvS or pound or something like that So the biggest problem that you have is as you add more modules things get slower, right? So sometimes you can't you can't avoid this. So how do you get around it? You try to cash as much as you possibly can on the front end. So or in in the mid tiers. So Some things that people try to use is memcash or redis to cash out the database here So this is again the caching in tiers So while you have varnish in the front, you know, you're also doing caching at APC and memcash in the center for PHP And then you also have you know, basically caching going on inside the database as well So in in the case of my sequel typically most people This is the most common thing that we find is you want to turn off your query cache If you're using something my sequel five five or higher It generally hurts you Do the do the caching and memcash do it in redis do it somewhere else. Don't do it in my sequel It's just it slows down things. It actually does IO waits a lot more And make sure you're using it to be buffer pulls as well. If you want to use minus one for caches You can do that through your cash tables There are some performance increases there, but if you want to just use it to be across the board That's probably the best thing to do So an optimization approach. So not this means don't look at the very small things. Don't go for the very small Little problems that you see like don't go for the things that are going to save you a millisecond Or two seconds of optimization So you want to basically optimize from the bottom up The hbt-sac is so Dense and in a dribble stack that you have so many places you can actually cash So, you know varnish is just one piece and that's what we're going to kind of talk about a lot today As we call in the office play chase the bottleneck, you know, you always want to go after the largest bottleneck you have most of the time It is either There's some function that you can you can find in with it xh prof Or there's something that is causing, you know, it's missing memcache or retus that is too large It's slipping past and you think it's being caught Those types of things you have to go and look at and you have to instrument basically put instruments in place in order to go And find those things so either use things like a patchy top or my sequel top or my sequel bench Barrage top, you know, there's different tools and we'll get into a lot of those here in a little bit So just again my sequel basics, you know, use my sequel five six five five Do not use five one five five five in no default is a thousand times better than in no In five one five six again the same thing They continue to make more and more improvements if you want an alternative go to Maria DB 10 10 dot or above It is equivalent actually in most cases faster. It's a drop-in replacement. It works great Again using a couple things that most people forget to look at is like any of you buffer pool the log file You know flushing those all have to be set properly if you're hitting the top of your log interview buffer pool and it can't cash any more tables then You're basically hitting the top of your cash and it's going to break So that's what your tools have to instrument for and then other things that you may want to look at But they're on a per connection basis typically or per application basis is like foul word table things of that nature So I'm not going to go through all these just because it's not really the theme of this But more one to get it out there is there's a ton of ways to make my sequel scale You know tungsten, you know, you can use master master application. You can do read read slaves On and on So memcash and Redis, you know the who here has worked with Redis before anybody? Good use it if you're not because it's actually a really good tool it survives reboots So you don't have to worry about your cash is getting corrupt and basically flushed there's a lot of a Lot of compatibility with basically D8's actually starting to write a lot of stuff in in in for that as well D7 you have a couple modules. I don't use it Again back to what we were saying before Slim is better so turn off as much stuff as you can make it as clean as you possibly can and Continue to try to not have to cash as much as you can because the less data you're pushing out the less overhead You have the less you're going to need to have cash on the front be aggressive If you're still using D6 God help you but use press flow If you're using D7 or newer, you know, you can use press flow, but honestly, you know most most of time most people don't And again, just just kind of round up the whole Architecture use solar where you can't research don't use your research So on the issue to be stacked, you know, we have Apache 2 4 That's you know use that because 2 2 is obviously it's getting older If you can use 2 4 do it If you have really really resource constrained servers then consider doing horizontally scaled virtual instances, you know Mod PHP is still kind of the gold standard for us, you know, there's a lot of people use engine X But we're seeing that you know as as scale goes up, especially if you have a varnish in front of it It doesn't really matter if you have engine X behind it or Apache Allow override thing just say it basically it saves you a couple steps of Apache processing So it's again, it's all about just getting as much as you can out of your stack You know, and again, do we care what what website we're running probably not So with varnish basically the trick is is what's varnish doing? Varnish is pretty much have been the kind of the gold standard again of making Drupal issues disappear So as as you have problems with your page load times, you have a hideous query that's running and it takes forever How do you hide that we only run it once every, you know Five minutes or something like that and you basically have TTLs in place to to query all basically cash all those queries Things you have to look for, you know, make sure your h headers are set correctly so that you have, you know The time to live on your cash set if it's not then you're probably gonna hate your back end more frequently And then you're basically gonna have you know more load on your back end Same thing with cookies, you know, make sure that it is actually getting hit instead of missed, you know Go through your vehicle over and over again. That's that's the the important part If you're using mod security, which some people are sorry if you're using mod security, which some people Some people do there's actually a varnish replacement for mod security. You can run it completely in varnish So now you can basically put post-based IDP or intrusion to action or cross-site scripting hacking All that stuff can be pushed into varnish now So you can actually go through and filter all those bad things out before they even hit your application at all So if you're concerned about security or if you're having compliance issues things of that nature That's probably someplace you want to go look See here, so enough about the slides. So who here's run virtual box before? Awesome, that's good. This is gonna be I was concerned about that All right, so this VM I basically built it was on this DVD It is I have the installers for OS X and and Windows as well. So if you have those already on there great Takes two gigs of memory four gigs of disk space. So make sure you have that Make sure you have virtualization support obviously in your CPU And then obviously bravery helps because this is gonna you're gonna break it and you're gonna break it hard It's not gonna break your laptop, but it'll break the VM So if you don't want to do this obviously you can follow on with us as well We'll be doing it up here on the stage as well So The USB keys and the DVDs I gave out If you copy that whole DVD over on your desktop the installers are in the virtual box directory So go ahead install that if you want the windows, there's obviously one is one and there's also the OS X one if you're running a Really old version I'd recommend upgrading onto this version Who here doesn't have a DVD or a USB key that needs one? So while people are copying things over still I'm and definitely share those USB keys around Jason I know you just threw out a ton of different acronyms, and that was a lot of information very quickly So this is a good time for anybody to say what what what was that? What is what is varnish? What is if you have any questions does all this make sense to everybody? That was very fast for me I'm quite impressed if everybody Awesome all right well if anyone can remember any of the acronyms to ask about and I know Jason is very good at Explaining them, but this would we're about to go through this in practice and actually set this all up together after you install or actually after you get installed there's a Varnish 101 directory in there as well Just going that directory unzipped that file the v3 file. It's going to explode out to about three gigs or so That is going to and then an inside of there. There's gonna be a son of 64 vbox So once you get that just right click on it In Mac, I know you can right click on and say open with and then be boxed. I think it's the major start I think it is and when does this probably much the same thing So raise your hands if you have problems getting that up getting that started and then We can come around help you as well So in the meantime, so acronyms, so there's a lot of them in there Anyway, have any initial questions on what I was talking about up there because we're gonna go through some of this stuff Not so much the reddest, but we actually did something with memcache instead go for it So typically we're seeing So the problem with the query cache in my sequel is that it causes it to actually go to disk Even though once memory fills up the buffer and the memory fills up starts using disk So there's seeks that happen With red is some memcache. You don't have that Because it's all memory at that point. So we're seeing typically on a higher highly loaded site. We see You know, I think we're actually going to show you here. We go from like 10 to like 40 simultaneous connections on just this little VM. So about a four times increase Roughly So that's kind of how we you know, you're gonna see that as soon as we do this in practice a little bit, too Should show of hands who's already got this set up and running right good Talk amongst yourselves Yeah, this is this is pretty good if it is working for everyone Thank you black mesh I Awesome, thank you Anybody needed you need a USB key? I'm not gonna throw it, but it's gonna run All right, anybody in the other side of the room need a USB key so we can make earnest comically run over there I'm kind of disappointed That's coming up in the next set of slides, but it the the login is root Password is varnish 101 all lowercase So I guess one zero one started Okay, if you haven't started that want to have a sorry raise your hand And everyone who's got it already on there on the machine and running Anyone who's not listening Yeah, all right, let's go ahead So this is you actually so oh you're having issues just raise your hand. I'll come up All right, is everyone ready to get going with the practical side of this that was a lot of information very fast Okay, so I want to stress That I want you to interrupt me with questions and theoretical situations And I really like hearing things like but this is gonna fall over when somebody does x y and z and hopefully half the time I'll get to say yeah, just give me two slides and we'll get there But I really want this to be more interactive than just a session talk If we do two hours and 15 minutes of just us up here talking. I'm gonna fall asleep Like I don't know about you guys All right, so in this virtual box. We've got sent us six We tried to set it up so you won't have to download anything because conference internet We've yeah, we've already given you the username and password for root the the Drupal install on there has user username and password admin admin Very simple all we've tried to make sure that everything is installed with At least close to the normal configuration that you'll get out of the box when you just do yum install Whatever the the package name is as We go through and we customize the configs We've actually left copies of our customized configs in roots home directory. You can see in root stuff And the Drupal site that's on there has about 16,000 nodes give or take Which is gonna be a really fun coming up The so the username is root password is varnish 101 one zero one I'm so happy somebody got this. I wasn't sure if people were gonna get that or not And I was ready with a song and dance about it's a drush in a box, but yeah Right So just some information so you don't have to poke around you may have already figured this out We have Apache Serving Drupal from bar www HTML on port 80. That's mapped support 2280 on your local host. So from your actual OS X or Windows browser, you can access local host colon 2280 To see the patchy version of things we have varnish I think by default it's actually not running yet, but it will be running on port 8080 Which is 2288 on your local host? We've already got memcache and apc downloaded and installed and there's a ton of disabled Drupal modules Basically anything I could think of that you guys were gonna ask us about or that we were gonna tell you about So with questions, we can all enable these things and hammer our own sites and see how this works All right So I just want to make sure that everybody can access their local through through Firefox whatever browser you use Has that anybody got any trouble opening up a web browser going to local host? 2280 one person two people You're the only one who can't get the whole thing started. I'm sorry The port here in fact what I'm gonna do If I switch up to Firefox His which yes, but they'll be available after the talk and also the whole thing Apparently the whole thing gets videoed So if you ask a dumb question it gets preserved forever This is what it should look like when you're starting your VM Anybody else having trouble or having questions so far while I wait for this to boot go ahead Edit dot it'll be in bash RC. I think or it'll be I may have it somewhere similar. We'll just have a look right now. I Like colors Here you can see what I've got up on the What I've got up on the screen color prompt just disable that I Actually don't like the colors that it does in virtual in virtual box either. So I'm gonna SSH in I'm sorry All right, everyone is with mostly everyone is with us so far. We can all access the site in our browser Are you you're in a Mac? Left the left option key immediately beside your your Spacebar, and I think tab will take it out Drupal is is admin slash admin and admin. It's good. I'm expecting to take some time for this So it's not a problem. Oh, here's the inside that I was supposed to leave up while you guys were doing this So I didn't have to answer those questions over and over again So even just clicking around through your browser. You'll find the site feels reasonably Actually show of hands who finds it the site feels reasonably snappy Yeah, mine too actually the first time I did it I'm going this is this performs way better than it does on on Amazon This is what I'm supposed to put in a pitch for black mesh rather than AWS, but That's just because you've only got one user and we're about to crush that So let's figure out what the limits are by default right now Drupal isn't caching anything out of the box We have actually one reasonably heavy view that you can see is in the main menu at slash heavy So we're going to use a tool that's called Apache bench. How many of you guys have used Apache bench before? Oh, thank goodness All right. Good. So these commands are going to be reasonably familiar to you. This is about the simplest Version of an Apache bench attack. We have another question Correct. Yeah, that's coming. That's definitely coming. The simplest thing is just to do it completely anonymous with no cookies or anything like that so This is the way a patch of bench command works. Is this news for anyone? All right, very quickly. You can see what the arguments are to Apache bench We have concurrency means how many requests get submitted at the same time number is the total number and the target URL is pretty reasonable I Want you guys to play around with the concurrency and the number How high do you have to get the values to break it? Hands up and let me know How high do you have to get it to actually make the site stopped giving you responses? I would love it if you would kill the virtual machine. That would be actually perfect It's exactly what I'm looking for right now Does concurrency of 50 do it for everyone? Mine seems to do all right Apple engineering Does everybody get response times like this? I was actually quick curious about this that I get I'm gonna make sure about that. Yeah Yeah, back up a couple slides That's totally fine All right, does anybody need this information still or can I move on from this slide? You're supposed to run? Yes, please Otherwise you will bring down your own laptop Which would also be kind of fun, but I mentioned this is preserved on video From so this is where you have to be really aware of which Environment you are submitting commands on from your laptop you will need to use port 2280 From the VM you can just use port 80 Yeah, because my expectation is that you're running your browser From your your laptop and not from inside the VM Yeah 2280 is probably varnish has turned off still we'll turn it on in a minute I'm going to I'm gonna step in and help people because I think we still have some trouble so talk amongst yourselves If you're not familiar with virtual box, and then you could lose your mouse then on a Mac you hit the left command key I think in Windows. I think it's the left shift key so you can get it back right shift key. Okay. Thank you So some people are having a hard time connecting with their browser if they're using Chrome Try using Firefox apparently works well with Safari. I'm not gonna mention that Apparently it also works well in Internet Explorer, but if you're using Internet Explorer, I mentioned this is videoed for posterity, right? All right, I'm gonna advance the slides now if you haven't written this information down yet. I Don't know. There's a quiz later So this is the Apache bench command that I'd like you to run This one should be run from inside the VM, which means your prompt will have that red load me in front of it you can actually just Hit the front page if you haven't had a look at what the heavy view is like it's up to you Is everyone gotten this far with us so far? Mm-hmm a one request in three seconds. Yeah, that sounds like you're hitting the heavy view Yes to the cartoon Yeah, at least while we're answering questions Okay. All right questions Yeah, that's a really good point especially if you're just hitting the front page. Yep Are you running it on heavy or on try against the front page so you'll get yeah, okay? So this is a view that displays Somebody check it for me, but I think it showed it tries to load 200 nodes at once Maybe it tries to load 600 nodes. I forget I Mean I wanted something that would break it This is actually not quite as unrealistic a situation as it might seem actually last week I was dealing with exactly this but is putting the nodes on a map And we had 600 objects to load it actually had to load them up Along with taxonomy, so that makes 1200 objects to load and then render in geo and that's You know what that's just a lot of work no matter what yeah Yeah, 10,000 users at once Open layers is gonna have a really hard time with 10,000 Well, maybe with points it'll be all right, but maybe Not just rendering it it's hard. It's worse with shapes. I have to say yeah, is everybody everybody get to the AB part Is it all right for me to take away this slide? I'm gonna show you what my output from AB looks like All right good. I'm just gonna do this on the front page so you can see what a reasonably normal output is So you can see that it walks you through and says okay Well, I'm not completely hanging it gives you feedback as it goes through and then it gives you a nice Detailed feedback of the statistics of absolutely everything In fact when I'm doing proper load testing for a client and trying to develop an architecture for them You spend a lot of time copying and pasting from here into spreadsheets so Total time took 13 seconds. That's fine 500 requests complete nothing failed There were no there are no errors, which is nice to know the average request per second More or less is gonna be it's about 30 requests a second and you can see the way this breaks down in percentile Something great to graph actually if you're a graphing kind of nerd and if I am gonna try this against that heavy view Actually, we'll probably be here for a while or we're gonna start seeing a few errors because that view should take 10 seconds or so every time it wants to load So the number that we that I'm gonna ask you to pay attention to here is requests per second It's not that the other numbers aren't valuable, but they're the easiest quick metric for how to see to see how you're performing Also bear in mind that number of requests that the reason we have concurrency and total number that we can set an AB is because Processes also have some overhead in starting up and shutting down and The difference between having 100 total and 1000 total is that those kind of little overheads really start to add up It's why in practice. You're not actually gonna get 37 Pageloads per second out of this in any kind of a consistent way You could get a burst up to 37 second 37 a second, but if you have 37 a second all afternoon, it's gonna die I haven't even gotten through a hundred requests. I'm just gonna control C out of this Did you tell us how many complete requests? Oh, we made it through 60. That's pretty impressive better than I was expecting all right so this is when you're supposed to do it so What AB values do you need to bring your site down on the front page? Can you guys just call them out? You don't need to put up your hands Three concurrency 300 or total number 300 anybody managed to get higher 300 what are you running on? I Can I have that virtual instance? Yeah, exactly All right, so we're gonna improve this and we're gonna prove this using some Some stupidly simple methods at first and we're gonna go down the rabbit hole As far as we dare as far as we have time for so this is my personal mental model of How Drupal loads pages this is quite a simplistic approach and that's okay for what we are trying to do So actually the way I usually talk about this every instance of the word data I just say shit is loaded because that seems more natural to me. So shit is loaded from the database shit is Shit is rendered in its first little pieces of the page Drupal actually renders everything in quite a granular way So it will start if you're rendering a view for example It would do each individual field and then each individual row of the table and then a whole table Finally when all of the page elements have been rendered it assembles them in HTML That's most of the work of the theming layer and then we deliver it to the client browser That's the that's more or less the total the total package and anything that we're gonna optimize is gonna be about Speeding up one or more of these steps or actually speeding up the arrows in between This model is simple enough that my mother understood it, but does anybody have any questions? You should be embarrassed if you have questions about this part. All right So we're gonna start with the best practices set up I really like this this Dilbert. Can everybody read it from where they are? Because I do an excellent screen reader impersonation Right There's some truth to this that any best practices is going to be a mediocre setup It's not the perfect optimization in every individual site. There are ways that you're going to be able to eke out extra 10 15% even if you're really going to tweak things into detail There's a whole universe of tweaking in in redis and memcash and APC and even the way Drupal does its cash bins, but As Jason was saying We're not really interested in chasing those kind of Small issues what we really want are the big best practices that are gonna work for 95% of your of your sites And are gonna get it to a state where it's really fast and certainly fast enough That's right, so this should be your baseline So yeah, if you're coming from your from your Computer, it's ports 2 2 2 2 and I can never remember if it's lowercase p or uppercase p But it's the opposite for SCP because Because that makes sense. Yeah, yeah, because you know we like a little bit masochistic in the sysadmin world so this is This is a best practices setup in a broad kind of sense We have Drupal running all of its own internal caching You have APC configured you have memcash configured you have varnish installed and enabled Has everybody here done this level of caching before? That's pretty good. Okay, so I'm gonna leave you guys to walk through that But I'm gonna do it at the same time so people can follow along. I Love these kind of questions. All right, we're gonna do this at the actually at the same time as people are talking We're gonna talk about this actually as people are doing this best practices setup. I would like I'd like everyone who knows how to do this We've already got a PC installed and set up with the shm size. So you don't have to worry about that you are gonna have to start the memcash demon and Don't worry if you don't know how to do any of this yet. I'm about to do it on the screen live so you can follow along and I'm gonna explain what you're doing and then I'm actually that's a really excellent question there It's very hard to come to an absolute. This is the right way to do it in caching in almost every situation There's a year mileage may vary and there are tweaks and there are different ways of approaching the same problem In fact, we're gonna get to that when Ernest starts talking though We will have already addressed a bunch of load problems You'll see the way economists does it may not be exactly the same way that we do it And that doesn't mean that we're wrong or economists is wrong It's that this is actually quite a complicated problem set and there are multiple workable solutions There isn't a concrete if then kind of way to approach this you really do have to understand and use Use our mental model. All right, so I'm gonna start. I'm gonna start enabling things here. Everyone ready Well, sure So we'll start from where you're logged in on the verge on the virtual machine And we have to start the memcache service memcache is its own program that runs on its own and communicates with With Drupal over a port or a socket. So we have to say service memcache duh Because it's a Damon start Everybody with me so far Now I'm now I'm logged into the site already as admin I'm just gonna make sure to enable every kind of caching I can think of It is helpful to set an expiration for cache pages no matter how small it is Basically enable everything Everybody know where to find this administrative screen admin admin Should have tried it before you ask All right, it is important with most caching modules to actually enable them through the the graphical interface Because PHP runs in separate processes on the command line versus Versus through Apache or nginx or whatever you're using and it's really easy to get into some odd and awkward problems Where you think modules were enabled and they weren't Yes, please Sorry, I'll go back. Well, let me wait for it to load. So what is it for? All right, so we've come so we've got the APC Service available to us. We've got the APC module compiled into PHP or in as a module. It is available to us The same way we've got memcached the program running these modules enabled Drupal to Actually use these in an active way APC does some clever things in a generic way with PHP no matter what it will speed you up a little bit This is going to enable us to actually tell Drupal specifically I want you to put menu cache into APC. I want you to put just the block cache, but not the page cache and Just you know things like that And you've got the good questions All right So both of them are key or what we call key value stores where you have a key like a Will say page cache and then a unique a unique ID for that and then we have the value of it The biggest difference is the way it is the way it operates. In fact APC is Closer to PHP because it's actually compiled because it's actually compiled in or in a PHP module you get a lot faster responses with it It is not as good at handling large objects and it's a bit more expensive on Cache clears and in fact in certain configurations APC will reserve memory with each separate process you have running HTTP or nginx or fast CGI will spin up a lot of processes So they both can do do very similar things, but they are optimized for different situations Sorry because you're using yeah Generally you want to have both of them available. It's not going to be a huge speed-up, but because it's not it's not a difficult thing to do It's a bit of a freebie Memcache I've used for larger objects or any time that you have to share the cache between more than one server If you have a load balancer and multiple web heads behind it It you really can't make use of APC because the cache is on the individual box Yes That's an excellent idea could I repeat the questions so that everyone knows what it is All right, is everyone with me so far on what we've enabled what's going on Yes, so we've enabled a PC which just to make it difficult to use module filter the module title is alternative PHP cache Yes, that's correct, and you don't need to have Memcache admin enabled But it provides some handy tools for us when we're debugging if We go to the status report for the site we can see now Drupal is reasonably aware of APC and Memcache It's going to complain that Memcache integration is not currently loaded This is because there's more to including a caching system than just enabling a module you actually have to make some modifications to your settings PHP Let me make sure that I don't have a slide about that first before I go and actually do it No, I don't have a slide about that. Yeah, awesome right, so Everyone sees and understands what I'm doing here Okay, I'm going I'm going to go right down to the bottom of settings up PHP it doesn't actually matter but it's nice to keep all of this stuff organized and in fact I am going to give you guys the big cheat sheet is that there is a read me in each of these caching modules that has the Lines you need to add There's a little typo in APCs, but I'll show you that and in fact I check it every time just to make sure that I'm not doing something stupid So I'm going to log in in a separate terminal just so I can have the read me is open in one place and paste them into settings up PHP and another everyone sees and understands We're just going to look at a read me in a different tab They give you a nice general use case recommend recommended use case for APC We can customize this a little bit, but we're we don't have to do too much So this is the one My one issue with APC's read me The way this works in does everybody here understand PHP reasonably enough for me to talk about things like arrays You know what you never know? No, it's totally fair if you're a system administrator. There's a lot of the stuff that you can be willfully ignorant of yes I'm sorry Yeah, the read me file is in the modules root directory, so This is for APC So the way this works is we have an array of all of the available Caching mechanisms and we want to set up this array in settings dot PHP So that Drupal can be aware of it as early in the bootstrap process as possible This is because the whole point of caching is saving yourself from having to load extra bits of Drupal that you don't Need to load and we want to make this lightweight There's nothing wrong with this PHP for syntax the problem is that it will blow away any existing cache backends you've already got So actually the way this should look Like this can everybody read that Careful So the important part is to remove the array and it's and it's round brackets and To add these two square brackets at the end here. Does everybody have this? No, it looks like people are still tapping whites good. You know, I didn't think about that. Yeah, sorry Sorry, we're VM guys. So Yeah, if you just type yum Why you em space install space nano and a no that'll put a pico style Editor on her. I don't even think about that You don't know how to use by that is worth learning how to do the one thing It's available on every system as long as you don't say you max then I'm good to go Yeah, I get violent That's that's two things now the keyboard and nano What's that the keyboard settings and nano I didn't even think about You know, it's I it's kind of embarrassing I live in Germany and I still I have never learned to type on a German layout keyboard So like I buy my computers in Germany. They have German the German layout written on them And I just tell the machines that it's a US layout Drives my wife crazy. Yeah used to do that with a Dvorak layout actually Then no one can use it but you Alright, if the internet is down if the internet is not Is the internet down can you get somebody tell me? Oh, it's up. Okay, they just drew con you never know I know I know yeah, I have a theory that the amount of bandwidth we use Rises to meet however much bandwidth they provide If they gave each of us a fiber line, we'd all be torrenting And if you wouldn't I'm nano yeah control axe And then make sure you save All right, I'm switching back so everyone can see what we need to have in settings dot PHP So I'm gonna explain what these what these commands do here the first one Says here's our array in the in cache backends You want to include this file Drupal APC cache dot ink? You can see by the path to it. It's provided by the APC module In fact, I'm gonna add a little line in here to keep this clear And then below here what we're doing is setting a couple of cache bins to go to APC. Yeah That better so Drupal Press flow and Drupal 7 have a concept of cache bins. That means we don't just have one thing. That's the global cache Cache glom instead we we can divide it up into different kinds of caches and you can specify where Drupal was gonna store Each kind of cache so you can see Good it still works. You can see that here we're saying okay the bootstrap cache Goes to Drupal APC cache The class cache which is keeping track of the PHP classes goes to Drupal APC cache. These are both These are both good ones to put an APC because the cache is accessed frequently and it's small Anything that you can think of that is gonna meet those two that's gonna meet those two is a really good candidate for APC It's also worthwhile considering APC is not great about flushing its cache So it's nice if it's a cache that's not gonna have to change too often Has ever is everyone with me so far We're gonna do the same thing with memcache just exactly the same thing that we did Yep. Oh, sorry Yeah, let me just have a look so the question is why do we in the APC read me there's step to a and to be Depending on what you've depending on what you've got Does it say for why you want and I'm not sure why you would not cash bootstrap in there. Oh This is to do with the amount of memory That you've got Yeah, it's actually the big difference It's not the difference is not that bootstrap is not included the difference is that we have this one cache default class Cache default class means this is the default if it's a cache put it in here unless it's specified otherwise So if you have a lot of memory for APC And especially if you don't have other caching systems, this isn't this is a reasonable setup. I Never do it this way. I wouldn't give you a hard time I've mentioned a couple times APC is not great about clearing its cache. It's a bit slow and so if What happens is when the memcache cache fills up it wipes the whole thing and then starts caching it again And those operations are expensive. So if your APC cache is too small It actually is going to slow down your site to have APC enabled The relevant setting here we actually set for you it is in Slash ETC PHP dot D APC I and I I will show you where it is It's this setting here APC SHM size and for most situations 128 megs is sufficient question at the back Depends on the it it depends on the size of your server and your particular Configuration in this case. We don't have the problem where each process takes up its own block of shared of shared memory so You could if you wanted to cache your entire Drupal installation in here cat put page caching here then Probably said it as a gig I guess Any good reason not to do that. Yeah What's the good reason not to do that? I mean APC because it's it's basically compiling the stuff in the bytecode and you're in You're doing this it depends what settings you have in here But there's times that this cache is gonna become stale and whenever you actually put all that stuff in there You're gonna want to put stuff in there. That's it's you know You're gonna have to have settings It's gonna clear that whenever that that changes and that bytecode is gonna change and then that's recompile it Generally what you're what you're gonna see is you'd rather do that those operations the side of memcache Where it's gonna automatically clean those things for you where APC isn't gonna do that for you So I mean generally whenever you're doing these types of things just the Drupal bootstrap stuff Just the PHP bytecode that's gonna be compiled out of just normal PHP operations is is what you want to focus on here and then use other tools and things that you can afford to have the socket connection going between Drupal and Memcache for instance or Redis and then have that in the back end actually manage those things for you because then you can isolate Those out as well because you can create different buckets of memcache to create different Let's say a multi-site setup you can create 10 memcache buckets and have each site be in a different bucket for instance Where APC you can't do that. You can't do that kind of isolation. It's all or nothing Okay, now I'm gonna do the exact same thing with the settings that we get out of the memcache module for Drupal The only difference is they're in install installation.txt It's very complicated to find They actually are nice and they include instructions for installing memcache What we really want Is it that I just skip over it. Oh, yeah, here we are. You're right. It isn't read me So there are three lines that we well there are two lines that we need to have here one of them is this same include that just says We have memcache available as a cache back end The other one that we need to have The other one that we need to have is actually one of the configurations here for a cache bin This is the form cache and it cannot be in memcache. Can anybody tell me why? Say again It's not a cache. Can you explain that a little bit more? Right, so he says it's not a cache. It's actually state storage. The other way to say this is The problem is that the form cache actually stores the state of forms in progress and God forbid your cache gets cleared in the middle of that someone's in the middle of signing up for a user account and You have to clear a cache. They're gonna lose everything that they've submitted in their forms The forms cache has to be somewhere. That's what we call non volatile storage something that will survive a reboot will survive Somebody coming and kicking your servers So we put it in the database cache There are other places that can be persistent, but the database cache tends to be fine This is not a big one for performance The other option that they recommend for us to include in read me is this one cache default class now We saw this a moment ago We saw this a moment ago with APC Where they said yeah, if you have a huge APC memory, you can say the cache default class should be here memcache is actually a more reasonable place To have your default cache and again, this just means that if it's a cache and Drupal isn't sure where to put it Put it in memcache I'm gonna step out of the way and let people copy this down now But are there questions about what we've got so far has everybody already copied this in? That is a great question. Was it a leading question. Do you know the answer is a good one? Yeah, okay. Thank you. Do we have any cookies or gold stars? That was awesome USB keys USB key with a VM on it But it doesn't have nano Right so we mentioned that these that these are both key value stores it means that there is I'm just gonna do this so I can be out of the way It means that there is a key and a value to it It's like having a field name and then a value that stored in there And if you have two Drupal sites that are using the same memcache install Then they're gonna be two Drupal sites that say this is what the front page looks like for example Or that this is what the bootstrap looks like and it's not gonna be accurate There is an option for having a prefix applied to all of those All of those cash keys for memcache actually the prefixing option is different for each caching module that you use It's another one that I can never remember prefixing Here we go memcache key prefix So you will need to have something unique stored in the memcache key prefix This is only an issue if you have multiple sites running on one server Do you guys need to see how we put this into settings PHP? All right Yes, it is an APC has a separate the most recent versions of APC have a different command a different variable to set it actually and it's When you have this prefixing problem, it's one of the More frustrating things to solve if you haven't seen it before Because all of a sudden it's things like menus disappear Yeah, the the the cache memory is still shared unless you're one of the fat Yeah, fast CGI still has the issue where it's set the memory for APC is separate per process In that case it shouldn't be an issue. So you only need to set this key prefixing When you have multiple sites that are going to be sharing the same memcache installation and this can be absolutely anything As long as it's unique Correct. All right, I'm gonna get rid of this because I actually don't like the prefix. It makes it harder for me to read Vlogs All right, we're ready to move on. We're gonna test this again Everyone's got it all copied down Man the quiz at the end is gonna be awesome Most of that you don't have to worry about the default is fine memcache operates on port was one one two one one And the Drupal module use that by default Yes, you can't put it on to one. Sorry. The question was is it okay to have All of my sites caching to the same memcache instance running on the same port and the answer is yes As long as they are prefixed Memcache won't care. You just need to make sure you have enough memory available for it All right, and generally whenever whenever you want to do something like Let's say you have a really heavily loaded site You'll create another bin for just that site because it's gonna really beat this not out of that cache So you want to separate that out isolate it so that it becomes its own process It's because then Unix is looking at it as its own process You can actually give it more resources you run on a different box, you know, you can isolate that out that way, too All right, we are ready for me to go back to the slides We're gonna we're gonna hit Apache bench again. We're gonna see the improvement that we've got That's right. This is actually the other part is to explain what it was we were doing So again our best practice is set up. So the first thing Data is loaded from the database now. It's actually coming from memcache, which is way faster in most cases It's way faster in all cases, but it's going to be coming from memcache in most cases We're using APC to speed up all of the blue arrows And we've enabled Drupal's generic page cache, which right now is going into memcache So that actually whenever possible It's going to just keep a copy of the finished HTML page and hand that out So we don't even need to care about the blue arrows or the or how fast is coming from the database Because we set memcache as the default Really, we've got memcache all over the place All right, here comes the Missing anything no here comes your second local denial of service So let's try the exact same commands that you did before You'll find that you can raise the numbers a lot higher now And in fact you'll get because the numbers returned from a B or averages the the higher the number is the better And I'm gonna run this myself Actually first. I'm gonna check Status because with all the questions. I'm not sure that I remember to do everything So can everybody see the difference in what you get out of a B for this? I don't remember. What did I get last time? I got 27 requests per second 38 yeah, some more around there, which is I have to say pretty good And now we're up to 250 to 46 and that's the first time Was loading it into cache if we do it again We'll find it's actually the average number for 38 is a little bit better. We're gonna do it also for heavy So now what values do you guys need to have to break it? Who's found the breaking point yet a thousand concurrency still breaks? It'll still works because anybody managed to get higher than then above 2,000 concurrent requests. I don't think I could do it actually Yeah, even yeah, yeah, so if you start running into a sockets error, that's not the server going down per se That's the server saying I can't handle any more connections in this at once period. That's not even that it's not even to do with the load So just from these basic steps. We've just increased our ability to handle anonymous page requests up to 200 200 some odd a second depending on how big the pages You know what we need to let me let me get you a mic because I'm not gonna repeat the whole thing Do you want to just use this one that's standing in the middle? Because that was actually a really good point I just want to mention that the the transfer rate which is displayed there Sometimes is the bottleneck first before the CPU comes up if you compare it to the network interface your server will have at that moment Mm-hmm That becomes an issue when you have really big amounts of traffic so the part Is everyone with me so far I'm gonna go back to the slides So the part of our best practice is set up that we haven't done yet is varnish Varnish sits right at the front of the whole stack and it keeps you from ever having to bootstrap triple There are various ways of integrating varnish with Drupal, but it's important to know that they are all optional varnish works perfectly well as What we call a dumb cache a system that is not at all integrated with Drupal that Drupal doesn't really particularly know about varnish Likes well a good varnish Configuration keeps everything in memory. It's saving just static HTML files, but it's all coming out of RAM, which means it's super fast So we're gonna start with just a very basic varnish configuration And then we're gonna show you how to make varnish sit up and dance varnish has actually quite a complicated Well, no it has quite a simple language for determining exactly how to treat each kind of request We're gonna actually Ernest is gonna do a lot of it It's going to be walking you through how to create your own varnish varnish configuration file So the first thing is out of the box the way varnish is is configured when you just install it on your server is Not is not optimal We are going to modify some of this So what we're gonna do is we're gonna actually configure varnish so it comes so varnish is listening on port 8080 because that's convenient for us to test with in a normal production environment You would make Apache run on port 8080 and varnish on 80 that handles all of normal people's web requests and you can see the complete configuration in this Root stuff etc sysconfig varnish, but we can also do this all together Everyone see the file name So this is these are this this is the file that controls the options for how to start the varnish program on the server Well now this configuration file is reasonably well commented It's not too difficult to understand what is going on What's most important for us in this demonstration? We want to change the the listen port so that it's 8080 Because that's what my examples use and that's what's forwarded on your computer In a production site you would actually use this option above varnish listen address and specify exactly the IP address And port that you want to listen on To make sure that it's only listening exactly for the sites that you want in this case We're safe to just use listen port 8080 varnish communicates with the back-end site If we're going to integrate with Drupal for example by opening up its own port and having the site connects to it We're not going to use many of these options So I'm going to leave these two as they are but for people who are interested you can tell that into port 6082 and play with it It's a special port for administrative actions So for example if you want to clear just one page out of the cache You connect to this port and you issue a command that says kill the cache for this particular page There's there's actually a Drupal module that you can integrate into this as well It will clear cache. When are you clear caches? It'll use this port in order to communicate with it So if you have a web farm, let's say like eight or nine web heads that are running all varnish You can clear all of them at the same time by just clearing your Drupal cache and it will go away Yeah in this presentation, we're not going to be using the varnish module it is included It is included in the installation. So if you want to enable it and play around with it, you're welcome to basically it Does the same thing that we've just done with APC and memcache It makes Drupal aware that there's this other caching layer that it can play with and send bins to Actually, it's not a bad module, but you don't necessarily need it to get the really impressive speed improvements So most of these options are just fine a storage size of one gigabyte is fine for most For most servers if you have two gigs or more, there's nothing wrong with lowering that to 512 megs if you need to This is a big one by default varnish wants to store everything in a file. It creates one big file to be the cache That's rarely the best option We're gonna change this to use memory So you guys are gonna follow along and see what I'm doing is I'm deleting the word file And I'm changing it to M. Allock Malock always sounds like some kind of super villain to me I'm going to get rid of this variable that says varnish storage file We don't need to have a storage file because it's gonna be in RAM and we can leave this varnish storage size You can see that was defined right above here You leave this up for a second so you guys can see Correct. Well, it actually makes one Whatever remember is that it makes one big block in RAM. Is that right? That's right. It doesn't it doesn't segment it out Correct, so so one of the varnish process starts it goes into it goes the links kernel Hey, I want to reserve a gig of RAM and links this will say yes or no to that And if the case is yes, then it allocates it and it reserves it so no other program can actually use it So it's the same thing as what you're doing in the file system The benefit here is is that memory is obviously a hell of a lot faster than anything you're gonna do on this guy Oh, even SSDs are slow compared to memory. So again, we know there's a lot of information in this session So we've tried to put everything that we can in the root directory under stuff We have an explanation even of Configuration for varnish here you can see These are the modifications to sysconfig varnish you see that we actually recommend that you specify the listen address Here's your storage size. This is that line that we just edit edited Jason also likes to increase the time to live. It's not going to matter for our demonstration But there's nothing wrong with increasing it to 300 if you like Has everyone got their varnish configuration file modified now Good you guys are speeding up so we save it and now we're going to start varnish Service varnish restart in cases already running anybody getting the errors. Of course you did. It's varnish All right, we're gonna make sure that this is running this is using an out-of-the-box varnish configuration file, which We will see is not actually too useful for Drupal But it's useful in us for enough for us to see How much faster this is so we're gonna go local host this time? It's ports 2288 and just make sure that If you're able to get a response here on port 2288 that means varnish is running and you've done your configuration changes correctly Okay here, oh So 2288 is a port on your actual local machine that is mapped to the virtual machines port 8080 The question was where do we make that configuration? That's actually in that's actually in virtual box Okay, so if you're not getting a response from there That means you may have made a mistake in editing varnishes config file and actually Jason will help you with that Is everybody else managing to get a response? Sorry just behind you. Yeah Varnish the way it's set up now. Yeah, there's a lot of things that are gonna break and We'll talk about that in a moment Varnish is a generic system that works very well for Actually works very well in a lot of situations, but mostly because you can configure the hell out of it You can actually see the big diff There's a visible difference here between the version that's running through varnish and the version that's not We have no cyber blocks that maybe because I'm not logged in Forms will break anything Yeah, actually there's a lot of stuff that's gonna break Yes a question in the back Okay, Ernest, would you mind giving your hand? Yep? No, no, that's right, that's right the so the comment was just you can really see the difference clicking around I Mean that comes up Instantaneously for me. Oh Yes, the first time you visit the page. It's gonna create the cash and store it in RAM We're gonna do another a B test and we're just gonna see completely out of the box with a stupid varnish configuration What the difference is? Jason we have another person with some issue here so We're running this from inside the virtual machine Note that the port number is different because we're inside the virtual machine. We want to make sure this goes through varnish So make sure to include that colon 8080 Raise your hand if you can see the difference not in this in the response you get Only one person sees a difference two three What kind of numbers do you are you getting how many how many page hits per second just call it out 2500 anybody got higher than that 4,300 I'm sorry sir. You've been beat 4,604 This is Not surprising When we say varnish is amazing. Oh my god varnish use varnish. This is why we just went from 40 35 page notes per second to upwards of 4,000 certainly upwards of 3500 and that's running on a virtual machine On a on your own computer. That's without a ton of resources to work with I'm gonna run it on mine so you can see what it looks like just because I find that fun 5,000 say say it again 5,138 I love this game. Oh Oops wrong port. So the first time it's gonna take a second and then it loads The numbers shouldn't make a difference actually. Let me try on just the front page 2000 I'm really disappointed come on varnish. You can do better 4,100 So now what values do you need to break this? Go ahead and try I want you to take down your machine, please This is why we have a virtual machine running you're still good at 10k concurrency Yeah, you'll find concurrency has a bigger has a bigger impact when we're looking for these sort of These sort of issues Yeah, I'm still having no problem with a thousand concurrency. Oh, yeah, and now I can't handle this many connections at once Yes From the Drupal status report. Did you enable the varnish module? That's why It's the same. It's the same thing as with memcash and apc that it's not just about enabling the module You have to actually configure a bit in settings dot php Questions so you don't see any effect The question is is the limit of the sockets configurable? Yes, that's a system limit But for the purpose of this demonstration, we're going to figure that that you're cool at 4,000 hits a second If you have higher load than that, then you need to phone Jason If once you get that socket error try restarting varnish Okay, good because we're gonna we're gonna deal with that problem now So the question was that we're having trouble because looks like varnish isn't doing anything. We're not getting a faster We're not getting anything faster. I Mentioned this isn't an exact science, right? The technical term for this is that varnish is a bitch to debug it's really hard and We're gonna I'm gonna walk you through a couple of ways of looking to figure out why varnish is not caching For the time being though, I'm gonna have to have Jason and Ernest come and actually help you because it's Because it's it's too difficult to do in a general situation Right, so this is where we start to see the difference between the results that you get from a B And the results that you'll get from an actual web browser a B is just doing a dumb connection. It has no cookies. It has no javascript. It Downloads the image files, but it doesn't doesn't deal with any of those kind of display layer issues If you actually have you real world user traffic, it's not gonna hit the cash very often This is because of the way varnish deals with cookies as everybody here know what a cookie is Not this kind the other cookie though. I like both of them Right varnish assumes that any time a site sets a cookie on your computer It's because it wants to customize something on the page and therefore it wouldn't make sense to keep that page to serve that page from cash Very often cookies are keeping track of logged-in users But it can also just keep track of your history You don't have to be logged in to Google for example for Google to customize your results page It just sets a cookie and it knows With Drupal we set quite a few cookies Anytime there's the possibility of having javascript on a page We set a cookie on your brow. We set a cookie that says yes, this browser has javascript Anytime we use table drag lots of forms use cookies and any of those cases are gonna cause varnish not to cash This is a bit tricky to test with a B. We can't use the same the same Setup that we were using before and I'm gonna show you how to we call spoof a session. We're going to We're going to have a B act as if it's a logged-in user But first we're gonna I'm going to show you how to tell that whether varnish is caching or not Now for this we're going to use a tool that's called varnish log Varnish comes with a handful of tools like this to see what's going on Doesn't make it any easier to debug but varnish log varnish top varnish stat There's actually a whole bunch of them and every one of them is extremely configurable Most of these I'm gonna let you I'm gonna let you discover on your own But varnish log is just way too important This is what the output of varnish log looks like That's one page request and it goes on for way more pages than this I'd like you just just try type varnish log in your terminal One word and hit enter and then go to your browser and try and load something through port 2288 I have an advanced Simulation for what this experience is like The best part is trying to recover your dignity afterwards Right so have you guys all typed varnish log and you can see what this output is like Yeah, and that was one page load if you do it with a b. It's way worse One of those marble garbled dogs, right? So varnish log comes with a lot of with a lot of filtering options so that you don't have to deal with the whole hose This is a URL. That's really worth bookmarking. I don't know why I didn't think to shorten this So it would be easier One of the other issues that makes varnish such so hard to debug is that they change the arguments For things like varnish log and varnish stat between versions. So like we're running varnish log varnish 3.03 3.01 the arguments that I'm going to show you won't work 2.0. They won't work. I think 3.1 dev It's a completely other set of arguments and not only does that make it really hard for me It also makes it really hard for you if you're trying to Google for what they what the right arguments should be Let's see. So, oh, sorry. We have 3.04 Right So the key Arguments that we're going to use here. You can see actually. Can you guys see that? It's a bit small on my screen, too And I chose a stupid font The first one is either minus C or minus B and that tells it to filter out and only show you connections with the client That's your user out there in user land or minus B the back end, which is your internal Apache server The second option that's really useful is minus I and that filters that that says I want you to include only these Tags so maybe you only want to see what the URL is maybe you only want to see what? What varnish's decision is cash or don't cash you can filter down to those tags and The last one is minus M, which I particularly hate because for me it works really inconsistently But that says only show me responses where this tag matches this regular expression So for example, you'll notice if you actually read through that varnish firehose that when you load a page Varnish actually looks at all of your URLs Not just the one the person requested but also the CSS file and any images that are going to be there and all the secondary crap that gets loaded Varnish gets to see all of those and it's nice and handy to say no, I just want to see the basic page So this is a varnish log String that I would like you guys to try Oh, that's way too small for you guys isn't it? No, you know what actually would be really good Because I'm just going to type it in myself and then you guys can see it that easier to see Yeah, so what we have here we have Varnish log easy enough minus C show me only the information related to a client request Minus I I want you to include these tags. So I chose these tags because these are ones that give you a nice summary of each request Vcl call and Vcl return are What varnish decides to do whether it decides to cache or or try and get it from the back end Object status is when it tries to get it from the cache. Is it stale? Is it does it exist in cache? Rx URL is the URL that was received that was that was requested And I forget what Rx request is Jason you remember what Rx request is oh? Yeah, that makes sense that makes sense and So it's either get or post or whatever the whatever the HTTP verb was. I've also filtered it minus M Rx URL is slash heavy That's correct. So this should not show you any output when you go when you go to the home Home page it will only it should only show you output when you go to slash heavy Let's let's see if it works for me this time The same command copy and pasted Sometimes works for me and sometimes doesn't absolutely drives me nuts Yes, the question Varnish will cache absolutely anything that gets passed from the back end that includes images CSS files somewhat problematically large file downloads you don't Really want 800 megs of your varnish cache to be used up by an ISO file Right that's Not particularly useful One candle it seems like a common issue is that Whenever some people at least half of the people I saw it's actually not even going into varnish at all. It's actually just passing So it's not actually getting cashed. I'm wondering if it's a cookie Well, no, this is what I wanted to show actually yeah, so this so everybody I was talking to this is the part That is basically going to hopefully fix what you're we're having problems with Correct, right? Yes, the a b should have been cashing, but it wasn't all right, so we'll figure out why not It's good that's actually what we're here for So you can see that the fire hose for those two requests that fire hose is a lot filtered now This is much more drinkable You see the session open that this is a get there's the URL Vcl call so it receives the request and Pass pass means I am not going to try cashing this. I'm going to pass this to the back end Hash hash means this is actually how it compares to make sure that This is how it compares to see what's going on in the back end. We don't need to worry about it for today Pass passes the call and then fetch hit for pass fetches the operation that actually pulls the data from your Apache Backend hit for pass means not only is this a pass not only just are we going to ask the back end? We're going to cash the fact that it's a pass so we don't have to look that up every time So every time automatically you go to slash heavy. It's not even going to bother It's not even going to bother analyzing the request. It's just going to pull the fact that it needs to be a pass out of cash and sent and Send you to Apache It's okay. I have not answered the question is why is it not cashing this page? Does anybody have any ideas why it would not be cashing this page? Yeah cookies the right answer is Probably for 80% of you cookies For the people who have problems we may be finding another one. I love it that on we're all in the same VM. Yeah Yeah, yeah, it's the best we could do We're all using the same machine and for half of us. We're going to get a different response. This is kind of the Yeah, this is what you pay people like black mesh to deal with for you So you don't have to cry yourself to sleep at night So this gave us a really nice easy way to look at that fire hose and say oh shoot We're not actually cashing when it's a real browser request Excuse me and even worse varnish is now cashing the fact that it's not a real browser request Or that doesn't that it shouldn't cash on a real browser request This is where we're going to enter into varnish configuration land So varnish is so one of the things that we could have altered when we were doing that varnish daemon configuration file is What file to look at for the specific? Logic about how to decide whether to cash something or not and we didn't change it so it's in ETC varnish Actually, you know what I just thought of this but one thing that might help people Who are having trouble with a be not getting cash? We have the configuration file in roots home directory You could just try copying that some some people overwrote it because they weren't Yeah, make sure you edit the right varnish in these config files. They sit in Etsy not in stuff Because varnish looks in Etsy sys config for instance Varnish for getting as config files. There are a couple people writing the stuff Slash Etsy slash sys config slash varnish and obviously those won't won't take so make sure you put the leading slash on these commands That leading slash is the important one Just gonna make sure I don't keep forgetting where I am in my slides here. Oh, yeah, this one's appropriate for where we are Right. This is varnish. Hell. It's an accurate description of your job, right? 99% of the day. Yes awesome That's what system ins do All right So we're going to edit the file default dot VCL We call them vicls for short That's got to be one of the one of the acronyms that a bunch of people didn't get at the beginning That is in slash etc slash varnish So this is the vehicle that actually comes with varnish Yeah, it's all commented out. Oh man, sorry mine uses the same back-end default at the top So the only thing that it does is say we have one back-end server one Apache server It's running on one two seven dot zero zero one. That's local host and it's running on port 80 That's it The default varnish logic is if there is a cookie don't cash it We mentioned that already and that's a big part of why we're having trouble when you load a site through the browser Drupal sets a bunch of cookies well One two or three cookies depending So we are going to modify the heck out of this Jason do you have a copy of my modified one up somewhere or I you know what it's alright It's not a secret that I'm going to be taking this from my template version because this can be such a pain in the ass Everybody saves a version of this of their configuration file and most people just use it like a template Mine is in your root directory stuff That's varnish So you could just copy this be this vehicle file into place I'm going to go through it one bit at a time and explain what the logic is that we're applying you can see right away This is my vehicle This is varnish's vehicle. Well, let me zoom in a bit. Yeah, my vehicle varnish is vehicle This is in the home directory under stuff Etc varnish varnish dot txt Actually has black meshes vehicle which is a little bit different from mine and it has more explanation And it's it's worth being said here that there is no as Campbell said earlier There is no one vehicle that solves all problems ours Ours is, you know, sometimes a little less dense than some others, you know, obviously your earnest where he goes through his is going To be crazy You know, you can do a lot of other things inside of it that we don't do so it's a good starting place and same thing with Campbell's It's a good starting place. I'll get Drupal's cookies. It'll be you know, it'll allow Google to pass through it'll allow other stuff to actually happen so I'm gonna put the the path to this This Campbell version up again so that everyone can see If you haven't been able to find it And you'll see when Ernest gets up to talk if we end up having time for Ernest to get up and talk That the way that the vehicle that that economists uses is structured quite differently Though the logic is not that different All right, has everybody found the Both vehicle files. All right I'm going to go to edit it here. Sorry Okay, so the first section here is declaring backends Varnish can actually handle having multiple servers behind it You declare them as backends in most cases when we're just on a single server that you don't need to worry about modifying this But if you had three or four servers behind it, you might also say back and How would it know where to forward each request? Oh? It's IP 7 you probably haven't heard about it yet. It's a little bit hipster So the question was then when you have multiple backends, how does it know where to forward the request? you actually will have a a function that is called a a Director So I have to think it's it's difficult. I'm a Drupal con You have a function that's called a director most of the time you'll have it just alternate Yeah, it can be as simple as a switch a lot of the time the back end Actually, you can also set up status checks So varnish to make sure that the servers are alive before sending anything to it the whole director thing Varnish can become a complete load balancer for you, but this is a really simple version. So we're not bothering with a director One really popular version of the vehicle template file you can find Lullabot keeps it on their public wiki Make sure you get the version for varnish 3 I Think I mentioned before the varnish breaks a ton of a ton of stuff between versions So make sure you get the one for the right version you're using it has an example director in it So the question is how important is the latency between Apache and varnish? It's only going to cut. I mean It's only important if it's a really huge latency The only time it's going to matter is when varnish has to ask Apache for for information And we want to minimize that as much as possible. We would rather be able to handle 4,000 hits a second than 400 Does that answer your question? Yeah, you could use it in a different country as long as you don't have to go back to Apache very often. It's fine And to be honest latencies The kind of improvements that we're looking at in caching. I mean a hundred milliseconds is not going to make It's not going to make an enormous difference And if you really think about it varnish is just a poor man CDN Right, so if you go out to Akamai, you know limelight whoever and get a CDN That's all they're doing is they're just caching your site So in this case You're selectively caching your site. You're not just doing a static content or in case Akamai you can do dynamic content as well but The latency part I mean Akamai has stuff all over the world, right? So and they use geo IP to grab you to the right location in this case You know your origin servers which would be this if you had Akamai or something in front of it Could be anywhere in the world. It doesn't matter. It could be in Australia and you could be serving someone in the States and It wouldn't matter because they're still gonna you know that that query coming from Akamai back to the origin to grab is going to be Really slow, but as soon as it's in Akamai, it's gonna be fast as Kind of a thought but if you're not caching a hundred percent of your page Which and most times you're not going to as we're gonna see or as we're seeing now with the cookies then You know, you're gonna have that still that latency in there to basically load that initial page up So varnish can serve it to the end So as your oh another question Yeah, so we deal with this a lot So just repeat the question. Yeah. Oh the question is do you do you use a CDN to query varnish? Or do you use a CDN to query Apache directly if you have varnish in front of it? So as I was talking before earlier, you want to cash in layers, right? So if you have a CDN out in front you want to CDN to talk to varnish You want varnish to talk to Apache keep that that layer there because of the CDN disappears You still have varnish protecting your site from getting onslaughted from just a wall of traffic Also, whenever Akamai goes and grabs stuff from origin. It's not nice. It's almost like a spider hitting your site It's gonna smash It's gonna it's just a large wall because they have thousands of servers sitting there Maybe getting queried for your data if all those TTLs expire within a couple seconds of each other All of those origin all those those cash servers in that CDN are now querying your varnish servers directly So the first one will be slow But everyone after that is gonna be fast because it's varnish is gonna serve that stuff instead of and if you didn't have varnish Sitting there Apache's gonna have to chew through all those requests coming through just like all those unique users hitting you So this is getting into some next level stuff and I Really like architecting Cash structures on I know Jason does too so we are happy to sit down with you and we it actually helps to diagram this out But for the time being we should keep going in the step this kind of stepwise motion this The vehicle is complicated enough to understand So vehicle of you know a vehicle is divided into sections based on the the big operations that varnish does The there are actually several session sections the only part ones that we're gonna particularly care about our VCL receive and VCL fetch I Mean actually the explanation gets shortened quite nicely if you just take out a bunch of vowels this because this is Unix So we expect to be able to read without vowels Everyone understands the difference between VCL receive and VCL fetch Varnish can make that's right. So in each of those stages varnish is gonna make a call about how to handle the request I'm gonna read this out because it is completely illegible The first call is pass we discussed what passes that means I'm not gonna cash this it's not appropriate to cash it I'm gonna hand it off to the back-end server The second is look up. We also discussed that that means yeah, this looks like it's cashable I'm gonna look it up in my cash and see if I've got anything for it If it looks up and finds there's nothing in the cash Then it will get a copy from the back-end server Sorry and cash it that's right It look up means this is cashable if I've got a copy server from cash if I don't Get it out of the back-end keep it in cash and then serve it It's a good good distinction to make and the last one is pipe. Does anybody know what pipe does? Can you imagine from the name what pipe might do? It's a trick question actually or it can you read well enough because this is a terrible color Yeah, so the difference between it sends it through to Apache the difference between this and and pass is Pipe sells it says to varnish close your eyes and just pass whatever Apache gives you on don't cash it Don't store a record that it has to go through just pass it directly through you use pipe for things like large files We talked about how problematic it could be to have an ISO file taking up your entire varnish cash You would want to just pipe that because it's not a big deal to have that served out of the files so pass means it means analyze and Check the back-end And when it comes from the back-end analyze again and send it on It's like path pipe is like ignore the rest of the rules I don't care what I've said in my configuration file. Just pipe it Yeah, Jason am I wrong? Yeah, that's exactly right. Yeah, that's good Because if you if you want to do pipes you actually have to put those in the config file So in your vehicle So what varnish basically does is if it's if it's something that's uncashable, let's say like these cookies for instance We're not we're not cashing cookies Those pieces are actually getting passed because varnish doesn't know how to deal with its dynamic content Right, so if you're trying to pipe something So so basically that content is something of varnish can't deal with so it just passes it So but it has to do that analysis to figure figure that that out both incoming and outgoing in a pipe You're basically explicitly saying in your vehicle. Hey, don't varnish. Don't look at this Just pass this request, you know hdp request directly onto apache Whenever apache replies to you don't look at it again Just stream the the bits as they come out of apache directly to the client and just acts as an axis of proxy Essentially, so that's the difference with apache. You don't have to do any vehicle It is fine. It is a great difference All right, so let's look at the vehicle rather than have you guys copy pasta everything that I've got up here We will ultimately just copy my vehicle into place But I am going to explain what the structure is and you Yeah, thank you And you have to keep in mind that big structure. So first we have a section for vehicle receive This is what we're going to do with the information that we get from the client So the very first thing we're going to do is make sure that the back end Is healthy make sure the back end is is Responding okay to pings if it's not don't bother with anything else and just serve it a cache page. You'll notice As a short as a shorthand for For getting it out of cache We often will unset cookies This is because varnish's default behavior is if there are no cookies Just use the cache. We could just as easily here say Say return Lookup Yep Yeah Yeah, it'll give you it'll give you an error and actually in this default vehicle I define a nicer looking error page and some nicer looking Default behavior. Yeah, just scroll down to the bottom of the file and you'll see where it is I'm not going to talk about it much, but The difference can anybody tell me where the difference would be between Unsetting cookies and just saying directly return lookup Mm-hmm The difference is that if you just unset cookies It's going to continue processing this logic and it's going to continue trying this side It's still possible to say no, don't cache this under any circumstances Return lookup just ends processing and looks up in the cache right now Yeah, something like that. That's right Resembles more Yes The question was will it actually remove the cookie from the request entirely so Apache won't see it And the answer is yes Apache is dead anyway. If you are if you unset cookies Logged in users won't work. In fact Last week I was on Skype with Ernest and he was saying that that they were having a huge denial of service attack and One of the mitigation strategies is just unset all cookies. Everyone's anonymous Everyone's going through varnish now and then we can gradually allow exceptions through another question Yes Yes, the question was will it also depend if you don't unset the cookie would also depend on your hash configuration? Yes, you can actually also enter custom logic into the hash configuration section Um, we're not going to do that here today because it's a bit of an edge case Uh, but yes, yes, it will depend on how it hashes things So Set the grace period allow the back end to serve up stale content if it is responding slowly Uh pipe these paths directly to Apache for streaming. This is commented out by default in my in my configuration But it might be useful. This is exactly the kind of situation. We were talking about if the url is Admin content migrate Admin content backup migrate export if you guys don't know about the backup migrate module It creates a database dump of your site There is no reason for us to catch that So if the url looks like it's coming from backup migrate Then just pipe it Questions so far Good Yes, all of these paths all of these paths are a regular expressions Well, when you have the tilde rather than equals, it's a regular expression Yes, this is inside the vehicle file It's to say that if the back end is responding slowly, then it's okay for us to serve stale cache So cache that should have expired, but we still keep it around This is actually an important concept to get for d8. We have some one of the big cache improvements Is that just because a piece of cache content is expired It doesn't mean we get rid of it because you know, who knows maybe the back end is going to go down and that That old copy of the front page would come in real handy right So expiry is separate from purge and this lets you serve expired content Okay I'm not going to explain the logic copy and paste this part The important part is that we set a header saying exported for this is the most common way to indicate that you have A that you have a front end in front of apache so that apache knows Um, the request isn't actually coming from local host if you don't have this section What apache is going to see is oh local host wants to load the front page again And in all of your logs everybody's requests are all going to be coming from local host Um That causes issues Maybe you want to go ip locate or maybe your log files are actually useful to you. So we set x forwarded for um There you uh, actually have to set an option in settings dot php for drupal to respect that It's already there. You just have to uncomment it. It's well It's well Commented in settings dot php itself. So we're not going to bother with that. But this is a good standard thing to have Do not cache these paths. So you have a series of regular expressions Notice that we have the double pipes to say or So if we're going to status dot php or update dot php or an admin page There's no reason for us to cache an admin page. There's we don't want to cache ajax responses So if any of those if the path is any of those things we're going to return a pass um, the login page shouldn't be a problem You'll see a bit further down if somebody is already a logged in user We're not caching for them anyway, but yes, it's a good question. The question was what about slash user. Why wouldn't you cache that? Question at the back. That's a good question I mean you could use pipe. Is there a good reason not to use pipe there? I can't think of one Yeah, it will cache the fact that it should be a pass response. Yeah, that's that's basically That would be my only two two sense of that. I mean you could pipe it but No, it will still go through that diff statement But the the difference is is if varnish won't actually go through if you did the pipe it actually wouldn't evaluate The response is coming back so in this case, I mean you could certainly do it but I would rather varnish know that it's not supposed to like I wanted to know that metadata is still there So that caches that metadata And then as it's pushing it back it goes. Oh, I've already seen this before I know not to just do this and it just keeps going with the pipe It doesn't know anything about it just it has to make that that that evaluation every time All right, so below this section We have a lot of things that I have in here because they're handy for me to keep around I don't have to figure out the syntax again every time Um, we have this commented out. So this is do not allow outside access to cron or installed up php Um, this means that rather than having droopal deal with the fact that you're not allowed to run cron or install Up php we can do that in varnish. This is handy specifically if somebody knows that you're running droopal and they're hammering the hell out of your site And we would just say here Here's just return a 404. So it looks for them like there is no Like there is no cron dot php Actually, who was it? We were talking about this the other day that there are times when you get when you're just getting hammered by a bunch of By hackers and they tend to be in the same IP range And you can just take that whole IP range and say no if it's from this IP range give them a 404 On every page as long as it comes from this IP You can actually even go one step further and look at user agents. You can look at you know URLs You can any red jacks you want to put in anything you think of in the hdb header Right, you can filter through varnish and return anything you want so you can truly manipulate anything that this user seeing just by Searching some key red jacks now grant that happens for everybody. So like what we use it for is to now service mitigation So someone's being a site, you know, a chinese web crawler for instance Maybe be beating the site and I know that the the the user agent is the same one everyone guess what that user agent is now getting You know a 404 it's getting something else that I define And now that that now varnish is dealing with that my site is now operating correctly for everyone else Yeah, generally you wouldn't want to do that in the long term on a long term basis, but it's a great way to protect yourself if you need to Yep You're about to ask about this next line, right? Yeah, but it's it's pretty immediate. Yeah, I mean if you're getting denial service and getting your your server handed to you It's it doesn't matter at that point You're down anyway at that point to remove So the question is If the url has extra arguments attached to it This happens for example, if you're tracking an ad campaign And maybe you just make sure that instead of going to the front page you go to the front page And argument equals one can varnish recognize the fact that that should be the same url Out of the box it actually uses the complete url string, I think But you can certainly use the regex to look for that That would be something you do in the hash function likely Yes, so the question was is it possible to throttle to to have varnish issue temporary blocks like that? I don't do it in this file. Do you have a do you have a particular configuration that you use for that? I mean you wouldn't throw you wouldn't slow them down. Yeah, I don't throttle. I mean it's either block or not Oh, yeah, you can absolutely do that. I mean you could do includes inside of here as well So just do an include in some file and you can do a regex with sed or something like that in a cron job Say that hey in an hour. I want this to be gone, you know set it out and off it goes. It's gone Exactly or you have it remove it out like what we do is we This is kind of a little bit off topic, but You can create files dynamically of ip address They're hitting your site out of here through the htp header and you can filter based upon those those ip's And then feed those back in the varnish so you can use those files those include files Dynamically and you can actually restart it as well to basically feed those stuff back We can talk about it later. Yeah, vicle is a complete Is an enormous either an enormous configuration language or a reasonably complete scripting language So anything that comes through in the http headers You can you can take really advanced action based on The next section here we say all right if the request url This is a a regex. So the simple the simple explanation of it is if it ends in pdf.asc.dat.txt.doc Then it's okay Then it's okay to cache it it doesn't matter if the user is is authenticated. It doesn't matter what cookies they've got It's not going to matter because they're just downloading a damn file This particular list is files that tend to be quite small So it's not a problem and it's faster to serve them out of memory But you can modify this list as you like This is now the entertaining part how we can tell How we can tell the difference between somebody who's logged in and somebody who just has a cookie set because you've got google analytics or something like that Excuse me So This is another part that most people would just copy and paste What we do Is we use simple modifications to the text string that our cookies to make it easy for us to Regex tell when there is a session cookie or other cookies So Actually, this explanation is not bad. Put a semicolon in the front of the whole string. All your cookies are sent as just one string It's one header Um, we remove all the spaces that appear after semicolons That's not step two step three is that we match the cookies that we want to keep Adding the space that we removed a moment ago So any a cookie that begins with s e s s and then has any string of characters after it or s s e s s And then any string afterwards. Those are Drupal session cookie structures Or if it's a cookie that says no cache Then we want to keep it and we're going to replace it. We're going to Have a space after we're going to include a space in front of it Even though we just got rid of all the rest of the spaces Then we're going to go through all of it and say, okay Anything that has no space after the semicolon get rid of it Then we can say Remove all spaces and semicolons from the beginning and end because we don't we don't need them That's just convenience and what that means that it's really easy to have an if statement that says, okay Well, the only set the only cookies that are remaining are the session or session cookies secure session or no cache so If the cookie string is empty Unset it because an empty string actually still counts as being set unset it and you're going to get a cached page Otherwise if there is anything left over in that cookie string Pass it back to the back end This I mentioned that it's complicated and this is a copy paste that a lot of people do If there are particular cookies that you want to let through a really common one might be hasd underscore j s You add it to this regular expression Or you called jason and me and we help you figure it out Yeah So just you know for you for you guys, I know I need coffee, but um Yeah, I can go through mine real quick Explained That's x that's a perfectly excellent anyone who has questions afterwards we can definitely you know kind of work for each problem with everyone's got a particular situation They'd like to handle does that sound good to the group or what do you guys think? All right, so the only step that you need to do to make sure you can test the stuff on your box Is going to be just copy our default vehicle To varnish and restart varnish. Good idea