 Okay, I think it's about 2.15 and people have kind of stopped trickling in so We'll get started and the lights will go down momentarily to make this easier to read My name is Narayan Newton. I am one of the system administrators for a Drupal org And I've just been around doing system administration and performance work in Drupal for a while now Can't hear you. Okay. I was warned about the directional nature of this mic How's that? Okay, great. Okay So anyway, as I was saying my name is Narayan Newton. I'm a system administrator for Drupal at org I've done performance in Drupal for a while now and This is actually one of the first presentations in quite a while that I'm excited about Because it's not a normal presentation like the ones I've done the past like I've done a lot of presentations talking about Specifically very high-performance situations like trying to configure my sequel to absolutely saturate IO devices and how to deal with a MySQL server that you want to be incredibly concurrent in a vertical way or how to take a site and make it concurrent in a horizontal way and scale it out horizontally and this is not that this is a much more a much more practical application and is much more me Explaining my process and coming to a site that is performing poorly and how I deal with that the steps I go through to do that and Why I think it's a good idea to do it very very iteratively and with a methodology you strictly stick to And I'll try to explain the reasons behind that basically I work with a lot of people and a lot of them are extremely good at exactly what I'm good at Like the skills of knowing how to do a PC knowing how to configure Apache configure my sequel Use XT bug and web grind to go through a site Lou use MySQL slow log analysis utilities to go through a slow log and figure out what the big problems of a site are and Go into views and try to fix a view or fix a cure query in actual module code those are skills that are pretty pretty widely held at this point, but what's What I've seen not widely held is The approach of going to a site that is extremely slow and not just being bogged down in minutia like I have clients that I've come to and the first 50 minutes of a Contract with them is talking to their very skilled developers and sister administrators about the perfect way to configure a PC Which is an opcode cache for PHP that is very important But once it's on and working the minutiae of trying to like reduce its hits to disk is not something We should be talking about when there are you know, 20 second page loads on the site ABC is not going to fix that it's not going to really impact that in a way that we should be talking about and that knowledge of being able to address a site correctly from Layer to layer to layer without letting yourself get bogged down in these little areas and trying to get the low-hanging fruit to make A site usable the quickest and make a site Fit its goals the quickest is something I don't see talked about So I'm going to try to really focus on that here And this is not a presentation. So this is one of the worst ideas. I've ever had I'm sure and we'll see how it goes I have a few more slides past this and then I'm going to exit to a VPS That's running on my laptop that has a site that I have configured to be terrible and we are going to debug it together He's going to come back and try to dim the lights. I don't I don't know how to Is this hard to read okay, there we go Thank you Okay, so the first step is we are coming in either You know either we're a new hire to a company or we have a new client or we're just we finished development on a site And we need to do the performance work at this point or figure out why it's going so slow on the production server when it was fine on staging or Something has changed and now it's really slow like there are a number of reasons for this What the client or what your boss or what the users will always say is it's slow or the front page is slow Or it's down all the time. These are not useful metrics They they fully believe they're useful metrics like I'm not they're trying to help but It's not very useful. It's not Really indicative of the problem in many cases. It's a good start But usually when I start talking to a client, I need to start getting a lot more details from them What sites are slow is one that is a weird question, but sometimes they're talking about a site That is completely different from like their front page or their main site And I have gotten like 45 minutes into an engagement before they pointed that out So I just want to mention that What pages are slow is It slow for anonymous users is it slow for authenticated users. Do you only have anonymous users? What is the load pattern for the site? Is it 50 50 anonymous? authenticated is it 80 20 is it you know five to six people on the site all the time until a commercial comes up and then hundreds These are all questions that need to be asked at the beginning because otherwise you're just going into a site that is Incredibly complicated. It's Drupal. It's stacks of modules and themes on top of that. It's PHP. It's Apache It's possibly varnish. It's possibly Memcache my sequel the operating system beneath that is the hardware Virtualized is the hardware virtualization done right is the sand that the hardware virtualization is backended to overloaded These are too many questions to address just all at once limitations are good in design and engineering in almost everything and The first conversation with whoever the stakeholder or client is for a website that I have is trying to find those limitations So if they're gonna say it's all anonymous That's a huge limitation that's extremely useful if they're gonna say anonymous users load the site in under a second and Authenticated users load the site in 25 seconds. That's incredibly useful information. That means caching is working That means that if an honest users aren't complaining their cash hit rate is high. Maybe they're not posting Articles that often maybe block caching would be useful These are all things that you get from these conversations that Are very important to have and honestly, I'd say just as long as possible as much as we hate meetings The longer you can talk to a client about exactly what the load patterns and goals for a website are before you start performance analysis is Extremely useful and take notes even though I'm terrible at that So we define goals and load patterns to maintain focus for our example here is Our goal and our patterns. We're gonna say that it split down the middle We have 50% anonymous users 50% authenticated users. This is by far the worst case just because you have no one way one place to focus and The anonymous users are just as important as the authenticated users because they lead to new users But the authenticated users let's say are paying so we can't ignore them either and our goal is 800 millisecond latency that means From the time that the request starts We want that first bit of data of the actual site to be delivered within 800 milliseconds Note that this does not take into account front-end rendering time at all We're not gonna talk about that at all. That is its own presentation its own conference. It's No, so we're just gonna talk about the Drupal actual bootstrap and page Output which is also extremely important, which is why it's often focused on So our first step is we've had this conversation with the client We have some load times some complaints about specific pages some ideas about load patterns And now we need to step away from the client and assume that they just lied to us That everything they said is wrong and in fact that they have never been to the page and are visiting Google and reporting those results so We're gonna start that our site is Drupal 7 its views flags the content is developed generated It's there's about 600,000 nodes about 100,000 users So it's a fairly large site and it's larger than I wanted it to be Because if you're gonna do this sort of presentation, I highly recommend you don't put an SSD in your laptop before you do It's amazing how quickly SSDs can do full table scans So we're gonna exit out of this now and here's our site Now I'm use Firefox for this you can use Chrome, Savari, whatever The important thing is you need something like either Firebug or Chrome developer tools or Savari web Console or something like that You need a net console to show How quickly the front page is loading and how quickly the resources are loading and the headers for all of those things When you're going to start looking at the performance of sites So here's our front page the first thing we're gonna do is say, okay, the client said the front page was slow let's load the front page and While it's loading I know that I loaded it before and so it is slow is 8.23 seconds So it's actually extremely slow. That's the point where users start leaving the site I actually that's beyond the point but at this at this level of slowness new users are only gonna stick around if they have an invested interest in coming to the site and Basically, there's gonna be no No users coming in just because I thought it looked cool Those people aren't gonna wait for eight seconds. Also your Google Page rank is gonna go down from this and likely your servers are gonna go down when they get load Okay, so we're gonna start actually taking some notes here So that took about eight seconds to load We were also told that there's a gallery feature and that that's terrible and So we're gonna load that and while that's loading We're gonna load an article page Why we're doing this is because Oftentimes you'll get instances where a client doesn't realize what they're actually doing for example The best example I have is old Drupal 6 sites with the admin toolbar turned on or admin menu. I guess that's what it's called admin menu actually makes sites Pretty slow in some cases and oftentimes a client won't realize that what they are seeing for example Just if anyone's used admin admin mob bar, there's a place where you can switch users and it lists all the users by role That's an incredibly bad query. So Sometimes clients don't realize that their site may actually be performing fine and they just don't know This is not one of those cases. This this page just took 23 seconds to load This page just took 12 seconds to load and now we're going to log in And so we update this because we're gonna do everything very very strict I'm not gonna load this but that is also 12 seconds. They're they're basically the same page And once we log in we have a favorite content view and The client has also reported that this favorite content view while it's pretty Pretty required from a feature perspective people are reporting that it times out and It's just it doesn't load in time at all and we're seeing that right now because it took It took 13.4 seconds to load so now we have some validation at this point it's very tempting to Turn on the develop query list go to these pages and go through Every SQL query there and try to figure out what the query is that's causing this problem And I'm guilty of this all the time like this is it seems like the natural step But the problem is that it doesn't Represent what might actually be going on like for example What if there's a block that's cached some of the time? But in some cases does run and is slowing down the site and that's what we're seeing and that's what people are reporting But if we reload this in the develop query list We might not see it and then we have to start iterating on that specific page and suddenly it's already done We're already locked into a specific page We're looking for a specific query and we're not doing an overview anymore and we're gonna miss things We're gonna miss things that might be extremely low-hanging fruit that we can fix very quickly And it's just one of those areas that I'm gonna point out repeatedly of Times where I'm tempted to skip and go to something in particular when I really should be zooming back out to try to Look at the big picture again It's almost never incorrect to stop what you're doing and relook at the big picture the things you'll waste time you'll definitely waste time but the one in five times where you zoom out and find this Piece of low-hanging fruit is on every single page and runs, you know 12,000 times a minute and is taking a huge amount of time is well worth the time you might waste by it while Zooming out to try to fix something when you could have just gone in and fixed it right there so We know now that there's definitely some bad queries on this page likely it's taking a long time to load So what we're gonna do is not assume that they're bad queries But we are actually going to go in and I already have develop enabled, but I do not have any of the develop options enabled We're gonna go in and say display the query log We're gonna go in and say display the page timer display memory usage and safe and Now we're gonna reload just the front page because we're zooming back out to try to make sure that In the page timer all we're doing here is we want to look at the page timer and make sure that the SQL query time is Dominant because we think that queries are the problem here But we don't know that queries are the problem here and before we switch on to the MySQL server to see what's going on We need to know and here it is We have confirmation that queries are the problem queries themselves are taking 8.5 seconds on the home page Now again, it's tempting to scroll down here and start looking at why and I would advise you not to in this case In this particular case, it would work but In the general case, it's not a good idea So now we're gonna go over to the actual VM. This is running on How many people have used the MySQL slow log in any way? Great, how many people have used a slow log analysis utility? Great, okay, so we're gonna use something called the Precona tool kit It used to be called Mat kit before that. It was called something else. I'm hoping they don't change the name next year And what we're gonna use is PT query digest Which is a tool that you point to a slow log and it does analysis over the slow log giving you a list of the top slow queries Aggregate so this is much better than just looking at a slow log because you get an aggregate view queries by count and queries by execution time and it even biases it so if a query runs 20 times But takes, you know 30 minutes in total and another query runs 5,000 times and takes in total about five minutes It's still a bad query The one that runs less but takes more time is going to float to the top above that one And that's something that's useful because for a while these query digest applications actually favored count a lot more than runtime So we're going to make sure that the slow log is enabled and it is and it's set for a long query time of two seconds I would probably set it to one, but It doesn't really matter for this case Yeah, how's that? How about that? Okay, this is what a PT query digest looks like Right here is a summary for everything. It's semi useful But not really when you're looking for a problem query and you already know that your queries are bad That is useful if you're gonna run this on cron and have emailed to you and it can kind of tell you when things are going off the rails This is now too small This is one of the most important parts. These are the ranks for slow queries by their time total the number of calls and the and the app decks score, which is actually fairly cool That's a new addition And so we're gonna go through this and we see here's our first query Every query will have a query time distribution and it actually tells you the distribution of How long this query runs? Because a query will sometimes vary by the arguments you give it actually quite often it'll vary by the arguments you give it for example The most common case is if you have a listing of content by author for example on a user page If you have a admin user that is going to migrate in a bunch of content from an old site And you accidentally make that admin user own all of the migrated content It's very likely that that query is going to perform extremely poorly for the admin user, but fine for everyone else So this allows you to actually see if that's the case Which is very useful because again it prevents you from wasting time So going down here. We see that this is SQL no cache and it's a straight select. So this is likely a backup job So we're gonna skip it and go down to this one So this is our first real slow query and I'm not going to Expect everyone to look at this and know where this query is or know what it might be or even how to fix it Basically, what I want to do is show everyone how to find it here and possibly report it Definitely keep an eye on it like know what this is no Have this open in another window for our next step so that we can we can map this query To where it is on the site so that we can figure out that okay this is our top real query and it occurs on these pages and Just that piece of information is extremely useful and much better than this page is slow And of course much much better than the site is slow And at that point you can go either try to fix the query yourself or try to find someone to give you help in Fixing the query and it's a big jump forward but So we have this the big thing I note here is order order buy node comment statistics last comment time stamp I'm likely to use that to identify this query and we're gonna go on This is pretty much the same query above, but you see it's select count star Basically when you see that you immediately think this is a pager query And this is the count to show how many pages could be in this pager amusingly this is one of the Worst culprits and standard websites that have complicated views and lots of data a lot of times a solution for a View that is performing poorly and can't really be fixed is to turn on what's called the light pager Which is a module that sadly has not been fully ported to Drupal 7 yet There is a patch that you can apply but for Drupal 6 it exists and all it does is it doesn't run this select count It just says next page so instead of saying one two three four dot dot dot and then the end page it just says next and With that little difference it can actually make a huge impact on performance Because the first query that runs has a limit on it and that limit allows my sequel to optimize it in very interesting and really Effective ways, whereas the actual count query Not so much views light pager Views underscore light pager one word At the moment you wait for views light pager to be ported, but also for views for Drupal 7 is a little bit smarter So that's nice because it actually for that count oftentimes. It'll remove the order by which is very nice But for now I've actually used light pager and with that patch There's an issue open for porting it to D7 and it works and the patch is very limited So if you really need it you can likely just patch it I would expect a release soon because it is actually fairly widely used Okay, next one This and again like you look at this and you've got no idea what the hell this is But what I want you to do is either have you know I can identify this by this right here and that's what I'm going to do But what I'd recommend is you have a text editor open and just copy these like the top four queries If they're dominant for a site into the text editor so that in our next step and in your next step You'll be able to identify them and map. Okay, this is in the slow log and it appears on these pages That's a very crucial step to this So that's our that's our third real query and this is our fourth And this one I'm again going to identify by this right here because it's actually this is actually pretty identifiable But again what you do is copy this to a text editor and this is actually a duplicate You're gonna have to look for these these slow query analysis tools often will they're very good for what they do But a lot of times they'll not be able to match something exactly especially with how Drupal does it so there might be duplicates And you just you just kind of have to look at the query and go Oh that has all the where conditions joins same join same type of joins and same order buys as that query up there And here's another pager query that we're going to note But kind of ignore and that's it those are the only slow queries So we have a few pager queries and basically three other queries that were interested in here and at this point The difference might seem a little small, but instead of going through the develop query listing per page We now absolutely know these are the three slow queries. They're completely dominant in the slow log the slow log is Is going to contain everything over two seconds in this example and our pages are loading in eight seconds or more so that's going to be good for us and Now we can go back and go to the query listing per page and figure out exactly what these are So we're going to go on the front page and we're going to start scrolling down And we're going to see 153 milliseconds. I don't care. We're way above that Six I don't care And we're going to keep going and bam Five seconds two seconds. Here's the pager query. Here's the regular query And we can see that this is a view views plugin which makes our job much easier We're now going to just scroll through the rest of this to make sure that there isn't something else and there isn't So now because this is views We got a break Which is the last time I'm ever going to say that from a performance perspective so we're going to go into structure and views and settings and Advanced and we're going to add view signatures, and then we're going to reload and what we're going to see is that now There is this right here The signature so this is tracker block one and that's the break If it's a view we can turn on view signatures and know exactly what it is And that's amazingly useful especially for Drupal 7 because everything is done through a query builder And it's very very hard to find things in the query builder so now you get to know exactly what views the problem in this case is the tracker view and What you'd probably do is either ask someone for help go through the Issue queue or just Google search because actually tracker is a pretty Common query to have problems and what you'll probably see is hey you don't need the pager and the pager is a problem and Hey, you might want to cache it and that's it And so what that's going to do is this the query is still terrible like the tracker query is terrible There's a way to fix it. It's called tracker to but by default. It's awful But now we're not running that pager and on the second load this block is going to be cached And this block appears on every page So just with disabling the pager we've cut out two seconds and with it cached we've cut out a total seven seconds from every page load Now what I'm going to do is I'm going to go to favorite content and do this one And then I'm going to back out of SQL for a bit the favorite content page is one of the problems that was highlighted by our client and We saw that there were a few We saw that there are a few SQL queries in the slow log one of them is actually this One of the things I highlighted was an order by flags And that's often what the favorite content any sort of favorite content view page block is likely going to be flag based So we see that that is exactly that here. It is order by flag content. This is exactly the line that I I highlighted in the slow log So in this case, this is one of those areas where okay? This is a lot of content That seems like too much content for me to have actually favored it I'm going to go in here And one of the biggest mistakes I ever see is that this isn't actually showing favorited content is showing all content Sorted by your favorites at the top This is an idiom that is almost impossible to make perform without writing your own module What I recommend is to have a listing of content and then a listing of flagged content and keep them separate And then this relationship Include only flag content and that will actually just full-on fix this It doesn't really have much to do with my methodology here, but I wanted to include it because I've seen it so often It's an extremely easy change to make and Just having two listings of views usually doesn't break a feature Okay, so we've made two changes We've removed the pager from the tracker view and we've made this only list flagged content now we're going to stop and back out we're going to back all the way up and see what the impact was on these pages and Basically just go back to our global view and make sure that these changes we made are a helping and Be there isn't something else. There's a problem. So The first thing I see is that took too long Like we should have fixed that this is six seconds and it was 13 seconds so that's better, but six seconds is still too much And we're gonna look here and the query time is still pretty high the execution time is higher. That's weird And then I'm gonna start looking through here and see that things are looking a little bit odd These are a lot of registry calls These are a lot of path calls Wait a minute we fixed this this should be cached So at this point I back out again and go check the caching settings And while there isn't a minimum cache lifetime set which there should be and we're not caching for anonymous users That shouldn't impact any of the things I just saw So I'm gonna fix those make a note of that go no further in this. I'm not gonna set up varnish I'm not gonna do anything like that Because I just found something that's a global issue that I need to continually back out and try to find the solution for So now I'm gonna back all the way out of SQL even though there are still bad queries We're going to note that there are still bad queries and back out and I'm gonna use something called xt bug xt bug is a PHP peckle module And You install it it's a it's in I us or EPL if you're using CentOS It's in pretty much every distribution at this point somewhere and you can install it easily And then what I do is I enable the trigger if you don't do this you're gonna have to enable it globally and trust me You don't want to so the trigger allows you to append this to our URL and then that URL will be run through xt bug So let's do that. The other thing I'm gonna show you is that I have put Webgrind and if you just Google webgrind, it's the first hit and this is I think the easiest way to look at this output And I just stuck it in the web root. It's very easy to install. You literally just stick it in the web root and All it does is it looks for these cash cash grind files that xt bug produces in temp And so it'll it'll automatically just pick the newest one You just open up webgrind press update and it'll parse it and that's all so how many people have ever seen this screen before? Fantastic, okay So the first thing I notice here is yeah, we clearly still have some query problems What I also notice is these two lines Now I don't necessarily want everyone or I'm gonna say that everyone should be able to instantly look at this and pick out those lines more I want you to just keep Keep poking at things that don't look right and keep zooming back out To all the way back out to the original page load time and trying to work your way back in to figure out what the actual problem is so One thing you should note is memcache connect should never ever be this high and That theme process registry really should only run once unless caching is broken and We already saw that we've cached a block and it's having no impact all of that put together means caching seems broken So let's check our settings We were using memcache, okay Let's try to connect to memcache You can use telnet for this. You just tell that to the IP and on the port. Well, that doesn't look promising Is memcache running? Yes Is the firewall running? Yes Do any of these ports look like the memcache port? No So we have no caching is What we took away from this There may be easier ways like they're definitely faster ways to find this you could just decide that for every client You go to you're gonna audit their firewall but that's fairly difficult to do sometimes it's impossible to do and That doesn't account all the other issues that you might hit on a standard client engagement or any sort of engagement Whereas what we just did no matter what the issue is if we keep backing out and going okay, it's not It's not a query because we cashed that query We're looking at PHP execution time and we're seeing something weird if you just keep doing that No matter what the issue is you're going to find it You're not gonna have to have these lists of one-off things that you're always going to check Everyone will have that list to some extent, but it's not something you should count on So suddenly things are looking a lot better This page is loading much faster much faster if we go down to the develop query listing we're seeing that hey There are a lot fewer Lot fewer calls here that should be cached But look at this. There's still a bunch of registry calls so and What I just did is wrong So what I just did is I looked at oh there's a bunch of registry calls and what I was about to do is go Look at the theme Because in my head I have a rule that says if the registry keeps being rebuilt Someone somewhere has put a register rebuild in the theme That is not something you should do and It's something that I still do clearly because I almost just did it there But what we're actually going to do is We're going to run a web grind again. It's that easy by the way You just refresh web grind and it'll find the newest file Hey, look memcache connect is not here file scan directory seems weirdly high Registry check seems really high registry parse files This is wrong. If you just look at this this looks wrong the registry is being rebuilt So now we can back up and go okay I validated what I think is happening and what I'm going to do is look in my theme Which in this case is Bartik and look for a registry call and I've hopefully even highlighted that is a terrible idea So I'm going to delete that I'm going to restart Apache because it's one of my tics to make sure that ABC isn't caching completely And we're gonna go back Not run the profile this time Reload it twice to live the cache and look no registry calls so now clearly that was completely contrived but How I showed how to actually find it is not and that's what I do if I if I hadn't caused the problem That's exactly what I would do to back out and find it So what we have at this point is a knot we've focused on authenticated users authenticated users are now fast For this page and for favorite content I'm not going to touch gallery because I'm actually running out of time and it's kind of boring But basically what I do there is the exact same idea is go to the develop query listing match it to the slow log to make Sure that we're looking at the right query because there might be another slow query on the The gallery page, but if it's like a one-second query, and it's not in the top of the slow log ignore it It's not low-hanging fruit. You'll get it on the next iteration Just stay very iterative and only do exactly what you're trying to do and don't get distracted by other things Make a note of them, but don't get distracted So we're going to back out of this and make sure because we've only covered authenticated users at this point Mostly because the non-assisted are easy, but we're going to make sure that yes An honest page caching is working and I was actually planning on going into setting up varnish Because I have it almost like mildly set up But it's 254 which means there are about 20 minutes left, and I really wanted to take questions So I'm actually gonna leave it up to you guys Do you want me to cover just setting up varnish to tune for? Anonymous users or do you want to go to questions? Okay So at this point I've gone through I think about two iterations We've found that there were PHP execution time problems mainly the registry rebuilding We've gone through the slow log We have not tuned my sequel you might note because we went through the slow log and found actual real slow queries And that was the low hanging fruit Tuning in this setup actually my sequel tuned horribly. This is an I know to be site and it's using I think 8 megs of buffer pool It's terrible, but that's not the low hanging fruit And we're going iteratively and strictly trying to get the big problems and not focusing on the random stuff We find along the way So what we're gonna do is we're talking to the client the client is happy with authenticated user performance now It's under 800 milliseconds and they don't really care anything beyond that, but they're unhappy that with Anonymous users It's fast, but it's not as fast as it could be these are the new users coming in and There's a commercial coming up, and they're not sure gonna there be gonna be able to handle the traffic So we're back outing backing out at this point and saying okay varnish would help you a lot So what we do because this is Drupal 7 is first we're going to enable Drupal 7 to use varnish and Having worked on press flow for a really long time. You have no idea how nice it is to just uncomment these two lines So what this is is the magical almost undocumented tunables to make varnish work and mainly it's this one this one you can set in the UI But I tend to like setting it in settings dot PHP because the for some reason people just go into the performance panel and just change Things so I like setting things in settings that PHP at least then they get frustrated and then blame you No, so in Drupal 6 you're gonna have to use press flow In Drupal 7 it's built in in press flow for Drupal 6 basically the process is a Little bit easier actually because you install press flow and then you go to the performance pane and everything's just there There isn't a magic tunable You could argue well you could go to the Drupal 7 issue queue and find the actual argument pro and con that But but that's how it is So now Drupal will send the correct headers basically and what varnish does is it proxies your request to Drupal and Drupal says to varnish. Hey, you can cache this page for this long assuming this isn't true and That assumption is it saying vary on cookie what that means is it's telling varnish Hey, if a request comes in and that request has a cookie and your object in your cache has a cookie unless those two cookies are the same Don't cash So what we're gonna do now is look at varnish. I've already I Have already installed and configured varnish except I didn't configure it What this is is actually copied and pasted off the four kitchens wiki This is a VCL that at this point almost the entire community has contributed to I'm going to edit my slides and add the links to these various things I mentioned to the end of the slides and then put them on the node for this presentation and this will be part of that So all you really have to do is install varnish Put this VCL in the only part of this VCL you should really pay attention to is this There are a bunch varnish doesn't catch cookies basically is the default you can think of What this says is I'm gonna look at every request coming in and I'm gonna strip off the cookies I know don't matter to the back end. They don't make this request different What makes this request different the session cookie if you have a dumb module? I'm not gonna mention the name if you have a dumb module that sets Sets a cookie to decide something like showing a block and that's not in JS that matters to the back end things like Google analytics Things like a lot of packages that track paths through websites That's all done in front-end G at JS and you can strip those cookies and catch the request So what you do is you basically you add the identifiers to the cookies You want to strip to this little thing and they're separated by these pipes and it's just a regular expression. It's very simple At this point you can look and Varnish is configured to listen on port 80 and if you look back at the VCL By default it wants a patchy the back end to be on port 8080 So all we have to do is go into the patchy configuration search for the listen line which is here which decides a patchy's port and Set it on 8080 Then we restart a patchy Restart varnish or start varnish Yes, if you have a virtual host that has its own listen directive, you're gonna have to change that as well What he said is to make sure that your virtual hosts don't screw up the port And then if we go here While it was fast before and this is why firebug is eternally useful if we look We see that xdrupal cache was a hit. We're passing through varnish now and it was a miss We're gonna clear our cookies because if I press force refresh, it's gonna actually force refresh the cache and it's still miss nice That's fine So that's okay Here's how you debug that varnish has a utility called varnish log that actually logs the request So we're gonna make that request again. Oh and it's Firefox being stupid So this happens a lot Browsers have decided that they're not gonna respect anything anymore Which is just fun So we did hit you can see it right here. Here's the request coming in We do a look up we hash it and we hit and then we delivered and it was a 304 not modified So it worked It just didn't show that it worked because what happened is Firefox requested if the if the data had changed and it Didn't and firebug right now sadly doesn't update headers when that happens. So I Don't know it's just one of those things that if you're really unsure just go look at varnish log It's it shows what is actually happening your browser likely doesn't which is if you're using Chrome your browser just doesn't Firefox is semi close to reality Savari is actually wrong in some cases like entirely entirely wrong So if you're gonna use varnish be very comfortable with varnish log because It's absolutely your friend So we're now using varnish. We've backed up. We still have things left to do like APC isn't configured at all Not in the slightest My sequel is completely unconfigured and we're basically reading everything off disk but The page is fast We could have spent Exactly this time we've been here for a little bit less than an hour We've been here for like 45 minutes. We could have spent this time configuring a PC I can spend 45 minutes configuring a PC, especially for triple seven We could have spent three hours configuring my sequel But because we did this iterative process continually making sure that we were absolutely focused on exactly the goal We have a fast website without doing that and we can either stop now and be happy that the site meets its goals Or more likely we can iterate from here with a much happier client or much happier stakeholders and spend more time on these issues that matter But I would still recommend like even when you're going deeper at every level back out and make sure what's happening works for example APC Does anyone here know about the include once override for APC? Okay for those people that don't The include once override basically It wouldn't configured correctly makes it so that instead of Drupal touching all of its PHP files Basically all the time whenever it feels like it whenever a wind really comes through slightly hard APC will override those and Make it basically touch the cache instead You wouldn't think that'd be a big deal But a lot of places have for example their web roots on NFS or they have pretty slow drives Or they're just having a huge amount of traffic You can definitely throw enough traffic at a single web server that those little those little touches on the disk start mattering When you start configuring APC, you're gonna realize that actually making that work is incredibly hard to do and The only way to really do it is make a change a single change and back all the way out Because there's no other way to make sure that things are actually working in theory It's two settings an actuality. It's two settings making sure you're not doing specific things with your PHP code Making sure the cache isn't doing specific things making sure Apache is configured correctly You have to keep backing up and that was The goal for this presentation, I guess was not really along the way. I showed you know kind of fixing at this site Partway showing the Bracona toolkit and how to go through a slow log Showing a little bit of how to set up varnish Showing a little bit of all these different ways But the goal of this to try is to try to get across how I approach these sites and how I try to I Fail a lot, but how I try to maintain to this strict method methodology of staying focused and continually looking globally Because I don't see it done that often and in my view is incredibly useful So we have about 10 minutes left. Does anyone have any questions? He asked if they include once conflicts with PHP 5-3 To some extent honestly they include once conflicts with almost everything You could say the same about ABC honestly, it's It's an incredibly useful thing like PHP without ABC performs terribly and It has performed terribly for a long time But it's a hack and it's the most enterprise supported Maintained hack that I've ever seen but it's still a hack and You just run into problems include once override is an amazing thing that I was really excited about right up until I started using it Which is depressing. It's why system administrators drink so in the beginning you Showed how to use slow log to identify the bad queries But then you just went into Firebug or not into firebug you turned on the the option the develop module will see the query So why couldn't you just go directly to those queries on the develop page, right? Okay, so what I was trying to get across with that is that So you can you can turn on develop the develop query listing and go through page by page to look at those slow queries What you would have to do and you can do it this way actually what you'd have to do is you need to have Notepad or text edit or spreadsheet and you need to look at those queries go through the pages and say okay This page has three slow queries. This page has four slow queries. This place has five slow queries Where do they overlap? Which one should I fix these queries can take hours days weeks to fix You need to be able to prioritize this or you're going to be spending a huge amount of time on something that didn't end up mattering which is I'm sorry. I apparently didn't explain that particular point well because that was one of the points I wanted to get across is that it's fairly easy to open up develop query listing and go through and start fixing random slow queries But what separates out someone who can come in and fix a site in a day from someone who can fix random slow queries is being able to find a way to prioritize what you're doing and In my opinion the slow log with something like PT query digest is the best way to do that So excellent talk. I was just curious Why you went with X X debug and now webgrind as opposed to XH prof? Yes, so He asked why I used XT bug instead of XH prof XH prof is a little bit newer than XT bug and Is much much better in a variety of ways in my opinion? But it's a little bit harder to use it's much much harder to set up It is at integrated into develop Which is actually really really cool if you want to look at you can you can set up XH prof and integrate it into DevL so when you're going through your site There's actually a little link on the bottom of the page that takes you directly to the query profile. Sorry the code profile And that's really really cool, but it doesn't compare in my opinion to just you know installing something out of peckle That's incorrect. I'm sorry installing something out of your distributions Yum repository or app repository putting a few lines into the INI file and Throwing on a twig trigger and then throwing web D web but grind into a web route Trying to get easier than that is all pretty hard and that's why I wanted to show But if you want to use XH prof for use XH prof I know Yeah, he was saying there's a Drupal module for it. It's actually integrated into develop as well but while So the theoretical instructions are that in actuality it can you can run into significant issues with XH prof Just because it's not integrated into the distribution package system, and that's why I went with this Do you have a few? Like common or top my sequel Misconfigurations because I'm not a DBA, but you know I find myself trying to make Edits to some of the main configuration options there, so I don't know if you have a riff on on that by default I know DB When these are commented out has a buffer size of 8 megs Drupal 7 default sign of DB That's it. Like there's your top one make your I need to be buffer pool bigger It's the single largest single most important thing you can do to a my sequel server is to make sure that I need to be covers your data space that if you've got that down you're ahead of the game and Honestly, if I was to say one thing that's the only thing I'd focus on there's another Tunable here that is the sync method you can so if you go on Drupal at work and search for I know to be and then sync method You'll find a whole conversation on it. I know to be by default runs in full acid mode Which means it tries to prevent any sort of data loss ever and what that means is it it syncs to the disk very very very often In particular because by default it syncs on every commit and Drupal commits a lot There are some numbers in the issue But it's a huge performance impact when you actually make it instead of syncing to the disk on every commit Sync every few seconds and that's a big big performance improvement It is also not something that I like to say just do this Because there are there's a reason that's the default if you have bad hardware. I Wouldn't do that and you're gonna take a huge performance penalty and my answer would be fix your hardware But yeah, I would I would say the one thing I'd say to everyone is make sure your buffer pool Holds your data size and to a minority of people look at the sync method But understand what you're doing and understand that if there's a hardware failure, you're gonna be in trouble I got a two quick questions. The first one is is X the bug suitable for installation on production server? No, so the you don't install it on a production server at all What I would do That's where you should use XH prof XH prof was designed for that It was designed by Facebook to do exactly that to run on production servers and be able to be enabled or triggered or just run every so often and It was actually built for that whereas X debug was built as a development enhancement So it's I use X debug on staging develop sites, but I never ever use it on production Honestly, I wouldn't use XH prof on production either, but if I were to choose one I would choose that And the second question is you mentioned like the views light pager module Is there really a difference between that and like the like the default views mini pager? Which just has kind of the the forward and next it has the forward and so I don't have time to show it So it has the forward and that has the next but it also has the last page Yeah, which is unfortunate. I wish it didn't like yeah Just a couple of quick comments actually One thing that I've run into when trying to work with client sites often is full discs You want to talk about low-hanging fruit a quick DF? Oh, hey, you're at 97 percent. That's probably part of your problem Yep, and usually it's log files in my experience or backup database backups that haven't been cleared The second part, which is I completely forget what I was gonna say Okay, well, I'll let it go at that. Okay. Well as a follow-up to your low-hanging fruit Here's another one. This is called VM stat, and it's the best utility in the world It shows many things, but the big thing is swap right here In the low-hanging on the topic of low-hanging fruit if your server is swapping all bets are off. So Make sure it's not Other than potentially running as H prof in production Do you have any other tools or techniques that you recommend for trying to trace performance issues? The only show up in production that are not reproducible in any other environment. Um, yeah, so Honestly, it's a little difficult and what I tend to do is what I what I tend to do depends on how much money is available If if there's a sufficient bank account involved I will often go and get a new relic account which is I Like it a lot actually it's very cool. It it lets you drill down into production systems but It's a little buggy honestly in my experience like we've worked with them a lot we we worked with them on like five clients and I think our partner of some sort but We've had to file bugs. It has caused issues but if you just I think there's a demo of some kind somewhere where you can use it for like a month and if you Just install it and look at what it provides. You'll quickly realize that there's very little else out there that provides that level of analysis The other thing I'll do is use either cacti or Munan I like cacti just because I co-maintain some of the graph templates with Baron Schwartz So I have an interest but Munan is much easier to set up and I'd probably recommend it and That doesn't give you performance analysis on an application level But it does give you a lot of historical trends for your server and for for production servers That's important for knowing knowing when something changed knowing when you're starting to head off the rails but are not off the rails yet and Just being able to track history which you can then sort of try You have that you have your slow log You have VM stat that you can run and you can try to you know combine them together and get an idea of what's going on And that's usually what I use when you know there isn't a mechs involved Okay, remember the other thing The other the other low-hanging fruit that I have run into in fact very recently in this last week or two Is to look to see what else is going on in the server now? Obviously, this is applicable with shared hosting in which you have no control over it But in in our case we had a very old very low traffic legacy PHP bb site on the same server Nobody thought it was going to be an issue and it wasn't until yahoo and google decided to crawl it on the same day and It killed the Drupal site on that server and we had no idea why because the Drupal site itself is running fine You know and we didn't see it until we dug into the slow query logs. Hey two-thirds of these are not even to the Drupal database Yep, so in that case. I think the solution is eventually going to be well. We're going to put that on a different server Thank you very much But it is something to pay attention to it may not be the Drupal site that you're trying to debug Yeah, and I Think one of the initial questions I said to ask the client is what site and I think I also said assume they're lying And I think that's a good example of that And a good example of why sometimes you should just back out and look at the slow query log Look at the global view for the servers logs. Don't lie until they do but that's not an issue. I Think we're out of time. Thanks everyone