 Hello. Hi, everyone. You'll have to excuse me. I don't have any notes or my slides in front of me. We had some technical difficulties. But I'm here today to chat with you about caching in WordPress. So a little bit about me. As was mentioned, I am a software engineer at Pagely. We are a managed WordPress host, so we deal with a lot of interesting caching challenges that come up at scale and on enterprise client sites. And before that, I worked at TimeInc as a WordPress engineer working on large multi-site networks. I'm also a bit of an emoji and meme enthusiast, so you can find my Slack emoji repo. Feel free to add a pull request or request there. They're all formatted and ready to go. And I am always happy to chat about it. So we're talking today about caching. It's a little bit of an enigma to some people. But when we're talking about caching, we're mostly talking about data that is stored closer to the end user, or at least it's closer to where it needs to be utilized the most. And that's my desktop. So yeah, that's a beautiful desert scene that I took. So we're talking about data that's stored closer to the end user. And here's kind of a quick example of what we're looking at when a user requests a website. Now the first request, caching is not on. So a user requests the site, it goes out, it runs PHP, it's fetching data, perhaps making a remote API call, a remote database call, something like that, building out all the posts, et cetera, coming back, maybe fetching some more data, building things, putting things together in PHP, and then turning the resulting information into the website. So the next person comes and requests a site. This is where it gets interesting when you have caching enabled. Caching provides an option to store some information temporarily in a caching layer of some sort in some sort of area. So say you're fetching all that data, and you put a little bit out in the caching layer, you save it in there. The next person that requests the website is going to have a much quicker response. You don't have to reach all the way out to that remote API, you can look right into a caching layer that's much closer on the server to where you need it. So the next requests are all much faster because you've stored information in the cache. So what about caching? There are a couple of types. Object caching is one of the primary ways that we cache things. It's a little bit more advanced. This is a server-level option that needs a little bit more setup. And so usually when we're talking about object caching, we're talking about memcached or Redis, something along those lines. But out of the box with WordPress, there's some other alternative storage mechanisms. This storage, when you install WordPress, if you have nothing additional set up on your site, transients will be stored in WP options table. So we've got an additional little spot to store there. Some people create additional database tables that are just meant to store cached items. And finally, we've got local browser-level storage. That's certainly a little bit different, local cache, but it's one of the other elements that we're dealing with when we're talking about caching. So one of the items that I wanted to mention, I didn't want to forget to discuss this, is caching and the data that we store in cache is not meant to exist forever. This is a temporary form of storage that we're working with. So when you store something in cache, you set an expiration date. Even if you don't set an expiration date, this temporary storage is meant to be cleared out easier and refreshed to go fetch new information. It's not like storing something in the database and always depending on it being there. For some reason, we might need to clear that out to refresh the server, to restart it, something like that. And when things are stored in a place like memory, that information can be easily cleared out. So do not store things like keys in there. Any specific data that you need to know is there forever, or for a long period of time. Okay. So we have a basic understanding of what caching is. Let's talk about the quickest way to getting your site cached. The quickest way to get some extra performance out of your site. What is that quick win? That is full page caching. And a lot of WordPress hosts take advantage of this. It's a fantastic way to get a super quick response if you've got a static WordPress site, something like that, where you are storing information, you're basically taking the full HTML output of a page and then saving it into this cache. So that caching layer that I talked about earlier, it's stored in this cache, all of the rendered HTML markup. Imagine how quick that is when it visits your site. And they get the HTML immediately. There's no need to go run PHP. There's no need to call the database. No need to make any extra API calls. It's all there and ready. And that's great, except for when we start talking about what logged in users need to see. So logged in content poses a bit more of a challenge. It needs to potentially be specific to the user. We can't store that in raw HTML in the cache, unless it's specific to that user. Even then, it creates this need for something a bit more dynamic. And full page cache does solve that issue. Similarly, if we're talking about admin content and frequently changing content, when you're dealing with a ton of logged in users at the same time, usually even caching plugins in WordPress are excluding any logged in content, any administrative screens, things like that. So frequently changing content falls under the same category that we're dealing with people needing to edit posts, needing to see information in real time, needing to edit other authors' information. All of this information, we need to make sure that we deal with. And this basic idea of full page caching doesn't solve it. So to state simply, we do not just cache all the things. We do not just take the entire lump and some of the HTML that we've created and store it for anything more complex than a basic blog. For a basic WordPress website, it's perfectly effective. But usually once we start talking about caching more and going into more complex ways of storing this information, it's because we've got more than a basic blog or a basic WordPress site. So what do we cache? Let's talk a little bit about that. Well, there are a couple of different things that you can store when you're looking at a basic WordPress website or a basic blog site that we can take chunks of. One of the really obvious things that we could do if we take our concept of full page caching is take a piece of the page. So say a sidebar widget with popular posts. If we're looking at that, we could take the rendered HTML and stick that somewhere in cache. I don't advise that for what it's worth. Generally would say that you should take the next option, which is taking JSON output or some sort of chunk of information that's not fully rendered into HTML and then turning that into whatever you need. Additionally, we could say cache the entire footer of a site, cache the entire header, cache a remote API call response. So we've got all of that information, a couple of chunks of data that we might want to store. And WordPress Out of the Box comes with this beautiful Transients API. Now when it's initially set up, like I said, it would just put information into the WP Options table. And you can see here what you would set is a key that is specific to the data that you want to fetch. Much like in the database itself, you've got key value information that you're storing and fetching quickly. So we're using getTransient, and that's going to pull all the data. SetTransient is exactly what it says on the tin. So we're going to place the value, which could be any sort of information, JSON, an array, again, a string of HTML markup, anything else that you would normally store in the database you could store in this field. And then again, deleteTransient. Now I should note that deleteTransient is a little bit interesting because it goes beyond just setting an expiration date. You can store things in the cache for a minute, an hour, a day, whatever custom date you want. And that expires as in seconds. So you can get quite fine-grained with it. But deleteTransient is extremely useful when you're talking about dynamic content. Because you can take advantage of, say, storing a WP post in the cache if it needs to be super quickly accessible and utilized in many places. But then, say, you have an author come in and update the title. Well, that cache needs purged. It needs cleared. You don't want to have to clear the entire cache of the website just for one post that you edited. You have the key there. So you can delete that one transient out of the database or object cache, what have you. And it will clear out quite nicely. The site will keep performing. You won't have to add anything else into the cache all over again. It's incredibly useful. And here's a quick example of what a function looks like to fetch data from a cache. And that is we're getting transient. If that data is expired or if it wasn't there to begin with, we never trust that it was there. So we always have a method that's checking an if statement. And that if statement is going to say, OK, well, if it's not there, if it's resolving to false, then fetch the data all over again. This could be my remote API call. This could be a complex query that we're making to the database, anything like that. We're going to refetch it, and then we're going to set the transient again. This is really important, obviously, to make sure that we save that data. And again, don't set it to false. I should make that note in here. Do not set it to false if you can help it, because we're also checking for false. And that's why I have a hard check there, 3 equals sign in a row. So I'm ensuring that it's false, not just empty. Because we could also potentially save an empty value to the cache. And finally, we're always going to return the data. And notice I've used the same variable in both places, so we're always returning the most fresh information that we have. So secondary to that, we'll get into WP cache set and Git. This is a slightly different API. When you have object on, the transient's actually utilized or can utilize it, depending on how complex your setup is. But WP cache set and WP cache Git, you'll note have a group variable set in there, a group parameter. And this parameter provides some really fine-grained and interesting control that allows any site administrator developer to set and manage this so you could potentially, on the server level, clear out a whole set of data without purging your entire site cache. This is something that we tend to utilize a lot at page layer recommend, because if a site clears the entire cache, it might become unstable. But then you've got a tiny group of transients or something in the object cache that is just related to a custom post type, say. Just clearing that and just being able to manage that and find the information separately is a huge help. So again, I mentioned this just earlier, which was if you have object caching enabled, then set transient under the hood actually utilizes WP cache set, where the group is set to transient. We had a recent experience with this where we had to set transients to actually, I can't remember if it was safe to the database or safe to a different area in the cache, but we had to silo that information out of it, and that's where having object caching on can be hugely helpful utilizing those groups. I should also note that WP cache set is something that works out of the tin even when you have WordPress without object caching installed, but the only time it saves this information is in PHP memory. So as PHP runs, it renders the page, then you've got this memory stored when you use WP cache set. But at the end of PHP runtime, that information goes away again, so even if you tell it to expire in five hours, once PHP is done running in that memory, then you don't get it back, you have to refetch it again. So it takes a bit of care and a bit of finangling to know which you should use at which time. Generally, I say stick with transients unless you are going for something with a more complex and managed setup. So that kind of comes to the question, do we need to cache more? And my answer is generally, yes. We always want to cache more. We always want to improve the performance of a site we're working on as developers that's something we're always judged on, be it site speed, be it the amount of CPU usage. There are so many things we tend to be judged on, and so I generally go through a list of different things that I need to evaluate to determine not only do I need to cache more, because the answer is yes, but also where I need to cache and what I need to focus my efforts on. So generally, I look at performance indicators of a site. I look at traffic indicators, so are there any expected upticks? I look at the use of external APIs. Am I heavily reliant on those APIs to serve information to my users? Or am I generally just posting to those APIs? The use of content and access goals. So am I going to be managing a site that has 1800 writers on it that then need to all access the site within a day's time? So that can be admin side under heavy load and something like front end caching isn't that effective. So once we get into performance indicators, generally my favorite personal tool is New Relic. That's a paid tool that you have at the server level. And it can give you a ton of feedback, graphing and information about CPU usage, memory usage, database queries, WP queries. It's extremely effective, but it's a paid tool. So my second recommendation is always to fall back to something like WP query or, sorry, query monitor. Query monitor is a plug-in by John Blackbourne. And it is kind of one of those essential development tools I keep in my toolbox. This runs at PHP load. So it runs and follows the PHP session, keeps an eye on how many database queries you have, how much time it took for each query, what plug-ins are hooking in at what points. Because if you have a plug-in that's in and causes some crazy delay, you can dig in a bit more and try to follow those queries. Next, I'm looking at the traffic indicators. So by traffic indicators, I'm usually talking about any expected upticks. They're going to be a sale that a site needs to focus on, sort of event or holiday that you know you're going to have a ton of traffic coming to the site. Beyond that, I mean, we tend to all want to prepare for events like that. We need to look at the expected usage of the site on a day-to-day basis. What does the traffic look like? Is it a news website that all of the writers are in the middle of the night trying to publish their content at the same time, and at the same time then evaluate and update the content? Is it a site that everyone is browsing and purchasing things late at night, or maybe during the workday when they're bored? How long can you cache things? How long can you get away with caching things to keep things running? But allow users to see the most up-to-date information. All of these things kind of go through my head when I'm evaluating what I need to add. Can I get away with maybe the sidebar caching for longer if I've got related posts? Or does that need to be breaking the most up-to-date information possible to keep people browsing my site and thinking, man, they're really on top of things. This is all fresh content. The use of external APIs. At a previous job, we relied extremely heavy on an extra API that collected all of the post content from our site we've got. It's something that takes a lot of time potentially on how that external API is set up and how your internal setup goes. But you've got time from the server to server to handshake. So that creates however many milliseconds. And depending on distance, that could be seconds. And then you've got time for that server to fetch its information. Then you've got to get the return information and process all of that, perhaps your reformatting into a different JSON blob to send out to the front end. All of these things take time. And if you don't need to get a fresh query from this external API, it should be noted that when you're working with external APIs, occasionally we're tempted to write in loops that say, OK, if the server didn't return a good response, we're going to keep querying against it. So you've got to be careful with whether or not you decide to cache a bad response or whether you add in retries. Usually I'll add a counter and only retry so many times and then accept that that server isn't responding for the time being. We actually ran into a case where we had a very small underpowered API that we were working with. And the API fell down. So we kept making queries to it, trying to get a good response, and eventually our site to the point where we were running into trouble on both ends trying to fix it, with only one dev team between the two. So it's kind of interesting that you can utilize caching. And we wanted to save this information to cache. But even then, working with this additional layer, and the API didn't have any caching on it, by the way, can cause a lot of interest and complexity that we need to evaluate as developers. So I want to take a quick segue here. I want to talk a little bit about the BrowserCache API. Now, I personally am a back-end developer. I work on the REST APIs at Pagely. But the BrowserCache API is something that I personally need to be aware of and as developers we all do. So I'm going to provide a little bit of additional information here for you. But I would recommend you go and research this on your own. The BrowserCache API is a fully front-end focus solution. So when I'm talking about BrowserCache, I'm usually talking about local storage. So storing something like session data right on the user's machine. But it's something that we need to use with great care, because it's very easy to fall into the trap of caching a lot and saving a lot to the user's machine. But what if someone has limited storage on their device? What if they're on a cell phone? And that cell phone is going to throw up an alert saying, hey, this site is trying to download a lot of stuff to your machine. You're going to lose user trust. So it's actually far more concerning to me as a developer when I'm thinking about something like front-end caching, front-end local browser storage. So with great power comes this great responsibility. Here's a quick example, a code example of in the JavaScript BrowserCache API, how you would cache that information. It's still relatively simple. It's still a key value pair of information that you're fetching. So as far as becoming comfortable with the concept of caching, this should be pretty cut and dry. Here's a couple of examples. I will be sure to share this later, location to be depending on what the organizers would like. But I would recommend that you definitely dive into reading into this, see what's appropriate to cache and not, because it obviously will be on a case-by-case basis, but there are some good standards to look over. So let's talk a little bit about some of my lessons learned at scale, some of the things that we've seen personally, either at Pagely or otherwise in my career. And this graph is very true to me. I was writing my talk, and I think I had already given it once. And a colleague sent me a message and said, hey, you should check this graph out. He's on the DevOps team. And so he's always helping customers with digging into why is my site performing slowly, why am I maxing out my CPU, anything like that. And you'll notice that's 100% right at the top. There are two lines there. So this site actually has two different nodes that it's running on, as well as a database node. And it's consistently maxing out. And so the DevOps team started to dig into it a little bit. They said, OK, well, they're running this one big plug-in. It makes a ton of queries to the database. I bet it's just running too many. It's just killing the site. But then they started digging a little bit deeper. And it turns out there was a recently installed plug-in. So I don't even remember what it did. But it was a pretty narrow focus. It did a small function that they wanted and needed. And they thought, OK, we'll toss this on the site. But that plug-in was calling a method, a PHP method, that was called session start. And in session start, it was actually hooked in all the way up at WordPress in it. So in the process of WordPress itself running, and it is way at the top. And this plug-in was saying, I need to start my own session so that I can capture some additional information. When it did this, all caching that was happening after session start got turned off. This was this underline to be honest. I couldn't tell you if it was server set up or if it is intrinsically within caching that this is usually turned off. But everything after the session start stopped getting cached. And the site continued to max out. So once we found that issue, we notified the developers. And they said, OK, we'll just turn it off for now. And you can see that graph level way back out, back to a normal amount of usage. But it was kind of a surprising thing. We went in. We said, OK, these are some long-running database queries. It's muddling things up. It's slowing things down. But it turned out to be such a small little plug-in that you tend to think is harmless, causing a lot of chaos for us, causing alerts to go off, page duty to go off, all sorts of things. So we talk about some other lessons learned at scale. And that would be cache warming. Cache warming is an interesting thing. I have dealt a lot with it in my time. And when I'm referring to cache warming, I'm talking about the additional layers of caching that you're adding. So you say, OK, I'm going to cache this one query for five posts on my blog. That might be the most recent post that we want to feature. Once we've queried that, we're storing that in cache. And then we say, OK, well, that's cool. That made things way faster. Let's cache some more. Let's cache for the admins, all of the users. Something quicker like that. Maybe not for an administrative need, but for an about us page. We're querying users. And then we're caching some remote API calls. We're like, OK, our site's growing, though. And we're making more unique remote API calls that we're caching. So we've got all of these layers. And all of these things have overlapping expiration dates. So at one time, you might be pulling most of your data from the cache. And this cache is growing, which is good. We also have to account for the fact that cache might get caught at any given time. Don't forget that these cache values, we can never rely on them being there all the time. So say the server goes down, or some developer doesn't think about it and clears out a whole bunch of cache, a whole group of cache that was critical to the site staying up and performing decently. Cache warming is real, and you need to be aware that any time you're writing code that saves and calls the cache, it still needs to be performant at the end. Significantly reducing that load is fantastic. But again, clearing that whole cache, when someone hits that button and says, OK, and the same goes for CDN. If you've got CDN caching assets for your site, when you hit clear cache, you are slowing everything down for the end user potentially. And finally, this kind of goes in a similar vein, is you should load test. So load testing is something that we always mention at Pagely to our customers who want to be sure their site is going to stay up during an event. We say, OK, we'll set up an additional server that's exactly like your existing server. We'll give you the whole same environment, and you can load test against that. Send a bunch of traffic and see how things perform, to add some additional caching somewhere. Maybe you're over caching or saving some sort of database value that gets called constantly that doesn't need to be called. So you're loading things into memory that don't need to be coded. All of these elements kind of come into play that you would have never seen had you not load tested your site. And even then, you could also potentially go further and say, OK, well, what would happen if we lost caching at the same time? There are a lot of fantastic things you can do with load testing. I would highly recommend it if you haven't researched it before, you should. It's a bit beyond the scope of this talk, but we could certainly chat about it afterward, and I could get you some resources. So finally, expired does not equal deleted. When you've got WordPress out of the box utilizing caching, utilizing the Transience API, then you're saving things into the cache. You're storing things with this key value pair and giving it an expiration date, say you are storing it for an hour. And then you change your code or you update your code, and that value never gets called again. WordPress doesn't have any garbage collection that's saying, OK, I'm going to look through the WP Options table where these Transience are stored out of the box. I'm going to look through there, and I'm going to say, this needs cleared out. We don't need it anymore. It's expired. No, WordPress does not do that out of the box. So you could potentially be stuffing a user's WP Options table full of bad data that's not even called anymore. I would recommend if you've got an old WordPress site, you should check out that table. You might find information that you have long since deleted, still stored in there, that need tidied up. So it's something to consider. Just because something is expired doesn't mean that data is not hanging around somewhere, taking up space. And finally, I want to leave you with this, which is you can cache all you want, but as developers, it is still our responsibility to write performance code. If you've got a long-running MySQL query, then you should consider that maybe you should look at the performance of that query before you go ahead and cache it. Because at some point, that cache is going to expire, and that query is going to run again and potentially affect a user who is trying to visit the site, slow that down once they're trying to render the page, and the cache is trying to fill that information again. So it's on us to still write good performance code, to still do testing, to still review and analyze what we're working with. Caching is a fantastic tool to improve performance, but always keep an eye on that and write the best code that you possibly can. So finally, I'm Mara Teal. I am working with Pagely. We do WordPress hosting. And feel free to find me afterward. If you have a question here, think about it, because we're going to do a little bit of Q&A. But if you're not comfortable asking your question here, find me at the after party, find me out in the hallways somewhere, and let's chat. Thank you. So thank you, Mara, for your excellent talk. Sorry for a bit of problems we had with the sound. So we're going to get you a microphone now for the Q&A to have that fixed. But in the meantime, if there's someone who wants to ask a question, maybe a microner can go and find you. Who would like to ask a question on caching? Oh, here, sorry. It's very hard to see. Hi, you mentioned load testing. Can you recommend some tools that could be used right out of the box, rather than coding something special? For load testing, there are quite a few tools. Honestly, I think it's specific to the environment. I would have to check in with my team, honestly, to make a good recommendation. I have a couple I'm thinking of. But I don't want to go too far. But a quick search should return some great results for you. So sorry about that. Thank you. OK, thanks. Any other questions? Yes, there's one in the back. This one coming. Thanks. So we use your services, your plugin for caching. So I have a question because when we have some changes on the front end, what will be the safest way to view the changes, to purge all the cache or? Yeah, so depending on what you have set up, there may also be CDN caching that needs purged. So you can clear that CDN cache as needed. There should be a button in the dashboard plugin section. Or any other cache clearing mechanisms at the database level or page caching level should get cleared when you make an update. So occasionally, browser caching comes into play. And as a backend developer, I don't generally have control over that. It's one of those layers that we always have to be watching out for. And I find myself doing a hard refresh, which is Shift R. I usually do Shift Command R on a Mac. But I find that utilizing the hard refresh is still something that I have to do in 2018 to ensure that if I'm telling someone to look at a page that's not logged in content, that they're not a logged in user, they should utilize that hard refresh. So if you're logged in, if you have logged in users, usually that's not that big of a deal. Thanks. Thank you. Question on this side? Yes? Hello. When talking about WordPress, we're using non-sys all the time. So I would like to have your opinion on how you're dealing with those. And the second part of my question is obviously the most important caching is for the first page load. But when on the first page you have something like exit intent pop-ups appearing or you want to check users' location, things like that, how do you do that? I mean, it's really tricky. So for user-specific and JavaScript things like exit intent, different user actions like that, I actually would find that local storage and browser caching or even cookies are good options there. So as far as that, I would say that tends to all lean toward front-end caching of some sort. Because if someone clears out their browser cache and cookies, they should get that same exit intent or policy acceptance, something like that usually. If it's not a logged in user that you can't save that content to the database, that is. Usually in WordPress, for things like administrative notices, you'll notice that actually saves to the user meta so that they never get that information again, or that pop-up, what have you. As far as nonces, and someone can correct me here, but I would say that that's not something that should ever be cached. So that's always something that should be fresh, because you are essentially validating that the user is who they say they are or the request is coming from where it's supposed to be coming from. So that should never touch anything cache related. Should always be fresh, yeah. Excellent. Question here at the front. Mick Runner is actually running to bring the mic. Can you raise your hand again, please? You talk about caching on websites. So for a news website, for example, different people visited different times a day. Do you recommend variable caching depending on traffic or user events? For example, if an admin adds more content to refresh the cache, or if we have a peak in the traffic, thank you. Yes. So certainly, I would recommend, usually my goal is to cache information as long as I can get away with without people seeing really old data. That said, if you've got a front page with latest news on it and you've published a set of latest news or you have some event going on, I would reduce the amount of time that's cached or have an option just to clear sections of the site. Again, that's where things like the groups and object caching would be really helpful because you could save things multiple places or in multiple groups or make sure you have a key set that you can be specific to. That way, you don't have to clear the entire cache of the site and hurt everything, but you're making sure people see the freshest content where it matters most. That might be the sidebar. That might be a banner in the top header. It could be various places, but yeah, that's where getting specific really can help maintain performance while you're under heavy load. So, thank you. So we may have time for one more question. Anyone? Once, yeah, one here. So, last question. So, say you've got a news site and you've got a piece of content that you update. That piece of content could be on dozens of places of the site, right? It could be in a category page, it could be in the sidebar. How do you do cache invalidation when it could be anywhere? It feels like there's no good way to work out where it could be, all the different places. It could be on page 25 of a category page. It could be on the sidebar. How do you cache invalidate without throwing the entire site out? I actually, I ran into a very interesting issue on the news site I worked at at time, which was we were querying for related posts and we were caching the information, but it had an expiration date, obviously. And some of these posts were only ever accessed by bots. We ended up permanently caching that information because these bots would trigger all kinds of MySQL queries and related posts are usually quite expensive queries. It caused a lot of problems. I realized that's not exactly the issue you're experiencing, but it's something worth noting and that's something that we made a concession for, was we accepted that we could not serve fresh new content on an old post that was, you know, five years old. We needed to, we wanted to focus our efforts on the top of Google, on the fresh news content, things like that. So it was something we kind of just accepted, but I would make a similar recommendation in that I would create caches that are group-specific and try to clear a cache based on when you hit update so you could check what category it's set in and perhaps use categories if you're utilizing related posts that way. An additional item that I hadn't mentioned though is for something like that, say you do have a related section and you can query posts from something like Elasticsearch for related so it's much quicker, you could set that up on your server and then instead of loading it within the HTML of the page you could utilize something like an Ajax request. So especially if it's toward the bottom of the page below the fold, you could just utilize an on-scroll event when the div comes into view that you're utilizing for containing related posts, you could either preload, start making the Ajax call early or just say we're loading in fresh content for you. This is especially critical for something like comments on the page to keep those up to date because it's extremely hard to balance caching that information and that's something that I would say you almost can't cache to make sure users feel like they're being heard and seeing the most up-to-date conversation. I hope that answers your question. Okay, thank you. Well, I've seen more people raising their hands but unfortunately we have to leave it here but as you said you're available for questions off-stage so she's easy findable with this lovely heart so that shouldn't be a problem. So thank you for attending and thank you for your talk. So big applause for Mora.