 Okay, so I'm Jodi, this is Yanez, and we're here to go deep with the render cache. So hopefully everyone knows a little bit about the cache API because we're not going to go too much into the basics and try to get more into the weeds of it. We've been going pretty deep into Drupal caching and found out that there's a lot to know to really improve your caching. Here's a link to our slides if you want to follow along. Okay, so my name is Jodi, I've been around the Drupal community for a long time, I've been going to Drupal since 2008. For a long time I was CTO of a Drupal agency called ZivTech. Past few years I've been working for a Japanese semiconductor company called Renazos. So I always had an agency I wished I could get really deep into a project long term. So now I run one website all the time, www.renazos.com, it's a big semiconductor company. I'm leading a team of 25 or 30 engineers there, and I'm also on the product team. So this performance stuff is more of my passion project than my actual job, but I really love trying, digging into complex performance problems, and I really think that fast websites are, you know, I just think a good website is a fast website, like it is the feature, it is what good software is, is fast software. You can't have, if it's slow it's not good, there's just, it is the thing. And it's not easy, but it's what we need to work on. I'm Yanis. I come from TechOne Consulting. We are the second biggest contributor to Drupal. We also donate a full-time infra expert to Drupal Association to help them run the infrastructure, Drupal.org, and all the tooling, and all those things. In the past, I used to be the lead of Media Initiative, and also worked at Examiner.com, which used to be the biggest website, the biggest Drupal website on the Internet, and we had quite a lot of interesting performance challenges there. And even before that, performance has always been dear to my heart, and I found it interesting, so here I am. So the reason that we are doing the talk together is, when I came to Renaissance, the site was really slow, like talking about like 10 seconds to load a page. Nothing was cached on any level, and I was like, well, this is going to change. This is totally unacceptable to me. There's no, I'm not going to sit here and work on a site that takes 10 seconds. We're going to fix this, whether it's my job or not. And so I, as I was getting into this, I was like, you know, I really need some really serious professional help on this, some experts. Our dev team at the time didn't really know a lot about Drupal caching or Drupal performance, so I remembered that the people I needed to call were Tag1. So we got to hold a Tag1 and got to work with Yanez and his team at Tag1, who really helped me to learn more about the Cache API, which I thought that I understood well. And I think most Drupal developers do think that they grasp it, but it's pretty complex. And there was a lot that we dug into and found out. So the reason that Renaissance.com has such serious cache challenges are, first of all, it wasn't built with like performance in mind. So of course, it's so much harder when you have like a site that's been around since I think it started in Drupal 6. It's been around for a long time under active development, with dozens of developers for a decade maybe, adding code without understanding performance or measuring performance or thinking about performance. So obviously we have a huge amount of technical debt to dig out of there. But then we also have other unique challenges. For example, we have tens of thousands of pages. We're basically an e-commerce site with tens of thousands of unique product variations. We also have three different languages on the site and nine different regional variations. So we have 27 URLs for every single page on the site. And of course, a lot of caching happens on the page URL level. We have a lot of content editors and a lot of automated integrations that are updating content constantly that needs to be invalidated all the time. And we support authenticated users. We have users who log into the site to access secure content. So we have a lot of people who are logged into the site who need it to be fast. And so we can't just make the site fast for anonymous users because now our product, our customer workflow is like, we're going to do everything we can to convince you to make an account on our site. That's how we get a conversion. We're going to get you to make an account on our site. And as soon as you do, it takes 15 seconds to load a page. So that's your reward for that. So we also have probably like almost 100 custom modules. We have endless custom code, hundreds of contributed modules. And we had, we have gone a long way to not, but we had very heavy use of views module, building everything all around the site. Every page was like a view of you, of you, of you. We also have lots of pages on the site that have hundreds and sometimes thousands of entity references on them. So like really big, complex pages that could cause a lot of cache invalidations or just be slow if it's uncached because there's just a lot of stuff on the page. So we had some serious challenges where it's not, like there's just a simple way to, and it's not Drupal's fault. There is no system out there where you can have authenticated users having per user experiences and dynamic content that's always up to date and, you know, and have like tens of thousands of different URL variations and have this actually be cached well and be fast without doing tons and tons of custom development. Okay. We will first go through some of the render cache basics, we'll explain the concepts behind it and how it works and then some more advanced techniques that are being used in Drupal like place-holding and layers of caches that we can experience in Drupal and then we will go into debugging and getting stats out of the cache system and things that can go wrong, etc. But first to the basics. What is render cache? Render cache was introduced in Drupal 8. I think that before it was, before it could have been a contributed module that did it, but in a form as we have it now, it came into Drupal in version 8. And it caches markup that was rendered to let you next time when you need this specific piece of markup, instead of re-rendering it again, you can get it from your cache and it's way faster. It is enabled by default. So if you are, you know, producing your own markup, it will be cached. So if you don't think about that, you will get potentially very weird results. And by default, everything is cached indefinitely. It doesn't have any, like, lifetime. It's cached indefinitely and it uses tags to invalidate. It does significantly improve performance and it's so powerful that, you know, it can, as Jody mentioned, it lets you cache things dynamically for auto-ticketed users. It's not like PageCache that only works for anonymous users, but can cause really very strange and weird problems if it's not used correctly. And in order to use it correctly, we have to understand some basic concepts behind it, which are really not that hard to understand once you grasp your mind around it. So let's look what those concepts are. First one are tags. Tags are used to invalidate things that are cached. We mentioned before that by default everything is cached indefinitely. So if we wouldn't have tags, when we rendered a page, it would never change. And cache tags let us achieve that because we want that. And they look like this, like the second item. First one essentially means that when node with ID 11 will update, everything that has this cached tag on will also invalidate. The second one is similar, but is used for user with ID 4. Then we can have cached tags that are invalidated when certain configs is updated, like the one for views that we have here. Or we can have also tags that are way more broad, like rendered. And this one will invalidate everything that is being rendered, basically the entire page. You can also create your own custom cache tags to fine grain the logic, the invalidation logic that you want. And when you are building anything that produces markup, like usually that would be a block plug-in for example, you are responsible to define cache tags on that markup, otherwise it won't be cached correctly. The second concept in the render cache are contexts. Contexts are the thing that defines how a markup varies. If we have markup that is the same for every user, then it won't vary. But if we have a markup that is different for different situations, we have to use contexts. Different contexts will tell the caching API, the caching system, how to vary. And few examples are route. So this markup will be cached separately for every route. If it appears on route A, it will be different than on the route B. Then session will be basically different for every user, because every user will have a different session. Then we can vary by URL. And this one is interesting, because you can see that you can use URL cache context, which means that item will be cached separately for every URL. Or you can control it a little bit more in detail. You can say, I don't want it to be different for every URL, but just for every, like for some part of the URL. In this case here, just the query arguments. And even more, you can limit it just to a single query argument. Which then means that you will get a higher cache hit ratio, because you're not varying as much. There is a huge difference if you're varying just by the page query argument or every URL. So it's wise to define this cache context as detailed as possible. Again, when you write custom markup, you are responsible to define the context on that markup. And there are also default cache context that are configured in service as YAML file. And those contexts will always be used. And defaults are language, team, and user permissions. And if you think about it, it makes a lot of sense. If you have a site where somebody can switch a team, you probably always want to cache markup in those two teams separately. And say it goes with language, because you don't want to cache something that is in Spanish the same way as something that is in English, that wouldn't work. And then at the end, we have the max age attribute to the cache system, which basically defines how long something will be cached. It's an integer. It means how many seconds something will be stored in the cache system. Default is minus one, which means forever. And that is usually the best approach, because we want to cache as long as we can and use tags to invalidate, not the time-based invalidation. And now we come to metadata bubbling. If you imagine a page as a tree structure, let's say on a page, you're displaying an article, and then inside of that article, you have an image and image itself. It's like a component that has the actual image, but then also the caption. And if you imagine this like a tree structure, and at the root or the top of the tree is the entire page, and the leaves are the smallest individual components, an example would be the image tag. When cache metadata is configured on the leaf, it will always bubble up towards the top. So all the cache metadata that every leaf at the bottom defined will end up being assigned to the top. So the entire page will basically inherit all the cache metadata from all the small components, the old pieces that are on that page. When the things are bubbling up, tags and contexts are merged. So we saw cache tag node 11 before. And then another component on the page adds node 22. This means that the page will have both these cache tags at the end. And same goes with contexts, which also means that if you put some problematic context or tag on some really small, maybe not that significant part of the page, the entire page will be affected. And max age when bubbling uses the shortest age of all. So if you say max age zero, effectively making something uncacheable, the entire page will become uncacheable. And there is a concept of place-holding which can stop this bubbling. And we will cover place-holding a little bit later. We already talked a little bit about the consequences. And this is very common. We have a block or some piece of the page that behaves in a weird way. And then a lot of the times people will go and put max age zero on that to solve the problem, quote unquote. But then it can happen that the entire page becomes uncacheable, which is obviously not what we want. And as much as max age zero is a problem, also using cache contexts that vary a lot, like very per user would be a good example, are also causing problems because it makes things harder to cache for Drupal. Render cache is not the only caching layer that we have. We have internal page cache, which has been part of Drupal. I'm not even sure for how long but for a very long time. But that one works only for anonymous users, which also means that it doesn't need cache contexts because you just cache one version of the markup. And then we have dynamic page cache, which works also for I think users and uses this cache metadata to try to be smart to have dynamic content but still being able to cache it as efficiently as possible. And then dynamic page cache also works with big pipe, which can stream in place-holder elements later in the HTTP response. And then we also have external page caches like Varnish or CDNs like Akamai, things like that. And using purge module, you can use cache metadata to also invalidate items in those caches. So if you've set your tags correctly, you can send the tags via a header to a CDN. And then when you want to invalidate that, you do an API call to the CDN and instruct it to invalidate that, which is also very powerful. But it requires that you provide correct cache metadata. So most common problems that we see, this one we already mentioned, usually this is done because there are cache-related bugs and then the solution is to make something uncacheable. And please don't do it ever, because there are really, practically never there are good reasons to do that. And if you still think you have to do it, it can maybe pay off to still cache for a short period of time. Maybe a few seconds or a minute or something like that. And then you have to placeholder it. Then another really common problem are missing cache tags. So for example, we have a block that prints a list of five most recent articles. And you forget to add cache tags for those five articles to the output of the block. This then means that if one of those five articles will update the title, this title won't update in the block. You solve that by adding cache tags for all five nodes that you displayed onto that markup that you're generating. If you don't do that and if you forget some cache tags, then content won't update. And then admins and editors in order to make the change appear to the user, usually need to go and clear the caches on the site, because we have a very nice button there in the UI, which clears all the caches on the entire site, which we don't want. And the problem is when people get into habit of doing that. Every time when they would see, oh, something is not updating, let's clear the cache. And that's a really bad thing because it affects performance a lot. And when people develop a habit, it's really hard to stop it. Then we also have cache tags that are too general, like node list, user list. Those will invalidate any time when any node in this example updates. And views, by the way, add this cache tag by default. It's way better to use cache tags that are more specific, like the one that will only invalidate when any nodes of content type article is updated. And a lot of times people will just invalidate cache tags just in case, which also results in frequent invalidations and that affects your performance. So don't do it. Do it when you know why are you doing and what do you want to achieve with that. Contexts, usually the problem is that they are missing context, so then things won't vary as in a way that we want, which means that potentially we will display content that was meant for another user to, you know, get another user, so we really have to think about which context we need. Because if we put too many on a markup, then you will get a poor cache-hit ratio. Common example is an empty block. Like you have a block that displays something, but then under certain conditions it doesn't print anything. And usually what people would do, like when those conditions are met, they would return the empty render array when the markup is empty. But you still have to put cache metadata even on that render array because you don't want it to be cached indefinitely and not vary at all. Sometimes it's not needed, but in a lot of cases you have to do it. So usually good practice is just prepare the render array, start putting basic metadata on it, and then have the condition and add the markup to the existing render array, which brings us to place-holding. Place-holding is a tool that lets us cache pages that have items that vary a lot on them or that are frequently invalidated. And it works by, instead of including the problematic part of the page in the cached item for the entire page, it will only put there a place-holder and then render it separately and inject the problematic part into the markup of the entire page after the fact. This lets it cache the entire page and still have pieces that are really dynamic and really change a lot. Drupal will do this automatically for you, will try to be smart. And for example, the core blocks are using that by default. So if you create a block that varies per user, for example, it will be automatically place-holder. You don't need to do anything about it, which is very powerful. But then we found out that if you're using context module, context module doesn't do that. And if you have a block that varies per user, if you're using context, then suddenly the entire page will be varied by user, which really backs performance. And there is an issue, it has been open for six years, it has a patch, but it's still not committed. So if you're using context, check that issue and use the patch. Example of place-holding are the status messages. So if you imagine status messages are meant just for the user that is currently viewing the site, just this one time. And so it would effectively make the entire page uncacheable if you wouldn't place-holder them. And if we check how status messages are rendered, it looks like this. It doesn't load status messages, it doesn't try to display them, do anything, it only does this. Which defines the lazy-builder callback, and it says, please place-holder it. And then Drupal renders the entire page, and after rendering the entire page, it will call the lazy-builder callback, which looks like this. And basically renders the status messages that we have above. And it takes that markup and injects it into the page. And this is how the page can be cacheable, and status messages that are only displayed once are rendered like this. On the renaissance.com we used this technique for this block, which you can see that it looks quite static. It has a few links that I think are the same for every user. But then you have subscribe to document updates, which if you are subscribed to document updates, will say unsubscribe from document updates. So only that link varies per user, because it has to check the subscription status. And we place-holder just that link to be able to cache the block for every user, and then just put that link into it to vary for every user on the site. Thanks, Yannis. Okay, I'm going to give you guys hopefully some tricks and tips and types of stuff that I wish I had known earlier, how to really dig into this stuff. First of all, you need to be able to debug these cache tags, and cache context, and see which tags and context are on everything to improve the tags and the context. So the first place to start is in services.yaml. So if you look, it should be in sites default. That's where it has these render configurations where it sets, which what your default contexts are, what your auto place-holding settings are, the kinds of stuff that Yannis was just talking about. There's also something you can turn on that you would not do on production, but you would do on local or preview environment. You can enable this debug cacheability headers, and that will let you see on the page level, after all the tags and contexts bubble up to the page level, you can look in your response headers, and you can view all of the tags and all of the contexts for a certain page. And then from there, you can start to dig in and see, like, this context is a problem. Like, for example, I think that URL query args is a problem, cache context, because as soon as somebody puts a random string into the query, now they've got a cache mess. So that's a problematic cache context. User.node grants view, that's a problem, because if you're logged in and you have a specific grant to a node, now you're varying per user, which means you basically have no caching at all for your authenticated users. And then you look at the cache tags and you try to see, so you have to really start to understand your cache tags, or any of these problematic cache tags that start to get cleared. And I'll show you how you could figure that out in a little bit. And down here, this is really bad. This whole page is set to uncacheable here, but this is a separate heading that you would have by default anyway. So yes, you definitely want to enable that by default locally and just always be looking at that. But there's another way to debug in more detail on the render array level, instead of on the page level. And this, I think Yana has actually helped this patch finally get landed. There's a setting, awesome in services.yaml called debug. So it's like render config debug. You set that to true. And again, for like your local, and then it gives you like in your markup in the HTML right before each block or each rendered piece of the page. It tells you just for that little bit what all the cache tags are, what all the cache contexts are, and it even shows you how long it took to render that block and whether it was like a cache hit or a cache miss. So I use that all the time because, you know, you have a mix of the custom tags and custom contexts that you're adding, then you have all kinds of things that core and contrib might be adding in there, and ultimately you just got to see what the end result is, and it even shows you like the pre-bubbling cache tags and cache contexts. So yeah, turn on that debug mode. That's kind of where you start to kind of get into this stuff. But then that kind of will show you what's going on like on the development level, but the thing about performance is it's always a combination of the code and the real user traffic, and you have to see what's like really going on on your production site with the real traffic and the real patterns that are actually happening to understand what's really important with the caching. So Yana is introducing me to this cache metrics module which is by Moshe who I think is here, and so Moshe made this module cache metric that can send this cache data into New Relic. So I think it can probably send it to some other places if you're not using New Relic, but so this way you get all the data on production of every time anything invalidates a cache tag on production and which URL it happened on, when and by which user, and you also can get all the data for all the cache hits and misses for dynamic page cache, render page cache, everything, and see like everything that's kind of going on in production with the cache tag. So that was like a really a game changer once we started investing in trying to get better data and like really see what was going on and spending and making sure we're spending our time fixing the things that we're actually going to help because you could really spend your lifetime trying to fix these like issues in cache and maybe not make much of a difference because it could go on and on. So you really need good data. So this is like a screenshot of a dashboard that I've got from using this stuff from cache metrics and you can see what's getting, which cache tags are getting invalidated and you can see like pages that are like cache misses and all this kind of stuff. But it doesn't like answer your whole question of which cache tags are really problematic because it will tell you which ones are, which cache tags are getting invalidated and it'll tell you which cache tags you have on a certain page but you can't really query it together to see like like so if you look here, it'll be like, okay, well these are like these cache tags are getting invalidated a lot but are we even using these cache tags, right? So I come up with, so what I'll do is like I'll pick a certain URL that I think has like a poor cache hit rate and has a lot of traffic and then I'll go in there and I'll grab what tags it has so I get the tags in here. Here's like the tags that are on this page coming from dynamic page cache. Then I copy those and then I do another query from the tag invalidations where I check to see what's invalidating the actual tags that I'm using and then so I put the whole, I copied those these are the actual tags that I'm using on that page and in the past week this one got cleared 14 times this one three and this one one, the other one's not at all so these ones really didn't matter for these other ones nothing was clearing them and this isn't, 14 isn't bad but it's not good but sometimes I've done it and it says 5,000 times and you're like oh great that's cache tag and then you realize actually this cache tag's on every single page on our site and it's getting cleared 5,000 times a week so we really don't have any caching at all and that's how easy it is to have no caching and that's why this stuff is so tricky it's very easy to destroy you can destroy your caching in so many amazing easy ways that not doing it is kind of a modern miracle and then this is showing for those same cache tags we were just looking at what URLs are invalidating them so you can kind of understand okay so this is the one that was getting invalidated fairly frequently it's happening when this node was edited by a content editor or this one was deleted and you can start to like understand hey maybe we need to be more specific about when we're invalidating them or how we're using these tags so that's cache metrics module really like that and then okay let me talk about views for a little bit so first of all if you're doing, if you're using enterprise site or a high performance site or you have like a big team on a site I would just not use views I just really think that Drupal as like a development community we need to get a little bit more clear about what is a site builder tool and what is like a serious engineering tool for like big sites and views is a great tool for small sites that you're building quickly or for things that don't get used to like an admin dashboard or something like that but it's not a great tool for high performance or for doing a lot of custom development on top of and there's really no reason to be using it just you know write a database query and build your stuff but so here's some of the gotchas with views so whatever you go into views and you see like the caching options there there's three options right tag based, time based and none and I think the default is tag based so if you use tag based which is the default it automatically adds this node list we usually views are mostly nodes but whatever the entity type is it adds that cache tag automatically because it doesn't know which cache tag to add so it just adds this one that's going to just wipe everything out as much as possible because it's not able to understand closely enough like when it really needs to invalidate well node list is just one of these like toxic cache tags and so adding that is like kind of having no caching at all and if your contact gets updated a lot because node list gets invalidated every time you edit any node or add any node so if you have a heavily updated site that's not a cache tag you want on things so then you might think well maybe I should switch it to time based and they have that option for time based well here's a messed up thing about time based time based adds max age but it doesn't remove the tags you think that it's going to like stop invalidating things by tag because you switched it to time based not tag based but it doesn't it just still uses the same tags and then it just also adds a time basis which is completely unnecessary because it's already getting invalidated all the time anyway so time based is worthless there's like an issue in core about it and then the other option is none if you set it to none I think that's going to give a max age of zero hopefully that will cause auto place holdering to happen so you can still get like a dynamic page cache hit for the rest of the page but it might result in like you not even getting any caching on the entire page at all so I wouldn't mess around with none oh also if you set sorting to random that's another good way to have no caching at all on a view so if you want to use views and like get rid of these tag problems you have to install this contrib module views custom cache tag then that'll give you a new option so you'll have like tag based time based or custom and when you use the custom option first of all it'll remove that node list that general entity list tag and allow you to put in in there the cache tags that you actually want so like one way you can do it is you can put a more specific one like you can put like node list article or something or you can put in like some other like custom cache tag in there and so we got into this idea of like custom cache tags which was really necessary for us and the reason was that we started like we wanted to have we wanted to use cache tags to invalidate varnish and our CDN so that things would update right away when they needed to on those levels and not require the editors to wait a certain number of hours or something or to like manually invalidate it and what we found was at least on Aquia there's a size limit to how big of a header you can send to varnish and so the way that varnish purging by cache tag works is it kind of sends the cache tags in like the response header and so if you have too many cache tags you're just going to get fatal errors all over your site and we had like way too many cache tags because what we were doing is we would just have like we would have a big table of like 500 nodes and we would have node 1, node 2, node 3 500 node cache tags on it so if any of those nodes changed it and actually we had some pages that had 10,000 in a table so we had like these crazy pages with like way too many cache tags and so we had to start we added some custom data to cache metrics so that in New Relic we could see the length of our cache tags and then we would know which pages we needed to work on because otherwise we would just keep on getting these fatal errors so Hoel of our team he did a ton of work around this to get both like the purging working and all this cache tag stuff working and so he would have to like go and figure out which pages had too many cache tags and what we're going to do about it so what we did was instead of like putting a separate cache tag for each node we had to start making our own custom cache tags so for example in this part there's like 2,000 records in this table that each one is a node so instead of putting 2,000 cache tags like we used to do he added a custom called parametric table and this is like the page that we're on and then we would make custom logic for when to clear that cache tag so then we could say like we don't need to clear that cache tag when any of these changes are updated we need to clear them if their title changes or if like some other little piece of it or if their URL changed or something there's something that's actually relevant here so that we don't actually so we can minimize the invalidation events just based on what actually has to change and that was something we had kind of avoided at first where we were like oh we just kind of used the core cache tags that are already there but once we started doing that that made a lot more sense so like we really need to have our own cache tags and our own invalidation rules because then we can control the invalidation much better so I could go on on and on but anybody have any questions yeah just repeat the question into the mic I just want the link to the slides okay you said you have like one block that's like on every page and how to cache it yeah so the example of I think if I understand what you're saying an example of that is like our main menu right so we have this big giant main menu it's the same on every single page on the site and Drupal by default adds a cache context of like active menu trail so that we're caching so by default it caches each menu per page so that we have to like render the whole entire menu fresh on every page and cache it separately so then we had to like go in and like get the cache context on that thing because really the important thing if you have the block on every single page the important thing is to make sure it doesn't vary by URL or vary by route or anything like that and really get the I mean really if it's on every single page it doesn't even need a cache context it should have no cache context at all right oh you're getting real time data so how do you cache that okay well I would say you can't really cache it I would try to grab it from JavaScript I would try to I mean we haven't really talked about that but that's like kind of like the key to all this stuff is like try to do it on the front end so that like your back end stuff is all cached and then do like a call call in JavaScript to get this stuff and then stick it on the page so the whole rest of the page will be cached that makes sense or yeah our placeholder or like if you try to compromise maybe you can cache it just for like a minute or like whatever is more than zero is better right so if you can if you're pulling in tweets and if you can you know be fine with having it not update for 15 minutes then cache it for 15 minutes use the time base approach we're getting the wrap it up flag so you guys can come up and ask like individual questions and we'll say goodbye