 Good morning, everyone. Just give it a few seconds for people to start dribbling in All right, I'll start with the introductions. My name is Toby Vaughan I'm the product lead here at amazing IO I'm based in Canberra, but we're here for Sean But we're here to learn about caching at gamification Sean Hamlin is the the main technical account manager in the APAC region from a major amazing IO He's been with us a little bit over two years now And we were just joking that his most popular module on Drupal.org is the responsive fava icons But secretly he wants to know who the four people who still use the nyan cat progress bar So if you one of them let Sean know During the presentation Sean's completely open to take questions as you think of them So feel free to drop them in the Q&A section of the Session and if you can answer them as he's going he will do that, but for that we'll have a wrap-up session at the end We'll get through those questions. So drop them in as you go. I'll hand over to Sean. Thanks, Toby Thanks everyone for jumping in All right We're gone through that First thing I want to talk about thank you for showing up. I know we've got a lot of slides Another competing talks here in which all high quality as well. So thanks to you for showing up and hopefully Learning a little bit about what we're Gonna talk about today, which is I go through a mazio in one minute in case you don't know who we are a Little bit about HTTP caching in case this is new to you or you want a bit of a refresher A little bit about why HTTP caching is a good thing and why should you care? We'll talk about this new tool that was developed could caching score and I'll unleash you upon that as well. So you can actually use the tool whilst I'm doing my presentations so you can ask some really curly questions if you want up to you and We're two from here. Where's the tool going? How's it gonna be adapted, etc? So quick fire amazio We are the host Things not necessarily websites. It could be other things as well But it's all it's all web ops. So you push code and code gets translated into Objects running inside Kubernetes. It's all magic highly recommend doing it and Yeah, we can host anywhere in the world. So if you've got a favorite data center that you really like things to run in that's all good and We support, you know, pretty much every major time zone with people in there So we 24 7 3 6 5 A big point of difference is that we're open source first So the hosting product we've got Lagoon, which Toby is the product manager for is entirely open source, which is absolutely kind of awesome and Yeah, as a result, you know, and we've got a lot of people that value that and Yeah, if you want to know more you can ask us after the session And a quick refresher on HTTP caching. All right, this is what websites did back in 1999 We had a web server, which was probably Apache 2 or 1 And you had a bunch of people and those people visited your website and your little web server kind of chugged along The main problem was that is that you have Basically only so much capacity on your web server and sometimes you'll have a lot more visitors than you expect So the next step up, I guess is having local browser caching Which kind of helps to some degree because your browser can remember it's downloaded an image and Maybe not requested the next time. So this helps a little bit, but doesn't really solve all the problems The next step really is introducing the sort of concept of a shed cash Which you can plug in for say varnish if you will and Have all your users visit your shed cash. That's it's in front of your web server and helps to offload it The next problem is what happens when your users actually come from all over the globe having one varnish server Plunked in one country Is it necessarily going to give you the gains that you think it might? the latency between Geographically the regions might be two to three hundred milliseconds and then multiply that by the number of assets you have Or suddenly websites now perceivably slower than what it could be so Next level up, I guess is having a shed cash that sits in the appropriate region so you have one in this example, Australia the US and England can't see Europe now not Europe and Even leveling up even further is like say you've got you know 70 points of presence around the world You don't want 70 requests for the home page to come through So what you do is you actually tear your shed caches. So this is called cash tearing or Ted caching and It's Kind of where you want to be So this way you're kind of funneling all the traffic through a series of caches to get to your web server And why should you care about any of this? Like why not just rock out a single Apache server for all your visitors? so It's faster. So time to first bite or TT FB is the time taken Between sending the request and the very first bite of response coming back It's kind of a measure and how how much latency and between you and the insiders But also time to download the assets has a marked difference as well There's a cost Benefit here. So you can actually serve more people with the same stuff So if you're rocking out a couple of web servers in the back background, then that's maybe all you need and Say you get, you know popular say your tweet goes viral Or you know, you get tick-tocked like you might be youngsters do these days Then maybe your site has a chance of staying up One of the features that most of these caching services have is stampede protection So if you have a hundred people requesting the home page at the exact same second only one request will Pop through back to your origin and then when that response comes back to the edge layer Then a hundred responses they go out to the end users. So helps to Yeah, stop the onslaught of people hitting you and if your origins ever down say you need to I don't know update Apache then You know, you won't be serving any traffic for that brief period of time So in this case most of these caching services will offer some form of serve while stale Which is the ability to serve previous content in the event that it can no longer update it and To that point this has kind of been in my mind for I don't know a year or two now Which is like I often get asked questions about caching which is like hey is this good You know quite generic quick broad questions like that and you know, I need to do a series of checks and work out and I Thought surely I can write something that does this stuff for me because you know I like to automate my life where possible and this can really help offload my brain into a tool So what you're about to see is that it's my brain this modified in the tool So and this is something you can go to now as well So if you have a browser up go to dot caching score com it is live It is up until this point not being tweeted about so If you find it useful then yeah, get on the tweeters and the tiktoks and make something So I talked to you a little bit about this tool and you know How how you can use it how you can interpret it and we'll go through some sample sites and Yeah, talk a little bit about them as well so This is its mission statement. I guess distilled into one line So it's a tool that you can use to assess how strong the caching abilities of a given site is This is like SSL labs for SSL site is but the equivalent for caching That's what I'm hoping it to be It's simple to use and understand at a glance so I'm trying to distill all the complex information Down into something that is more Understandable especially for people that don't really care about Ted caching and pops and CDN providers and stuff like that Provide a lot of education around the current score and also how to improve it as well And it is written with a lot of plugins. So the idea is we can add checks and additional CDNs Add additional cms's As and when the need arises How does it work So when you plug in a domain, which there's a handy giant sized tech box text box at the top When you plug in any domain up there, it will perform a series of HTTP requests Quite a lot maybe more than a dozen now maybe more than 15 I don't know I'll just count and Then it records the responses and it analyzes the HTTP headers And it works out what cdn if any it is using And then based on what cdn it's using and then it can interpret the response headers more accurately There are other tools out there at the moment that can just give you the response headers But the problem is that every cdn is so quirky and I'll go through A few examples just to show you what I mean by Like it's a minefield out there So That's why this tool is a bit different. This tool actually puts The htp headers htp headers in context of the current edge provider that the site's using And there is rate limiting in effect Mainly to stop people using it as a vector for amplification attacks because I figured someone's probably going to try to do that So there is a rather generous rate limit So you won't hit it. You know, yeah, quite a quick normal user These are the cdn plugins I've written so far It's by no means every cdn on the planet. It's only a subsection at the moment It's just for the ones that I've come across most recently And there's also one for a generic varnish reverse proxy as well So like we cater for both just caching proxies and actual cdns as well This is probably my call to action for everyone this call, but hey if you find something That caching score can do better Or maybe you can help explain a bit better or maybe you want a cdn to support it That's not currently supported or maybe i'm interpreting the headers Incorrectly for that given cdn, which is entirely possible. It's definitely not perfect right now Then there's actually a link to a contact form So plug in your email here your email is only needed so I can reply to you If you don't want to be replied to then just make up an email. I'm not going to check it And then put in your feedback message here and then click send and that will Send off an email to me The tests at the moment are executed out of Sydney, but this won't matter if your website's hosted in say another country if you have a cdn in place It's not perfect. These are just the limitations. I know about but Yeah, it's only allows 10 seconds per hdp request And if your site's slower than 10 seconds, then I think you kind of know what your score is without me having to tell you I have seen some sites block caching score because of bot management tools So That you won't be able to score them obviously in that case And some people configure their cdns to strip every useful modicum of information out, which I find rather disappointing Because how do you debug something that you can't inspect? So I don't know I don't find that particularly useful, but there are a few sites out there that do that And some more advanced features. I just can't test. It's like a limitation purely of this tool Like I can't test what happens to your content when your origin is down Unless you want to take up take your origin out then do a scan and then put it back up again, but it's kind of risky And if you do see logs with a user agent of caching score slash one dot zero and you know where it's from Someone's been scanning your site And who decided on these checks and the scores They're just based on my experience in scaling websites, so By no means perfect. They will be tuned and adapted and no doubt. There'll be some feedback after this talk So I welcome it and I haven't really mentioned it but this the point of this tool isn't really to To beg on anybody or say, oh, it's terrible. You suck. You know, it's not supposed to be like below There's more how do we bring you up and how do we level up your caching and To that ends I'll go through a few examples Not the contact us page I picked on our own website as well, which is using Gatsby and a few other Weird things that I'm not quite privy to but anyway, it gets a b minus. So It's not great. It's not terrible. It's kind of in the middle. I think if you get a b, you know, you're not doing too bad And you can see a list of Checks that have run if it's green you've got top marks if it's red you've got no marks and if it's yellow you've got some marks You know, it's and I'm not a graphic designer So if you see stuff in here that looks like it could be a bit better rendered then yeah, that's probably why But uh in each check you can actually click on them and expand it and you can actually learn a bit more about the check and You know every check starts off with this. What is it and why should you care? And in order to get maximum points on this check you need to have a time to first byte of 30 milliseconds or less And you can see the current time to first byte is two so I can tell you for a fact that fast lead has a pop In sydney they have to because two milliseconds. You can't go very far in australia on two milliseconds There are a few other Checks that we can Probably talk about that like e tag support and last modified are very similar But again, I'm hoping to help explain what an e tag identifier is Yeah, so That's kind of another useful feature of Caching score actually goes off and does a background hdp check on your behalf With a header of an if none match and then the e tag to see if the actual website responded with the 304 which is uh, You know, essentially you're still good that content you've got is still okay Um, Ted cashing. I'm still working on some cdm providers to work out whether they actually do support it properly For fast, I can guarantee that this is accurate, but other cdm providers. I'm not as sure Um 404 has been cached well Often people don't really think about their 404s until like after they go live and all of a sudden they're serving 10 000 404s out of origin that were from the previous website. So I think all 404 should be cached for some amount of time Like 60 seconds is better than nothing And uh, how it does that it actually does a request to a url. It's definitely not going to be valid Unless you're really cheeky and create a url for this on your side, but for the most part this should be broken and I request it and then I request it again and this is the second time I request it. It's not I'm returning a cash hit and we know it's Definitely not caching for it for us these two are related but uh query program stripping Is something that you don't really think about until your marketing team DOS is your site and you know that has happened to numerous sites that i've been involved with in the past so The first one is the facebook click id or fbclid and Typically facebook will append any like this sort of garbage on the end of every url you pop in facebook And uh, fun fact this um identifier is guaranteed unique for every user and click So if you really want to destroy your site post it on facebook And so yeah, this check is all about if I request this url. I should have a cash hit Just requesting it once and the same with uh utm. So this is more about um, like people doing google ad words and stuff like that. So um, we've been just popping links and you know twitter people will often append this utm uh garbage on the end of it like utm medium And stuff like that. So Yeah The last one here is like how long the cash lifetime is said to be and As you can see there's a sweet bug there where i'm rounding up to one second where it should be clearly zero, but in order to get maximum points in your cash lifetime score you need to have a cash lifetime of four weeks Controversial probably i'm keen to have feedback on that But I think with the advent of cash tags and ripple eight nine and ten supporting Cash tags like there's no reason why you should be rocking out a short fixed time to live You know, you should be one week two week three weeks four weeks minimum in my opinion And where it gets really crazy is i'm actually parsing the html that's responded with and then i'm pulling out a css file A javascript file and any images that live on the exact domain that you requested So if you're using a subdomain of assets dot or cdm dot i'm not going to bother checking those Assets need to live on the exact domain that you've Requested and I do pop in the list the URL that you've actually got there And these checks are the same checks that we've put on on the html, but we just I just weight them a little bit less because You know, they're not as influential In my opinion to the overall score And if you really want there's the full html response headers that are pertinent to caching down below and if you really really really want You can execute a kill command To help you get the same information And draw your own kind of conclusions that way as well And you'll notice that there are The special dash capital h that's like send a special request header And it knows that this site is using fastly and it knows that fastly has a debug header. So It's actually getting you to plug in the debug header That pertains to the cdn in question Not all cdns have debug headers. Some of them just out omitted by default But if it does then you'll see there And I haven't really talked about you can actually if you want to see the headers for the javascript file you can click through and Yeah, see what the asset in question actually is So you can see here. We're definitely not caching this for very long And we should probably have it tapped to the team toby and make this one not a Not a zero Um Cool. So in terms of like, you know picking other sites just go for gold like You can have a look and see Like make make sure that we've identified the cdn correctly And you know just have a look at the scores and make sure you're kind of You're comfortable with all that I'll go through a few kind of interesting cases that I've found So this is a site that's using imperva as the cdn and I need to put more details into here about the cache it but it's actually not a cache it and the only way You can discern this and this is where it gets really kind of crazy and I hope this is big enough to see But they've got this header called xii info and The part that tells you whether it's a cache hit or not is this It's the second character In this long string of gumball n indicates not catchable. So that's kind of what I mean by um, it's really hard to tell if request was Uh, actually served by the cdn if you don't have this information about the cdn's in question Because a naive approach might be oh look at xcash and see if it's got the word hit But actually that's the varnish ever that sits behind In perva in this particular situation And that varnish ever is not probably going to be geographically very close So Time to first bite. Yeah, here you can see we're penalizing them here as well. So it's 103 milliseconds for For this so even though it was a varnish hit behind in perva because it wasn't a hit at a perva then I think we may have temporarily lost shawn. Maybe his eight gig Internet link has dropped or something but we'll give it a few seconds to come back In the meantime, if anybody's got any questions on what they've seen or Similar sort of information they're looking for please throw something in the q&a and If shawn doesn't come back, I'll have to answer it. So shawn's just told me the powers just died in his house. So Many apologies for this. Um And I'll point out that I have nothing to do with making the The amazing website even more awesome. That's handled by people far more important than I So a question came in from steven on is the cdn mandatory If you don't want to run us as in We use the cdn to calculate part of the score. Um I There are very few examples that I would think of where you should be running your site without a cdn Generally the The cdn if you've got a very niche target audience and you don't need any of the distributed Consumption models of cdn then Yes, but there's so many other things as a hosting provider We see every single hit that comes to our back end everything that hits your database every request that comes through varnish There's a lot the cdn will do an awful lot to block some of the bad actors Sean and I are going to be in one of the sponsor sessions later today talking about how we proactively monitor a lot of customer stuff and how we do a lot of That blocking off our infrastructure But yeah, definitely Any cdn and I think tyra said it any cdn is better than none Because it just lifts that away from your infrastructure and whilst you might have infrastructure that can scale You might build your application to be able to scale it just takes a tweet to go viral or someone To tick tock your website. I don't know if people can tick tock websites, but Yeah, and that's why we put importance on cdn in that in that score um question from nick about including a case power ranking leaderboard of popular sites and a wall of shame I mean, I don't know if nick was in the session when and shon said it was explicitly not about shaming people, but We all know that there is no greater motivator than a poor score Being able to show someone an f Will make people jump through hoops whether we need to do that publicly. I I don't know but certainly encourage people to to Find examples of sites that are done well and share them as a best practice thing um, let's not let's not have the uh the humiliation of Of poor scored sites laughed around because we've we've all been there. We've we've all made something that we know works um and copy the presentation slides. Oh Hey Hey, sorry about that. I and I I kid you not my power just died at my entire house um, I don't know if it's out wider, but um, we're rocking out on some Some 5g and now single monitor laptop Um, so yeah, we'll see how all that goes um Catch me up toby. What happened? So I answered a few questions Another difficult ones. Um But yeah, the the couple questions about is the cdn necessary and um, yeah, we said cdns are important So, yeah, if you've got if you've got a couple more sites to go through Uh, I do Let me see if I can I'm now going to share the window. I'm currently Viewing so it's going to be fun um Yes, we'll work out how to make these presentation slides available afterwards as well But most of it's just screenshots of the website. So Yeah Is this um viewable toby? that is Okay, sweet. Okay cool, um I've heard this being talked about in the past. There's a few people probably on the call that deal with the site as well um And here you can see my sweet css. So yeah. Oh my god. So good um, yeah, so this is an example of a site that's uh, Drupal and um, one of the things I did want to talk about was um, the fact that there are actually some Drupal specific checks now in place so um, we can discern that JavaScript aggregation is turned on and I don't think css is because I think there's a front-end framework. So yeah, maybe take that one with a grain of salt But something that I do see a lot of um with Drupal sites is The page cache module which kind of comes by default turned on I think in d8 plus And it's really good for tiny websites. Um, you know, if I go back to my slides, it's really good for the people that are rocking out on Yeah, a single Apache um Yeah, really good for them, but not so good for anyone using a cdm like It was on calvron So yeah, I know I had to get points for this one if you have a cdm you need to disable the page cache module um So at the moment there are only three Drupal specific checks But you know, if you have an idea on what else we could discern from the outside and keep in mind We can't run drash. We can't access the database We can only look at the response headers and the response body But if there's something that you can discern, it would be useful. Um, definitely keen to know more about that Um, and if I do talk about cloud front for a little while, um Oh, sorry, my keyboard is not working because it's plugged into the Um Yeah Okay, it's not actually showing up, but I'm pretty sure there is for a full caching for cloud front, so I just need to um potentially force a cache mess Yeah, so like there's there's actually a few um of these URLs that are quite nuanced so um If you pick on this one here, um How you tell that this was a cache uh hit and uh um And this is this is the stuff that the plugin system does for it as well Like you see is error from cloud front. You're like, oh is that a cache hit or not and actually Do you read the docs if you have an age of something? Anything and error from cloud front. It's a cache hit. So that's here to tell if cloud front supports for a full caching And probably the next question someone might ask is give me an example of a perfect site. Um I don't want to pronounce this because I will make an absolute mess of it, but it's uh Swiss German so um It means wildlife in English. Um, so it's kind of like an animal site But yeah, I haven't yet found any Checks that wouldn't uh fail on this particular site Um, they're using a cache lifetime of an extremely long period of time They support everything that they need to they are a Drupal site. They have everything kind of Perfect as far as I can tell um And yeah, I do need to talk more about this but I'll do it in the explanations when I actually write the content But I need to talk more about how s max age and max age Coexist with each other um Cool, so I think that's everything I wanted to talk about um From the actual tool perspective um But yeah, definitely keen to have any uh feedback pop through on the tool or through the discussion chat and uh We can uh Is any questions toby? Uh Carl wants an api Oh Carl um There is an api So You've got forward slash scan if you put forward slash api slash scan, you know hash light hack Then you get a full json enumeration of the visual representation of what you just saw It's subject to rate limiting with the same thing as the ui So Yeah, no api key is needed. So yeah, just go nuts and just tell me what you build like Yeah, that's all I really want to know to be honest And His his follow-up question was does it have built-in support for refresh hit responses or is it assuming only a mess or a hit? Uh I do interpret some refresh hits when as a hit when it will serve it back to the end user immediately and refresh in the background If you need to refresh by the cdn talking to the origin that then needs to come back and then the whole pipeline finish Then I count that as a like a not a cash hit if you know what I mean so And that's where I might need someone's like if I am interpreting that wrong for your particular cdn then Let me know I know akamai does Refresh hits and they come back as refresh hits in the In the ui, but I think in that particular situation Akamai actually goes all the way back to origin to verify it's valid and then they Send it on to the end user Your lane is very complimentary in case any of the developers of the two wilt site are here About the response times of their pages. So, yeah, have you found any other a a plus sites on your travels? No, I haven't really had a chance to use the tool so much. I'm too busy building the tool So That's different every year. We're trying to build the plane when you're flying it Um, so yeah Yeah, so I I'm definitely keen to know if anyone else can get an a plus Or you know, anything you kind of find is kind of maybe kind of unjustified But coming into the future. Um, I haven't really um I haven't really kind of ranked in stack sites and I don't really want to do that because I Again, well what I said earlier, I don't really want to come up with a shameless, but It's not the point of the tool but what I might do and Yeah keen for everyone's suggestions here as well But if you're scanning like a say a Dot com dot au I might Give you a position against the other dot com dot au sites So like hey your position nine out of 57 dot com dot au sites that have been scanned So you kind of know where you stand roughly without kind of like pointing a big You know point point stick in your face But yeah, that's in terms of the gamification elements quite light at the moment small on akin to SSL labs where it kind of just gives you the score a grade And kind of leaves you be There's another site I do use called security headers.io which kind of has leaderboards and stuff like that as well. So Yeah, like I'm just on the fence about leaderboards and You know shame boards. So like at the moment they don't exist It's good to know that we both said exactly the same thing. Um, yeah, I think The I think this is a great tool and it's fantastic alongside SSL labs alongside security headers We know we all want to be better Um, I know that in a lot of situations trying to get the resources or the time or the allocations We need to be better. It's difficult. So yeah, if you can use This site and others as part of your business case towards why Caching is important and why performance is important then Yeah, let us know let us know if the if it's of use to you Um, thank you so much to everyone for coming for asking questions and for putting up with Sean's power outage Much appreciated Sean and I are going to be in the amazing IO sponsor session In a couple of hours time talking a little bit more about some of the more proactive stuff we do To help keep people's sites up running alive and happy. Um, have a good day