 application it's a lot to cover and it's like 5 p.m. so you know caching can be a very technical talk there's a lot of jargon can be a bit dry so good take your attention with the first slide got to catch them all okay so all right oh it's cut off okay let me let me tune the retune the yeah display a little bit let me try to see if I could hear the guy arrangement is that? yeah yeah yeah don't yeah just don't do that and then I would shift this guy over and then I would do this I hope that will work oh yeah I just lost my Wi-Fi okay okay there we go let me make sure I'm there yep oh yeah so these are the contents I'll go through a definition how does it work why is it important when to apply it what to cache cache expiry fasting and validation some common examples and some myths okay now Wikipedia has a definition I won't bore you through all this but there's like page cache web cache DNS cache database caching that is interesting just more of a computer science definition a cache is basically just a high-speed data store right but practical definition would be using the use of previously stored computed results to shorten the time of processing a future request that's basically the purpose of that of caching and caching is only effective if the retrieval of the stored result is faster than computing it otherwise it defeats the point so how does it work well most caches uses a key value system right you pass it a key it returns to the value and the value usually is way larger than the key so the key could be integer email address some sort and so in a typical transaction you have an input request an application that caches stuff will actually check if the request is already in the cache right if it's not it's called cache miss you'll compute the results they write to cache thereafter you actually set it out as a response the next time round we say hope is in the cache I'm gonna just fetch it from the cache and shoot it out right away okay so why is it important it actually increases perceived performance right performance is a very technically speaking is a very hard to define stuff a lot of us look at like request response and that's performance right but truly performance is the number of instruction set the CPU runs the only way to to to make it run fastest two ways you buy a faster CPU or reduce your instruction set right there's no way to increase performance so you can actually increase perceived performance by having caching right because that actually reduces the number of instruction set into run in your code so it reduces the time to respond requests definitely it reduces cost for computing power right it definitely clear application bottlenecks allowing your web servers to process actually new requests instead of the same request over and over again and offloading the the application IO to a faster storage so when to apply caching well obviously when there's a bottleneck when the output is the same for a high number of users right so let's see if you're on a shopping card you know application you don't want to join the same page again and again for thousands of users can you do it once and cash it and serve the cash to a lot of users when also the time taken to compute or fetch the output is not acceptable so even if the results are unique by takes five minutes to join us the page then you might want to catch it right so and output can say stay consistent for a period of time again let's say for shopping card or shopping our website the products page they those products once added they're gonna stay there for a while right even if you're gonna add a new product or just gonna remove a new product you could have the cash you can have the old pages in the cash stay in the cash for probably about another hour before you change it and that's fine yeah because if a new product is added it will not show up until an hour later not so bad but if a product is removed right it will still be in the cash the user clicks on it and you want to buy it you call your controller your controller said no product does not exist still not so bad you can still control that kind of user UX when real-time accuracy of the output is not really required yeah you can cash locations of a taxi like long that's fine you can but shouldn't cash for an hour I think five seconds ten seconds is fine you know 20 seconds fine yeah someone ever told me that even catching for like 20 seconds sometimes it's a big thing if the number of clients connecting clients are high as if you have 10,000 clients connecting they all want the same number right just catching for 20 seconds 30 seconds it helps a lot yeah only a small part of the output will change if you have a huge web page or huge data hash only like 10% or less than 5% will change you can cash the whole thing and then whenever there's a change only modify that little bit and cash it again so you actually work on the cash instead of like recomputing the entire data hash so especially when the output only changes at a predictable time so in an enterprise environment there'll be a lot of manual workflows you know people coming back with lots of invoices whatever the key into the system you know when the data will change so you can cash it until then and when an always ready copy of the data or the output is required so the data in your data store can change or your database can change but you always need a ready copy to render a page to send an email or whatever yeah and when it's more cost-effective so you can actually cash compile bytecode that's even in PHP you can do that that's the ops code you can come cash you can cash database query results HTML output fragments you can cash the return of a JSON API you can cash user session data using preferences as well but you can cash it on the on the browser if you want to erase hashes transient data that isn't too permanent so they're like many layers in your infrastructure that you can apply caching but a good number of us are very familiar with web caching at the age level like clock flaring clock flaring front your CDNs and all but but I'm actually talking about the application layer within around PHP yeah so so when you write a lot of stuff to to the cache right I say memcache redis or DynamoDB especially for memcache you can have expiry timestamp so they actually expire after an hour so you don't have to clean up you don't have to remember to clean it up after an hour or day it will just go away by itself memcache also we remove item based on our least use least recently use and manual you can actually call the key and say hey unset the thing so applications can actually cash and forget you can just keep caching stuff you know I don't have to worry about it if it's not used you'll just go away yeah if a new key comes along if your user if you use timestamp as a key right the new one will be used the old one will be forgotten that's fine so cache busting is a technique where you augment the key to get a new copy of the sit of the data so a very good example is that you actually cache a URL right and then so the full URL is cached however time however many times you hit the same URL is going to get this cached data but you know the database has changed the values in database has changed and if you call the controller now a new page should be rendered so you can actually augment the URL by putting a very safe thing like a question mark T equals 1 T equals 10 right that will change the key and then that will hit the controller and return a new page that's probably for testing for development purposes you can do that yeah right to get a live version of the output cache invalidation refers to the fact that you want to remove the items in cache there are times where the data in the database can stay for a long time let's say a customer list knowing so you have like hundreds thousands customer or million customer in a database so what you did is that you actually do a map reduce you pull out all the names based on postal code you create this hash and you catch it right it's not going to change so often right maybe every week you're just going to add a few one or two but you do not know when all right sometimes maybe there's no addition sometimes there'll be a lot so you're going to catch for a long time you can catch it for a month or a year who cares but you know when it's going to change so when it when it changes just invalidate the cache and refresh it again and you can do that by the key certain caching or solutions allows you to flush items older than a certain date so you just keep the last two weeks if that makes sense to you depending on the application or in the case of Cloudflare you can actually say hey you know what all these pages if you catch for me everything under slash customers remove them they're no longer they're no longer valid okay so for a e-commerce website for example you can actually cache all the public pages because they're the same for all users thousands and hundreds of thousands of users they're the same right addition to that you can actually cache every product item fragment so a web page could have like could show like 20 products right each of them could be a fragment and you can catch every one of them so if one product description has changed you only update that product fragment right so and then you invited the top cash the top cash can now reconstruct everything but it'll be quick because it takes all the fragments and assemble a new cache right so it'll be very very fast it doesn't have to re-render the entire page but private pages you will not catch them you just leave them out of the cache and then it will not be cash you'll be generated dynamically the same goes for social media website is very similar you have the content page say a blog page right a blog post the content of the blog once written not going to change too much you can cash it and then the comments section can have multiple comments right the entire comments section create one cash you can catch that as well right those are fragments right the page cash it will be will be basically the blog post cash fragment and a comments fragment so when the comments have changed a new comment comes in you take the comments fragment you append it with a new comment and then you refresh the comment fragment and then you refresh the page fragment using sorry the page cash using the content fragment and the the comments fragment yeah it'll be way way faster than just rendering the whole page so some of this these are some of the oops sorry we'll go back okay yeah these are some of the common app behaviors per se right this are the nature you have like high write volume high read volume right normally in that case the real-time accuracy is not so important Facebook like counts those are yeah actually Facebook like counts are not exactly the same across all devices they are off by usually quite a bit right and after they reach like a thousand they put 1k so they can be off by quite a bit doesn't really matter yeah but they are eventually consistent eventually low write volume high read volume those are like web pages yeah high right sorry right in real time but read later like analytics you track all the analytics trackers and you have a right in real time and read in real time that's really hard to cash right like chat messages that's really hard to cash and then you can write in batch and read in chunks lastly I'm going to cover some of the the myths that it is not possible to cash a page because the user names are different for each user say I'm on a shopping cart again right okay so I log in to one of this shopping website and on the top right hand corner I have my username right just because I have my own username on a top it does not mean I cannot cash the entire page I can't I can cash the page without the username I can save the username on a cookie on in the browser so that when the the page is loaded the cash page is loaded JavaScript kicks in checks the cookie and write the username on the top right hand corner right of course you can say that your user can change the cookie store and whatever but that's fine because it's not gonna change it's not gonna really hack your system it's just display right so that's fine and the session key will still apply to a cash page so that's fine it's not possible to cash a page because the section of the pages dynamic like I mentioned before you can actually have fragments so if a single fragment has changed you can still reconstruct the entire page cache using fragments not possible to cash results of Ajax core dependent dependent on the nature of Ajax core if the result is always about the same that's fine Ajax core is just a text file you can just cash that using memcache readies even caching is bad because it leads to still data or HTML yes unless you put in some cash invalidation rules caching should be avoided because you can always tune your code for better performance I have heard this time and again don't cash at all write real good code make it fast right if if if Ruby slow switch to Golang Golang slow switch to Scala well well I always say use the best tool for the job in this case if it's a static HTML for millions of users use a high IO and our solution like memcache I think it's fine you know varnish I think it's fine you know don't render the fish over and over again sorry right so that's really a myth yeah I think this is the last one caching is good for read operations that's also quite common common thinking that you only cash stuff that you read actually not true you can actually use a high speed data store remember caches a high speed data store for fast writes all Facebook likes I should stop in memory first they never stop on this they never when you like a Facebook article alike it doesn't rise to DB directly right it gets written to a high speed memory and periodically there's a routine that will read the high speed memory collects it and write to us a disc yeah Facebook will crawl if they write to DB every time when you like an article yeah so they don't yeah if you need high speed writes caching is also good you need a good a fast data store like Redis Dynamo DB they are pretty fast they are fast because they are using SSDs also because they're non-blocking they return immediately after the right so okay any questions was I like too fast caching service yes for page caches right web page caches you can use top flare that's pretty okay top flare it gives you a free SSL as well if you pay a little bit they give you a free DDOS protection it's also pretty good if you want to host your own you could run varnish I guess or memcache D right those are fine as well yes oh yes yes what I did not cover was basically HDT haters right you can she set the haters to tell the browser not to retrieve the page again yes that's also a good technique yeah you could do that as well yes don't be for that okay right so if you abuse the cache right yes it can be it can lead to abuse so developers don't write really perform on code and just catch everything right so for a certain nature of apps where they generate very static pages the developer can get away from that but when you have say like Netflix the the dashboard your your your video dashboard on Netflix or channel dashboard on Netflix differs from user to user so they can catch one and serve thousands right yeah so best you can do in that scenario is that knowing every user has a different dashboard every night or every hour you look at your preferences in DB hit the cache I'm sorry you refresh the cache by running your code construct it before even the user retrieves the dashboard so by the time you do it is immediate they'll get it like yeah yeah so but yeah it's true you can actually abuse yeah but so it's like both extremes right so you don't abuse it at the same time you don't want to rely a hundred percent on on on code yeah they remove cash on the order yeah so so the the I would say this in the ideal world you write higher percent performance code but to achieve the last 10% performance takes probably 90% of the time so what costs at what cost right our server cost more expensive and engineer cost I don't think so so if you would return on the cache and leave the last 10% tuning right oh no and avoid the last 10% tuning you probably save a lot more money yeah running your infrastructure then paying salaries to engineers and more often than not there are more cutting-edge platforms or programming languages that promises really really good performance right at the expense of lesser features and no TDD no BDD no no testing suites so the developer could do a hollow word as freaking fast but then when it comes to really complex business logic they just could not finish a code could not deliver the code right so yeah so it's more of a balancing act here use a tool that you're very comfortable with that is more or less 80% 90% performance leave the last 20% to caching I think that's fine so long as the nature of the app suits caching right like I said just now chatting apps are very tough to do so because the the message comes in you need broadcast immediately you might ready to Dynamo DV very quickly then broadcast it right but you don't really need to cash it because you have to push it out almost immediately so you only so-called cash it not really cash in fact you can use Dynamo DV as your persistent layer just in case if the server dies and boots up again but then if you have a very good redundancy infrastructure the other server will pick it up and just send it out yeah so so nature of apps yeah the developer has to look at the nature of apps and pick the correct catching strategy yeah cool all right any other questions oh yes very good thank you yeah so Sam has covered the principles behind caching so I'll just be touching a little bit on the different tools that we would use as a PHP developer for as a PHP developer right so okay yeah so according to the p3 the right way they talk about off-code caching so what Sam talk about bike bike by code caching is basically what we in the PHP world use called back off-code caching there are a lot of libraries out there so well there's a radio available for PHP so OP cache for example is the latest one and it's basically comes packaged together with PHP 5.5 onwards so if you're already using that you can turn on that mode you it's a common module in in PHP it's available from 5.5 up upwards I think is it's also lower versions available in lower versions of PHP before that it was APC which is called alternate alternate PHP PHP cache and before that is a little X cache and Zen optimizer a few others there's a whole bunch of them out there so this is usually a module and you choose one you just just switch it on and it should work try not to mix them together because you get some weird results they have off-code cache of an off-code cache which is kind of weird yeah so the example of this doing of doing this I can show you a quick example which I wrote in you look at the sample code which the sample code report there's a new folder there called caching inside there have a small little Docker file which has which has some of the codes I'm talking about right so I'll show you right now one of the examples so here I have a Docker instance installed is running PHP 5.6.24 as an Apache app and this is the app is basically you're just showing you the PHP info and I'll show you the page right now which is this right so over here if I look for off-code cache there's no mention of it in the module which means it's not turned on and if I change this now to just hello world so one way we kind of like play around with to measure how much we have we have improved is by actually to know how much we've actually improved the performance using caching we use tools to measure it for example we use Apache bench over here so we use Apache bench right now basically what it's doing is take 10 concurrent connections hitting the server 5,000 times right so it's basically hitting port 8,000 which is the app where the app is and this is usually it's without caching this without off-code caching let's look at the let's look at the performance right it's hitting it with 5 they show you 500 requests it goes through it shows you right so on average it takes about 27 milliseconds 50 percent of longest request about 148 milliseconds this is without off-code caching right so right now let's turn on off-code caching by rebuilding let's rebuild the docker container with off-code op-code turn on op-cache rather live demo please work seems to work okay right this is prime this is the show look at again just showing nothing but just hello world right so the same code I'm going to run the test again so notice it takes about 12 27 milliseconds on average maximum is 148 if I run this same test again look at the improvements right it's about 10 milliseconds faster which means your page loads a bit faster this is this example is a bit contrived because it's just showing you one line of PHP code they still cash and look at the performance is almost the 2x for almost the 2x performance right yeah it's close to 2x performance in terms of this so imagine what I could do for your for a bigger for a bigger application right so that's the off-code off-code caching and I don't think we could do is using object caching so object caching will be things like a class on array or a value that will be on store somewhere in the server and we retrieve it only when we retrieve we restore it somewhere and we don't so we don't need to compute it again for example you do a SQL query which gives you like 50 records and you know this 50 records doesn't change you can cash that entire set that I already into you can serialize it and store it into a cache like Redis memcache and then retrieve it without recomputing or without making any more round trips to the database server which will improve the performance of of your application so there are things like APC has has an API for you to actually add stuff so these are all in memory in memory caches you can just store in your event many instances of PHP running when you use APC actually stores it locally in the memory where you have a cluster of servers that you need to share you need to share data you can use memcache so with memcache you can just store stuff like a bunch of your cluster of your web servers PHP web servers who will be storing stuff together in a one in a single memcache instance or cluster of memcache instances which we can then share the same data so like same first time a record was hit like say first page of your of your of your blog post five records was retrieved for the database it's it won't change the next half an hour so you can actually store that five record in to Redis and then when other clusters other servers in your in your cluster picks it up or rather retrieves the same thing it finds it and it can actually just show the records without making another round trip to the database right so we could use this quite readily I was more example of how it could be done so in a memcache of PHP for example so basically is doing things is basically adding a server you can actually add multiple servers so you have a cluster of like two three memcache servers you actually add all together here basically you check whether the key exists if the key doesn't exist which means it's not been stored in a memcache yet I can actually do a set with the data after that it will be saved and I don't have to and basically will be showing you the same data instead of the memcache data so this is really so you can actually wrap this up into a class so instead of making a calls like so you get an abstraction layer which you say I want record of this now but if the record is on there set this as a default value for example and then your abstraction class could actually do the handling of checking whether the value is there storing it then returning it or if it's already there I'll just retrieve it and show it to you you can actually write that abstraction layer on your own so this is one way of handling that we could also use the same principle of object caching where it comes to where it comes to fragments fragments of your web page for example your Laravel app you can actually cache the rendered version of that piece of code like say the in the view of your blog post for example the first five blog post has won't change the next next two hours you can actually cache the rendered HTML version of that page into into memory so a lot of frameworks like Laravel, CakePHP they actually they actually have this built into the system and in this case it's actually a lot this is a lot of a add-on it's called Matroshka so it's a caching framework which you can use to basically cache fragments rendered fragments right so it's a really interesting one and most in most frameworks like Laravel or CakePHP they actually have it have actually introduced a their own abstraction layer which lets you work with different types of backend of cache of key value stores like memcache, Redis and a few others so this example of Laravel you can actually can actually store and retrieve stuff from from either a database a memcache cluster or Redis cluster so you can actually store stuff that you don't have to worry about oh do I have to know how to actually write stuff that interacts with the memcache server or don't have to write stuff that interacts with the Redis server you don't have to worry about that because the framework will actually abstract away that complexity and just show it have one common interface which lets you interact with different types of backend right this example of caching done in CakePHP right so it's you will echo the element and you actually cache it you can also specify the key that you use to cache this with you can also specify the expiry time how long this cache will remain there right so what Sam talk about in terms of cache invalidation it could either be a time-based cache invalidation that will expire or it could be key base basically your keys will you can embed certain values in in in your cache key so if that something has changed in the content for example the rendered version of the code will have a certain MD5 hash you can actually use that and store that make that your key right that's one example of doing it you'll be using something like Redis you could even do like nested in Redis you can do nested values so if a if you're in the way that you write your cache keys if you actually invalidate the whole bunch of you could evaluate a whole bunch of IDs that comes after a slash user slash user five blocks whatever whatever all that is cache invalidated in using Redis you can actually delete the entire block of keys by just doing something to a star right yeah there's some commanding a push the choice and say anything that has a certain pattern you can remove all those caches all those values in the cache which is kind of cool but that also of course requires you learn know how to interact with the Redis server directly which is yeah one way right so that's actually basically all there is to caching I think we covered most everything right is that any any questions with regards to what Sam has talked about and what I've talked about yes which is why if you work with an abstraction layer yeah use a framework like Laravel has Laravel has they have a generic cache class so what you use behind it is that is basically up to up to you to decide so you're using a memcache cluster if the library changes you basically just the framework should handle that change for you right so you will switch from a memcache cluster to Redis cluster for for your for key value caching you just need to swap around change the credentials change the server ips you basically tell it I'm using Redis now that that's it you should handle all that difference for you right you use yes right right right yeah I think APC APC and off-cache is there is space is okay APC has a key value has a key value of the object caching component and they also has the off-code caching component off-cache doesn't have a does not have a key a object caching but it's still it's still just a off-code caching so it's just basically replacing one extension with another extension what it does underneath I think is pretty similar I don't think it should be any difference in just go with the what is the most latest and most supported why they change I think it's a political issue which I don't want to get into so you know so yeah anyway yeah yes yeah usually mancache is not the same container right so there's still a network latency right yeah and it's a thing I think is that the DB usually is also out of out of a container right it's a process out of container so there's also a disk and network latency as well so given that map is faster however having said that for specific nature of application I would even recommend a second mancache we've been the same container of your app it can be recommended if the nature allows as in for what a reason you need the cache super high-speed cash that's only for this instance of the web server that's why we're not there what memory cache our own memory cache right you can you can so in the previous company I worked in we actually use xcache to download a giant array so this is there's a an associative array which has feature feature toggles our feature toggles is stored as a giant array is actually we have a database table with which is basically a bunch of feature feature toggles and what we did was we export we basically retrieve that as a as an associative array and restore the associative array in memory using xcache wrong wrong it's basically RAM RAM basically memory RAM memory yeah so either way if you put it in your app process it's memory mancache memory yeah yeah in fact there is this technique about RAM disk that I'm not going to that if you really want this as fast as RAM RAM disk we many years ago I used to write the many years ago years ago I used to write a social network and we had an inbox and we use couch base as the inbox so whenever things go wrong we had to restart the couch base database or something data goes goes missing or which things become inconsistent very fast so don't keep I do keep yellow stores like couch base where they are fine and good as a caching layer but don't use them as a primary primary data store always have an authoritative data store like my SQL or Mongo to a certain extent but I wouldn't recommend it but go for something with which properly as it right like Postgres my SQL right or MariaDB yeah that's as an authoritative store and then you have a cache layer in front of it either as a transient yeah just transient data in memcache redis or something which you can just read very quickly which things I don't require you to do a lot of computation like for example home feed home feed for Facebook whatever requires a lot of computation to gather up the data and what find out what's going on inside that computation takes a lot of time and it and you have to do their computation every time as some person loads a home page is not practical right so you cache things for example right so we cache things in redis you're caching in memcache you crash caching in local memory or as a file somewhere can be done right there's some car there's some partial there's some view caching that just takes the rendered version of the of the view and she stores it as a time file in the in the file system they do that can be done yeah cool any other questions all right so that's it we'll come to the end of the PHP the very workshop thank you very much