 Thanks to be here for this talk about the guided tour of caching patterns. I am Nicolas Frankel. I've been a developer for now two decades. I also recently became a developer advocate. In the past, I also have been a lecturer at universities and higher educational determination. So if sometime this talk feels a bit like a lecture, it's probably because of this also because a guided tour and it lends itself to being a lecturer. I work for a company called Hazelcast. Though this talk won't be about Hazelcast, I will be using it to highlight what I'm showing. Hazelcast has actually two big features. The first one is an in-memory data grid. Meaning you can store data in memory and it's distributed by nature. And the other feature is in-memory data streaming. So you can transform your data, you can enrich your data, combine your data, whatever. Anyway, if you remember one thing from this talk is caching is a trade-off. You might have heard previously that caching is a sign of a badly designed system that you should shy away from caching that it's not true. Yeah, of course, sometimes you implement caching because your system is badly designed, but it is as it is and you must cope with it in the most straightforward way. You cannot redesign every system every time. Now, regarding this trade-off, you have two actually trade-offs to make. So in two contexts, caching is a good idea. You accept that you will have stale data because you want fast data or you accept that you will have data, stale, because otherwise there is no data at all. And I always use the same example of an e-commerce application. An e-commerce application that is designed around microservices. So you would have a catalog microservice, you would have a court microservice, you would have a payment microservice, a checkout microservice, a pricing microservice. So the customer, they put items into their cart from the catalog and at some point they go to checkout. During checkout, you will call the pricing microservice again and because it's a microservice architecture and because it's distributed by essence, there is a possibility that at this point it takes too long to get the price and customers or potential customers, they don't like to wait. So they will leave the shop, they won't buy, that is bad. So in that case, it's better to cache the prices of the most used items in the checkout microservice and if the pricing microservice returns, that's perfectly fine, then we can cache the price. But if it takes too long, we just go into timeout very fast and we get the price from the cache. And of course, the data won't be exactly the correct data. But from a business point of view, it's better to sell at a slightly outdated price than not to sell at all. And of course, it can get even worse, the microservice, the pricing microservice might not be only slow, it might be down completely. So in that case, it's not about waiting, it's about returning anything. And again, a cache can help us. So remember, the point of view of the business especially in that first case is it's better to return slightly wrong data than not to return anything at all. Business is heavy. Second lesson from this talk is you might have heard about Don't Roll Out Your Home Crypto. Well, Don't Roll Out Your Home Cache. It's going to be a mess. I've been guilty of that in the past. I've been art, hey, I mean, just like make it more performance. And I used a hash map. So in second lesson from this talk, you might have heard the Don't Roll Out Your Home Crypto, maybe. Well, Don't Roll Out Your Home Cache either. Like in general, when you're a young engineer and I've been a young engineer and you are asked to do some caching on some values, you generally use a dictionary. And it works perfectly well until it doesn't. Because the problem of the dictionary per se is it fills out the feature, but it's unbounded. Meaning you will put entries into the dictionary more and more and more and more, and there is no limit. So at some point, your dictionary will take a huge place in memory and will compete with your application. So first rule, you should set a limit. What's it not only about the size limits? The size limit is the first hard step. The second hard step is imagine that now you have reached that limit and you need to put a new entry into the cache. Which entry should you evict? That's not really cool. You need to think about it. And in general, professional caching solutions they have multiple strategies for that. It can be least readily used. It can be least frequently used. It can be a priority that every time you set an entry into the cache you can give it a priority. There can be even a custom plugin so you can link in any strategy you would like. But this is something you need to address. Also, when you set an entry into the cache, in general, it's good for a certain time frame. Afterwards, that entry should be invalidated. There should be a background thread that removes invalid entries. And again, that's something if you roll out your own cache units to develop by yourself and it's not fun. This is related to time to live concept. Again, when you put an entry into the cache it's valid for a certain time. So either it's per entry or it's per cache region. But anyway, you shouldn't put an entry into the cache forever. Then there are optional features. Something that might be also interesting though you probably won't need it every time. So some caches, for example, they are distributed. That brings a new level of problems because once you start having distributed cache you probably want to find a cluster and so there is this auto discovery problem. How do you find your other nodes to find a cluster? And then because we are talking about distributed systems you probably have heard about the split-brain. Split-brain is, hey, there is a network partition and there are clients that are still using the cache and at some point the cluster reforms again and what are the valid entries? How do you work with that? That's completely a new level of issues. Then when you put stuff into the cache how do you serialize them? Do you serialize them according to a dedicated stack like Python or Java, whatever? Or do you serialize any JSON so that you could have like a cache that is accessed by different clients in different stacks? So let's start simple. I have an application and I just want to cache some stuff and the easiest way is your application will be the orchestrator. So the application will first get the value from the cache and if the value is new meaning there is nothing in the cache or there might have been value in the past but it has been invalidated. So the way right now the cache is empty for this value then we will get to the data store and in general the database and then you get the value from the data store and if you find it because there is a chance you don't find it then you put it in the cache and then you return it to the client and the next step is now the client gets the cemetery from the cache and it's here and it returns immediately without going to the data store. So let's check how it's done. So here I've created a simple flash application I have a couple of routes the first route gets all entities from the database the second route gets one by its pk by primary key and I also have a third route that allows me to put entities into the database I mean it's a simple crowd application I will be a handling person and before I like start the application what I will be doing I will be manipulating my database with dummy data if the database is not already populated I'm using SQL alchemy and I'm using SQLite to put the data of course it's not how you would do it in production but it's for simple demo so let's start it and let's see it works so I can kill local host 5000 and I will get all entities here and you can see that they correspond exactly to what I created because it's a brand new database or I can ask for one pk or I can try to put some data into it so curl post local host 5000 and now I must remember the syntax it's header and it's content type application json and I will pass in this following data and now if I curl again for everything I will get this new entity local host 5000 yes and now I have this new entity that is created so I'm very happy I have an application that works but for whatever reason it's too slow and I want to implement caching so I go to my faithful IDE and I already created everything because I mean it's too dangerous so I will be using Hazelcast as I mentioned and the only thing that I'm doing I will create a Hazelcast client and from this client I will get a dedicated map by default we have a non-blocking behavior but here I want to have it as simple as possible so I will create a blocking cache so every time I put a stuff into the cache I can directly get the result and this is how it works I will get all for every person that I got I put it in the cache I write it down into the log just to be sure and when I get one I will get the pk I will check if the cache contains the pk and if it contains it I return it from the cache otherwise as I mentioned I will write it down into the cache and return the person directly so let's see how it works so here I will first start my cache Hazelcast starts so I already installed it again it's not about showing you Hazelcast it's just imagining that I have a cache running it's a distributed one and now I can start the application and I can curl it so curl a localhost 5000 get first entity and it returns me Joe Delton and at this point I should have like person with pk1 not found in the cache person which people want to set in cache if I ask it again the same key then I have pk1 found in cache so I don't go to the database and of course I can curl everything so now every entity is in the cache and if I ask for a new one then all entities are in the cache because I've set them all and now pk1 has to be found in the cache so you don't go to the database again that's actually the cache aside and of course it's only about reading but we can also do the same for writing and this is actually even simpler so every time you write in the database and you check that it works of course then afterwards you can set it in the cache and in the code it's the following translation like we commit and at this point if there is no error we can set a key in the cache so here we will create a new entity again h and I just want to create check do for example because John has already been created and now I want to curl everything localhost 5000 we can see check and it's in the cache perfect that seems easy enough and actually you don't need any real like caching provider for that any hash map we'll do in that case even though you need a link you need an invalidation strategy whatever I mean it's straightforward this is in general what people think about when they are using cache but cache can be so much more so that's the first pattern is cache aside for read and write the second pattern is actually read through and we read through you actually are not orchestrating your application is not orchestrating you are only interacting with the cache itself and the cache is configured to talk to the database so on one side it might be beneficial because you actually only talk to the facade of the cache on the other sides you need to configure the cache according to you you need to be sure that the cache works according to your needs let's see how it works in the code so for convenience purpose I've created a new project with the cache itself because it will allow me to do some changes and in that case the change is the following so normally you would just use again the cache and you would like write the code and then you won't care about it because I think that's important for you to know right now I've created a new project in that case it's a I'm sorry it's a Java project but the idea even if you are a Python developer is quite straightforward the idea is we have this stuff called SQL map loader and basically what it does is when we are calling this like load and we won't be calling it explicitly it will be handled by the caching provider itself it will look into the database and of course there is a load all and of course afterwards there is a load of I mean there are a lot of methods that you need to implement well not that many actually if we look there is this load load all keys and of course there is some initialization stuff and because here I want to say hey I want to read from this database but one that it's done we should forget about it we can just run the cache and we will have this application but as you can see on the rounds we are not interacting sorry not here of course here we are still interacting here we are not interacting with the db anymore so we move the code that interacts with the db from the application to the cache so now that our application is actually much simpler what we do is just read from the cache and the cache will be the one to say oh it's not in this cache so I need to load it from the db to retrieve it from the db put it in the cache and return it to the application in turn we will just get it likewise you can do the same for the pk so let's see how it works so I can now call the application localPost 5001 and I get directly the entity if we check from the python application of course there is no login we directly get the stuff and on the caching product side we can see that we have been loading some data already we are eagerly loading it's not necessary but we are eagerly loading on the sides of the writing we are still interacting with the database because we are only doing read through we are not doing write through so the next step is let's see write through and again it's the same as before we are interacting only with the cache and the cache itself interacts with the data store so let's simplify our code using write through that requires the next step and in that case now this map store this class implements a new interface which is called map loader again this is Java code I don't want to go into the detail but it adds a couple of additional methods actually store and also yeah I'm too lazy to implement this now but and then you can also delete stuff and delete all stuff so let's start the cache with that in mind and afterwards we will be able to start the application itself okay so we can still curl everything is fine we can still curl everything everything is still fine and now we can post and post will be doing write through so curl I will be doing that because I'm lazy okay and I have John I have Jack now I will have like Jane Jane do and here I have an issue I don't get the ID of the entity because now I've migrated my like store logic to the cache so here I need to provide the ID which can be an issue in some cases but here let's suppose it's not I will be using ID 10 now it works and of course we curl ID 10 it's already in the cache because again we are only interacting with the cache right now that's pretty good so far I'm quite happy notice however that in that case what we are doing is we are every time we are waiting until we get confirmation so here this is if you remember your UML lessons this is synchronous so we put data into the cache the cache puts data in the data store and we wait until the data store has done its job and we get the response and we return the response and it might be good depending on your use case it might be not so good we are not that fast as we could so the next step is once you have implemented right through it to say hey but perhaps we could be much much faster how much faster can we get well it's easy when you put stuff into the cache you just return immediately and you leave it up to the cache to put the data in the data store and again it has benefits and it has disadvantages benefits of course I mentioned it it's like super fast you are only interacting with the cache and you return immediately problems that you are not guaranteed that actually this will happen there might be some issue with the caching provider it might go down there might be some issue with the data store it might go down so you might have wrong value in the cache and the clients like things that everything has been done in the right way so this is eventual consistency and sometimes it might not be consistent at all but you are fast and again it's not bad but there are some use cases where you value by speed over consistency it's up to you to decide but this is right behind so you are just saying ok I will do everything asynchronously let's see how it works so here we go to right behind and actually the only thing that we need to change is how we configure in that case that's very hazelka specific we just need to change a single line of configuration as I said that there is a right the less seconds so between the time that you actually do a change and the time that the change gets triggered to the data store there is 20 seconds and the good thing is that during that time there might be other requests coming in for the same entity so there will be only one call to the data store the latest one again pros and cons everywhere the next pattern that I want to show you is refresh ahead one of the problem of the cache that I mentioned is you've got stale data in order to cope with stale data we are doing cache invalidation so when you put an entry into the cache you said it's like valid for a specific amount of time let's say 5 minutes afterwards that gets cleared and next time your client gets the entity the entry it's not here so we need to fetch it from the database and at that point we need to do the whole flow and it takes time so you will probably have like long requests that need to access the database sorry, slow requests that need to access the database and fast requests when you hit the cache that's not a necessity what you can do again with some caching providers is that when an entry is close to invalidation then there will be an eagerly fetch from the data store so again that is asynchronous your client doesn't know anything about it and once you've put an entry into the cache it will be refreshed automatically when it's close to expiration so if you set the invalidation period to 5 minutes like when it's close to 5 minutes we'll fetch the data from the data store so that you know that every time the entry will be like we'll have the latest stuff from the latest 5 minutes but more importantly the client will fetch the data always from the cache and of course the problem in that case your cache will always be full so once you've put an entry into the cache it will stay there forever because you can only grow because you never invalidate anything when you invalidate you put it again that's pretty good if you accept this tradeoff but that means that the first request will still be slow because the first request there will be no entry at all and you will need to fetch from the data store now imagine prepopulate the cache with everything that could be a pretty neat idea again if you've got a lot of storage if memory is not an issue we could prepopulate cache that's a pretty good idea but once we've propulated the cache we need to update it and again refresh ahead is a good idea but if we set the time to live of 5 minutes that means that there is a time span in which a time window in which the data won't be valid so when you start thinking in frames in windows of time and you have an issue with that probably that's not the right way to address the situation in that case if you really want to have the window of inconsistency between the cache and the data store to be as small as possible you should be even driven every time there is a change in the data store it can be reflected in the cache that's another level of complexity it works but for that we need an additional property we need to have streaming so we will have streaming that gets the changes from the data store and every time there is a change it will set the value in the cache and on the application side not only we will interact only with the cache but the cache itself won't fetch the data from the data store so everything has been moved to the other components now it's this streaming pipeline's job to get the data from the data store and into the cache so it's another level of architecture complexity for another level of benefits let's see how it works now the cache itself is not necessary anymore because as I mentioned the cache itself doesn't interact directly with the database so we can remove this one there is no like SQL map store what we have created on the opposite there is a cache head stuff so again this is Java code I don't want to delve into the specifics but we've created a pipeline and even if you are not a Java developer you can read from my SQL and I will map so here I have an object called change record and then I will transform the change record to a map and I will transform the map to a JSON and from this JSON I will extract the ID in the cache and well the rest is not super interesting basically it's just like configuration to say where do we get the data from so here I'm using my SQL because SQLite is not made for that so here it's the host it's the login, it's the password and here we are just configuring where we will write to so it's a remote cache for that I've created a Docker compose file and you can see here I have my cache very simple cache again I'm using aslcast you can use anything you want here I have my application that I have dockerized and here I have the pipeline and everything has been dockerized and of course I have the database so let's start this it might take a bit of time of course so let's do some magic I'm ready, I can start playing with my application so here everything is dockerized I don't need to start it in the IDE I can directly like kill, let's do that so let's say kill localhost 5000 slash 1, good I got it anyway I can see it's like I wouldn't say blazingly fast because I don't have this done but very very fast and let's say here I want 2 and as you can see again it's very very fast if I get everything it's very fast, I have an issue with the formatting and then last I want to post so let's post something H and here I'm on a new application so I can put everything like this and if I kill localhost 5000 it's already in the cache that's our latest pattern, cache ahead so here is a summary in this talk I've shown you several patterns to interact with your cache the first one as all developers do is the cache the cache decides you don't care about the capabilities of your caching provider, your application is the orchestrator and then you go to the data store and you go to the cache to read and write and it's your application's responsibility once you get comfortable with your caching provider in general you will use RedFool and RedFool because it removes the responsibility of this orchestration flow from the application your application doesn't care about the cache you use one data store and in that case the desktop store can be seen as the cache at some points you might be faced with performance issues you are blocking if you don't have solid consistency needs then probably you will need to check right behind we've right behind when you have like when you are hammered by several requests on the same entry what you do is you defer the writes asynchronously and then you will only write the last one which makes your system much more resilient if you value consistency if you don't like the fact that the data you return might be still if you value the fact that your application is fast you might know that every time that the cache is cold you will need to go to the data store in that case you might say hey once I've like requested an entry and put into the cache there might be slow pass so when this entry has been invalidated so one idea is once an entry has been put into the cache when it's nearing invalidation time you will refresh it asynchronously and in that case once an entry has been put into the cache there will be only fast request because you will be always eating the cache it will be a hot cache that of course means that you will have memory available the problem is it doesn't handle new data now if you have a lot of memory and you want this window of desynchronization between the cache and the data store to be as small as possible you can have cache ahead and in that case you remove everything from your application and you remove every logic from the cache itself and you are just creating a new pipeline that will like read the changes directly from the database and put them into the cache so that the application actually will consider the cache as the data store and you know that this cache is very very closely in sync with the data store it's not completely consistent because we are talking about distributed systems so it can never be consistent as you would think about acid but the window of consistency is as small as possible so I thank you a lot for your attention you can with my blog you can follow me on twitter and this story was based on a blog post I wrote so if something is not clear or whatever you can check the blog post more importantly if you want to check the codes and everything is on github it's freely available so please add a look and if you have comments or issue or whatever I will be very happy to read them and if I got you somehow interested in the Heizelcast you can join our Slack or get some free training