 I am working as a software development engineer for Bundle Technologies which is famously known as Swiggy. It is mostly into the food delivery in India and expanding into the hyper delivery local market. Today I will be presenting on the MySQL plugin for caching with Redis. This is the same hardware statement which states that basically whatever I am presenting today is for informational purpose only. This is our program agenda for today. So first we will see microservice architecture in general then we will see how applications uses MySQL and Redis and how we can use the plugin here to enhance some of the use cases. So let us get started with the microservice architecture first. So what are microservices? Microservices are small applications for a defined use case which interact to each other where defined APIs for contracts and they together solve a purpose and the benefits of microservices are they are independently deployable, independently developed. So if a new developer joins the project it is very easy for him to learn a small service and get started and if you have a load on one part of the software then you do not need to scale your whole application. You can just scale the service which has more load and your system will go on. So these are some of the properties of the microservices and the last one is small tech stack that means your own service will not be using, it will be using only the stack it needs so it will save the cost as the license will be less or you will be running the same software on the less machines. This is a sample in common architecture generally for the microservices. So here are your applications. These are the small microservices interacting with each other. With the common users which will be interacting with the applications might be interacting with each other and your database will be common in the cases where it needs to be. Mostly the database in case of my school cannot be separated very much because of the joints. So your data needs to remain together in most of the cases. You can have a data access layer in between depending on to optimize the cases. So till now everything is fine then why do we need Redis and other caching mechanisms? So as the application grows, the scaling grows, the load on databases starts increasing. So today database skills actually go on but they cannot handle the load after a certain point. So once that happens we need some kind of modification in our data access layer to maintain the industry standard of the response time of 200 milliseconds. So this is where Redis comes in. So the Redis sits in between the data access layer and your apps can interact directly with Redis or interact with the data access layer with the Redis. What it means I will go in for the slides. First let's see how the applications are using MySQL and Redis. So MySQL is one of the most popular delivery choices for most of the applications. And Redis is an even more key value data store. Redis provides persistent storage as well depending on the requirement. But that is one of the recent cases of Redis. First the MySQL. So this is the architecture diagram of the MySQL. So basically when the time connection comes after the authorization and authentication the client is connected. Whenever a query comes it goes to the parser. After the parser it goes to the optimizer which interacts with the storage engine to fetch the result and the same path continues for the return of the data and the return to the client. You can optimize MySQL in a few ways depending on the requirement. You optimize queries, keep optimizing your tables at certain intervals. Use thread pools so that your connection time reduces. Write into master, read from slaves, divide your data according to the use cases so that the one slave also doesn't get all the read requests. And if the data is over it is not being used. Archive the data from the MySQL to some archival storage so that the data size is small and the query is faster. It will be a bit about Redis here. So Redis is an invariable database platform. It is used for high performance operations. And it supports the data structures like string, list, maps, sets, which maps and the list goes on. And going through some use cases of the Redis. First is caching because through which we can provide low latency responses. So caching can be done with any of the data structures mentioned here. The application can directly cache the data. Then there is a use case of session data storage, which is your OTP active sessions. So the Redis provides TTR with the operations. So the data expires after some time. So your session storage or the data will expire in some time. That makes it perfect for session class data storage. Then comes the rate limiter. So what it means that Redis provides a basic structure of counter. And if we use TTR with the counter, then the Redis can use it as a rate limiter. Because Redis is single threaded, your counter will not have a synchronization problem. And you can say that, okay, I want to allow only, let's say, 10,000 connections per minute. So you can use this with the detail of one minute. And counter to provide the rate limiting. Then comes the PubSub model, which is a publisher subscriber model. And one, the different systems can subscribe to a topic. And Redis, whenever he publishes data to one of the topic, all the subscribers get the data. So this model becomes quite fast for the low latency responses. Now, let's see a use case of Swiggy, how it is using MySQL and Redis both. And so both the systems are extensively used in MySQL. The order in the Swiggy can go to 1 million plus on a daily basis. And one order life cycle requires multiple reads and writes. So when all the actual data fetching starts, when a user opens his app to order the food. So the first request goes for the listing of the restaurants. The listing of the menu from the restaurants. Whenever you create a card, then a write request happens for the card. So when you place an order, then the order is written, the transactions are written. Then the order is showed to the restaurants which confirms the order. Then the order is assigned to a delivery boy and there is a proper tracking at every step. So one order may involve maybe more than 50 plus reads and maybe 100 plus reads. So for the 1 million order, it becomes too huge. For directly is MySQL for every read and write. So many of the data is cached which is reused like an order. The data of the order is constant through the order life cycle. Similarly the data of the delivery person or our delivery execution is constant through the whole delivery life cycle for one order. So this data is cached for the various systems to request. Similarly for the latest, for one service, the get set operations can go to 100k plus per second in the peak timing. So what I did was on my local machine, I did a small time-lining of the operations. So I created a table customer with only three fields ID name and city ID. And the primary key is ID. And I did 100,000 fetch from the MySQL and 100,000 fetch from the Redis. And the Redis was like 4 times faster in CDV than the MySQL because all in the in-memory. So this shows the time of operations. But if your table have joints or any of the operations which might go to full tables can, then Redis can be 100k faster than the MySQL. So from here the idea came to me that if MySQL stores the data, so this is the basic application model how data is handled. So if the data request comes, the application goes and tries to fetch from Redis. If the data is found, the application returns the data. If the data is not found, it fetches from MySQL, stores in Redis and then returns the data. And this has to be done for each and every piece of code that is stored in the data in Redis. And if you are sharing the Redis or your MySQL, then it needs to be consistent from everywhere. So what MySQL does for the caching, MySQL has a query cache, but it is deprecated in 5.7 and removed in 8.0. So MySQL does not provide any native caching for now. So this is a plugin I wrote which is for the POC only for the inserts. What we can do is we can store the caching rules in the MySQL database. When the plugin is initialized, all the caching rules are loaded into the memory and the connection to the Redis is also created on the plugin initialization. So when the plugin is initialized, load all the caching rules into memory and store into 3 data sets. One is set in the memory, set is based on the key of schema name plus table name. So you know that whenever the insert is coming to this table of particular schema, we are going to do some Redis operation also. There are two hash on which the key is schema name, table name and column name that we want to store this particular column data of our Redis and there is a hash based on id also. So that particular features that let's say I loaded one customer in the memory, the Redis cache, but I know that the customer belongs to the city of Singapore. So as the next steps, my application is going to need the city of Singapore information also in memory. My next operation is going to load that into the Redis. So why not load directly when I am loading my customer data in the Redis? So for this purpose, the hash id also is used for the slides. I am using high Redis client which provides the sync and sync calls to the Redis. Currently I am using sync calls. In the future, I can come up with the sync calls after doing the benchmarking and then there will be cases and currently the connections are persistent. So there are no timing or no delay in making the connection to the Redis. The connections are terminated when the plug is initialized. So this is the caching rule table. So this is the rule id, schema name, table name, column name. This column name is the column for which data you want to cache. Then there is a cache key. So what Redis does is in the model, generally you can cache this dot id but if there are different tables caching on the id then there will be conflict. So you add some kind of cache in the starting of the id. So two different shapes between the different things which you are caching. And there are linked cache rule id. So I will demonstrate it in the next slides. But basically what it is is once this rule is executing the one rule you can say that once this rule is executed I want to execute one more rule and you can have this linked all together so it can go in fact about time till you terminate the link is terminated. So let's say you load a customer. The customer belongs to a zone in the city. You load the zone. Then the zone belongs to a city. You load the city. All these things, one insert will load all three data into the Redis cache. And the cache warmup column can be used for initial warmup for the cache when your data is not actually inserting but you want to reload all the data in your Redis. Timeout is in seconds. By default if you specify the value minus one the data will not be timed out. And there is a specification column name all. So if you want to cache all the objects into the table like the whole row in adjacent format then you can use the column row all. It will cache all the whole row in adjacent format in there. So taking an example it's the same customer table I was using before with the random number names here and this is the city table with the id and I have added some caching rules here in the caching table. So basically what it does is it says for the schema profile and table customer cache the name with the caching key customer name with a timeout of 100 and there is a caching link rule also too. This is the caching link rule too which says for the profile schema city table cache the name of the city with the caching key city underscore name with a timeout of 200. This doesn't have any caching rule attached to it. And there is a third which is also on the profile schema on the city table which says cache all the data of the row for this table whenever insert comes to the caching key city underscore the timeout of 100. So the implementation goes through you install the plugin then you show the latest class plugin as active. This is currently written as an audit plugin. Maybe in the future or this is implemented 5.7 if you move to 8.0 or in the future you might change it to something more suitable and you insert into profile.city name enable values city 21. So you get the last insert idea is 20. You go to the reds you search for the city key is starting in the city you see that two city keys city underscore 20 and city underscore name underscore 20. So the city underscore 20 gives the all data of the table of the row in the json format and city underscore name gives the name of the city. This is a video which shows the demo of the link which shows the demo of the things I presented now and also example of where I will insert a customer name and it will fetch and store the customer city also so that will demonstrate the linking of the caching rules. So I had no keys in the reds. Now I am installing a plugin. So it is showing here reds plugin as active to install. So I am inserting into the customer first I am in city with the name city 22. I fetch the last insert id which comes as 22. Then you go into reds search for the keys with the city and you get the city 22 directly from the reds. Same happens in the city underscore name underscore page 2 it depends on the kind of data you want to cache. Then we go insert into the customer with the name avisek test and city id as 21. So I have searched for a key city now. So city name 21 comes into the list of keys. The value here is fixed because I have not implemented the fetching of data for the city for the id. And we get the last insert id for the customer and the customer name is also cached in that case. This was the demo. These are the references which I used. Any questions? Here is a timeout column. If you specify a timeout value after the timeout so reds provides this facility. You can set a timeout by the detail for a cache. Whenever you express this value then it will automatically be deleted from that timeout to the reds. So if you have any like a minimum time I guess reds supports in milliseconds. So you can provide in some milliseconds the values. Yes, so I ran this query like twice. So basically the data can be in memory by using inodibu for 4 months with the inodibu for 4 components. Set up, compare some goods amount so that the data will be in memory at least. So this is like our average after multiple times data fetch for the same set of data. Invalidation of cache. Yes, so that's for the application to decide. If the application wants that, okay, I can't work with the state data then application will always go to the source of truth and ask for data. But if the application says I can live with a little bit state data till I am getting it in the fastest time then it will fetch from reds. Yes. Invalidation can be used to deal from the application site but if we develop this plugin for the updates currently I have done only with the inserts but if we deal with the updates also then whenever data is changed it will automatically update in the reds. That is one case for the actual use case. Tools I am specifying in the MySQL table and the data is stored in the reds in memory database. Okay, so instead the timeout actually depends how long you store it. Yes. It's the reds job to invalidate the data after the timeout.