 Hello, everyone. Hope you are enjoying your first full day at KubeCon North America 2023. My name is Madeline Olson. We're all here today to talk about how to build real-time, highly-performing applications, and how you can do that with Redis. I'm hoping you guys all at least know what Redis is. Does anyone here not know what Redis is? Good. I will not be explaining it at all. None of the basics. We're only here for advanced stuff. Yeah. I'm here to talk about some basic concepts in caching and how you can extend those with Redis and then some of the more advanced patterns we see that I've seen people use. I said my name is Madeline. I'm a principal engineer at AWS and I'm also one of the open-source Redis maintainers. I want to start off this talk by sort of just a little bit of a warning, which is that a lot of people these days kind of view latency as the new outage. A lot of people hear this quote where if someone sees your application being slow, that's the number one reason for people to not come back. That doesn't necessarily mean you should put a cache there, which is what we see a lot of people do. That doesn't always work great. So before anyone tries to suggest putting a cache in an application, you should really make sure the data is cacheable and that you can identify that piece of information with a key. And then you're doing a lot of frequent lookups on that key. You're not constantly updating that data and you need to reinjust it. So that's the first thing. The second thing is make sure that you're actually experiencing the problem on the read path. We see a lot of people trying to figure out how to put caches in the right path and they generally don't work out super well. You can do something, you can like buffer data, but that's not quite caching. We see people start losing data. Don't do that either. The other good reason to start adding caches is when you start hitting scaling limits, you usually want to start rethinking how your architecture work if that's the problem you're having. It's probably not caching. And also tolerate eventual consistency. So many people just think like oh, we can just put a cache in front of our user APIs and then it works and it doesn't. So those are sort of like, you know, I'm going to talk more about making sure you're following these in a bit, but let's dive in and just kind of go over the basics and talk about how Redis fits into this picture. So most people, when they use caching, they start with what's called lazy loading and read through. The idea being that you have a query and a backend database and you want to identify that piece of information, you first check to see if it's in cache, like Redis, and if it's not, you go and check it from the backend database and then pull the data back through. So how does this data eventually get invalidated, which is to say how does it no longer become the most up-to-date source of truth? You typically set a TTL on this. So Redis has obviously made support for TTLs, but so do all the other major caching engines. So one thing I want to highlight here is that there's an independence of the read and the write path. This is often important for microservice architectures because they follow command, query, responsibility, segregation, which is you want to have independent writes and reads so that they can scale independently. So if you're just doing lazy loading, everything is on the read path, which is great. So what are some Redis-specific things you can use to make that read path better? So the most common thing we see people do is what's called a refresher head pattern with Redis. The idea being that you can listen to key-space notifications in Redis, which is a PubSub notification mechanism that's best effort, and it will notify you when a key gets deleted. People use this in a way that when the key gets deleted, they'll know the key name, and from that, they'll be able to refresh the data back from the backend database. This helps with tail latency, as you no longer have that first cache miss on the page, and that's lazy loading. The other thing we see people which is kind of starting to become more popular now, which is client-assisted, sorry, server-assisted client-side caching. So traditional client-side caching usually involves keeping sort of track of TTLs on the client, but now we support the ability for Redis to directly invalidate the records that the client has read. So the way it works is the client opts in to client-side caching. Every time it gets a value, Redis will keep track of that, and when that value is no longer up to date, it will send a notification to the client, and the client can kick it out. There's some popular clients like Redis, which does this automatically, but most clients support all the fundamentals to support this already. So the last thing that I'm going to call caching fundamentals is how to do invalidations. So invalidations is the path where we are proactively kicking stuff out of the cache. So this is done in usually one of two ways. Through a write-through caching pattern where we basically go to the database, re-put everything back in the cache, or we just kick it out and let lazy loading take its place and re-populate the cache. The highlight I want to make here is that now we have to make changes on both the read and the write path. Again, as we talked about, we like to have those separate, so now these two systems might be built by different teams, and now they have to coordinate to make sure the cache is staying coherent. So that's kind of the s. If you can do all that, you're doing caching more or less right. But there's nothing Redis interesting or too interesting in that. So what most people really want to be using with Redis is using all the advanced data structures we had. Everything we talked about so far is just using the string data type. And Redis strings are just binary blobs of data. You can stick compressed blobs in them, JSON blobs, SQL like result set blobs, whatever you want, strings are great for that. But the other big data structures that Redis uses for caching are the hash, which is basically a sub key value pair. People use this to sort of store like sub piece of information. Or they might store like a large block of text as different components, and then you can fetch pieces of them individually. The next data structure is the set, which people use a lot for like fraud detection. You might put bad actors in the set and say, hey, is this user in this or are they not? And most might also do stuff like, hey, is this user a loud listed? You can do a very quick check to see if the user is in the set or not. The next data structure is what we call the sorted set, which is kind of just a fancy way of saying it's an ordered set based on a number. So this allows you to rank the elements in a set. A more practical example is what we typically call a leaderboard, which is when you have a ranking of items, which is we also use this. It's common like the gaming context like, hey, who has the highest score, but much more practically, it's used a lot to say like, hey, what are the top 10 products that are being sold in the store today? And the score would be how many times it's sold. And the member or the key name is the actual product being sold. I'm not going to touch too much into geospatial, but that's kind of like a sorted set. But instead of using a score, it uses what's called the Geo hash, which allows you to find pretty efficient queries on like the nearest end neighbors to an element. What's great about this is REST is also completely open source. So there's also a bunch of open source extensions to REST called modules, which add even more data structures. So what are all these data structures that have in common that make them good for caching? Because REST has other data structures. It also has the list in the stream data types. This is great because these are all what I like to call declarative, they have declarative APIs, which is to say, hey, we're setting the value to be something. This is good because you often need to keep your cache consistent with your back in database. You don't want to have to worry about cases where if you send an API twice, it gives you a different result, right? Because you're usually like committing the transaction and then updating the cache. And you're kind of okay if you don't update the cache, but you're really concerned if you like update it twice to be more concrete. So when you're putting something in a set, you can only put something in a set once. If you try to put the same thing in the set twice, you'll get the same output. This property is called item potency, which is really great for caching. So let's walk through a little bit what these actually might look like in real workloads. So I'm going to talk a lot about this relational table for a little bit, which is just like, hey, we have this idea where someone's ordering products, there's an order ID, there's some product ID that defines what product we're ordering, and some customer is buying that. So when you're trying to think about caching with Redis, you should think a lot about the use cases. Like, how do you want to access this data? So let's say we have an example where we want to see if a user has bought a given item before. So we can use a set for that, right? We can say, hey, is this in the set or not? So we can see we can sort of like reimagine the data from this relational table into a Redis set. We're following all Redis' best practices where you kind of have a common prefects for all the same data, and then we're using the right data type. So we look at this, how would we actually construct this set? Because all of the mechanisms we have so far don't allow us to do this. We can't lazy load this information. Every time we request it, we don't want to pull it through, like do a complex query on the relational table. Maybe we can do something like write through every time the product gets ordered, we put in the cache, but then we have problems if the cache becomes inconsistent, how do we refresh the data back? So I just want to think about that for a second. So let's talk about one other case, which is the benefit we had in that previous example is like all the data was kind of still there. And if we tried to add something multiple times, we'd have this item potency protecting us. So the other big use case I want to talk about is the leaderboard case. So let's say we want to say what are the top items ordered for a given category? And so we have like a couple of items. In this case, this is all a single key, not multiple keys. And we can see how many of each item has been purchased. So we still don't, we sell the same problems as the set case, where if we want to, we can't lazy load this information, if we want to write through it, we don't have the item potency protection. Because if we want to say like increment the item by one, we don't have the protection because if we actually send the command twice, it will get, it will become incoherent with the back end. So sort of just to reiterate, like we do, we have an ability to do an item point add. So in Redis, storage sets can have an item point add with the Z add, which sets a specific sub key, sorry, the specific score of a member to a specific value. So like we would like to be able to do that, but we don't really have a good mechanism for doing that. If there was some way we could sort of keep track of how many of the increments that we were doing, we'd be able to sort of solve this problem because we were able to turn a non-item point operation into an item point operation. So the meta point that I was trying to kind of get at was this sounds a lot like what the database is really doing already. So the database's job is basically to keep track of all the transactions and make sure they get applied in order consistently. So it would be really nice instead of what we're currently doing of like trying to keep the two databases and the cache in sync, we could just listen to what the database was doing and every time it updates a record, it would just go and let the cache know directly and we don't have to worry about it. Because it's already maintaining logs, like basically all major databases use a right of head log to keep track of updates. So thank God for OpenSource, there's already a project that basically does all of this. We don't want to have to think about how to manage the different logs from different databases. So there's an OpenSource project called Debezium that kind of does all of this for us. So Debezium is like blue code. Debezium knows how to listen to all the different databases and it knows how to basically push out notifications about the updates from that database. The most quintessential use case is it's usually pushing these into a Kafka topic, but this slide had Redis on it so I wanted to use that instead. These are a little bit less common but you can use all this stuff. And the great thing about Debezium being OpenSource is it basically supports most major databases including cloud data ones like Fittas. So what does Debezium basically do? It's listening in on the updates and produces update records like this. So the core pieces of information here is that there's a schema so it kind of tells you on every update sort of what schema looks like. You can turn this off but it's important to make sure like the schema is not changing beneath you. And then you get basically a pre-image and a post-image. So in this case this is a create record. So there's no before image. If there was there would be if this was an update there would be a before image. But there's an after image which is this is what the database looks like after the update. And what's really important here is we have transaction IDs. So we know basically where we are in the stream. And we can know if we've already applied we've updated the cache with this information or not. For most databases that ended up being sharded like Fittas we also get great information like shard IDs so that we can actually fan out and use multiple Debezium processes to read a bunch of data. So I've been talking mostly about materialization of data into a cache. But this is also a great tool for doing invalidations as well. Instead of having to go and update the write path you can just go and listen into the database and see when your data is no longer relevant and just kick it out and then lazy load it back through. Okay so I would like to spend a few minutes and show you kind of like you know how this all works in practice. So let's hope everything works because I I originally had a online demo but I everyone was telling me that the internet sucks so I am doing this all offline. What do I want to show first? So Debezium pretty straightforward. Everyone can see this right? Yeah cool. So we have a very simple configuration that's saying hey we're using a sync type which is basically what we're dumping data into. We're just using renus we're dumping all in and we have a source configuration which is where we're pulling data out of. In this case it's a mysql database the variance secure password. Most people probably won't use Debezium server they'll use the normal Debezium which is more natively connected directly with Kafka but for this demo I just am going to use Debezium server itself which is like the core piece of Debezium. So I don't have Debezium running right now I just have a mysql database and a reticence since running. So we have this exact same table we started earlier and I have, I should have read this here and there's basically nothing here. There's the cdc location which I'll talk about in a little bit but right now nothing else is going on. So to start off I'm going to apply the change stream file we just had and Debezium is now going to kick up and start running and the other thing I need to do is actually start pulling that data so this is us consuming it doesn't matter what you can you don't need to read all this it's a lot of information but this is the you know we're pulling these streams and doing something with it. So we have that stream we're looking at and now we can see we actually have data so this data was all pulled from the back ends and put into our database I always get the casing wrong like it's command wrong so we can see we can basically do the command what I do is like hey what are the top products that we've ordered but this isn't super interesting let's actually put some load on this so now not only are we materializing we're also generating random data and just kind of filling this up so we're now like throwing random data at this and we can go and validate that this is actually staying in sync let's go grab some random product ID and go and validate that this does actually work oh because this is in red this that doesn't make sense this is where I want to put this so we have four items and we can do this well I promise this works um so that's because you know there's records ongoing I got a little bit unlucky and one other great property about this is if we want I'm going to throw Mara up on here so this is basically everything that's running on reds at the time you can see all these commands we're talking about and where's reds if we go and just delete everything it will go and start refilling everything that happened a little bit faster than I was hoping but I deleted everything and we can show that eventually it all gets repopulated because it was able to go back to the being stream and we resume everything so there's a lot of glue code in this file but there's really not that much actual code that we need to write so in this case we talked a little bit how we're generating these cdc events and all we're doing is going in parsing these messages that were generated we're checking to see the operation that it was in this case we talked about c being the create event my SQL specifically also has an r event which is a read event which is from a snapshot and so this is hey when we're loading it back from a snapshot this is because the way divising works is it goes and takes a full snapshot of the data set and then starts listening to ongoing changes so you can turn this on on a running database or you know set it up before you start the database and it will work the same this also means that if you somehow get a corruption inside your database and like your cache becomes incoherent you're able to basically restart to be easy on it it'll take a new snapshot and catch all the way back up and so here's like kind of all the important code it's a little bit smaller which is that hey when we got a new record for the first time we'll call asset on it which is set add and we'll put it in and if we have duplicate records they'll be kind of handled gracefully and then the zincrify case is when we're incrementing the sorted set score by one I mentioned a little bit about transaction IDs so you can also do that with Lewis scripts you can basically every time you update it you can pass in hey this is the transaction ID I expect you to see is it the one you have if it is update the transaction ID cool so back to the slides okay so I could it's worth saying now that this is sort of like a generic mechanism you can use like you don't have to be consuming from the database stream you could also put this as a as a right ahead before the database as well which is a common pattern to sort of like allow you to batch writes into your back end database while still making sure that the data has been durably stored but the benefit of this is you do have those decoupled reads and writes and so the applications can more or less iterate independently and you have a great way to scale the back end because then all you really need to work in worry about is how do you scale debesium so that it's pulling data uniformly across a bunch of database instances I know the common use case we've been seeing a lot as a lot of people are starting to invest more in data analytics because gen AI I hear it's a thing needs lots of data so people are using that to sort of push all of their data out of their database into analyque warehouses other places so that they can do machine learning so the other great thing about doing this strategy with Redis is that it works well with how Redis does high availability so there's kind of two main ways to do high availability with caching the first is you do multi writer so when you have a right you say of like three backend nodes for caching you write to all three and you basically respond back when like two have written successfully this means that there's often very subtle incoherencies between individual nodes and the cache whereas Redis takes a different approach whereas you only write to the primary and the primary directly replicates verbatim what it has to the replica this means that that's the cdc location that I kind of briefly talked about earlier you're able to store where the cache is in the cdc stream that's being produced by debisium so that when a replica when the primary fails and replica takes over it knows where it is in the stream and can resume from that point on cool so this is in the panacea as I said at the beginning there's always trade-offs to everything we're building the main downsides even though debisium has really matured as a open source project in the last couple of years there's still a lot of assembly required to kind of hook everything up and then make sure everything's working as expected as sort of glossed over some details with vitas we have to have handle like shard movements we have to handle transactions if you send multiple stuff sorry multiple different updates together in a transaction you sort of handle those as a block so there's a lot of little stuff you can still kind of get wrong and still requires quite a bit of you know tribal knowledge to be getting right but if you kind of build up that expertise you can sort of apply it generically across a bunch of different applications the other downside is if you still have a very right heavy application you're going to generate a lot of unnecessary data and do a lot of cache updates for no reason the other big issue we've often seen is you know you don't you usually have a very small set of data which is your working set in caching see if you have like 200 gigabytes maybe someone's only operating on 10 gigabytes there's no real good way to like shrink this data set down and only care about what's hot because everything is based on the previous updates it's hard to know you know what's the hot 10 percent of data without really understood like having deep understanding of your data and sort of proactively working on it another downside we hear a lot is people often say like oh it would be really nice if I could replicate a view of my database which is say hey like I want to create a virtual view and replicate that but to be the only word operates on real updates so there's no good way to do that without also materializing like another read replica somewhere which there's ways to do that it's just not fully fleshed out at the moment the other main downside we've seen is that when you do have data failures and you need to like restart the stream you can't lazy load stuff in so you kind of tend to take a major outage while restoring data so that's sort of the end of the advanced use case section and now I want to talk a little bit about the best practices that I've seen people employ as as I've worked with folks that are using Redis specifically in cloud native environments so we see a lot of people running managed Redis I am from AWS I do work on the managed Redis service but Redis still works pretty well in Kubernetes deployments and it's fairly cloud native the best way to run Redis in like a Kubernetes cluster is with cluster mode a big gap we used to hear a lot was with the dynamic IPs of pods cluster mode would get kind of confused and IPs would be assigned to multiple nodes but that's mostly been fixed with Redis 7 so if that deterged you in the past it's working a lot better now and we are still and we being the community working to try to make Redis cluster better with this new initiative we're calling cluster v2 which should make it a lot easier to scale in and out adding new shards removing shards etc so the next thing I want to talk about is sort of monitoring so you know as this theme of hey make sure you're doing caching right it's very important to monitor caches behavior because if you start having problems they tend to escalate very quickly it doesn't Redis does not have built-in support for integration with telemetry or metrics but it does there's a lot other open source projects that allow you to kind of bridge that grab the main one I recommend is using Redis Explorer for Prometheus you can configure the Explorer to provide the data from Redis info in a format that Prometheus likes and you can adjust it all this is a Grafano dashboard that I was showing last year to several folks at KubeCon I assume everyone here is fairly well versed in how to measure cache performance but the big thing you want to be watching for is cache hit rate which is easy to do and usage the next thing I want to talk a little about is a lot of people ask me why can't they measure latency here so Redis recently adds support to measure like what's the end-end latency time within a given command with like B50 P99 and I want to talk a little bit about that so this is a graph of what Redis is response latency looks like from the client side at different throughputs you really can't quite see at the bottom but the bottom is a scale of basically zero requests per second to 160,000ish requests per second and the vertical axis is the latency time you can kind of see that the latency stays pretty flat so the bottom line is the P15 average the maroonish color and stays pretty flat until it gets to about 160 requests per second and then sort of vertically shoots up it doesn't stop at that line it just we can't measure it anymore because we're like the latency jumps to infinity and this is because latency in high performance systems grows to the square of the number of commands that you get queued up so Redis tends to start seeing latency increases around 95% of max throughput and then scales basically with a square as you approach that the 100% limit so you really have no room to scale for latency spikes which is just to say you really need to understand the limits of your system because if you're waiting for latency to spike you're kind of already too late to respond to the crisis and this is because Redis is what's caching and distributed caching in general including Redis are what are called metastable systems there's a bunch of good writing on this Mark Brooker who's also an engineer at AWS has written well about this which is this idea that caches tend to be stable until they aren't so as your this graph just shows like kind of a high level of good point is the number of successful cache hits but once you kind of exceed and offered load which is the number of requests you're sending the system the number of successful request drops off and once it gets to this dropped off state it often can't get back to the good state once you start seeing errors in your cache you'll often start doing retrives or you'll start getting a cold cache I haven't talked much about AWS this talk but one thing AWS does really well is post mortems of issues a couple years ago there was a major kinesis event and this was purely because of cold caches right so kinesis was expecting like when it got a new piece of information it would put it to the cache and then put it in a queue and then once it got to the end of the queue it would go back and check the cache but if the cache became cold you would put something in the cache put something in the queue it would get processed a while later and when it got processed the cache would be cold and you would not be able to process it quickly and so kinesis just kind of kept getting further and further behind which is all a very long for both way of me saying you should really be testing your failure modes of your caches there's a couple of recommendations we normally give to our customers at AWS as well as open source users make sure you're testing individual node failures this is very easy in Kubernetes just kill pods randomly make sure everything's still good and stable you should also be testing high load scenarios what a lot of people are doing is just testing like failures but you know that case where we have high latency is also a big problem there's an opensource project or an opensource module the plugin for red is called reds fault injection which allows you to artificially increase latency the other big thing that I would strongly recommend is try to avoid multi node dependencies because they can exasperate failures so we've seen a lot of this with like feast and other types of feature storage which people use they have an offline primary store and then an online cache of features and the way it works is they sort of fan out and request a lot of data at once but the latency of that request becomes the tail so the slowest node in the cluster becomes the latency of the entire request so although sometimes it's unavoidable it can cause high latency and then lastly you should be doing like you should be periodically doing what I did in the demo which is just flushing all the data and making sure everything comes back up you might want not want to do this on a prod cluster if anything I would tell you not to do on a prod cluster but you should be testing it and you should understand what happens so that when something does happen you're prepared for it so that's everything I had today I'm so happy I'm almost perfectly on time you can follow me on Twitter if you want or what's formerly known as Twitter all the code was posted here if you can please take some time to scan the QR code and give feedback it's really important for the organizers here to really understand like what you guys want if like this was completely wrong do the feedback tell them that it's all wrong or if you want something else if you want more gen AI talks scan the code so I've been told if you have questions well you can do that now for a little bit if you have a question you should go up and ask it in the microphone or if you want to go to the bar crawl you can go do that too hey thanks for the sharing I do have two questions so first on Kubernetes regarding performance do you recommend putting your the computation pod at the same node as the reddit pod? so I typically recommend not to do that because you can sort of give the situation where like if they don't if you end up scaling them independently you'll have inconsistent cache performance and you know a lot of stuff works small at scale if you like co-locate them and once like they they sort of split you start seeing weird performance patterns so I normally say by default don't like keep them away from each other so that you can make sure it still works as expected for that like non-uniform access but there's plenty of people who definitely do side pods of reddit so they scale them together and if you're going to do that that's fine like just make sure you don't run the situations we have that mismatch okay thanks hi thanks for the talk I'm an SRE who has recently inherited a whole bunch of huge redis nodes and we're working through a lot of tech debt to get up to date working on getting to redis 6 excited for redis 7 thank you for the work you've done on that my question is one of the things that we have found with our choreographies is one of the one of the hardest things to deal with from coming from kind of the kubernetes world we have a control plane right will some of the things that that are coming up with cluster v2 help with the fact that redis is a little different you know redis is a cluster of peers yeah so I didn't I didn't want to spend too much time talking about that but that that is sort of the idea is we really want to give a control plane into redis clusters so you can instead of saying like so normally you have to basically add a peer like you know resharred the data over set it slots and we really want to get to the point where you can say like hey I want seven shards with like even like slot distribution and like there will be some operations that it will itself just be able to scale out it will add the shards over it will happen automatically and you don't have to go manually do all of that so that is a stated goal I didn't link it but if you if you reach out I can like share the issue with you if you want to add your input into that because like this is kind of what information we want if there's anything specifically painful you're seeing but yeah we are hoping to solve those types of problems absolutely yeah thank you so much cool well if no one else has any questions I'll be up here I'll chill out there's not I think that there might be beer in the main hall but I'll be here for a little bit anyways cool and thank you all for coming