 Of course, I'm really happy to be here at FlossAsia, and we're going to be talking about databases at scale. This isn't really just about Redis and Postgres, both are great databases, but it's also about how we have to think about databases when we're operating above certain thresholds. And we're going to talk about how that changes as our databases get big and busy, and also how Redis and Postgres are very different from each other. So basically, I have a bunch of experience with Postgres, and actually a bunch of experience with Redis. I'm not going to go into too much here for lack of time, but I've worked with Postgres for 24 years, and I've probably worked with Redis for maybe six or seven years. As I say, they're both good databases, I like them both, but they're also very different, and there are a bunch of misunderstandings that people have about both of them when it comes to scale. So I wanted to thank, in particular, Adjust, where my case study came from. I used to work there, and we actually moved a bunch of Redis stuff to Postgres specifically for scaling reasons. And also, when I used to work at Delivery Hero, we had cases where we were looking at Redis and Postgres together in certain environments, and asking questions about when is it appropriate to use these. So that comes in really important, really, that was really helpful in kind of developing my ideas on this. And then of course, representing OriolDB here, I'm working with them on this, and actually this whole slide deck and presentation came out of collaboration that I started with them some months ago, where the question was, what can we do to make Postgres able to replace Redis in some cases? I wouldn't say you're ever going to replace it in all cases, but then this gets into the question of what's good about each system and what's bad about each system. So for our agenda day, we're going to talk a lot about databases at scale, then we'll have a quick tour through both Redis and Postgres in terms of their basic architecture, and we'll talk about how these are affected by volume and velocity of data. Okay? Because both of those are things which can impact scalability in both cases, and so we'll look at those carefully. We'll discuss the case study and we'll talk about why it adjusts when I was there. We moved all these Redis systems into Postgres, and it'll become very, very clear as we go through the architectural side why that was the case. We'll talk about some common solutions to using one or both together, and finally some general recommendations. So first, databases at scale. So basically, there are a large number of things that people believe about running databases at scale that tend not to actually match the experiences that we have when we get there. So of course, different levels of scale are different, and we'll talk about that, right? But people sometimes say, I've heard this many times, Redis is faster, it scales easier, et cetera. This may or may not be true depending on what exactly you're doing. Okay? Similarly, if you're at a point where you have to scale out Postgres, there may be some solutions that might help you with that, or you can probably afford to hire people who know what they're doing to make it happen. So that's basically the overall motivation of this talk, and then the question is how we use the tools that we have in these cases. So first thing I really want to drill home here is how we use databases changes as they get big and as they get busy, okay? And I'm willing to bet, at least when I started working with databases, I started with some toy databases like less than a gigabyte in size, and eventually, through my career, I've worked in Postgres databases that were 170 terabytes. In these different environments, you end up thinking about your data and your database very, very differently, right? And so where we start in our careers and where we can end up as we scale can be very, very, very, very different. So I'm going to basically go through a few sightings here, and we're going to talk about how every time we increase the database by a factor of 10, some of our considerations may change, okay? First one is database under one gigabyte, right? Who cares about indexing, who cares about how efficient your SQL is? Who cares about what the planner does, right? You write a query and it comes back almost instantly unless you do something really, really, really, really stupid because all the data fits in memory, you can sequentially scan through a gigabyte of data really fast, right? So at one gigabyte, typically we want to focus on learning how to do things correctly, make sure our data is properly normalized, and all the very basic things that we would learn from a purely mathematical perspective. We can look at the database here as just a math engine and nothing else, right? We get to 10 gigabytes storage, starts to matter a little bit because not all the data is going to stay in memory, usually, unless you have a lot of memory. You're typically going to want to think about indexes because even if you have it in memory, sequentially scanning through 10 gigabytes of data is going to take a bit longer. And if you're writing a really crazy reporting query that, say, processes like 5 gigabytes of data like 20 times, in order to get one report, that's going to perform badly and you're going to have to start thinking about how on earth you are going to write this better. So query efficiency starts to matter here a little bit and indexes start to matter just a little bit. But usually, I mean, while storage can matter a little bit, it's usually not a big concern. When you get to 100 gigabytes, storage starts to matter a bit more. You need to be pretty good at indexes by the time you get up here, right? You need to understand what the index is, what it's doing, how to make it work, and you're probably going to have to start tuning your database here for performance, right? At 10 gigabytes, what tuning do you need to do, right? At 100 gigabytes, you're probably going to have to tune some things, one terabyte. So one terabyte, we've typically gotten really good at indexing already. We don't have new indexing problems. This is going to be much more about storage and access patterns, and we're going to have to start to think in terms of these lower levels. And we're going to start to face problems that were a little easier on lower levels. For example, I usually find that if your database is 100 gigabytes in size, certain approaches to backups work very nicely, but when you get to a terabyte, then they start becoming really, really, really difficult. And so you may have to switch what tools you're using for backups here. You may have to rethink how you're doing backups. So a lot of these administration things become a lot harder when you go from 100 gigabytes to one terabyte. So when you go from one terabyte to 10 terabytes, what happens? So by the time you get to 10 terabytes, typically you're starting to have to reason about the internals of the database on almost everything you're doing. The database is no longer a black box. It is a very transparent set of working equipment that you actually have to use as working equipment. So internals matter a lot, and you'll run up against them. And by the time you get to 10 terabytes, there are no longer any kinds of new kinds of problems that you will face as you grow, because you're up against the limits of your hardware and you're up against the limits of the software. So 10 terabytes to 100 terabytes, same set of problems. So basically early on, we think about databases as black boxes. Internals don't matter, and we don't really have to worry about performance tuning. Closers straightforward, et cetera. But as things grow, we start to have to think very carefully about internals. We have to work our way around the hardware. We have to think about the hardware. A lot of these other things become really, really important. So I'm just going to go quickly through a case study, why we moved some systems from Redis to Postgres. This was a big ad tech environment. We had a large number of very big Redis servers. I think it was somewhere around 20, if I remember right. Each Redis server was probably running 10 instances of Redis, and it was a really complicated setup with Nutcracker and Sentinel. I guess with newer versions of Redis, you have the clustering stuff that might help a little bit, but it was a massive, massive headache. It was to the point where we couldn't work on the servers reasonably well because we didn't have complete confidence that if you had a Sentinel failover, it would failover all the right things with Nutcracker and so forth. So like touching that was terrifying. The question is how did we get there and why did we move it? Another major reason to move besides the administrative headache was the fact that Redis hardware is more expensive than Postgres hardware. Redis has to have everything in memory, Postgres puts everything on disk, but the other big issue is that this big Redis art infrastructure was brittle. What happened just to quickly describe the history of this is that the first minimum proof of concept that they had written way back when they started was in MongoDB and Node.js, then they discovered MongoDB didn't scale away, neither did Node.js, so they switched to Redis and Golang, and that worked for a while, and then they moved some things from Redis to Aerospike and some things from Redis to Postgres, and then eventually we got rid of most of the Redis and moved almost everything to Postgres. So we're going to talk a little bit about Redis's internals so that it makes a little more sense why we did this move. So Redis is a main memory database, but it has a very specific architectural decision to make it go really fast on small workloads. It is a single threaded event loop, there is no concurrency, and there is no ability to run queries in parallel on the same Redis instance. Newer versions can do disk IO with threads, but they cannot do, but they still cannot serve requests with threads because that's all a critical section. What this means is if you saturate that event loop, it just cannot go any faster. So Redis goes one speed only and you cannot throw hardware at it to make it go faster. Persistence is optional, that becomes important if you're doing queues and stuff, and replication is more or less, replication is more or less similar to Postgres in the sense that there's effectively a binary set of changes that get distributed and applied. So it is possible to write Luliscrips. I have not done this and I have not evaluated how it affects this particular architectural thing. I think the Luliscrips run outside, but I'm not 100% sure. If that's the case, then it would be a little less irritating than if they run in process and can starve everything else. So I mentioned persistence. The big thing about replication is you're replicating writes and that also has to go through the single threaded event loop. So if you saturate an instance with writes, it will saturate every instance with writes across all of your replicas. So writes compete with reads and they will compete with reads, a single write will compete with reads across every replica of the same Redis database. So I mentioned no parallelism, easy to saturate, things like that. So typically as people get really, as things grow with Redis, then you typically set things up with not cracker. I guess now they have the clustering stuff that should make some of this a little easier. The issue is if every replica has to have the same memory allocation, so if you're trying to scale up, you're scaling out within a single box, which means more memory. And typically what I've had to do in the past has been Sentinel and not cracker. That gets really complicated really quickly. So quickly described Postgres, I assume people here mostly know it relatively well. It's multi-process, it's not multi-threaded. So every backend gets its own query. This makes it much slower to start up and it has a much higher latency. But you can run many more queries in parallel and you can scale it up easily within the same box. Now if you have to scale it out, now you have to write your own tooling or you have to use systems that may not be built for what you're doing. Usually many people start reading their own tooling in those cases. So it scales up. It doesn't scale out so well. Postgres is persistent by default. It is possible to maybe make it not persistent. You could throw things on Ramdisk. For example, don't do that. Replication is tied to the persistence. So if you get rid of persistence like for an unlogged table and try to put it in a Ramdisk it won't replicate. And you have many, many, many, many options for replication. There's logical replication. We heard about Spock and you can build some really complicated things there. So compared to Redis, Postgres is slower per core and it's higher latency. And this is never ever going to change, right? Because the architectural decisions mandate that. But Postgres scales up much, much more easily than Redis does and much more cheaply than Redis does. At scale, Postgres is just a lot simpler to manage. In many cases, if you don't need one or the other, if you're centralizing everything, you may find it much, much easier to put things on Postgres in many cases. But at massive scale you've still got to write all of your own sharding pieces. So costs are usually less. I've mentioned that. I'm going to skip over the scenario again. This is the case study. It had massive amounts of, okay, massive amounts of sentinel and just the complexity of dealing with the failover was very, very, very, very, very high. Redis can work very well as a cache. There's even a project that'll read your logical replication stream from Postgres and put it into Redis as a cache that gets rid of cache invalidation problems. So in the case where you need that, it's great. Also Redis has this time to live, which is really, really, really, really helpful. So in the event where you can leverage the time to live, Redis can in fact be really, really good alternative. So Postgres also can work as a key value store. There are tons of ways of doing this. You can also store a lot of other things. Replacing Redis as a queue, like with the list types and pushing and popping, is not as good. Right? So if you're running a queue off of Postgres you need to write it to run a queue off of Postgres. You're not just going to lift and shift. So a couple things I would say Redis and Postgres are different. And when you're at scale, you need to make sure that you have a variety of tools. They both have easily readable code bases. It's easy to understand how they work. Typically I find Postgres much easier to manage at scale but there may be many cases where you might still push something out into Redis. One case where I thought really helpful in my career was for authentication tokens on a website. So you have a time to live that can do a lot of your work for you. It reads seldom right mostly and had to be managed in this way. So I've just gone over my recommendations already and I'm open to questions. So I like the things that you mentioned about the Postgres cannot handle the queues. So can we mix the Postgres with Redis and let the Redis manage as a queue? So to be absolutely clear here, it's not that you can't build a queue on Postgres. I've done it also. The issue is that on Redis you have this list data type where you can just push and pull. That doesn't handle persistence very well so people usually just put it in memory. So for an in-memory queue, Redis is going to be the easiest thing to do and I've worked in environments where that's been split off into Redis that's running on the same system even as Postgres. If you're trying to build something that can store more information and you don't want to worry about the memory limits or you need persistence then I probably wouldn't do the queue on Redis. I'd probably do it on Postgres. So you mentioned some experience with OrioleDB could you elaborate a little bit about some improvements you might see coming? Yeah, so Oriole is what we're building is we're building a flash optimized storage plugin for Postgres because the Postgres data table formats was built around the idea that sequential reads are a lot cheaper than random reads and so if you start to try to flash optimize things, RandomIO is a little cheaper and you can now start to do things like block level compression some other things relatively nice. So what I've seen at least in the benchmarks we've put together is multiple times the throughput if you're running it on flash storage with transactional workloads. I think obviously it may still have some benefits on spinning disks but those benefits will not include speed and probably be slower. I would also say that the compression is actually really really helpful and that's something by itself that I'm really excited to see. Do you have a place to start to look at how to do Q in Postgres? So I started a project on GitHub that there are a couple of implementations of this. So there's of course PGQ which is originally I think part of the spec tools. That's one possibility. I also wrote a simple implementation called PGMessageQ. It's not for high loads or anything but it was designed for answering the question of how do I send an email from the database back and Q it and have something else do it. And then I was working on I haven't really completed it but it's on GitHub and you can take a look at it. It's a project that I created called PG Titanites that was based on the work I did on a big life sciences database for really really big heavy Qs and high workloads. I think that can probably be improved at some point because the code base was built around Postgres 9.4 but those would probably be the three things I would look at. I would probably start with PGQ because it's the most mature. That's all the time we have to do for the...