 I want to thank everyone for coming. It's been a long conference, but a pretty good one. The last slot of the conference. Before I get into the main part of the talk, one thing I like to do at specifically Postgres conferences is share two little things that I put in my psqlrc file, which aren't super well known. This first one here, on error rollback interactive. Now here's a hypothetical situation. Let's say you have a big table, you're doing some manual maintenance on it, you do a big update, and you're doing it in a transaction, sure, but it takes let's say four or five hours to finish. And then after that happens, you want to see if you did it correctly. So you do a little select star form users and then Postgres hopefully lets you know that it's supposed to be from and not form. And that's an error. And so it blows away your entire transaction and all that four and five hours of computation. A little secret is it's not so much of a hypothetical situation that happened to me. And so after that happened, I started looking around, like how can I avoid this? And one of my colleagues says, oh, you should use save points. I'm like, well, I'm not going to remember to use save points every single time I do any statement. After a lot of digging, it was actually kind of hard to find this in the manual, but if you set it to interactive, what this will do is actually set a save point up every time you do a command. And the interactive part means it'll only do it for interactive PCQL sessions. So if you're still piping a script through, you'll get the proper, you know, it'll blow up on exceptions and such. But interactive mode, you save yourself. So now when I type commands, which I still do all the time, my transaction is still safe. And then I can commit and continue on. The second one, I think this is a little more well known, but like it's still, every time I use it, it just, I'm so happy that it exists, is backslash x, as you're probably aware, will switch outputs from a lot of, you know, being a regular table to having the columns be one per line and then do, you know, several lines. Backslash is auto, I think it came in nine, nine two or nine three, I think nine three. And if you put this in, it'll automatically pick the right one for you. And so you don't have to re-run queries and switch it around. So that's a little PSA that I like to put to Postgres audiences because those two things have helped me a lot and I hope if you're unfamiliar with them, they could help you. So a little background, I work on Heroku Postgres, which is a very large Postgres as a service offering. And I've been there for a little over four years and in that time I've seen, you know, some terrible things. So like the things people do to their databases is just, it's bad. But from seeing a bunch of customers, you know, grow, grow their data sizes, you know, become, you know, get their applications bigger and bigger more and more, I've seen a lot of different things. I've been able to put some of that back into, you know, by directing Postgres development features and features that we've sponsored to be built. But also in thinking about a lot of things I see, I was able to put together this talk and a lot of this comes from being able to be in a good position to see, you know, so many different customers and what they do with their data. When I was putting this talk together, I was reminded of a story that I heard several years ago now and it really, when I heard it really resonated with me and so I wanna take some time and share it with you. Back in World War II, the Allies would send out these, you know, planes on bombing missions and, you know, some of them would come back but some of them would get shot down and, you know, the ones that came back, they would, you know, send them out again and, you know, some would get shot down and some would come back. And so what they started doing was, you know, looking at, you know, where the planes got shot and it turns out, you know, they weren't getting shot all over the plane, they were getting shot, you know, in certain areas. And so they started to reinforce the armor on the spots that they got shot, which seemed like a reasonable thing to do. But it turns out that didn't help the recovery rates of these planes. About the same percentage was getting shot down even with this extra armor. And it took a little while but what they realized was, what's really happening out there is the planes are getting shot uniformly across the entire body of the plane. And so the spots at the planes that there weren't any bullet holes such as where the wing connects to the body of the plane, those are the places where when they got shot would cause the plane to crash. And so by armoring the parts that they saw bullet holes is exactly the wrong thing to do. They needed to armor the other spots. And so in this talk, I'm gonna talk about, you know, why you wanna keep maybe your data size small. You know, in a lot of the, you know, exciting stuff that's happening in Postgres and other databases is focused on, you know, bigger and bigger and bigger data sets. But I think there's some things you can do to, you know, metaphorically, you know, armor the other parts of your system that maybe you should, you know, pay some attention to keeping your data sets smaller rather than the effects of what happens when it gets large. And so just a, you know, quick overview of what we're gonna cover is why small data might be important, how to avoid growth in the first place. And if you're starting to see your things start to grow, how to, you know, sort of stem the tide and prevent, you know, slow down that rate. And then finally, what to do when you do get to large data sets. And that last part, you know, many other talks at this conference and other conferences and online focus on that. So I'm gonna only touch, you know, touch briefly on that. The focus is gonna be on the other areas. So yeah, so why small data? So I don't wanna give the wrong impression. Postgres can handle a lot of data. But, you know, and often in many cases it's happier. You know, it sort of performs better. Things are easier when there is less data. Also when I was doing this talk, I found a lot of pictures of baby elephants. And I figured that's like small data. So they'll be here through the talk. So one of the things that, you know, sort of the basic things that's very easy, when you have smaller data, it's much easier to do backups. It's much easier to test the backups. It's much easier to restore the backups. Upgrades are easier because you can just do a dump restore. You don't necessarily need PG upgrade. And that's nice because, you know, unfortunately today, when you do a PG upgrade, you can't get the newer checksum abilities. You know, that makes things easier. You know, another thing is when you can keep your data set in RAM, like if your machine is big enough and or your data set is small enough that you have enough RAM to cover all that, you know, things are like really fast because, you know, anytime you go to disk, it's gonna be orders of magnitude slower than the file system cache and the Postgres cache. You know, even if you do get like crazy, like Fusion IO drives, like there's still, you know, there's still nothing like, you know, having everything in RAM. But the real reason is what maintenance on larger data sets is a pain. And, you know, some of these things like, you know, adding a column with the default, right? This is one of my gripes with Postgres. Like they make it so easy to add columns, but then once you try to do it with a default, you don't really know that that's a problem until the first time you take down your production system. You know, then you learn, but, you know, cause it's gonna have to rewrite the entire table with that default. You know, also adding indexes become harder. It takes, they take longer and longer time to compute and to build. They take up more storage space with these larger indexes. Changing data types gets really tricky. There's a couple of data types that you can switch to and keep the same representation, but a lot of them do require rewrite. And also it's harder to create followers or read replicas. You know, taking a base backup of a very big system takes, you know, much longer. And then, not only that, but because it takes so long, the amount of wall that you need to replay to have it become consistent, that, you know, takes more and more thing. And being able to create lots of standbys, we've found has been instrumental in having resilient systems. One of the things is sort of an aside that could be, you know, an entire separate talk is the idea of like immutable infrastructure. One of the things that works tremendously well for us is when we, you know, burn a server and, you know, put a database on it, we try not to, you know, mess with it after that point. And instead, you know, when there has to be new operating system or new whatever, instead bring up a new read replica and then, you know, do a quick changeover. And when you can, you know, make up these read replicas like much faster, it makes it, you know, much easier to, you know, work around problems and such. But, you know, our customers that, you know, have, you know, lots and lots of data, you know, it's all automated, of course, but it just takes longer. And, you know, the truth is, once you start accumulating data, it's hard to stop. Like, the other thing, he's not gonna give up that bottle. You know, it's sort of like the data takes, the accumulation like takes a life of its own, it seems. And so, if you can, you know, start the fight early, you know, start it today, that's, the sooner you start, the sooner it's gonna win. And I have one more here with the elephant fighting with a hose. So let's get into the first area, I feel like. How do, how do avoid growth? Like, what are some things you can think about to perhaps look at your systems a different way and avoid the growth before it happens? One strategy that me and my team, we've been sort of developing recently is to always, when you're looking at a new system or an existing system, take some time to think about the data flux of that system, like how much data per second is coming in, how much is going out, how much is accumulating per second. You know, for example, let's say you have a table and each row is about a kilobyte and you're getting, you know, 1,024 inserts a minute, which, you know, isn't, isn't, you know, screamingly fast, but it's something, you know, that adds up to be, you know, a megabyte of storage per minute. And if you, you know, play that out over time, you know, that ends up being, you know, a half terabyte a year. And, you know, a half terabyte, that's fine. But, you know, two years from now it's a terabyte, four years from now, you know, it's two terabytes. And, you know, if you can, you know, think about your systems in this way, you can sort of plan out, you know, how big is it going to be in the future? When do you think, you know, when are things gonna start to fall apart? What are the problems and so on? One of the things in looking at a lot of real life systems is that it really seems that a lot of the data is garbage. Like it's, you know, I don't think a lot of the data that we store, like we really need to. And you might think about your own systems, like, oh, no, no, well, my systems, all the data's important. But I don't know, is it? So I'm gonna go through a couple instances to bring it home. You know, a big one that I see over and over again is like soft deletes or paranoid deletes. And this is where instead of actually deleting the data, you add a timestamp column and when it's deleted. This is, you know, it comes from a very good desire to, you know, like if a user accidentally deletes the data, they can, you know, you can recover it for them, if not automatically, you know, with a support ticket and work on it. You know, maybe if your code, you know, goes haywire and starts deleting things, it still doesn't actually delete it and so you can kind of recover. But the problem with this is it's, it just sits there and grows forever. And, you know, everywhere in the application, you have to put where deleted at is no. Sure, you could have a view that kind of takes it away from you, but it's still like always there. And it, you know, the data just grows and you don't notice it until it's too late. Another thing that I see fairly commonly is people put their logs right in the database, like I have a logs table and that's very convenient, but, you know, again, it's gonna just grow unbounded. The good thing about this is it's fairly easy to prune it as time goes on, though it logs sort of evil cousin the events table. This one's a lot harder to prune because this might be here for compliance reasons or other parts of systems start to rely that the event stream is there and starts to programmatically look at it, so it's harder to just drop the events table. And the next one is like, sometimes it's just, it's not really, it doesn't fall in these other categories, but it's just like old data, it's data that's been around in the system for a couple of years. No one's really looking at it, but there's no built-in way to remove it and so it just kind of sits there and builds against the walls of your database kind of like plaque and just kind of like encroaches in and, you know, it restricts, you know, everything else from being nice. The most surprising thing, though, is that I've seen some cases where garbage data is actually the central point of the app, which is sort of paradoxical, but often this manifests itself as there being lots of little tables, you know, like a small users table, a small, you know, accounts table, and then one table that is just, you know, orders and orders of magnitude bigger than everything else. An example of this is we had this one company that the point of their app was like a messaging service where you could send messages to other people and so, you know, they had a small users table and then they had this giant messages table of all the messages, you know, which makes sense, but, you know, it's never going away, it's just, you know, accumulating and accumulating forever and if you take a step back and you look at, you know, the usage, you know, their customers, you know, no one's going back years, you know, or even months in the past and looking at the old messages. Like if they were deleted or moved to some other archive service or something, the data would be much, much smaller, everything would be, you know, much snappier and everyone would be happy, but, you know, instead, like this one table just grew forever and ever and ever. So, you know, with those sorts of things in mind, you know, what are some strategies for slowing the growth? And, you know, if you can't prevent it entirely, how do you at least buy yourself some more time? Again, you know, actually delete stuff. One of the big things that makes this easier is having, in your application, having a way to do background jobs. Now, it used to be not as common, but now I see like most apps these days always have a built-in mechanism for like background workers and, you know, async jobs and such. And if you can do the deletes in, you know, the sort of background system, you can, you know, finish responding to your query really quickly and, you know, get to the work of like cleaning things up after the fact. The problem, the thing that prevents, you know, sort of fights against this strategy, again, is oftentimes compliance or, you know, there are sorts of regulatory concerns. So as a way to, instead, one pattern that works pretty well is having an archive table. And you delete it from the real table, but put it in this archive table and it sort of, you know, the data is still in the database. It can still, you know, it's still growing there, but you know that your main application code isn't accessing that. It's a way that you can sort of have a staging place to then perhaps create a CSV and put it on S3 or, you know, any other sort of, you know, file storage, you know, of choice or, you know, somewhere else. And if you do the sort of pattern of, you know, putting it in a temporary place, you still have this sort of peace of mind of, you know, should something go wrong, I can get it back really quickly, but then you have a staging place to go stick it somewhere else. I found this and I couldn't find a relevant place to put it, but it was a cute baby elephant eating bananas. So another common source of where this sort of data comes from often is temporal or time series data. And a really good strategy for dealing with that sort of thing is to do table partitioning. PG Partman is a really great battle tested like open source tool that manages the sort of thing. Unfortunately, I don't have a whole lot of personal experience using this in production. I've mostly just like toyed around with it. Instead, we recently spun up a service that does this sort of partitioning, but we didn't need like all the knobs and whistles that PG Partman has. So my colleague Machek did his own partitioning. So I'm just calling it Machek rotation. And just to show how, the first time I heard about this sort of thing, I was like, oh, this seems pretty complicated, but I just want to show you, it's actually, it's not too bad. And so the sort of the idea is you create a type that can put together, use Postgres table inheritance to have a parent table and then children tables. And then you create a type to let you book keep the identity of these child tables and the time stamp of which they are handling and what time they start to handle the new data. You create a couple other functions of being able to add a partition. So this looks at the parent table and it creates a new child table with the correct time. Draw partition, it'll just drop a table. And then the way that it actually works is that the parent table has an insert trigger that knows about all these tables and the current time and can put it in the proper spot. And then you can wrap all these up in this, like a massage thing, which you tell it how many tables you want in the future and how many tables you want to keep around like for forensics or operations. And then this uses those other three functions to create new tables and drop them. And then what happens, the really nice thing about this of having it in separate tables is you avoid a lot of the bloat problems of having one table that is constantly getting things added to and deleted from. And so when these other child tables are dropped, it's simply dropped and you don't have to worry about any of the bloat issues or vacuuming them. So again, so let's say those other strategies didn't work and you have your table, your application is growing and growing. What you can do here when you already have a lot of data, there's still a couple of things you can do, deal with this. There's a lot of content out here and also solutions, like Citus Data, one of the sponsors here about how to have a really nice horizontal scale thing. There's a lot of common wisdom around dealing with large data sets. So I'm just gonna touch on a few of these things. Again, John, I have a point like maybe you can delete it. I don't want to be like flipping here or sort of obvious, but it's a simple solution to delete it, but it's not easy. And once it's already there, it takes time to put together a strategy to get rid of it. Often you have to talk with a lot of people in your company, a lot of the product things to, because now your product depends on all this data being there. An example of this, at Roku when you push new code or when you change a configuration or when you install an add-on, any of these sorts of things, we create a new release for your app. And what's really nice about this is when you have a problem with some code, you can roll back, like let's say you're on release 100 and there's a problem with that, you can roll back to 99 and it creates a new release 101 with the same stuff as release 99. The problem with this is we store all of the releases forever. And because each of these releases has a copy of the config bars and because we encrypt it, these config bars like toast doesn't work anymore. And so this table is just huge and it's becoming to become a problem over time. And I've been advocating that, for example, like this, yes, you're probably gonna roll back to a couple of releases ago, but you're never gonna look at your releases from a month ago, like rolling your app back a month in the past, like it's really not gonna work anymore. And so if we were able to delete that data and only keep the working set in Postgres, we would be able to drop hundreds of gigs from that table. But this is the product right now as it stands, it's all documented that you can see all of your releases forever. So making this sorts of change is going to require documentation change, communication with existing customer that we're changing, and albeit small way, we're changing how the product works and it gets tricky. And so again, it's much better if you can address this much earlier on in your product's life cycle than now, because it gets tricky. Another obvious way is scale up, get a machine with more RAM, faster disks. That buys you, actually surprising with, especially if you're on real hardware, you can get some pretty beefy machines, but it only takes you so far. If you are in one of these situations where you're doing the soft paranoid deletes, one thing that I see it surprisingly not often enough is Postgres's partial indexes. And for those of you who aren't familiar with this, one of the really great features of Postgres is when you create an index, you can give it a where clause. And what happens then is the index only applies when that where predicate is true. And so if you create your indexes where deleted as null, not only will the index be much more applicable to your data, it also slims down, your index is much smaller, takes up less space, and this helps a bunch. There's a couple of things you can do to delay the inevitable. When you are doing those indexes like that, that change a lot, the actual index itself can have bloat. We talk a lot about, in Postgres, about table bloat, but we don't give a lot of attention to index bloat. And one of the things, unfortunately, while Postgres has create index concurrently, and now more recently has drop index concurrently, it doesn't have re-index concurrently. And so this is a small script. It's open source, you can use it, but you can just look at what it's doing, and it's a simple strategy of creating a new index concurrently, renaming it, and dropping the other one. And so like in one step, you kind of get a nice compacted, deep-loaded index. And this was made during some, one of our internal customers, so our database service, you know we have lots of external customers, but the other teams also use us. And they are much worse customers because they know where we sit, and they can't come up to us. And so we wrote this to help some of them out. You know, again, there's lots of resources out there for vacuum tuning. I'm going to give a link to, I forgot to put it on this slide, but I have a link to another slide later that has a bunch of queries where you can sort of analyze your databases for bloat and tune auto vacuum and such. And there's a lot of other resources out there, especially there was some really great talks early at this conference on database tuning for these. So I didn't want to re-go into them. These resources are out there. But so I said, this is all just to delay the inevitable. And unfortunately, if you haven't ever grown database and you can't stop it and you can't slow it, you're gonna have to shard. And the thing about this is it's not easy. It's a real pain to implement sharding late after the fact. And all too often I see customers, we say, hey, you're gonna have to shard. And they're like, oh yeah, okay. And you like your data. You're gonna have to shard like, oh yeah, yeah. And then it starts falling apart. And we're like, you have to shard like, oh, we can't possibly. It's like, if you start now, so go think about the data flux of your system and look at it. And you can piece out in front like how, what's your runway? And you probably, if it is ever growing, you're gonna have to start now. So there's some good resources out there. One of the things that has made sharding significantly less painful was the addition of the Postgres foreign data wrapper in Postgres so you can connect out to other databases. So this HerokuPGX shows up top here. This is the thing I should have put earlier. It's a nice whole bunch of things. It's meant for Heroku Postgres customers so you can run on your database and see things. But at its heart, it's basically just a bunch of SQL queries that you can run on any existing database. And so if you take a look here through the source, you can find some nice things to investigate sort of how your system's doing and a whole bunch of nice things. One of these in here, while foreign data wrappers are very nice, they're terrible to set up because you have to generate, you can't just say, give me everything on that other database. You have to say, you have to create a table for each single one. And this is a hideous SQL statement that generates out another SQL statement that you can put in your first database. It's a huge hack, but it works. The other strategy that I think is one of the better ones of how to do sharding if you have enough preparation up front is the approach that Instagram used. If you Google search like Instagram Postgres starting, they've wrote up a number of blog posts on how they did it with specifics and such, but the way it works I think is pretty great. So what they've done is when they just had one Postgres it had maybe 2048 schemas and they treated each schema as a separate database even though it happened to be on one Postgres. And then when that database got too big, what they did was create a read replica, turn it into a fork and then on one of them delete the one through 1024 and then they kept 1025 up and so that way they're able to split it out into two and now that one they can split out into two and so on and so forth as any one of those gets too hot. And this way if you do that up front, you have a really good, you have a good path forward when it becomes time to shard. And I thought that was a really great approach. And the last sort of sharding that I wanna talk about, I don't have a good name for it. I don't know, I haven't been able to find it because I don't have a good name for it. I haven't been able to find, I'm sure other people are doing this, but I don't know what to call it. So if any of you know what it is, I'd really appreciate it. But it's a sort of idea that, so when I look at some of the systems that I've built in some of our customer ones, often I see like two types of tables. One of them is a slow moving table. It doesn't change very often, but it has very important data. And this is like your user's table or your billing's table, it doesn't change all that often, but if you were to lose that data irrevocably, your business is probably not a business anymore. And the other table is a table that, it's moving very fast, there's data being added to it all the time, data changing on it all the time, but any given row in there really isn't that important. And you can sort of, if that table, if something happened to it, it'd probably be a big pain, but it's something you could recover from. And when you have both of these tables like in the same database at the same time, it makes things difficult. You can sort of conflate the importance of the two. It makes, usually it's the fast moving table that's gonna cause problems, cause outages on your system. And it's hard to separate the attention that you need to pay these two things. And so one of the things we've started doing over time is moving these sorts of fast-moving tables out to either their own database and just have two databases on the same app or moving these things out even to their own systems. As an example of this, our main app that manages the central part that started out just as one app that managed our Postgres fleet has now been split out into a couple more apps. And one of these splits was for doing observations on the fleet. So now every 15 to 30 seconds, we look at each database and look at each server and ask it some questions, do some health checks, do some diagnostics, and we store that data. And so the table of these observations like just grows and grows and grows and grows and grows, but we really only ever need the last two or three per resource or per Postgres or per server. And so sometimes we get into sticky situations where we saw a vacuum-prevent wraparound was going for like three days and we're getting close to wraparound, so it's pretty scary. So we were just truncating that table because we don't need the data, but it's nice to have. And so what we've done now is put that into its own service and that's where the table rotation came from is making that system really good at these sort of high-rate things. Now it's not to be taken lightly, like we've been working on that for quite some time. It's hard to split that out after the fact and keep everything running at the same time, but being able to look at your database and think about these two things in parallel, these two different kinds of tables and looking at it through that lens is a pretty good way of splitting things out. And so just to go over recap here, we talked about why you would want to keep your database small, like how Postgres behaves better when it is at smaller sizes, how to prevent growth in those cases when you can't how to slow it and then how to deal with larger data sets as they grow. And again, if the data set is large, dealing with those problems, still feels, sometimes that's inevitable, sometimes that's just how it has to be, but I think a lot of times it is sort of putting armor on the bullet holes that are there and not really the underlying thing that can't be addressed. And thank you very much. I'm gonna bear after this. You've got stuff you want to get rid of. Yeah. You say, no one's using it, but then you get a change of data. So to what extent would you collect more data? Right, right, right. But to prove that data is not being used and to point to it and say, look, the last time this stuff was accessed was never. Yeah, yeah, that's definitely, that's why it is harder to, so the question was, just to repeat it, you know, what about strategies to store more data to keep track and point and say that this data has never been used? Yeah, like, that's one of the reasons why it is harder to start dealing with this after the fact while it's there. Like, it's much easier to fight it early. Yeah, in those cases, you know, it's definitely a good strategy. Like, an easy way is to just add another column, right? And, you know, last access at timestamp, and like, that is more data, but it's not like that much more data. One thing I actually did do is, it wasn't so much for, in the case of deleting, data in, you know, to case for deleting things, but more of to kill a feature. Because we have this one thing where you can sort of give read-only access to your semi-technical folks and they can write reports and import, you know, import the CSVs into like Google Drive or whatever, and one of the features there is exporting to a Excel spreadsheet, but I'm like, we have CSV and I don't think anyone's using this, so I actually, you know, added a thing to keep track of who downloaded what. And that does help, like then you can have to go away from the hand-wavy to, you know, concrete data. Like. I was thinking more of like, maybe something even so far, like let's take the text messaging example. Yeah. Like every month, just to keep track of the last, the oldest message there was access during that month. Yeah. Really, the argument you want to make is that I want to believe everything, that's the point, not, you know, these records. Right, yeah, yeah. So you can keep it really simple, really small and say, okay, everything past this point is garbage because it hasn't accessed in two years. Done, and it's proof. And at the same time, you're not increasing your update volume and, you know, creating bloat, so that can create bloat and so on and so forth. Yeah, that's a good strategy. And another thing that I did find help is, you know, in these sort of negotiations, like let's delete this. It's like people react poorly, you know, other stakeholders react poorly to, let's just delete it. So if you can say, like, let's archive this and put this in the system that, you know, we can get it later, but you'll never get it. Like, it's just gonna sit there. But that usually helps us sort of like, you know, here's a consolation thing. Yeah, yeah. Yeah, so the lots of, unfortunately I don't have a good answer for the tables with lots of columns. I haven't myself ran into that. So I, I'm sorry, I don't have an answer for lots of columns. Sorry. I know. Yeah. And so they're not there or they're not. Yeah, no, I definitely, I can appreciate that situation, but one of the, you know, our customers are usually start out as green field apps. So I haven't seen many people, like myself. I know that that situation exists, but I just haven't been in a position to observe that. But I can definitely see how that's a problem. All right. Thank you.