 Okay, so we are ready for the next talk. We are ready for the next talk. Thank you for being here. Next speaker is Stephen Frost from Crunchy Data, and he will talk to us about indexing all two time slow queries in 6-in-8. Stephen got a warm welcome, please. All right, so I'm Stephen Frost. I'm the Chief Technology Officer at Crunchy Data. I'm one of the post-credits committers, also a major contributor. I've hacked on a bunch of different stuff. Latest was working with GSAPI Encryption in V12, which is pretty awesome. But I also did roll-level security, common-level privileges, roll system, contributions to other stuff. All right, a couple of things that I like to say about community. Join postcosecule.eu. It's based in France, but they allow membership for anyone in the EU. I'm not sure what that means for that funny island over on the upper left, but whatever. It's not limited to Europe, technically, but it's not EU, so I don't know. Anyway, also follow planet.postcosecule.org. If you don't want to join PGEU or you're in the US, or even if you're not in the US, you can join postcosecule.us. We don't have these restrictions. I don't know why they do. All right, so let's talk about finding slow queries. So first big thing that I like to talk about when I talk about finding slow queries is logging. Logging is a really big deal, and doing log analysis is really important when you're working with Postgres. Then we're also going to look at viewing active queries. This is going to be PGSTAT statements, which is a really, really helpful tool. So we're going to talk first about finding those queries that are slow in your system, and then we're going to talk about what you can do to possibly fix them and improve on that situation. So inside of your postcosecule.conf, you have a whole bunch of different configuration options, and we're going to work through each one of these and talk about them. I'm happy to take questions as we go. If I feel like we're running out of time, I may be short in my answers, but please feel free to ask questions as we go. Logman duration statement. So the first thing is, let's just get these queries into the log file. If you set logman duration statement to zero, you're going to log every single query that comes into the system, which is great for query logging. However, you run into this downside that if you have a highly active system, actually the time taken to log the query can oftentimes be much, much longer than the amount of time required for the query itself, and that creates a backlog. So if you're doing this on a really, really, really high-rate system, don't set it to zero, set it to something above zero. Now, I've gotten some different people who have commented on numbers I've thrown out in the past around this. Apparently, one prior time I was talking about this, I said 60 milliseconds is a good time. 60 milliseconds in my opinion is going to give you queries that are actually slow as opposed to queries that maybe if you set it to one or two milliseconds, that's way too much usually on a high-rate system. That said, I've been told that 60 milliseconds is apparently too aggressive for some people in some environments, because they have lots of queries that are 60 milliseconds or longer, so it really depends on your environment. One of the big things about log-min-duration statement is that it will give you the duration on the same line, which is going to make parsing a lot easier of the log file and analysis of the log file a lot easier. So I say, don't enable log statement or log duration, because if you enable those, the time ends up on a different line from the actual query, which makes it really obnoxious to try to figure things out. So this is what you'll see from a result here. So this is a very simple PG sleep statement, and you can see here we have a duration of about 1,000 milliseconds. The next thing is that every time we log a query in Postgres or log any line in Postgres, we include what's called a logline prefix. The default logline prefix in Postgres is pretty terrible, and so I encourage you to basically amp it up. This is an example, and this example actually comes as a recall from PG Badger, which is a tool we're going to look at in a little bit that generates reports based on what goes into the log file, into the Postgres log file specifically. But the whole point here is to give you things like a time stamp and a process ID, who the user was, what the database was, things like that so that you can actually parse this information out and do queries like per user or queries per client, things like that kind of information. And this is kind of like an example of what you'll see when you start adding in a logline prefix like that into your log files. One of the other things that Postgres does a lot of, and is really important to Postgres, but can also impact query performance is checkpoints. So you always want to be looking at why are your checkpoints happening, right? And when are they happening? So you can do this by enabling what's called log checkpoints. This will cause Postgres to log into the log file a whole bunch of information about each checkpoint, including when it started, when it finished, and a whole bunch of stats about it, right? How much it wrote, how many transaction logs there were, how long it took us to do our F-syncs, One of the things that can be really interesting to look at is like, okay, are you getting a lot of slow queries around the time of a checkpoint, right? In which case maybe there's a correlation there or it does not seem to have any kind of correlation. Maybe it doesn't matter. Checkpoints are gonna be more meaningful on a high write system than on a high read system, because checkpoints are all about writing out data. Connection logging is really useful to have. So log your connections, log your disconnections, right? That'll be useful information here. You can see a connection received, connection authorized, and a disconnection. So you can see how long a query or how long a connection has been happening. One of the things I wanna mention here is that when you're doing connection logging, if you start to see cases where you have like constant connection, disconnection, disconnection, connection, right? That's not a good sign, right? The actual startup cost for Postgres is pretty high. So you really wanna be using some kind of connection pooling to make sure that you're actually continuing to re-utilize the same connection to Postgres. One of the connection poolers that I really, really, really like and encourage is something called PG Bouncer. So if you're not familiar with it, I encourage you to look at PG Bouncer as an option for doing your connection pooling. It actually has something called transaction level connection pooling, which is really fantastic. One of the other things that can slow your queries down are lock weights, right? If you are blocking behind a lock because somebody else is holding a lock on something you need, right? That's gonna make it look like your query is going slowly, even when it's maybe not any problem with the query itself, maybe it's just that you're blocked, right? Waiting on a lock. Postgres has a really cool way of showing you this by having this option called log lock weights turned on. Now, this will not log every lock that's taken, right? What this will do is after one second of waiting on a lock, we will then put into the log file all of this really useful information about saying, hey, we're still waiting on this lock after a second, right? And this one second comes from deadlock timeout, is the length of time we wait. Because after we wait for a second, we have to run what's called our deadlock detector to make sure there aren't any deadlocks in the system. So when we do that, we also happen to say, okay, if you have log lock weights enabled, we'll go log a bunch of information about that weight happening. So this won't help with short-lived locking convention, but this will help with anything that is more than a second. And you could potentially reduce deadlock timeout to less than a second. I don't encourage that simply because running of the deadlock detector is a little bit expensive, right, in its own right. Not super expensive once a second, you know, once after a second is fine, but just be aware of that. Don't increase deadlock timeout. That won't make deadlock stop, just to point that out. I have seen this, right? I've seen people who have had like 60 seconds for their deadlock timeout. And I'm like, that's not gonna make them go away. That's not how that works. All right, one of the other things that can happen when a query is running is temporary files. Get created while the query is running. This can be really useful because guess what? When we're starting to write out the temp files, in many cases, that's an indication that like work members isn't set high enough or we aren't using as much memory as we'd like to, so we're having to write out the disk instead and writing out the disk is expensive, right? So log temp file setting it to zero means you're gonna log every time Postgres uses a temp file like this to do some kind of action, right? So in this case, we were doing a sort, that sort because this table was pretty big, ended up spilling out to disk and here's the information about that. Log auto vacuum min duration. This gives you lots of information about auto vacuum. This basically, if you set it to zero, it's gonna log every time that auto vacuum ran and how much work that it did. In general, running auto vacuum can have some impact, but you need to run auto vacuum. Don't not run auto vacuum, right? People sometimes turn it off. They're like, well, I don't need this, right? It's just like some background process. No, we need this, right? There's a number of things that auto vacuum does that actually helps with query performance, right? Including dealing with dead tuples and bloat, but also things like updating what's called our visibility map, which is really important to update if you wanna have like index only scans, for example. So I don't recommend disabling auto vacuum. Instead, set the duration to zero and go start looking and saying, okay, is this really actually what's causing a problem in my environment? Most of the time, it isn't. All right, so that's a quick run down. Now, I'm gonna do a little thing that I like to do that's fun and we'll see if it works and yes, it depends on the internet and no one ever knows if that works well. But here's an example of what you'll get from a PG Badger report, right? So here you get things like number of unique queries, right? Number of overall queries, query duration, first query, last query, query peak. You get all of these really cool graphs, things like queries per second, right? And this allows you to do some amount of drill down in here, right? Where you can start to look at specific queries and you can also look at things like number of connections into the database. All right, here's a, yup, look at that, it worked. All right, the internet is not dead yet. So here you can see, you know, connections per second, for example. Simultaneous sessions, so checkpoint activity, like I was saying before, can be really interesting to look at. And so I encourage you to look at checkpoint information as well. Yeah, all right, I don't think it's not gonna work so well. Okay, it did, did look good. All right, so here you can see how many buffers we wrote out from a checkpoint across these five different five minute periods. So the default checkpoint timeout is five minutes. I'm gonna talk a little bit more about that later on. And here you can see wall file usage. This kind of information is really, really helpful when you wanna look at queries and wanna try to correlate between these things. So then you can kind of come over here and look at different types of queries, query per database, query per user, right? Query per application. So lots of good information. And here you can see your top histogram of queries. You can give you, get slow individual queries. So here are some examples of some slow queries. These aren't that interesting in my opinion because they're copy statements, right? Copy statements are just ways of doing bulk loads into post-creds which means that essentially what we're doing is we're just doing a whole lot of writes. But you can get down into here and you can see, okay, now we're looking at time consuming queries. Where we can see, okay, this select count star from PG class is being run 11,000 times. That's quite a few times, right? And the shortest time it ran in was 300 milliseconds and the max was 23 seconds. That is a long query, right? Particularly for what this is doing. Now the average was only a one second or so which is not too bad. But this is one where you can go, okay, why is this running so often, right? And is there something I can do to fix it? Now in this particular case, this is related to PG pool and if you're using that, maybe there's not a whole lot you can do but it's really useful to be able to go into the system and see this kind of information, right? So here are our update statements now. You'll note there's no values in here. That's because we've normalized the query. So we've taken out the literal values out of it, right? In order to be able to smush them all together to see a consolidated view across all of the different statement, individual statements that are run. You can see here is another count star or count which ends up being pretty expensive. So PG Badger, really, really useful tool. I encourage you to consider using it and working with it. I'm not gonna do too much more with it. We've got a lot of other things to chat about here but I wanted to show that off. Okay, all right, the other way that you can look at slow queries inside of Postgres is by using a tool called PG Stat Statements, right? PG Stat Statements is an extension that you add into Postgres that will allow you to then look at all of the queries in a normalized fashion similar to what PG Badger does but through a view inside of Postgres. Installing it requires that you update shared preload libraries to include PG Statements and then you have to run this create extension command inside of each database that you want to have access to the view. Now, the actual contents of the PG Statements view are across the entire cluster. But you can only see them if you've created the extension inside of a particular database. Note that you're gonna need to do a restart, not a reload of Postgres because that's what's required when you update shared preload libraries. So what do we have inside of PG Statements? So here we have a user ID, database ID, query ID and the actual query text. Now that query text is also gonna be normalized instead of having literals in there where you'll have question marks, right? Because we have to take out the actual literal values in order to be able to do the aggregation. The query ID is an interesting thing. So the query ID should be consistent for a given query inside of a Postgres major version, right? And in fact, I think it hasn't changed much between major versions even so. And this can be your identifier to use to identify queries over time so you can like export this information, pull it into an analysis tool, right? Outside of Postgres, I've seen people have good success with like Prometheus and Grafana scraping this information and pulling it out and then visualizing it. And that query ID can be a useful tool for looking at a particular query across multiple of those snapshots in a simpler way than trying to compare the query text. Calls is the number of times that query has been run. Total time, of course, gives you the total amount of time. All of this information is since the last reset of PG Stat Statements. By default, PG Stat Statements will save the information across shutdown and restarts in Postgres. And so when you shut down Postgres, it'll write out all that information and when you load Postgres up, it'll read it all back in. You can change that, there's an option for that if you want. So this is gonna be like total time since whenever you probably installed PG Stat Statements, for example. We then have min, max, mean, standard deviation, number of rows impacted or returned by this particular query. So it depends on if it's like an update or an insert, but you'll get a number of rows associated with the query, which can be helpful as well. You can then see block hits, either shared or local or temp. And this is when it's either a hit, a read, dirty to written, so hit means it was inside of shared buffers, right? Where we already had that page available to us, which is great, and that means it's cheap. Read means that we had to go read a page in, dirty means that we had to go actually dirty a page inside of our buffer cache. Write means we actually had to go write that page out. Now, one of the things that I didn't mention on the other one is you should also enable track IO timing, right? This is not enabled by default because it basically means that we're gonna be doing a timing call before and after each read and write inside of Postgres. If your system has crappy timing calls and they're really, really, really slow, that can actually have an impact on performance. Most systems these days don't have that problem. There is a tool called PT test timing that you can run to check and make sure your system is okay. But if you enable this, then these block read time and block write time parameters inside of PT stats statements will actually be populated and you'll be able to go look at how much time did we spend on this particular query overall doing block reads and block writes, which can be really, really useful when you're thinking about query analysis and you wanna understand why was this query slow. So here's an example of what you might see inside of PT stats statements. This is a simple update statement, one using a utility called PG bench. PG bench is a pretty useful utility for basically running bench marks against Postgres that you then wanna compare against different Postgres instances, for example. Here we had 40,000 calls with a total time of 4.7 seconds. Here you can see our min time was extremely good, but we did have a pretty large max time of 142 milliseconds. So if I was looking at this, I'd be like, wow, that's a pretty large max time. And if I have my log min duration statement set to 100 or 60 or something lower than this value, that query exactly is gonna be in my log. And I'm gonna be able to go look at that query and be like, why are you slow? It's possible that at that time there was something else going on in the system, maybe a checkpoint was happening, maybe there was a lock involved, potentially it could be anything. So you can go back and look at that and say, okay, why was this particular query slow? And that can be really, really helpful. And here you can see that we have 40,000 rows as a result of this because each one of these statements is actually updating an individual row. Here's another PDStat statement query. This one is a select base query. And here you can see our total time from this is only 516 milliseconds across 40,000 calls, which is pretty darn good. This is what you like to see. This shows you your min time and your max time are well under a millisecond, which is typically very good. And your standard deviation is also quite low here, which is awesome. So that's the kind of query you'd like to see. So now let's talk about why your query is slow. There's a lot of different potential things involved and I think it's really useful to go over them. So we're gonna talk about Postgres configuration, we're gonna talk about tuples and bloat, and we're gonna talk a little bit about query plans. So these are a bunch of the different options that you wanna think about configuring inside of Postgres when you're looking at trying to improve performance. So the first one is work mem. One of the things to realize is that work mem can be allocated many times over. It's used for bitmaps, hash builds. It's also used for sorting in memory. So these are all things that you wanna care about and think about. One of the big things is that bitmaps, if you're using a bitmap heapscan, I'm gonna talk about that a little bit later, but if you're doing that and you overflow work mem, bitmaps will start to reduce their accuracy, which ends up being a potential issue for performance. Maintenance work mem, this is allocated by auto vacuum worker processes and also by parallel create index, right? Or any kind of create index or alter table, right? So for example, if you have like five auto vacuum workers and you have a gigabyte used, you can go up to a gigabyte for each one, potentially. Effective cache size, this is a parameter not everybody really thinks about. It's never actually allocated, right? It's an estimate of the size of the disk cache and basically what Postgres is trying to use this for is to decide, is this page likely going to be in memory or not, right? Maybe it's not in my shared buffers, but it might be in the disk cache, right? So if it's in the disk cache, then it is a good chance that I can utilize it more efficiently without having to go to disk, right? In which case, I'll be able to pull it back quicker and so that can lead to say, maybe it would be better to use an index scan rather than a tuple scan. Shared buffers is allocated at server start. This is our big shared buffer cache, right? Usually about 25 to 50% of the system memory. PG buffer cache is a really useful tool for analyzing the contents of shared buffers. All right, checkpoints and wall size. So min wall size and max wall size are also things to think about. Basically wall size is like how much wall can the system, how much disk space can the system use for the write ad log, right? And if it's too low, checkpoints will start happening before the actual checkpoint timeout, right? Checkpoint timeout devolves to every five minutes and basically what that implies is that when the system crashes, we have to go back five minutes in time potentially to replay through all of that write ad log and that can take time. So you don't necessarily want to have checkpoint timeout set like super high, right? But you probably also want to think about the fact that every time we have to write something after a checkpoint, we have to do a full page write which can be expensive. Checkpoint completion target is basically how much of the checkpoint timeout to use to perform a timeout, perform a checkpoint rather. Which is defaults to point five that seems kind of silly to me. I usually up it to like point nine. So that's something to think about as well. But if you start in particular in your logging of checkpoints, if you see that your checkpoints are happening due to X log, you really want to think about increasing your max wall size because that's gonna be a, you know, basically slow down your writes, right? You're writing to the database because we're having the checkpoint more frequently than you'd like to. All right, let's talk about dead tuples and bloat, right? So vacuum goes through and marks these records as reusable as one of the reasons why auto vacuum is really helpful, right? But if there are tuples out there that are dead, right? Or there's empty space on a page, we still have to read that data, right? When we're doing like a sequential scan, okay? And therefore that can cause performance impacts on sequential scans. Now indexes can also have bloat, right? Which can be impactful on the system running as well. So you really want to be running vacuum. Check Postgres is a really helpful tool for identifying tables to look for bloat. PG Monitor is also really helpful. Note that if you want to eliminate all bloat, you have to do a full re-write, which is typically not a good idea. Basically, some bloat is useful. If there's bloat in the table or if there's empty space, I should say in the table, that means that if we have to add a new tuple to that table, we don't have to extend it, right? Extending a table is expensive, okay? We've taken some steps to make things more performant around that, but it's still not cheap. All right, so what are these different ways of getting data from Postgres? So the first one is that you can step through kind of every record using either a sequential scan or what's called a bitmap scan. This is typically a pretty bulky operation, right? This works good for when you want to get like everything back from a table, but it's not going to work very well when you only want specific records. So in those cases, what you'd prefer to use is a index, right? And you'll see that as an index scan node. Obviously, you typically have to create these indexes in order to get them to be used, but this will often require accessing both the index and the heap. In some cases, we'll be able to perform what's called an index-only scan, which is really nice. So what this means is that every column you are returning exists inside of an index, right? Or inside of a A-specific index, right? And so what that allows us to do is that if all the columns are in the index and the visibility map is up to date, okay? This visibility map is what's updated by vacuum and auto-vacuum. Then we can avoid going to the heap to check visibility information and that can be a huge win. So think about these things where when you're thinking about creating indexes, like, okay, I'm creating this index so that I can find my rows really quickly, but if I add this one other column, then the entire query could potentially be answered from the index itself and I wouldn't have to go to the heap at all. Now what that's called typically is something called a covering index, right? Postgres now has support for doing this even with unique indexes. Basically, you can tell Postgres, okay, for these three columns, I want them to be unique, but I want you to also include these other two columns inside of that index that's performing my uniqueness guarantees. All these other two columns include in the index though so that I can facilitate these index-only scans, right? That can be a great way of speeding up your queries that are able to leverage index-only scans. When we're combining things in Postgres, we do a nested loop join, a merge join or a hash join, right? So a nested loop join is basically stepping through one table and then hitting the other table every time, right? Which is good for small sets, but it's not great for bulk datasets, right? Much better for a bulk dataset is like a merge join where we're basically walking two indexes, ideally, and interleaving them. If the data is an index though, Postgres, if it does a merge join, we're gonna end up sorting it, right? On one side, which can be very, very expensive. Sorting is not cheap. Hash joins are the other way we can do it where we basically scan one table, build up a hash table in memory, and then bounce the other table against it. Obviously this has a slow start, but it's extremely fast once we get that hash table built up into memory. When we're talking about aggregates, we have group bag and hashag. So for a group aggregation, basically we need the input to be sorted, sorting is expensive as I said, but it's very, very fast if it's say index-based, right? If we can just reverse the table in index order and feed that into a group bag, that's very, very fast. And it doesn't require very much memory or anything. The other option is what's called a hashag where we basically scan the table, build a hash table, and then as we find matching entries, we combine the results. This is clearly very memory-intensive, especially if you have a large grouping set that you're looking at. Now one thing I'll mention here is that there's a bunch of parallel options that are also parallel nodes that you may also see when you're looking at query plans and you're doing parallel operations inside of Postgres, but they kind of amount to the same kind of thing. It's either a sequential scan, an index scan, or some kind of hash aggregate, maybe a parallel hashag, right? Or a group ag. So what's the best plan, right? Well it depends. How does Postgres figure out what the best plan is? It gathers statistics. It gathers statistics through analyze and vacuum analyze. Auto-vacuum will do this for you automatically. So there's another reason why you want to keep auto-vacuum in play, right? Because bad stats is going to give you bad plans. So what do you want to do to figure out if you've got a bad plan or if you have bad stats? Well, if you run explain analyze, it's going to give you information about the actual results versus the estimates that we got back, right? So that's really, really helpful for being able to say, okay, are my statistics any good? And what you're looking for here is like order of magnitude type differences, right? You don't necessarily need to concern yourself if it's off by, you know, a percent or two, right? It's more like, is it off by 10x, 100x? That's when you start to say, okay, my stats maybe need some adjustment. And one of the ways you can adjust your statistics is by adjusting what's called the statistics target. In Postgres, you can set a statistics target that changes how much of the table we look at when we're doing an analyze to gather these statistics, right? By default, it's not very high, it's 100. Increasing it can be helpful at times. What that also is going to do is it's going to increase the, if you increase it, it'll increase the histogram that Postgres uses, right? And it's also going to increase the collection of most common values that we store, right? I've also seen times when decreasing the statistics target has been helpful. So think about playing with it potentially in both directions. So you can automate the collection of plans for queries that are run using a tool called auto explain, which logs explain for all the queries. This is also based on the length of time, but basically you put it into shared preload libraries and then you can set this log min duration and you can say you want to log nested statements too. It'll give you statements inside of PLPG SQL queries, for example. You can also have it log explain, analyze, but that's very expensive. Explain also gives you a few different options for how you can get the output out, XML, JSON, and YAML. And there's tools out there for analyzing this explain output. Pgabin 3 or 4, as well as a really, really cool site called explain.depes.com. This is a really, really useful URL to go to. You can basically take your explain plan, go here, paste it in, and it gives you a really pretty visual representation. All right, so let's talk about slow queries, right? So here are some examples of queries where you might say, okay, basically I've got a sequential scan that's happening, but I'm only returning a single row without any aggregation. Typically that means you need to have an index, right? You want to build an index that can find, whatever that one row is that you're returning, because that's going to be a lot more performant than sequentially scanning the entire table. Another one is that if you have like a relatively small data set, and by small, I mean, you know, not like two rows, but maybe a million rows or something like that, still relatively small, and you're seeing a lot of sorting happening or you're seeing a merge join, think about increasing work mem, right? And why is this? It's because Postgres will be like, okay, if work mem is only four megabytes, right? Then I'm not going to build a hash table inside of Postgres, it takes up more than four megabytes, right? And if I really want to do that, which can happen, when I hit that four megabyte boundary, I'm going to start using the disk, right? And I'll do a, I'll actually push out things to disk and do batching it with the hash join, right? So when you start seeing a hash join that has batching in it, that's going to be pretty expensive, because again, it's going to be writing out to temp files out on disk and reading them back in in order to perform the join. Another area where work mem is actually pretty important is with bitmap heap scans. I mentioned this a little bit earlier, but when a bitmap heap scan is happening, what we're doing is we're scanning the index and building a bitmap in memory, right? But where is that memory? That memory is work mem, right? If we fill up work mem with all of those pages that we want to go visit and all those tuples we want to go visit inside of the heap as part of the bitmap scan, we're going to start making that bitmap be lossy, right? What that means is that rather than tracking an exact page, an exact tuple on that page that we want to go visit, we're going to start rolling that up and be like, okay, well, we're going to potentially roll it up to like a page level. We're going to say, okay, with this bitmap index scan, with this bitmap heap scan, we have to go to a particular page and we know what page that is, but now when we get to that page, we have to scan the entire page and compare every tuple on that page to see if it matches our requirement, right? Whatever our conditionals are and if it doesn't match our conditionals, then we're going to throw it away and we'll have wasted a bunch of time. So when you start to think about work mem, think about bitmap heap scans, bitmap index scans, as well as sorts and hash joins. So an area where maybe your statistics are looking a little bit off, right? For example, if you have like a large data set and you're getting a nested loop, right? This happens from time to time with Postgres when you have not done an analyze it all on a table in particular, right? Because by default, Postgres, if it doesn't see any stats for a table, it's going to assume the table's pretty small, right? And then it hasn't got a lot on it and it's certainly not very interesting. So that can be a problem, but other cases can be, we just don't expect very much in the way of a result set from this table for this particular query and you can help that by increasing your statistics target and making sure your stats are current. Another one is that if you have indexes, sorry, if you have foreign keys, right? Postgres, when you do a delete, has to go verify that that key that you're deleting isn't in any of the tables that are referencing that table, right? Basically all of the referring tables. Now Postgres doesn't require you to have an index on that column in the referring table, okay? So if you do that delete and there's no index there, we're just going to sequentially scan it, which is going to take forever for a delete to run if you're deleting a lot of keys. So what you can do instead is you can go create an index on that referring table for whatever column it is that is pointing up to the referred to table. Another one that's good is preparing your queries, right? So if you prepare your queries, you can avoid repeating your planning costs, right? So the plan cache has both generic and specific plans and it's kind of an interesting thing. I'm not going to cover too much, but basically if you prepare your query, then you can explain it using this explain execute and you can explain analyze execute as well. But this will save you overall on planning time by preparing your queries ahead of time. So some of the other things that you might want to look at, right? If you have a big select count star, an index can help on this case, right? Because a select count star isn't returning anything from the actual heap. As long as your visibility map is accurate, we can use an index only scan here, right? Your visibility map should be accurate as long as you're running vacuum or auto vacuum. Another one that people commonly do is they do like a select star from a table. That returns all of the columns and all of the rows, right? Is every row needed? Is every column needed, right? In particular, if you have any value that's over 2K in a particular column, Postgres is going to do something called toasting, which means we're going to shove that tuple or that value out into another table, right? It's going to be expensive for us to pull that back from the other table, decompress it, because we automatically compress it, and return it to the client. And if that column isn't needed, it's going to be ridiculous, right? It's just a lot of wasted time and energy. Another thing to watch out for is select distinct star, right? Often that is actually an indication that you forgot a join, right? Some kind of join condition. So for example, here we don't have a join condition for C. So we're going to create a Cartesian product, and then we're going to remove a whole bunch of duplicates that would have come out of it. So it's a lot better to use like join syntax, like this down here, rather than doing a comma join. That can help you avoid those cases. Typically, select distinct is something that should not be needed very often. So definitely watch out for that. Another case here is like a select star with an in clause, right? This can be turned into a join, and that typically allows us to have more options for how to execute the query. It's even more important on a not in, right? In fact, this actually, what you wanna have here is turn this into a left join, or what's also called an anti join, right? What's a lot better usually is to use exists or to use a not exists when you're thinking about these kind of queries. So for whatever reason, that doesn't seem to be very common, but definitely recommend using like a not exists because Postgres will recognize that more easily as an anti join. For older versions of Postgres, this is actually going to change very recently in Postgres. If you're using CTEs, that's a with clause like this, what we're gonna actually do is we're gonna run the entire CTE, shove it into a temp table, essentially, and then use that table afterwards, right? It's called materializing the CTE. So that can be very expensive. So what you really wanna do is put your expensive stuff inside of the CTE or in all your filtering inside of the CTE, and then do things outside of it later. But you wanna try to eliminate as much as possible in your first CTE. Now, this has changed with Postgres 12. Postgres 12 allows you the option of whether you want the CTE to be materialized or not. And by default, we're not gonna materialize it, right? What that means is that we're gonna take that CTE and we're gonna optimize it if it was part of the overall query. Now, that's a big change. So definitely if you're on earlier versions of Postgres and you're using CTEs, test your applications out, right? Make sure that they work well with 12 and that you're not seeing issues here. Something else you can do if you want like a really fast count star is to use the database statistics. So this is a thing called well tuples in PG class. It only works for whole tables though, and it's not very perfect. If you want something better, use a trigger-based approach. So when you wanna do some kind of tuning of Postgres, definitely wanna look at increasing work mem if you're looking at particularly for queries. If you wanna improve your performance when it comes to creating indexes or doing vacuums, increase your maintenance work mem. And set effective cache size. The other thing you can look at is increasing shared buffers. A couple of other things that are really, really good are partial indexes and functional indexes. So a partial index is an index that you create with a where clause, right? And so you can say, well, I only wanna index this column when, sorry, these rows, when the state, right, maybe I have some kind of state machine, right? I want the state when the state is open, okay? What does that do? That means I can very quickly find all of the open records inside of that table. And of course that index can have some other columns to allow you to further constrain it, but the index will be smaller to begin with, which is really, really nice. Functional indexes allow you to basically say, I wanna index the result of this function, right? So I can have some kind of function that I wanna run and whenever I have a where, my function of whatever on this particular column equals some value, if I create a functional index in Postgres on whatever that function is, we can use that instead of having to scan the entire table and run the function in every row. Very, very useful capability. Do make sure you double check that the query plan that's actually used is, or sorry, that the query plan used is using your index. Sometimes with functional indexes, it can get a little bit confusing making sure that you have the right data types used. Also, if you have performance issues where queries and whatnot are going slowly, like in search and updates are going slowly, you may wanna go look and see if you have any unused indexes, right? Unused indexes still have to be maintained. We still have to update them. We have to write to them and that can make a big difference in terms of your query performance when you're adding in new rows in some way, either through an insert or an update. You can look inside of this table called PGStatUserIndexes, that's gonna tell you how often that index is being used and why it's being used. If there's any particular reason why, or sorry, not why it's being used, but you can then look at that query and say, okay, is there some reason why this index exists even though it's never being used? One thing to note about that is indexes are also used for backing constraints. Things like uniqueness constraints, right, are backed by an index and you'll see those indexes in PGStatUserIndexes. So even if you don't see those indexes being used, don't go and remove them, you really wanna have those constraints enforced, right? There's a reason why you have them and that's important. All right, and that's what I've got, but I still have five minutes or so for questions. So questions. Yeah, sure, whatever. Go ahead. Yes, right? If you set workmem very high, now one thing to note here is that Postgres will only allocate workmem as needed, right? We're not just gonna go like allocate it all, right? So you can set workmem very high and it'll work potentially pretty well until you actually run the box out of memory, right? Because that's gonna potentially happen. If you set workmem very high and you have a lot of queries that all come in that all wanna do like big sorts all at once, right? Postgres is gonna end up allocating memory for all of that time, you know, for all of those sorts and you could potentially run the box out of memory, right? Now if the system's configured properly, you know, that should mean that just one of those processes gets a malloc error and it gets, you know, basically it'll give up on the query, shut the transaction down and free all the memory, right? But that still can be impactful on your system, right? Especially if you start thinking that, well maybe it's starting to swap or something like that, that gets to be really bad, right? That can really hurt performance a lot. So I don't recommend setting workmem super high. What I tend to find is that a lot of your queries are either going to be like big analytical queries and you know which ones those are or they're going to be relatively short like OLTP type of queries, right? And so what I would say is that when you are running those big analytical queries and you know you're not gonna have a lot of them running in the system, you can set workmem on that session right before you run your big analytical query and reset it afterwards, right? So I would recommend setting it like that on a kind of per query basis, not just globally really high. Other questions? Yes? Yes, it essentially represents a percent, right? So my recollection is I think 1,000 is 1% of the table which is pretty high actually. You don't really want to set it super high like that, right? I think in general the default statistic start data is 100 is quite good and you want your analyzes to be able to continue to run pretty efficiently, right? Because if you start amping the statistic start get up really high, you're collecting more statistics, right? But that's also gonna mean that the analyze takes longer, right? And you're gonna have more data for the planner to have to consider. So that can have an impact on planning performance too. So there's trade-offs to setting it up, setting it, which is why it's configurable. Other questions? Yes? A distribution key you're talking about like off of, yeah, talking about distribution keys. So you're asking about distributing across multiple different nodes, for example. So that's kind of outside of the scope of Postgres, but at the end of the day it's hard, right? To figure out a good distribution key. And you're gonna end up with hotspots no matter what you end up doing ultimately, right? So I would tend to say that, I don't know, there's a bunch of different ways to look at it, right? If you have a natural one that's great, but if you don't sometimes a hash works, right? You can just hash some value across it, but you're gonna end up with hotspots because of that, right? It's just not an easy problem to solve. So I don't think I have any generic recommendations for that, unfortunately it's just not trivial. Other questions? Yeah, so when you create an index, right? That doesn't actually change anything when it comes to either the visibility map or the analyzed statistics, right? So you don't actually need to rerun analyze, number one. And number two, the visibility map is either gonna be updated or it's not, right? So that's not gonna be related to creating the index. So if you create an index and you see that it's not being used for an index only scan where you thought it would be, it might be that your visibility map is out of date and you need to go update it with a vacuum. So yes, maybe that would be relevant, but if that's happening, you should be looking at how you enable, how you make sure vacuum is running more frequently on that table so that you can consistently have an up to date visibility map, right? So there's some options to auto vacuum where you can say reduce some of the limits for when auto vacuum is run on a table to make it run more frequently. But you don't, yeah, if you test it immediately afterwards, I mean the visibility map should be updated unless vacuum hasn't run on that in a long time, right? It should be there. And the analyze stats, again, creating an index doesn't change the analyze stats. I will caveat that, there is one exception. That's if you use functional indexes, right? So if you use a functional index or you create a new functional index, you will need to run analyze to get postgres to basically run that function against the values in the table so that we can get statistics based on what the results of the function were instead of just based on the column itself. But just creating a regular index on a regular column, we don't need, we don't update stats, there's nothing to update in the stats based on that. Hmm, I don't think that's true. We have the histograms on all of the columns. This is my recollection. Maybe I'm wrong. I'd be happy to go look at the source code if you want to go take a peek later. I'm sure you're right. I'm pointing out the proprietary database. Oh, I don't know anything about other proprietary database, so I can't help there. All right, well I think that is it for me. Thank you all very much and I'm happy to answer questions later.