 Thanks, so yeah today we'll be talking about Postgres query planning That's kind of the process of going from SQL to like actually something that the database can execute This talk will come in two parts first. We'll talk about some fundamentals. We'll talk about some theory We'll talk about relational algebra with which is kind of the theoretical foundation relational databases use and Get over like go over some basic ideas for optimization of queries And then in the second part we'll actually look at how Postgres does it we'll look at some actual Postgres query plans Kind of how to read them and just how the planner works at least in concept Okay, so part one Relational algebra this sounds a bit boring, but it's kind of the theoretical foundation for this I won't go super in deep, but there's definitely some useful concepts in there and If there's anyone in the audience who's like really interrelational algebra You'll probably be a bit disappointed by some of the simplifications. I'll make so you have bear with me So for our examples, we'll just start with a simple example database with two tables. We have one table called customer There's just two rows in there for two different people and a second table called orders Where there's also just a few examples rows in here the data types and all of that don't really matter too much like kind of the implied ones work We're more interested in kind of Talking about some of the more higher level concepts So the first operator from relational algebra. We'll look at is the projection The symbol for use for the projection is the uppercase pi looks kind of funky in some fonts I kind of like a swear thing and some it looks more like similar to the lowercase pi and That projection corresponds to the select part of an sql query And so if we start with our table and apply the projection that would look like that We give the arguments kind of in the subscript of the operator in this case It's just the names of the rows and the names of the columns. Sorry the names of the columns. We're interested in So yeah up here we start with all three columns id name and registration date and we filter it down to just id and name The second operator we'll look at is the selection the sign used for that is the lowercase sigma And here's where something kind of unfortunate has happened The selection does not correspond to the select part. We've already covered it It corresponds to the where part I blame the sql people for this. I think the relational algebra people came up with the name selection first Um, so yeah bear with me. This is a bit confusing. I know but um I hope it's the only like really confusing name thing in this presentation And yeah, what the selection does it corresponds to the where part so it filters rows based on a condition And we again write it down in a very similar way. We just write down the condition in the subscript of the operator So here we can see the query on the left and that corresponds to what we see on the right We're filtering down To only the rows where the registration date is greater than the first of august 2021 And we just end up with one of the two customers right the Last operator we'll look at is the join operator specifically we'll look at like not the usual join used in relational algebra, but the theta join that is the Join with like an explicit join condition. I think that's a bit easier to understand that the more general one Again, it works in a similar way Of course corresponds to the join part of the query That makes sense again And here in the query we have select name product ID order time from customer join orders on customer id Equals orders that customer ID and this last bit little bit here, of course is our join condition after the on And yeah, if we join two tables, then that looks kind of like this in relational algebra We just again write the condition in the subscript in the arguments of the join operator And here we just end up as we can see There's just like the customer ID one in the specific orders we have which corresponds to even tropia So we just end up with those two rows Keep in mind. We also have a selection here So a SQL selection with corresponds to a projection I Didn't write down all the columns over here. So that's also something going on just to keep it a bit simple on the slides Okay, so far we've just looked at Like single parts of queries Now let's look at entire queries and kind of how to represent them in relational algebra And the way to do that is as expression trees that look kind of like this So again, we have a query you already saw a select name from customer where registration dates greater than 1st of august 2021 And these trees we kind of read them From well data flows from bottom to top. So we start with all of the rows from the customer table Then we do the selection and then we do the projection and the root of the tree At the very top. That's always where we kind of get our final result of the query Um and the general order for these nodes over here on the right Is we start with the tables Then we do the joins Then we do a selection if we have it and then we do the projection And that's kind of the I'll also refer to it It's the kind of the naive order that will always work that will always kind of give you the correct result Okay Let's look at some tricks we can do here And let's look at a first rule we can kind of derive from relational algebra. That is splitting up projection Um, so if we start with a table r with the columns a b and c If we do a projection down to the column a of the table r That is the same as doing a projection down to the column a of the projection Of the columns a and b of the table r sounds a bit confusing But basically what we're saying is we can throw away some columns earlier if we want to So again visualized with the tables over here Um, if we start with the columns a b and c do a projection down to just a and b And then do a projection down to just a of course, we just end up with column a But we can just skip this Set in the middle We can just project down to column a right away But we can also add in this extra step of dropping one column and then dropping the other column Or we can just drop all of the columns we don't want at one time So, yeah ends up being the same result And there's an optimization we can do with this in mind This is the same query again all the customers with registration rate greater than first of august 2021 But the query tree at the right has slightly changed We have this extra projection in here So what we're doing is we're starting with all three columns of the customer table down here ID name and registration date And then we're dropping ID because we aren't actually interested in it in any part of our query Then we're doing the selection and then we're projecting down to name Okay, so why is this useful why we we want to do this the idea behind this is to Reduce the size of like the intermediary sense we need to deal with in like while processing our query So of course if we Kind of like have all like right now We just have three columns a lot easier But if we have lots and lots of columns if we need to carry them around with us in every step of the way kind of And just throw them away at the very end if we don't need them Of course, it's more efficient to not carry them around with us if we don't actually need them That means we use less space and like buffers between this different the different steps And the post potentially means faster processing as well Okay, so that's the first optimization we can kind of make don't carry around columns. We don't actually need And what we do need to do is um, we can't just project down to name down here Right away though because we do still need the registration date to actually be able to apply our selection up here so Selections or like join conditions stuff like that as something you do need to keep in mind with this optimization Okay, let's look at a bit more complex query and kind of get used a bit more to the representation as a tree And the first part of the query we've kind of already seen earlier It's like name product ID order time from customer join orders on customer id equals orders the customer id But now we're also adding a selection now. We're also filtering the result We're filtering where name equals even tropia and order time greater than first of august 2021 and then at the right again, we have kind of the naive order we've Kind of talked about earlier. We start with the tables Then we do the join then we do the selection and then we do the projection Okay, let's look at a another rule that we can use to kind of play around with our queries So if we have a selection that contains an and that's the like roof thing we can see here That's a logical and if we have a selection that joins two conditions with an and that is the same as Applying the selection with the first condition to the result of the selection with the second condition Or the other way around like the order here doesn't matter So kind of our takeaway here is conditions joined by and can be split up into actually like two or even more Separate selections that we can run after each other on the results of the previous selection And the order doesn't matter for and So let's see what we can do with that Here we have the same tree again, but we've actually split up the selection So instead of having one selection with the condition name equals even tropia and order time greater than first of august 2021 2022 whatever Um, we now have two different selection nodes and of course if we kind of think about this again Only rows with name equal to even tropia will end up here and only rows with Order time greater than first of august will end up here But of course because we've already thrown away everything that doesn't have even tropia This will only have rows that are like satisfying both conditions up here Okay, so far this hasn't really helped right like just made our tree a bit more complicated But if we take another look at that We can see or if we remember kind of the table layout from earlier Each of the selections only apply to a single table or columns from a single table So the name column is uh comes from the customer table and the order time column comes from the orders table So one of the things we can do now actually is move around where in our query tree We're doing those selections Um, so we've just moved down This selection based on the name to write after the customer table And we've moved the order time right behind the orders table So kind of what we've changed here is we don't actually need to Join up every single customer that has ever existed with every single order that we ever had and end up with a potentially gigantic set That we then just like filter down again We can filter down first and then do the join. Of course, that's A lot less data we need to carry around a lot less complexity um, a lot less work in the In the following nodes and a lot less work in the join node, of course, because there's just less rows to deal with And as we'll see later that has kind of something to do with indices as well I'm doing this like having conditions just right after the table That helps the database even more because for example, I can use indices stuff like that. We'll talk about that later Okay, so that's the second optimization we can do Try to apply selections as early as we can to again reduce the size of the data That we need to deal with that we need to move around between different stages of our query Okay, let's look at a again slightly more complex join. Um, we're joining a third Third table product. I haven't really mentioned that first, but we can kind of imagine it exists And we can like join the orders that product id on like the id column of the product Kind of makes sense. And this is one way to do it. We can first join customer and orders And then join the result of that to product and then do the projection Um, but we I mean we could do it the other way around, right? Uh, we could also first Join um product and orders And then do the second joint just kind of move it up there Doesn't really matter. We can do both Uh, both orders work give it the same result but I mean Uh, which one's better, right? So suddenly we have choices to make Uh, it's kind of annoying Uh, so because the order like which order is better depends on a lot of things actually Uh, which is more efficient depends on like the sizes of the tables If one of the tables is really tiny then probably joining that first might make sense So we end up with like smaller set sizes but of course it also depends on like the conditions we have because Like if we push down one of the conditions that changes the sizes of the set we're dealing with again And of course like the result of like if we're applying a condition How many rows we end up with that depends on the data in the table, which is kind of annoying um And it gets even worse like there's different join up the implementations to join to join implementations to choose from So we can do like a nest loop join or a hash join stuff like that We need to pick which one's the best we might be able to use indices as well They might have us they might not Um, yeah, there's just a lot of decisions to make Okay So where are we? There's a few optimizations that pretty much always make sense Like pushing down selections like filtering down as early as possible Or projecting early Because that pretty much always decreases the set sizes we're dealing with that's pretty much always beneficial But there's some other choices where it's a bit harder to say which one's better And like simple rules that as like always push down selections as fast as we can Aren't enough But what we've looked at still is useful in that context as well Because the rules from relational algebra and kind of what we can derive from it is a good basis for correctness So kind of the worst thing that could happen the worst thing we could do It's like change around stuff in a query and then we end up with an incorrect result Like that wouldn't be good if we get like a different result than the naive plan Then we've done something wrong and just kind of looking at how relational algebra works what it tells us Um is a good start for thinking about like the things and the changes we can do in a query plan And the changes that we can't that will mess up our result Okay Short breather, I'll take a sip of water before we go into the second part Let's actually Open the hood a flick inside The most useful tool for that is explain So if you give postgres Explain and then like a select query Um, what does that does is it asks the database for a query plan for this query It doesn't actually execute the query if you just do an explain. So it's pretty quick Um, and it's a super useful useful tool for figuring out what the database wants to do And these explain um trees Like they're also like a tree structure similar to what we've seen in relational algebra Um, they do look a bit different the specific nodes we're dealing with but I hope it's still Kind of some of it transfers and of course the ideas are the same Now usually explaining gives like a text output, which is kind of hard to read at least on slides um And there's also the option to do json output and there's a few tools to Visualize the results of an explain query. So that's what I'll be using in this talk Okay, let's take a bit closer look at the table product we've seen earlier So each product has an id And there's a second column called number of foods and each product has like zero one two or three foods Whatever that might be um Okay, and now we can do Um a query where we are just interested in the products that have zero foods Okay, so let's see how the query plan for that looks So, um There's just one note here which is a bit surprising If you've seen what we've done earlier with like the selection the projection the tables or that It's just all in one note here It's called sequential scan or sex when scan or also sometimes known as like a table scan And it basically just does that it just goes through all the rows in the table one by one and Applies the condition and then just spits out the results If we open up this note and look at it a bit more of the details like there's a short description Of what this note kind of does And we can get some more details if we look into these tabs down here So for example in the mist tab we can see the filter which is product dot number of foods equals zero Which is that corresponds to the selection we've seen earlier If we look in the output That gives us the names of the columns that will actually come out of this node that kind of corresponds to the projection So it's all just kind of bunched in together in one node In this case, that's kind of how poskers does it But again, if we have multiple nodes in a query The rules from relational algebra to how we can move around the selection Terms and the projection and stuff like that still apply. They're still the same Okay, and if we go back into the general tab, there's something interesting in here as well It says rows and then it says not available And that's because we haven't actually executed the query and if we had that would show us the actual number of rows And then it says planned 24,623 Uh, it's interesting How does it know right we didn't actually execute the query right? We just asked the database for the query plan and it said how many rows we were going to get The answer to that is statistics Um poskers collects statistics either when you explicitly ask it to Um by using the analyze command you can give it a table and it will go through and collect statistics for the table Or it will also do it automatically During the auto auto vacuum process which like periodically runs in the background And there's one statistic type. There's a few different statistics types, but there's one we're Interested for this which is the most common values statistic And we can actually just ask the database about the statistics it has from a specific table By querying the kind of special pg stance Table or its view actually And we can ask it for a few different things. We can give it the table We're interested in and then the at name that corresponds to like the column That we want the statistics for and again, we want to know the statistics for number of foos in this case And the result of this query looks like this So what does this mean the first thing here the null frack is how many rows had the kind of special value null In in this column in the column number of foos We get zero here because none of our columns none of our rows have the value zero here We have n distinct. That's how many different distinct values The analyzer saw in this column and that's four. It's kind of to be expected as I said earlier Each product has a number of foo equal to zero one two or three Um, so that's four different values. That's what we see here And then the next two are kind of the the more interesting ones That is the most common values and then the most common frequencies These curly braces that look kind of funky and they're just the array notation postgres users So the first one is an array with four integers and the second one is an array with a bunch of floats and here and the first one I'm sorry the first Entry in this array corresponds to like the first entry in the second area So if we write it down in a bit easier to read way and we ran the numbers basically what this says is The number zero appears 25 percent of the time and 25 percent of the rows number one Number two and number three So each of them are like one quarter of the product Okay, so That's kind of cool So how does it work? Um, if we query for equality Uh, postgres can check the most common values and it can see whether the value we're querying for Is actually one of the most common values and if it is it can just Like read from the statistics Which fraction of rows that will hit? But even if the value is not one of the most common values that tells us something because then we know the value is uncommon and the Frequencies of the most common values kind of give us an upper bound to how many rows this query might still match Okay, that's all good, but like is it actually right? um We can check that by using explain analyze and what? Oh, that was bad. Oh, that was even worse um Just lost my speaker view That's unfortunate um Sorry about that. What's my speak of you? Oh, that's my speak of you Shouldn't see that Okay, sorry about that Okay, we're back um so, um Again, um, is it right we can find that out by using explain analyze So that kind of does the same thing as explain it gets us the query plan But it also actually executes the query. Um, it doesn't give you the query results, which sometimes is a bit unfortunate Especially for quotes that do take a long time Um, but yeah, what it also does is it collects a bit more statistics while actually running the query So if we open up this note again, it's again, it's the same sequential scan, but we have some more data in here So for example, what we can see here is the number of rows now We can see the actual number of rows over here compared to the plan number of rows Um, that's pretty good. It's pretty close. Of course, like we're just dealing with statistics So if these don't fully match up, that's that's good enough that works Um, and for example, one of the other bits of information that's collected is the timing So how long postgres spent executing this specific node? Okay, let's play around a bit more. Uh, let's add an index Um on this column by doing a create index on product on the column number of foods And we're still doing the same query. We're still doing select asterisk from product where number of foods equals to zero kind of the idea now is Maybe the database can use the index to actually find those rows quicker And if we do an explain analyze again, we see a slightly different query plan. It's actually a bit more interesting There's one note down here called a bitmap index scan And if we read the description here, it says it uses a bitmap index Which is one bit page to find all relevant pages Okay, like what the heck is a page, right? So, okay for this we need to go a bit into how postgres actually stores data from tables So on disk postgres uses something called a segment file Which is typically one gigabyte which contains multiple pages and pages are eight kilobytes each and um Each page contains the data from multiple rows And another thing we assume is that the order Is kind of kind of random. So the the page has like some header data Then there's pointers to like the different items which are the rows So this might be one of the rows of our product table. This might be another there might be a bunch more in here Okay, why do we assume random order, right? It's kind of weird That has to do with how postgres actually Deals with updates or leads and stuff like that So if we were to update this row up here What postgres doesn't do is it doesn't just overwrite this row in place but just kind of just kind of marks this one as outdated And it writes a new version of the row that we've just like updated Somewhere else it might be in the same page like it might be up here It might be in a different page wherever this space basically so As we update stuff in the table The rows just end up wherever they want to kind of or wherever the database ones So we can't really assume they're sorted by like primary key or any specific column. They're just kind of in some order Okay, but this tells us a bit more about like what a page is and what that might mean so Um, these are the two nodes. We just saw the bitmap index scan and the bitmap heap scan And the bitmap index scan works by creating a bitmap So that's just an area in memory with one bit per page And the database knows how many pages there are for a specific table And what it then does is it goes through all of the rows in the index It's kind of a simplified way of like showing How indices work in general Postgres uses b-trees which look quite a bit different But the basic idea is we have a quick way of finding All of the rows with like the value zero as the number of foos And for each row we know in which page it lifts So what the bitmap index scan does it just goes through Um through the index applies the condition and then just collects which rows we need to look at which rows those which Which pages those rows are in And then what the beatmap hits the bitmap heap scan does It is receives that bitmap and then just goes through all of the pages That are returned by the bitmap index scan Um, but it doesn't know where in the page the row is right. It just knows this One maybe more rows that match match the condition in this page So it just goes through the entire page and filters based on something called the recheck condition So if we look at the plan again Uh, what we can see Oh, no, I hit the wrong thing again Um Oh, no, I didn't go this time. I'm sorry Okay, um Hang on. Yeah, there we go. I'm sorry It's uh Kind of weird setup I have here Okay, sorry, we're back. So this is the same query plan the explain analyze plan We're looking at um, and here again, we can see the bitmap index scan That kind of goes through the index and creates this bitmap of which pages we need to look at At here in misc, uh, we can again see the actual condition. It's checking the index for and um Yeah, some other information for example again gathered from the statistics how many rows this is going to match and then again as we saw in the relational algebra part The data kind of flows from bottom to top Uh, and the result from the bitmap index scan goes into the bitmap heap scan Which then goes through the actual pages And then based on the recheck condition we can see here um Go through all of the rows in each of the matched pages and Actually picks out the rows we're interested in Okay, it's kind of cool We're interested in a higher number of foods So now we want all of the rows that have a number of food Number of foods greater than zero so that would be one two or three Let's look at the query plan for that and we would probably expect it to look kind of similar, right? Just expected to go through the index again and all that We're back out of sequential scan Okay, cool. Like why would it do that? Um, the answer to that is something called cost and you might have seen it in The plan nodes. I haven't really talked about what it means. So if we go back to the sequential scan we got um We can see in here If we open it up, uh Postgres has estimated a cost of 1790 for this node There's not like really any specific unit for this just kind of made up But lower cost is better and higher cost is worse um, so okay And one thing we can do now is we can actually Do a bit of a trick and say to postgres Please don't do a sequential scan. We can force it to not use a sequential scan And do the index scan again instead And if we look in here again, we have like costs cost for the bitmap index scan It's 830 sounds pretty good, right? But if we then look up here We can see um The cost for this single node and then the total combines like the cost of all of the nodes Under this specific node. It just gathers The the costs up for like the entire part of the plan or just the entire query in this case and we can see The total number of cost we see uh here 2330 that's like a lot bigger than what we saw for the sequential scan So that's kind of interesting. So why would it think the sequential scan is cheaper to do than the index scan? um For that we need to kind of look at how The cost calculation is actually done um The first thing the planner does for that is Determine the number of pages of the table we're interested in And that's a piece of data. It just has And also the number of rows that's also something it like at least vaguely knows And then the total cost for a sequential scan is the number of pages Multiplied by the second page cost Plus the number of rows Multiplied by something called the cpu tuple cost And so the first part of this equation kind of models the cost of i o we need to do We need to like actually read these pages from storage To go through them And then the second part is kind of the cost of like taking out the single rows From the page and kind of like at least moving them around looking at them And this kind of interesting like having both of these in here the pages and the number of rows But of course depending on how much data is in each row a different amount of rows fits into each page And if all of the rows just have loads and loads of data You might just fit a few in a page and if all of them just have very little data a lot of them might fit into a page But yeah, it's kind of interesting to have both of these parts in here um, and the sequential page cost is like a cost factor That is applied to sequential page access. So just starting at the Start of a segment file and just going through all of the pages in the segment file and just going through all of the rows in the pages um And then the second the cpu tuple cost again is like the time factor the cost factor for processing one row one tuple And both of these are actually user configurable. So you can set them to different values um, and maybe you should But we'll talk about that a bit later Um, so the cost calculation for the index scan or like the bitmap index scan That's a lot more complicated. Unfortunately. So we won't really see like a nice equation for that But the important difference is it involves something called the random page cost and that is the cost for Just going to a specific page and reading that page And that is typically set higher to the sequential page cost the idea behind that is Seeking around in a file and jumping to different offsets That is less performant than just going through the whole file at least like per excess And that's especially true for like hard drives um, so kind of the idea behind why it does that the sequential scan is Okay, it knows we're hitting 75 percent of all rows because we want number of foos greater than zero um, and each of those numbers is like one quarter of the rows And it kind of decides okay reading like almost all pages In like but jumping around like that is going to be slower than just reading through all of the pages sequentially That's kind of the the idea behind why it chooses the different plan here Um, but as I said, there's a lot of assumptions in these cost factors as I've mentioned They kind of apply for hard drives But especially if you're using flash storage the difference between doing Sequential read of like lots and lots of pages in a segment file isn't as big as doing just a random access to a single page and These cost factors are definitely not accurate For example, if your entire database fits in ram if you can cache the whole thing Because then the difference between a random access and sequential access is basically zero So depending on which hardware you're using Tuning might improve your performance quite a bit because the query planner is able to make better decisions And I don't know if anyone of you was attentive enough But if you look at the actual numbers for the timing of the two different queries You can see that actually on my machine as well Actually doing the index scan would have been the faster way even though postgres opts for the sequential scan Okay, let's Kind of go through how the planner generally works again So kind of the first step and the kind of the biggest thing for actually making decisions Is it retrieves the statistics for the table and for the columns we're interested in And what it does then is Explores different plans. It has a lot of things to like play around with It can use a sequential scan or it can use an index scan or a bitmap index scan. That's what we saw earlier It can do different join orders. That's what I mentioned at the very beginning might use different join implementations And which one it might be able to use maybe also based on like which indices are available And there's actually also one other trick that we haven't seen at all Postgres is able to actually parallelize parts of the query. So spawn multiple processes Let them process part of the result and then combine the results of them And then for each of these plans it's exploring it estimates the costs and also like the number of rows per node And the cost might be for example for Nodes that take an input it might be based on the number of input rows It might also be based on the statistics And of course it also uses those user configurable cost factors And I've just talked about two but there's loads and loads more for tuning different parts different costs of different operations And then it chooses the plan with the lowest cost which hopefully is also the fastest or uses least memory or Other aspects is the best one. For example in terms of io But of course it's not perfect. There's a few limitations So statistics are only that there are only statistics Um, and usually they're pretty good But of course if you have very big data sets, there might be like some edge cases where the estimate isn't as good And that might kind of lead the query planner down the wrong way Um, and of course also the more complexity query gets we just saw some very very simple queries here The more different query plans the planner actually can consider the more different join orders You have the more different combinations of like join orders and index scan with sequential scan. There's just a lot of things to look at and The problem here is we do have to be pretty quick with query planning because if we have a query that's pretty fast anyways Like even if we do the naive way Then spending loads and loads of time trying around different orders and different stuff Isn't really worth it. It actually adds latency to our query And there's a few common pitfalls. I've definitely run into when Doing this sort of thing So one of the things we already kind of seen in this talk is the arguments you provide to a query Can change the plan quite a bit. So if we Do something like number of foos greater than zero Like okay, that's going to give us a different plan than for example if we were to do number of foos greater than three because There's no number of foos greater than the three. So that's probably going to figure out that it can do a different query plan there And sometimes of course these arguments like come from stuff users input or like different IDs People can click on and stuff like that so just looking at one plan for like one specific case might not give you the full story and If you just change something slightly it might kind of go over the edge and choose a different plan for that Um, the other thing I've definitely run into is it's kind of tempting to add a limit Especially if you do explain analyze on a really slow query But that does change your query plan because the database knows Okay, if you're just interested in the first 100 rows, there's probably a quicker way to do that It doesn't have to join everything together There might be an easier way to just combine the data to just give you some of it If it doesn't have to actually give you all of it So adding a limit is not too useful for debugging it can sometimes help but it can also really lead you down the wrong way And the third thing that's again kind of complicated, but um, I think good to know One thing the planner does is it assumes Your columns or the data in your columns is statistically independent What does that mean there's a simple example for that for example if we have a table with cars And say we have 10 percent with brand equal to forward and 1 percent with model equal to focus But I mean all of those Model focus are going to be forwards, right? There's no other car maker out there that makes model focus. I think I hope so Um, but if we're doing a query where we're saying brand equal to forward and model equal to focus What the planner is going to do is it will assume those two things are independent of each other And what it will do is it will just multiply those fractions with each other to estimate how many rows will be matched And what we'll do is 10 percent times 1 percent will just give us 0 1 percent But of course the actual number is 1 percent, right? Um, as we've seen up here if we assume all of the things with Model equal to focus are going to be forward then 1 percent of the data will be matched by this query So one way to work around this is there's actually a possibility to do multivariate statistics Where it looks at like data combinations from Um different columns and actually kind of tries to factor this in That helps a bit But the other thing to also consider and don't want to be rude here, but Stuff like that indicates that maybe your schema isn't great And maybe you should like split it out into like a different table that actually combines Um model and brand and you don't like maybe you shouldn't store that in the same table, but yeah Um and kind of what I want to leave you off with is A bit of a step-by-step recipe for what to do when you have a slow query You kind of want to figure out what the issue is So the most important tool again is explain analyze And one thing that's really interesting to look for in there Is big mismatches between the estimated number of rows and the actual number of rows because that tells you Oh, the platter has gone down like a weird road here. It has Misestimated something and that often leads to bad query plans Um and another thing to kind of look for is like, okay, is it not like using indices? You should be using uh, you think it should be using is there maybe something like wrong So some index missing something like that is there some index you could maybe add to speed things up Stuff like that Then the next level up is to actually turn off certain planning nodes. That's something I talked about earlier Just telling the planner don't ever do a sequential scan That's really good for like playing around and then like looking at the costs for the plan where you turned off certain nodes and the default plan um, and that might give you a hint to why it might be Not picking that plan and why it's Thinking it has like then you need to kind of go down like to figure out why it thinks that plan has a higher cost Even though it might be faster Um, but the important thing is like, please don't ever do this in prod um Turning off certain nodes is pretty much bound to mess up your query in some cases or some parts of your query Again, like the query plan you end up with and the query plan that's best depends on input potentially So this is kind of dangerous. It's good for playing around. It's not good for like a permanent solution And the last thing is tuning those configurable cost factors checking whether they kind of match the hardware you have But it's something to also kind of be careful around and maybe go step by step But there's because there's like kind of cases where some queries might benefit from some change But there's also other queries where it might not be better actually Um, and if you said it like wildly wrong for your hardware, of course, you'll end up with query plans that just aren't great Okay, and at the end there's a list of a few other things you can look at if you want to go a bit more in depth Um, there's a really good, um chapter in the postgres documentation about this Which goes a bit more into different statistic types and kind of how to read these query plans And there's also A chapter on like all of these tunable cost factors. There's a bit more information about the multi version concurrency control That's like the moving around the rows in the in the pages That's what I mentioned earlier kind of on the side and the last thing is the plan visual lines are used in these slides Uh, which I would recommend using something like that instead of trying to go through the text queries at space representation I think especially for complex queries. It's a lot easier to find your way around and something like that Thanks Any questions? Sorry No, you ask So it's a no way to actually tell postgres, please dump all the Options you have considered for this query. Unfortunately, that would make it a bit easier. Absolutely. Yeah. Okay. Thanks And that is unfortunately one of the limitations of using explain like it only tells you which query it Like which query plan it shows doesn't show you the other ones it like looked at didn't choose Which would make debugging stuff like this a lot easier. I don't know if something where people have looked into trying to patch that in I've not seen anything Or the statistics you can see in uh in the plan and the timing and stuff Is this done in the background by postgres or only when you explain something? So can I fetch it in monitoring or is this done when you do explain? Um, I think So, I mean, I usually it's only done when you do an explain analyze, but I think there's options to actually turn on Like logging this Like at least for slow queries. I think you can like set a threshold to If you have like a query text more than I don't know one second then it should dump the query plan somewhere I think there's an option like that Yeah Thanks for the great talk been amazing Um, you actually talked about a lot about the software side of things I'm wondering because you mentioned, okay, there's obviously a huge difference in between htd and flash storage Whether you got some Experience on those because we actually have seen though we are operating maria debate. So still um, there's huge differences between Different flash storage devices, especially regarding f-sync problematics Unfortunately, I don't really have much experience on that But yeah, that's another thing that like gets kind of sticky with like how the database like actually makes sure stuff is written to So I don't really have much insight on that. Unfortunately. Sorry. Thanks anyways, though. All right Two questions. One is which kind of tool to use to visualize it. Oh, is it somewhere? For the for the queries. Yeah, exactly. Oh, yeah, that's uh, that's um This one down here pv2. Oh, okay And the other question is you mentioned that Tables are basically used and then they are structured and uh with the um pages and so on But how is it used for um sub queries where you already have a result? How is that done? I mean, there are basically pages I assume Well, there's a lot of different ways so actually the Like kind of like the conceptual way is like for sub queries Like, okay, you just have like completely separate query within your query But the query plan is actually like able to optimize around that Like it might be able to just like actually push a condition you have like in your outer query Up into your sub query If that's faster, so sub queries are You'd like more a concept that's like useful for structuring queries and not necessarily too interesting for the query planner Because it like kind of assumes they don't exist and just kind of treats it as a box But it's able to still move stuff in and out of the box kind of um But yeah, of course if There are cases where like it does need to buffer the result of like a part of the query Whether it's a sub plan or just some usual query node And it's able to like there's different limits again You can set it's able to spill that onto disk if it doesn't can't or doesn't want to fit that into RAM I don't 100 know how like which storage model is used for that. I think it's not using pages though because Some of the limitations to like why it's doing the stuff with pages comes again from the multi version concurrency control Doesn't need to do that for like intermediate results Hi Yeah, do database implementations have a look at the file system that they are running on and uh, do they consider them Like difference between x4 and butt f s and whatever. That's a good question. Um, I can only talk about poscracy Uh, I not really I think there's some options You can only set on certain file systems like for example for um, like the f sync Stuff someone has mentioned. Um, I think there's some file systems that have some optimizations there that you can use if you're using that file system But in general, it's more up to you to actually set these factors correctly And poscrace doesn't really do much in terms of like looking at the actual hardware and trying to figure it out itself yeah Or if you if you've like got the query that's being executed a lot of times a day or like a lot of times even in an hour Does poscrace ever cache the query plan that it generates? Oh, that's a good question. Um like it's Probably difficult to do because the data could change and like it needs to reanalyze every time it does it but yeah That's a good question. I think there's some I think it might do that to a certain degree and I think there's some mechanism way can actually like specifically ask it to Like say like give it a query and say like, yeah, okay I want to like execute this query more often I think one of the other upsides of that would be if you have a query that is quite long Of course, like you need to transfer the whole query to the database each time if you run it a lot and you can kind of like Get around some of that network latency also with that Um, but yeah, I don't really know about like much of the details of like when it might do Caching and what it might cache But of course like yeah, the statistics might change but they don't get updated Like every time the table changes they only get updated when the auto vacuum comes comes along and That might be fairly often in tables where there's a lot of activity But of course it's like not like for every query or something Um, do you think that the current postgres defaults for costs and everything are still accurate for today? Because I mean it's quite old software. Have they changed a lot over the years or are they fairly conservative on that part? Um, I don't actually like I haven't looked at the change log for like the default values I don't think they've changed a lot and Yeah, I like maybe some of them aren't great for current hardware Like there's also some that apply to like cpu cost of different things and different operations But again, it's I think it's just hard to find good good defaults there because it does depend a lot on like the specific Like the cpu costs might depend on like the actual like kind of cpu using stuff like that I think there's just a lot of variables involved So yeah Since it depends so much on the hardware that you actually are using Are there any I would say Recommended ways of finding out good thing like is there a program? I can run that just simulate that I can run for like a day that just Does things and at the end says these are good values for you or is it? So I have to say like not super into the operational side of that So I don't actually know like there's some web tools I know where you can just kind of like give it the number of cpus you have Which kind of storage and all of that and that gives you some recommendations And that's probably a good starting point But yeah, I don't know of any tool that like actually does benchmarking on the hardware But that would like feels like it would make sense So, uh, yeah, I don't know Is there a way to form postgres of any correlates in your data? um Well, I mean kind of like multivariable Multivariate statistics is one way to kind of do it and the other way is to Uh, well choose a different table layout basically um, so for example if um Like instead of again if we're with our cars example We have a separate table that just like has like maybe an id column and then like a links to like Another table with all of the different manufacturers and then another table with all of the different models And just like kind of links those two up The planner like the statistics for that table will be a lot like simpler to work with Because it doesn't need to consider Whether two rows are correlated Whether two columns are correlated. Um, and that generally will give you better query plans But there's no way to have any custom input or give it hints from me the database owner who knows What my users usually input I don't I don't think there's like a manual way to like Hint it that that these might be highly correlated other than like actually just adding statistics for it and letting it figure that out itself But yeah, you've mileage for may vary on that as well. Yeah Are there any other questions? No one Yeah, not a quick question, but maybe uh, I can chip in regarding the performance or default settings So in general, uh post this, uh tried my knowledge might be a little bit outdated But to be very on much on the conservative side so basically the default settings which you typically have are for very Let's say restrictive Hardware so like you're still using or assuming that you have an hcd Having not that much memory and so on so it might definitely be worth trying to check that And Something which you might try Is using pg band so in general it depends on Of course also what kind of data you expect and what kind of queries But that could be a starting point if you want to manually inspect that Yeah And again like this even stuff that like doesn't affect how the planner works, but like limits for how much it will like Put like cash intermediate intermediate results into ram or when it will start to spill over into Terrible storage and those are pretty low like by default. So if you have a lot of ram increase those values Any further questions that I'd like to thank you for speaking here at the gpn again. Thank you