 So Before we really get into this as Cynthia mentioned I have updated this I originally gave this talk two years ago at PG con For 9.4. So this is updated for 9.6 There are some code references in the slide notes I will have the URL where you can grab the presentation from it at the end In terms of the talk there's there's a lot of different Talks and blog posts and stuff that's out there about vacuum how to tune vacuum what about cost away all that kind of stuff and The one thing that I haven't seen out there and we're you know where I really wanted to go with creating this talk was something that really Walked through What the heck does vacuum actually do? I mean we all know that it gets rid of dead tuples and Put some space back in the database for us to make use of but beyond that You know, we don't we don't know what what the internals are actually doing So that's what I want to go through in this talk is is to really focus on What does the code of vacuum look like and and what is it doing? Those of you that that you know are They find this stuff interesting or have additional questions First of all, if anything's unclear because some of this gets a little technical, please just ask away No need to wait till the end But I would also Highly recommend go ahead grab the presentation. It's got references to the source code And for those of you who have never looked at postgres source code It is some of the best commented code. I've ever seen anywhere So even if you're not familiar with C code, there's no reason to be put off by that Just go ahead take a look at it what the code actually says if there's Anything that you want to dig into further? so Here's the laundry list of what we'll be talking about We'll be going into a pretty fair amount of depth Mostly about the internals will touch a little bit on auto vacuum as well So MVCC and Bruce I think Bruce gave a talk on this yesterday. I don't know how many of you saw it You could easily spend four hours just discussing MVCC But just as a as a very brief overview Postgres uses multi-version concurrency control Most modern databases have shifted to some form of MVCC Because the alternative is that you have to lock rows every time you touch the row That really really hurts performance and it becomes very difficult to administer that Administrate those systems What this means in practice is that in Postgres you when you delete a row The delete doesn't actually happen at that point the row gets marked as having been deleted and Later on vacuum has to come in and go actually clean up updates Essentially turn into a delete and an insert so an update never happens in place It's always creating a new tuple in the database And in this case I'm I'm differentiating between row is something that you would see from a query Tupel is what's actually in the table itself So eventually these old tuples do have to be removed from the table to make space for new tuples coming in and that's what vacuum does So another portion that's that's somewhat related to MVCC is the commit log the commit log tracks the status of every transaction either the transaction is in process or It committed or it aborted Multi trans multi x-axe store information about If you have multiple transactions touching a row at the same time The row header does not have the ability to handle that so as soon as you have multiple people touching the row Simultaneously we have to generate a multi x-act so that we can track everybody that's touching the row Vacuum is also responsible for removing Old data from both C log and from multi x-axe So this is this is something where sometimes people will see The C log Subdirectory which which by the way the names of a lot of these directories are changing in 10 I'm not sure what Commit log is being renamed to I do know that the PG X log directory. That's going to be called PG wall But the people have occasionally come to the mailing list saying the C log directory. It's huge Why is it so big and it's because they just hadn't been running vacuum? So because vacuum hadn't been running yet. There were transactions taking place commit log just kept growing and growing so Before we can really discuss what vacuum does we have to look at the question of what can actually be vacuum Vacuum can only remove a tuple from the storage from the table If no other transaction running in the system can possibly see that so The database will track Over time what is the current oldest running transaction? What transaction has the XID that is the farthest the transaction ID that is farthest in in the past In addition to what's happening if you have streaming replication set up The other effect that you can see here is that streaming replication replicas can also affect the oldest running XID in the system because if you have a very old running transaction on replica It may need to Access data that vacuum would like to remove So if you have hot standby feedback turned on your replicas will be providing that Transaction information back to the master and the master will hold off on vacuuming as long as you have An old transaction either on the master or on the replica The second aspect of what can be vacuumed is that In order to perform a vacuum The code has to obtain an exclusive lock on the page that contains the data so That's normally only necessary when An update or an incident or a delete is is happening something that's actually going to be Modifying the table, but there's there's actually additional considerations for for vacuuming so Normally Multiple back ends can hold a reference to a buffer and if you're doing something like a deeply a Many layers deep nested loop join for example the outer side of that Nested loop the the the outermost driver may be sitting on a single page for a very long period of time Seconds, maybe even minutes potentially even hours That is going to block the ability to do cleanup on that page so In normal operation if we're not if vacuum is not doing a forcible freeze When it comes to one of those pages It's not going to attempt to up to clean that page. It's going to skip that page This was changed. I think maybe 9.2 this was changed somewhere in the in the version 9 series previously you'd have a vacuum running and Somebody be sitting that they'd be holding a pin on the page vacuum and come along and say I need to grab the exclusive lock Vacuum would sit there and wait anybody else that came in and tried to read the page would block behind the vacuum So suddenly vacuum became very disruptive to the operation of the system So what happens now is is that vacuum will just skip over that if you run a vacuum verbose It will tell you in the output if it had to skip over Any pages because of the inability to acquire the cleanup block No, no, it just all it does right now is is hold account so when I When I first came across this and and was looking at this in 9-4 it occurred to me that You know, this could potentially be very problematic this could really prevent vacuum from being able to do its job and I started a discussion on the mailing list about Well, I honestly at this point I don't remember I either started a discussion about we should at least report it Or I maybe started a discussion about maybe we shouldn't do it. We should fix this And the discussion went back and forth and ultimately we decided well right now We don't know if this is a big problem at all You know, maybe it's an issue that should be addressed further. Maybe it's not So right now we just we don't really know so certainly Anyone that does run vacuum verbose if you run vacuum verbose and it says it's had to wait on Page locks, especially if it had to if it skipped a large number of pages We would definitely like to hear a report from you on the mailing list Because that's that's that was the that's why we added that tracking So at least now we've got some amount of visibility as to how big of an issue is this in the real world If it turns out it's a pretty common occurrence, then we would most likely look at okay Let's let's try and find some ways that we can do better in that area so the between both of these between the oldest running transaction stuff and then the the issue of the The cleanup lock What this boils down to is that long-running transactions will prevent vacuum from being effective So if you have a system that's doing a lot of updates on some tables You want to try and avoid having long running queries in that same postgres instance and it's important to note that the X IDs and the MX IDs those Happen at the instance level so you could have one database It's got a really fast churn rate and a separate database in the same instance that has long-running queries The fact that they're in two different databases doesn't matter the long-running query in database a is going to present problems to vacuum in database B so Transaction IDs and multi transaction IDs are both limited in size. They're both 32-bit integers the X IDs Transaction IDs are created When a transaction modifies data or if it has an exception handler So if you create a save point inside of a transaction That's going to generate a new X ID Where that is very easy to get into is if you have a plpgsql function that has an exception handler Under the covers to make the exception handler work plpgsql creates a save point So you need to be careful with plpgsql functions that are executed very frequently We've I've had a customer that was using a trigger for partitioning and They had overflow partitions So if if they didn't have a partition created for a particular piece of data the That would go into a special overflow partition and the trigger that was handling this Was using an exception handler to detect that. Oh, we don't we don't have a valid table That meant that every single row that was inserted was creating its own Transaction ID for the save point for the exception handler So as they're loading a billion rows into the table all of a sudden their database is shutting down to prevent wraparound So how many of you have ever experienced the wraparound shut down not fun is it? And and it always happens Friday at five o'clock in the afternoon So yes, the the long and short of this is that When the database is getting close to those rolling over it needs to do an aggressive Vacuum and that is the term that is in the code. So that's why I'm using it You'll also hear this reference is a anti wraparound vacuum In fact auto vacuum when it's doing an anti wrap vacuum. It says so in PG stat activity and You might also hear this referenced as a vacuum freeze Well a vacuum freeze is a little different all that vacuum freeze does is it sets the freeze limits to zero So that vacuum will Be as aggressive as possible and freeze as much data as possible So Similarly similar to how long-running transactions prevent vacuum from being effective extremely high rates of Updates of XID consumption for Sherlock or concurrent foreign key checks The for Sherlock and the concurrent foreign key checks. Those are the common ways that you that you need to generate an MX ID Those can cause freeze problems Freezing I want to be careful because freezing is not quite the same thing is anti wraparound Freezing and by the way all that freezing means is that when you take a tuple and you freeze it the X-min The the inserting transaction that created that tuple that used to be a real transaction ID It's simply set to a special transaction ID, which I think is two And that that special transaction ID. That's the fruit. That's a frozen transaction ID That means that that data is visible no matter what your transaction ideas Yes, it will remain that way forever until you update or delete the tuple When you update or delete the X-min will stay marked as the frozen But now you have an X max that's saying. Oh, well as of this transaction this tuple is no longer valid And then at some point vacuum will come and remove that tuple So there's vacuum freeze min age and vacuum multi x-act freeze min age control how old a X ID or an MX ID has to be before they get Before they might be frozen before they may be set to that special frozen X ID The freeze table age gucks settings those control how often an aggressive vacuum happens so as vacuum is running if if we come across a tuple that that matches these conditions either of these conditions if Vacuum can it's going to go mark the tuple is frozen, but it's not going to be paranoid about. Oh, I have to go do this these limits These control when auto vacuum or regular vacuum is going to go into paranoia mode So once a table is old enough that it hits one of these limits That's what's going to trigger an aggressive vacuum And that vacuum is going to very assertively go through the entire table and Make sure that every tuple that needs to be frozen is frozen. Now when it does that it's When it does that it's going to be using these limits to determine which tuples to actually freeze So there's there's a difference between these two settings And and similar to how long-running transactions cause problems. Well, they also prevent freezing so sounding like a broken record, but heap-only tuples were added in I think it was 9.0 Because vacuum is is Really expensive. We wanted a way to try and avoid it and that's what heap-only tuples do So if you perform an update and the update does not change any index values on the row and in this case an Index value means obviously if the column is referenced in the index or If the column is referenced in a functional index for example, if you've said create index blah, blah, blah on Lower some text field and lower so that the index is actually on the lowercase value That text field is obviously referenced in the index Finally indexes you can have partial indexes where you have a column referenced in the where clause That is a reference column in the index. So if you are performing an update and The update does not touch any columns that have been That are referenced if it doesn't modify that the update statement can actually say oh Set this index column equal to this index column that doesn't matter what matters is that the value itself did not change So if the value doesn't change on any index columns, there's no need for the update statement to create rows new rows in the indexes Those new rows in the indexes are What makes vacuum so painful? So basically because new rows didn't get put into the indexes the old row a new row will be created in the heap But no indexes will be pointing at that row. They're only going to be pointing at the old row that means that So so that creates a hot heap only tuple tuple Those tuples when they become old enough that they could become back. They could be vacuumed That can actually be done by just a regular plain old insert update delete statement that just hits the page and goes Oh, I I need more space So when those commands run They're first going to check to see can I free up space on this page just by removing old hot tuples When that happens vacuum just never gets Involved it it has nothing. There's there's no need for it to be involved in that process. So What that boils down to is avoid referencing heavily updated columns in indexes So vacuum itself There's four major variations. There's auto vacuum and manual vacuum and then there's With manual vacuum you can do a vacuum full which just completely rebuilds the table from scratch It's it that will actually just go and rewrite the table It's not really a vacuum. It's it's basically Create table as select from table Switch the names have all the indexes in place. That's that's really what vacuumful does. So we're not going to talk about that anymore today Vacuum is just a regular manually run vacuum and then Vacuum freeze if you remember those age limits that I was talking about a second ago What vacuum freeze does is it sets all those limits to zero? So it just says be as aggressive as you possibly can about Freezing this table and forcibly Read the entire table to make sure that everything is frozen. Yes. Yes So auto vacuum generally doesn't need to be tweaked Nowadays It's It's pretty good on its own for a moderate size database If you have a database, however, that is very large has a large number of tables You should certainly look at Increasing the number of auto vacuum workers There's some others tweaks that you might have to do My colleague Scott me it actually has I don't think he's giving the talk at this conference But I'm pretty sure you could find it on Google He's got a presentation about how to go through and basically tune auto vacuum to be as optimal as possible But for a lot of medium small medium databases auto vacuums default settings are going to do pretty good One thing I want to specifically warn people about I've seen Many customers. They'll see an auto vacuum running. They're like, I don't want auto vacuum running in the middle of the day So they'll set auto vacuum nap time to 24 hours Well, that just means that auto vacuums not going to do anything except once every 24 hours But you have no way to control when that 24 hour clock starts and Generally speaking it's going to that timer starts when you restart the database, but over time. It's going to slowly shift So first of all, it's not really doing what you want it to do second of all, there's other problems that you're going to run into because that check if that's Only happening every 24 hours. There's other back-end processes that should be running more frequently that now really aren't Running so that's really just not a good a good thing to set up If you do have a Workload that has you know high points and low points like maybe every weekend is slower or maybe overnight is slower It is very very helpful to run a manual vacuum during your slow period Let it do everything that it can and then during the day when you have higher or whatever your fast your heavy period is Auto vacuum then has less work to do so auto vacuum can focus more on oh, these are tables that are really getting hit I'm going to try and keep these clean And it's not going to worry as much about other tables The other thing the other place where Manual vacuum can really be very beneficial is if you have Tables that are getting hit over and over and over again, and you know that these tables have a high rate of churn The problem that you run into with just letting auto vacuum just trusting auto vacuum on those tables is If all three of your auto vacuum workers end up vacuuming three large tables And they're going to be busy with those for hours or longer Well now your tables that see this really high rate of churn. No nothing's vacuuming them So those tables are just going to blow bigger and bigger and bigger What's worse is the index is going to blow bigger So if you have tables where you know you have this high churn rate go ahead and just set up a cron job I usually just run it once a minute which may sound really aggressive But the thing is these are tables that are small So if you have a small table with a high churn rate But you vacuum it once a minute the table is going to stay small and as long as the table is small the vacuum is going to be very very fast Vacuum full as I mentioned it it basically just completely rewrites the table One thing that you can do if if you Find yourself in need of vacuum full some people have had luck with PG repack There's also been some complaints or certainly some complaints about it in the past So certainly be careful But what PG repack does is it will rebuild your table for you in the background? And it only takes a bare minimum of locks where necessary. So it's it's kind of like a vacuum full It's kind of like a vacuum full But it's it doesn't need the exclusive lock so vacuum One thing to note is that you cannot run vacuum inside of a transaction vacuum has to start its own It starts a new transaction for every table that it vacuums And it expects to be able to do that. So you cannot run it in a transaction With how postgres works that also means that you cannot run it from inside of a function There is a vacuum DB shell command. That's really just a convenience wrapper around essentially doing P sequel vacuum the one nice feature vacuum DB is that you can tell it To fire up multiple parallel vacuum processes so if If you've got a database and you just want it vacuumed as fast as possible vacuum DB is the best way to do that so Starting the the actual code walkthrough What vacuum does is for each table? It's going to scan the table itself scan the heap and as it's scanning when it finds dead tuples It puts those into an array and memory After scanning the heap it scans all the indexes and the indexes go and remove Their reference to those dead tuples After scanning the indexes it goes back to the heap and removes all those tuples that have now No longer have any reference from indexes And then finally if the analyze option was specified It will run analyze Once that's done once it's handled every table it's going to handle it will take a look and see if it can update the Columns in PG underscore database that frozen X min and dat min X I dat min Mx ID Those those are the columns that get checked for are we in danger of a wraparound so keeping those up to date is obviously important One big gotcha With vacuum so remember that vacuum first scans the heap and it makes this list of oh Here's all the dead tuples Well if maintenance work memory is set small or if it just isn't large enough that means that any vacuum Will have to scan through the heap and then once it runs out of space to remember dead tuples It has to stop scanning the heap scan through all the indexes Go back to the heap Remove the dead tuples and then it can start looking for more dead tuples So if you have a really large table with a bunch of dead tuples, and you have a really small Maintenance work mem you can end up scanning through the indexes over and over and over again Just trying to remove this work The the only bright side to that is that it's once Once it's gone and remove the dead tuples from the row Well, at least it's gotten that far, but this is something to be careful of It does not explicitly tell you that but if you carefully read the output you'll see So on a normal vacuum it will talk about the indexes doing the scan and then it's Does a summary at the bottom of what was actually done on each index So you see each index mentioned twice if you see an index mentioned more than twice That means it had to do multiple passes so The the vacuum command when you when you actually call vacuum itself It's going to be entering the vacuum function. That's That's what I just described here It for each relation then calls vacuum rel Vacuum rel will vacuum a single relation whether it's a materialized view a regular table or a toast table I Actually, I take it back vacuum It well, so vacuum rel it depends on whether it's it's called by auto vacuum You can manually vacuum a toast table normally there's no reason to But in any case vacuum roll will handle both a When so when vacuum rel vacuum rel needs to get a lock on the relation itself Normally, it's just a share lock unless it's a vacuum full If it's an auto vacuum and If it's non-aggressive Then if vacuum rel cannot get the lock on the table, it's just going to skip it So what that means is that if you're doing something like an altar table command and Auto vacuum happens to fire up and go I'm going to vacuum this table auto vacs not just going to sit there and wait It'll say oh, I can't get the lock move on So vacuum rel it does a bunch of mundane setup work And then it calls either cluster rel for vacuum full or it calls lazy vacuum rel once those calls return Vacuum rel will call itself to vacuum the toast table unless you've told vacuum not to do that It will not do that for auto vacuum because auto vacuum always handles toast table separately lazy vacuum rel is is where the Kind of the the work really starts happening. So the first thing that it needs to do Is it determines the XID and MX ID limits? These are the limits at which first of all What is the newest XID that I can possibly consider visible everywhere? So that's You know anything that's been to the any dead rows that are older than that transaction They can be removed. It also sets up the freeze limits So based on those settings that I mentioned earlier. It's going to determine any Any tuple that's not dead, but it's insert XID is older than This limit it's it's going to get frozen Once it's determined those limits, it's going to scan through the heap Using lazy scan heap, which I'll get to in a second After lazy scan heap is done. It's going to If there's enough Possibility if there's a possibility of reducing the size the physical size of the table It will call lazy truncate heap to try and do that It will clean up the free space map It essentially vacuums the free space map And then finally it's going to update PG class and log statistics if necessary lazy scan heap so we've now gone from vacuum to vacuum rel to lazy vacuum rel now We're down to lazy scan heap This is where the the real meat and potatoes when it comes to the table scan comes Comes into play so for each block It's going to skip ahead if it can and if it makes sense to do so so The block skipping depends on whether the vacuum is aggressive if It's an aggressive vacuum meaning it's a vacuum that is trying to prevent That's trying to satisfy the table freeze age requirements Only blocks where all tuples are known to be frozen can be skipped So this is the vacuum freeze map Feature that was added in nine point six. This is brand new Previously before nine six. We don't track We don't have a bitmap that tracks whether a block is entirely frozen This is problematic for things like data warehouses where you can have really really large tables that are basically a Pend only so the only vacuum the only work vacuum will ever have in those tables is at the very end of the table But every time that you have to worry about an anti-rap around Vacuum is gonna say oh, I need to scan the entire table So now you've got a 10 terabyte table that vacuum is gonna slowly chew its way through well starting in nine point six We record What blocks have nothing but frozen rows and if the block is entirely frozen Even if it's an aggressive vacuum, it doesn't need to touch the block. There's nothing for it to do So that's an aggressive vacuum if it's not an aggressive vacuum Then any blocks that are all visible so we know that there's no dead tuples on the block Those blocks can be skipped as well So non-aggressive vacuum skips frozen as well as all visible an aggressive vacuum only skips frozen Oh the the other part of the of the skip logic is that? It will only skip if it's going to be able to skip ahead at least 32 blocks That's basically an attempt to not fake out the read ahead logic in the operating system So if it can only skip a few blocks ahead, it's just going to sequentially hit each block anyway So once it's once it skipped ahead if if warranted It's it's going to it checks to see how much Maintenance that says work mem. That's inaccurate. It should say maintenance work mem is Left so the array of remembered dead tuples if that's getting close to being full It's going to stop the scan hit the indexes Remove the tuples and then come back to what it was doing After that check it attempts the cleanup lock if it cannot get the cleanup lock and it's not an aggressive scan It's just going to skip the block. It'll just go on to the next block It will increment the counter that says here's how many blocks I skipped because I couldn't get the cleanup lock Once it's got the cleanup lock it's going to do page pruning this is the exact same process that hot pruning is So this is the same function that's called in an insert update delete when it gets a page and says, oh, I need The page Doesn't have enough space. Let me go see if I can free some up after that It's going to record The dead tuples now there is one exception to this dead tuple thing I've been talking about of remembering in an array list of dead tuples if a table has no indexes There's no reason to do that There's no need to go remove tuples from an index because you don't have any indexes So in that case as vacuum runs through the table It's just going to immediately Remove the dead rows from the table and continue on If necessary, it's also going to do tuple freezing It's going to update the free space map And the visibility map that actually yes, that does happen for each block. So at the very end it determines Oh, if every tuple in this block is visible Market is such in the visibility map If every tuple in here is frozen market is all frozen in the visibility map Finally after it's scanned through the table its Updates the statistics the vacuum statistics and it does the final pass through the indexes and The final pass of cleaning the heap lazy vacuum index. So this is called Lazy scan heap is what we're just talking about lazy. It calls lazy vacuum index This is where the index vacuuming happens Because postgres is extensible and allows you to create new index methods Theoretically without changing back-end code. I'm not sure if anyone's ever actually pulled it off that way But because we do have different index access methods each index method has its own requirements for what has to happen during a vacuum so The one thing that is in common is that The the indexes will have to scan through the index entirely we don't currently have a method for I Have a heap tuple. I have my row in the table Let me go find the index, you know in this index worse the stuff in the index that's pointing here We don't have that ability if we had that ability vacuum would suddenly become a lot simpler But because we don't have that the different index access methods. They just have to scan through the index Completely and every As they come across each index pointer They have to go check to see does this index row Point at a heap row that's going to be deleted The list of dead tuples it's I'm pretty sure it's a binary search because the the tuples are going to be in order in the list so So the the long and short of this is that it's very difficult to reduce the size of a loaded index This is definitely true for bee trees other index methods may not be quite as sensitive But by and large when it comes to index bloat Just don't let the index get bloated if your indexes do end up bloated your best bet is to re to do a reindex So lazy vacuum heap after After the call to lazy vacuum index has happened Lazy scan heap then calls lazy vacuum heap This is the function that actually goes through the heap table Takes the list of oh here's all the dead tuples and one by one it goes through the heap and removes the tuples themselves As it's going through and doing that as it hits the end of each page It will defragment the page and then it updates the free space map Finally there's a last step for dealing with indexes Index each index access method has the option of defining a lazy cleanup index function This is run after We know that we're done. We've we've finished vacuuming a Particular table and now if the indexes want to do anything in particular at that point they can go do it For bee tree all it does is it just simply cleans up the free space map the brin index. That's when it does the It's it's renormalization Honestly, I didn't look at the other ones Finally Vacuum will Updates if it's necessary. It's going to update PG database that frozen X ID as well as the That min Mx ID What this function does is it looks to see Have I actually been able to advance? Have I been able to change either value if it has been able to change either one of the values? It goes and truncates the commit log files It updates the So there's a separate internal copy kept in shared memory that mirrors the information in PG database. That's updated and then PG the multi-exact files are Those will get truncated, but that happens during the next checkpoint. So the checkpointer is actually responsible for that So that's everything on vacuum itself Real briefly on auto vacuum. There are two parts to auto vacuum. There's launchers There is one launcher and then there's multiple auto vacuum workers The launcher every time that it wakes up and that's controlled by vacuum nap time It looks at the list of databases and it prioritizes them by whatever database is most in need of Of transaction freezing Usually there are none then it prioritizes in by most in need of Mx ID freezing and again, usually there are none and then after that it just looks at what was the last what was the Which database have I looked at least recently? So There is no prioritizing by the vacuum launcher of oh, this is a really big database I need more workers over here. It does not work that way Multiple workers can work on the same database at once. So if you have a cluster What's kind of a common pattern is if you have a database where you only have one real database There's a couple other databases. You'll have your postgres database. You'll have template one Those will get visited But what's going to happen is it'll start off Say it visits your large database fires up a worker there wakes up again in another minute says, okay Now look at postgres fires up a worker in postgres database. Well, that worker is going to return very quickly Next time it's going to say well, I haven't looked at template one So fires up a worker there again, that's that worker is going to return very quickly. So now three minutes later Auto the launcher has said, okay Let's fire up another worker for the big database And that worker if there's a bunch of work to do is going to stay working and then finally three minutes later It's it's going to again have cycled through postgres and template one and get back to Your big database and fire up a third worker by default. You have three just three workers So what this means is that if you are running an instance that has a large number of databases You really need to set the nap time to be small In fact, if you're running more than say 50 or 100 databases you probably want nap time to be down in the order of one or two seconds Unlike a regular Vacuum an auto vacuum worker will self terminate actually I think it I'm pretty sure it actually just stops what it's doing on the current table But auto vacuum is Intended to not interfere with normal operations The best way to tell whether auto vacuum is doing its job if it's being effective is to Simply monitor how many auto vacuum workers are running you can see that in PG stat activity and compare that to Auto vacuum max workers if you frequently See that all your vacuum workers are in use. They're all doing something That's a strong indication that auto vacuum cannot keep up with what is happening in your system The auto vacuum worker Once it goes into a database it gets a list of the heap tables and materialized views that need to be vacuumed or analyzed After it goes through them it gets a list of toast tables that need vacuuming Temp tables are ignored For each relation it attempts to get a lock and if it can't get the lock It just skips the relation unless it needs to be aggressive due to anti-rap around After it runs the the vacuum itself it'll update That frozen X ID Sorry, it does that. It's at the very end Right before it's ready to exit and then once it's exhausted its list of work it then exits The key point here is that auto vacuum does not prioritize tables within a database So if you've got a table, it's desperately in need of vacuuming it may end up way at the back of the list So this is part of why it's very helpful to run manual vacuum as well to give Auto vacuum less work that it has to do in some of these situations So ultimately auto vacuum can become ineffective for high-demand tables if too many large tables need to be vacuumed at once It'll essentially get stuck on the really big tables and then it cannot go Visit the other tables Vacuum cost delay, it's pretty well documented c-section 1944 the critical idea here is that once you hit the cost limit The vacuum will sleep. So it's a little bit unintuitive increasing the cost limit speeds up vacuum increasing the sleep slows vacuum down That's the one thing that's a little bit tricky You definitely want to be careful. I've seen customers where they're going Oh vacuum is creating too much load on the system. So they turn vacuum down. They slow vacuum down Well, what's happening now your your system is getting bigger and bigger and more bloated and more bloated and then you get to a point where oh now I have to do an anti-rap vacuum and I'm in trouble because I just don't have the the speed to dedicate to vacuum So be very careful not to slow it down not to throttle it too much If you have a system that has very fast rights, so like if you have a raid controller with a battery backup You may well want to set the page dirty setting to be lower than page miss Because on those systems writing a page is going to be cheaper than reading a page So Enclosing long-running transactions hurt vacuum high transaction rates Those can those accelerate the need to do freezing Try to avoid referencing indexes so that you can get hot definitely make sure maintenance work mem is large enough Trying to reduce a bloated index is next to impossible So just don't let them get bloated and auto vacuum can only do so much You really want to try and give it a hand with some strategic manual vacuums Unfortunately, I am out of time. So I will be happy to stick around outside to answer any questions That is my github repository that has all my presentations. This one is just called vacuum. Thank you