 Okay, thank you. I'm glad so many of you are still here and haven't given up on this conference yet But I can see why most of the content has been pretty good So I'm here to talk about as Steven said the elephants trunk and postgres 9.5 Just before I even get started with introducing myself. Who is already using postgres 9.5 for anything real? Okay Why am I not surprised that the two hands that go up are people who are actually committers on the project I Hope you guys are working on 9.5. You're probably got 9.6 running soon Yeah, okay, so Sorry, you can branch in your own local repository One of the powers of get Okay, so my name is Magnus Hagener as Steven said I've been around postgres for a long time for a very long time at this point I'm part of the postgres core team. I'm one of the committers along with for example, Steven and Joe and maybe I saw someone else in the back So I work on the actual code and feel free to heckle me for not committing enough things in this version Which I really haven't And I should have done more work, but you know, we leave get a leave it to somebody else in some of the times I also do a lot of work for postgres Europe part of the team that organizes sort of the the European version of this conference and Think like that when I don't work on community stuff I work for a company called red pill in pro that I bet most of you haven't heard of we're a local open source Services company featured in the Scandinavian region. I'm out of Stockholm in Sweden myself We do sort of all things around open source products postgres being one of our focus products where we do now consulting training support Custom development. I think we've done exactly once on the back end things like that So let's go for another show of hands. How many of you read planet postgres? on a regular basis For the rest of you, I suggest you take a look at that. It's an excellent resource We have a number of people who keep writing blogs. So they've been working on it for Almost a year whenever someone commits a new larger feature within a couple of days Somebody else usually picks it up tests it and writes a post about how to actually use it What it means for people who are not the developers So if you haven't already go read that thing I suggest sticking it in your RSS reader so that you get it on a regular basis. It has a lot of very good content So postgres 9.5 development started officially on June 10th last year, which is when we branched off postgres 9.4 and Then rebranded the master branch of the postgres repository as 9.5 develop And in the postgres project we use a process that we call commitFest Where the theory is that we take the development period and we split it into pieces where we do One month worth of development and then we do one month worth of what we call the commitFest Which is review of the patches that were written in the first month and then commit of them, hopefully And then we sort of go back one month development one month review That's the general idea behind the process Because you know we all do iterative development that we all have to have our own name for it. That's our name So 9.5 is the first postgres release to do five commitFest previously we've done four But we had one in June one in August one in October one in December and one in February I'm getting to that But that's the plan and officially that's what we have and if you go look look into our databases That's what we have right so that means that the current status is that we are in the final commitFest and Normally these commitFest are one month, right? But we do allow for the second or the last commitFest to be two months Other than that we will already be behind schedule, but in fact in reality. We're in basically continuous commitFest Since sometime around September Because we've not been able to close out the commitFest in one month So it's all been just flowing into continues commitFest mode, which isn't very good Which is something we need to do something about One thing that you guys can help with that is just reviewing and testing all the things that are in the commitFest Now because if we don't get the ones we have now then they're gonna get punted and not be in 9.5 But also as we get started on the 9.6 cycle in the early commitFest as well A lot of the features that we have in this version came in you know August last year And it's not just the code review right so you don't actually have to be a C coder to help us with the commitFest Simple thing of you know well you need to be to the point that you know how to apply a patch and run make But that's pretty much it So if you can download a patch See that it works see that it compiles and test the feature like run it and see what happens If it adds a new SQL command make sure that it works and just report that that is actually of very much help Now we still need the people who do the review of the code itself But in particular if you can find something that doesn't work Then we can bounce it back and get it fixed before those guys even start spending time on it If you do have C coding skills We absolutely need more people to help us with the review at the C code level as well So because we really don't want to be stuck in this continuous commitFest, you know for another year Nobody is going to be happier from that So the idea is we try to push this through try to push the commitFest through on the actual schedule So that we can then focus you know on the parts that everybody likes which is writing new stuff for another month So that's the idea we very much like to see you help with that So some current statistics or and this is almost current because as we are in the final commitFest Things happen every day now. So this is about a week's old statistics The commit compared to 9.4. We had modified about two and a half thousand files 222,000 new lines of code and removed the hundred and thirty five thousand this is Almost double that that we had in 9.4 9.4 was a We've been on the declining trend for a couple of versions and now we're going back up to about the double Because we all know that lines of code is the best way to measure developer productivity, right? This is a very relevant number Yeah, of course it isn't it could be a single patch that just touches a lot of code or You know some of the most important patches we have for example when we implemented Synchronous replication in Postgres, I think we can all agree. That's a pretty major patch And it was tiny compared to the amount of features. It was a small patch So yeah lines of code are awesome, but let's not focus too much on that But it shows that we're really yeah, there is a lot of work being done There is a lot of churn in the Postgres code base. There are a lot of things going on at all times Now some of this is visible to you some of it isn't visible to you Some of this pure infrastructure some of it is maybe infrastructure that you can build your own features on And some is just very very obvious and use of functionality. So I've tried to sort of split this into four areas Developer in SQL level features for those of you who are working with the database from an application perspective DBA and a demonstration For you who are stuck with running it And of course, this is database. This is computers. We're always interested in performance Every postgres release has performance improvements, but I tried to single out the couple of them And then of course as we are in the final commit test There are things still in the queue that we can still hope for And that you can still help reviewing and making sure that you know those hopes actually come true So let me start some of these are coming in sort of a semi chronological order rather than anything else like the features that came in first But let me start with the developer in SQL features One of those features that is actually it seems like a fairly small one But if you need it it turns out to be really really useful is that you can now do a multi-column sub-select update So basically previously if you did an update set a column equals the return value from a sub-select you could only do one column If you wanted to update two columns you had to run two updates or write some really really strange syntax Will like expanding and unexpanding row constructs and stuff like that This will just let you say you know update set list two columns and A sub-select of any kind that can return multiple columns This is an SQL standard syntax So it should work if you have like query generators or anything that can use this should be just a flag to use it now if you're not mistaken the Standard actually allows any row construct in this point and we don't we only allow a sub-select But it turns out that's what people actually want. Oh, you can do values We've been emitting do values before but There are other kinds of weird row constructs that we can't actually do at that point Another one of those useful features where we had the features, but it was somewhat limited Is the function generate series? So who in here has used generate series? Okay, good. This is well, it's supposed it's a crowd that knows Postgres, right? Generate series is one of those tiny functions that turns out to be fantastically useful It just generates a list of some things Prior to 9.5 it could generate a list of integers and it could generate a list of timestamps Basically now we can also do numeric which means we can do decimal numbers Like in the example here select star from generate series. It's a set returning function 0 comma 1 comma 0.1 means started 0 go to 1 in increments of 0.1 Which we couldn't do before because 0.1 is not an integer And it also means that we can now reach Larger numbers then we can get with an integer I'm not sure you really want your generate series to be that big But you could start at a high number like a generate series that returns more than 2 billion rows might not be very fast But if you start at 2 billion and go up it would make sense So we do have it available for numeric And continuing on the range of sort of things that we had part of the way Who has ever used select for update no weight? Okay, a couple of you so no weight has the what do me if you do a regular select star from something for update If someone else is locking a row, we will block wait until they're done and then we'll return value If you say select star for update no weight Well, if someone else is locking the row will say could not obtain lock and then will abort your transaction Which is helpful in some cases, but then what the new code that says you know select with skip locked Instead of no weight Will skip the locked rows and return all the other rows in the table So in this example, it returned two rows two and three my other session had a lock on the row with value one It's a fairly narrow use case where this is useful because you're intentionally not actually returning the complete result of your query so you probably don't want to do this if you're doing like reporting but it's Very efficient when you're doing for example queue processing you have a worker queue somewhere You can combine this also with a limit You know take the first value out of the queue, but skip any value that somebody else is already working on It's pretty convenient and it with this it becomes much much easier to do than it used to be So simple worker queues can now be implemented without a lot of the External support that we previously needed One of the more major features I'd say that are exposed at the sort of SQL level that I've actually been put in is Row-level security Of course, this is when you should have had a time machine And I'd say you know if you like to learn more about row-level security go back about four hours in time and go see the presentation just before lunch I Had hoped that you know I'd be scheduled earlier than the last session so that I could just deflect any details to whoever Came's later, but I guess I'll now just have to tell you my story and those of you who went to the row-level security talk will correct me When I'm wrong No You're not allowed to do it The idea behind row-level security is to apply access policies on a per row basis. You can almost get that from the name So previously we had table level permissions, right? You can grant access on a table and we had column level permissions You can grant, you know any level of access on the level on the point of a column Now the idea here is we can limit the access to individual rows This is something we could do before you know we could create manual views Remember to make them security barrier views so that you know people can't look through them But using this feature it becomes a lot easier. So what we do is we create basically a policy For a table that says this user is allowed to see rows that look like this Now they still have the regular ACLs So you still have the ability to also limit on a table level also limit on a column level We just add one more thing and you know as should be with all permissions You need all of them to actually see the data. It's not enough to have one of them It's worth noticing that as usual super users and table owners will bypass the security So don't give people super user like ever Keep that to yourself and your evil plans There's also a new roll attribute that you can give to a user which is bypass RLS Which says that this user without being a super user or an owner will be able to just read everything in your table or Write everything in your table regardless of what the row level security policies are Now row level security is enabled on an individual table basis with a simple alter table command So the syntax looks sort of like this. I think I took partially the the example from the Documentation where say alter table companies enable row level security Okay, now it's enabled Once you've enabled it on a per table basis. You need to create a policy. These are also created on an individual table basis So in this case we say so create the policy Companies manager on companies for all That means for all operations We create individual policies for read and write So you can say you're allowed to write the data, but then you're not allowed to see it That can lead to really interesting application behaviors, by the way, that's not just a feature of row level security You can do that with with views we But applications tend to be confused and I would say to public so you can limit this policy on an individual role basis saying this policy only applies to users that are part of a role And we say using and finally we give any expression basically and What's in here? If it Evaluates as true The user is allowed to view the row if it evaluates its false. The user is not allowed to view the row So in this case, we're saying, you know, the user is allowed to view or update or edit any row Which has where the column manager contains the same value as the variable current user being the user that is logged in Which of course means that when we read it This is just an example and there here is the difference So I'm first logged in as user postgres doing select store from companies. I've just added three companies I'm myself the manager of two of them and test is the manager of the third if I reconnect as the user test I can only see that row Because that's the only row for which this expression evaluates is true So it's a very simple level of filtering, but it's also worth noting. It doesn't give you an error if you don't have Table level permissions on a table and you try to do a select from it you get a permission denied Row level security does not give you permission denied. It just takes away the data So if you actually create a policy, that is just false Every table would just look empty, but your queries will still work And you know your colleagues that you applied that policy to will be really annoyed And you can do policies on any sort of regular expressions. There are a few things you can't do in your policies You can't do aggregates It's one of the big ones So you can't do sort of where some value equals select count star from a table somewhere But you can access other tables So you can do sub queries and things like that in order to evaluate whether this expression is going to be true and I've yet to write an expression complicated enough that it doesn't work Other than I've written them complicated enough that I wrote them wrong, but that's a postgres fault But it's the general expression manager. You can define multiple policies on a single table Works just fine. In this case the results are odd So you need to have access through any one of the policies to gain access to the table It's also worth noticing. It does not affect cascading our eye operation So if you have a foreign key with a cascading operation that bypasses RLS Unless you I mean if you have RLS on both sides and you won't be able to delete the row in the first place for example It will still be there, but if it's the foreign key that deletes the row it will be deleted That obviously only applies to writes and not reads But yeah, you can do policies that are really Complicated I just I had to test it just to make sure it worked So otherwise I would have bitched on at Stephen, but you can for example put a recursive query Into the expression in this case. I changed my managers to be a hierarchy and Just traversing the whole hierarchy as I evaluate the different rows You know it just works because it's a postgres expression Now if your table gets big and if your manager's tables get big there are no guarantees that this will be fast Right, but it will work You know make it slow enough. That's another security policy like anyone who tries to read the data is not supposed to Just doesn't have the patience Not sure that would like pass any audit or anything, but you know But the general idea anything we put in this using Sort of anything postgres can do except aggregates will work in that normally as With all these things like keep your policies reasonably simple Otherwise you're going to do like I did when I tested this and I was actually initially convinced that this did not work But that turned out to be entirely my own fault For having another policy on the table And I missed the fact that they order them together So and it's easy to get lost when you do these things so with all things security Reasonable level of simplicity is usually good idea So moving on out of the sort of sequel layer a little bit More into the DBA and administration side So who in here is running more than one instance of postgres on the same server? Quite a few of you you're gonna like this one right a very simple thing. We have a new configuration parameter called cluster name if you set it We will just print it in the process title So in this case, I've named my cluster my test cluster and now every process associated with this cluster Has my test cluster in the process title If I don't set it we just leave leave it out completely It's convenient if you have multiple instances and you want to figure out who really is this sure We have all the process trees and you can go back and look at command lines to try to figure out where the data directory is Now you can just do grep directly on on your ps output So it's a very convenient feature in those cases very convenient feature for those of us who develop postgres and have like 20 different instances running at the same time Who has never killed the wrong one? Another thing on the DBA side So with a bunch of enhancements on our foreign tables one of them is you can now say import foreign schema Obviously it requires support in your fdw, but the one that's for postgres comes with the support So basically The command here is you need to create the local schema first Then you say import foreign schema something From server the server that is defined as a server and then into the local schema And then the fdw will just reach out to the postgres server on the other side Copy every individual table definition and create those as foreign tables If we list them then so in remotes in the local schema name remote schema would just get them there And it will not obviously not just copy over the table name, but actually the column definitions If you have a lot of tables, I'll save you a lot of time questions about It's a one-time thing. It does not up to it I Don't believe we have refresh ability of this. It's not a bad idea to have it And I actually as you say it what I don't really know is if I run the same command again Does it error because the tables are already there or does it skip them? Does anyone know Okay, Stephen thinks it will error. I think it will error, but it would be kind of neat if it did But I'll have to test that So yeah, there is no but there is no refresh It will definitely not update a table that has changed its structure or anything like that But you can just drop the local foreign table because it has no data and then do an import again Yes, oh you try to read the data and see what comes back Or or you can use the backslash D plus. Do you need the plus? Okay, it shows up directly in backslash D in psql and presumably there will eventually be a PG admin release That will also be able to view them I don't think it's there, but yeah, so you can see that the policy is there You can't see the rows that it's potentially hiding But no the the schema in Postgres is basically you can read in PG catalog Same thing. I mean we hear that sometimes that people don't like you can view the source code of store procedures It's the same thing So there's no Prevention of that another thing that went in very recently In relation to foreign tables is that foreign tables can now participate in an inheritance tree Means you can create tables with inheritance that includes foreign tables in the tree Now foreign tables today. Sorry inheritance today in postgres is used for partitioning Now there are drawbacks to this way of doing partitioning But you know, it's how we do it and the advantage of this is you'll this means you'll be able to have like a partition That is on a different machine or what some people would call charting And it uses constraint exclusion the same way that you know constraint exclusion works for local tables So you have a check constraint that'll tell you like rows matching this Goes over there that may well be added in line for it's the inheritance or it was added very early on They're not very useful without actually having the inheritance So the inheritance allows you to basically take a partition and put it somewhere else Now there's obviously more work to be done when it comes to optimizing queries and things using this But basic sharding using that is certainly doable You can still there are things like PGA shard and extension From Citus data that will actually do sharding in with a bit more advanced functionality But this gets you pretty far and it gives you the same interface as local partitions Which has a lot of other advantages and of course it will be improved on For those of you who use Unlogged tables you've probably been annoyed that you've been unable to change it So now you are able to change it right you cannot change a table to go from unlocked to logged or from log to unlocked With a simple alter table set unlock them offer table set logged obviously Well, I don't think it's a small caveat about indexes There's a small bug related to indexes, but we're hoping we were assuming I'd say that that's going to get fixed But but some interesting things can happen with indexes right now if you're using the unreleased version It will have to log the content of the table Yes, so if it's a terabyte table the operation to set it to log will log a terabyte of data Because it's the only way to do it, right Because for example in the replication scenario suddenly it need to appear on the slave, so we need to log the data Whereas of course if you go from logged to unlocked That's just a simple metadata operation saying anything in the log that has to do with this table. You can just ignore that now So again if you're using the feature, it's been really annoying you couldn't do this. This is a nice improvement. Yes, I Was saying it does truncate it on the slide It does truncate it, but it's like all unload tables. It's present on the slave, but it's empty Yeah, so that once you fail over for example the table is there. It's just empty Is anyone using? Alter system set that we released in the last version. Wow. No one is using that I think you can guess that Steven didn't like that feature It could be because you're not running nine four yet, right in that case you can skip Sorry That's actually a good question. Who is actually running nine four? Okay, that's way more people than we're using alter system set But if you are using or if you will be using or if your colleagues are using alter system set You can now also do alter system reset to change your mind Basically, what it does it resets the config variable back to variable back to whatever is in postgres.com Or if there is nothing in postgres.com it goes all the way back to the default value The implementation basically means it gets deleted from the postgres.auto.conf file Which is where we put it when you said alter system set And as with alter system set. Yes, you still need to do a select pg reload com Or you know do a service postgres reload or whatever it is on your platform Otherwise it will be reset, but it doesn't actually take effect Or if you do this with like shared buffers or some nice feature like that You have to restart the whole thing and you can't do that for me inside of SQL I'm not sure how that would work, but so you can't But yes, there's nobody was using that let's just skip ahead to the next one instead There's a bunch of interesting sort of more infrastructure level functionality one of them being we'd now track commit timestamps If you want to It's a parameter in postgres.conf because they track commit timestamp to on In which case we start tracking commit timestamps. Well, that means that every time we write a commit record to the transaction log We also tell it what time it is So there is an obvious small amounts of overhead because we have to actually figure out what time it is Which can be surprisingly expensive But then one thing you can do is you get a function called pgxact commit timestamp That you give an xid And if you know how postgres mbcc work if you look at the xmin is sort of that's the earliest xid that can see the version of this row. So that would be the one that wrote it So you can say select pgxact commit xmin from a table It'll tell you that well this row this table clearly only has one row which was committed by transaction 787 Which was committed at exactly this time So you can figure out when a row was changed You can also use the function pg last committed xact Which will just tell you when did a transaction last commit on the system and well This is a set returning function that returns both. Okay, this was the last transaction ID that was committed and this is when it was committed I'm sure we'll see more sort of other things built on top of this as well There are other use cases for it rather than just looking at your rows Yes No, the xid still has a value right and the commit at the timestamp is a separate field the timestamp isn't calculated incrementally It's just a timestamp. Yeah, that will that will affect the xid field Obviously, but it won't affect the timestamp No, you can go Exactly how far back can you go? There is a separate set of data that stores how much it is It's a good question exactly how it goes back to I think the last global vacuum freeze No, it's way past the oldest currently running transaction. I think it's back to the oldest vacuum freeze It's it's not related to well It's not related to how much while yellow we don't actually look in the wild to get it We we write it to the while obviously as we commit it, but we also write it to a separate set of metadata information there is a There is a committee as directory That has a file with it. So obviously it does also you if you're tracking this Yeah, this is data that you're storing your data director is going to be bigger Yeah, the amount of data would be the same as in the C log that seems to make sense It goes the same amount back in time as the C log does If you're using the vacuum DB command, you know when you've really screwed up your database You need to vacuum everything at once. Hopefully you never have to do that But if you do you often need to happen quickly It seems surprisingly simple it took many many many rounds of new versions of this patch before it got in But we have the ability to just add a dash J to the command line vacuum DB command They'll just run multiple vacuums in parallel See if you say, you know dash J 20 it'll run 20 back vacuums in parallel and most likely kill your IO system Right and everything else stops But you know if you need to run vacuum really fast because your system is look completely screwed up Then basically you want to run as fast as your IO system can do But it will block other operations that are happening in your database It won't block them lock wise necessarily depending on on how you're running it But it'll if it uses up all your IO capacity, then there's nothing left for everybody else So be careful of the load there Yes, I don't believe it tries to be smart about table spaces. It tries to be reasonably smart about table sizes But not table spaces An important change that's probably gonna affect every single one of you is that the good old parameter checkpoints segments doesn't exist anymore It has been removed We no longer have this awkward way of specifying sizes instead. We have two parameters min while size and max while size Adjust well, it tells you the minimum and maximum size of your PG X log directory So you get rid of this calculation of well, it's three times the value times 16 megabytes Which is how much disk space you need you just say maximum well size equals. How big can it be? And postgres will then automatically tuned when it's going to do its checkpoints to stay between this Between these limits and it uses a moving average of previous checkpoints to figure out when to start the next one Obviously, we still use checkpoint timeouts If you want to run a checkpoint every five minutes or every 15 minutes or how you configured it This just controls the space level, but it's a lot more reasonable to configure it this way And we've also moved the default sort of out of the 80s So the default is no longer three times 16 megabytes the default is now minimum size 80 megabytes and maximum size 1 gigabyte Which is really don't you have a gigabyte free space? You probably do. I don't think that's a problem But if you really are really space constrained you'll need to bring max well size down again If you set these two to exactly the same value you get the same behavior as you had before Because that basically turns off the auto tuning, right? So if you set them both to 48 megabytes you get the default from before but you don't want that What it also means is we only actually consume the space when we really need it Like previously we could fill out stuff even if we didn't need it the way that max checkpoint that checkpoint segments work Now it's going to recycle them as needed and just keep the space down when possible By using this moving average of what it's needed before Another thing that will affect the number of you at least Is that we've gotten rid of pause at recovery target when you're doing point-in-time recovery instead? We have a parameter called recovery target action That can be pause Or it can be promote or it can be shut down Pause still requires you to have enabled hot standby and I think we're still discussing what's going to happen if you put an incompatible value in there But these are the values so instead of adding another parameter. We added a new one and took the old one away But this controls what's going to happen when your point-in-time restore reaches the end Do you want it to open up in read-only mode? Do you want to actually open it up for full usage or you do you just want to shut the system down? Because you might be preparing it for something else And not want to open it at all that a question or Okay, you were just joining. Okay. That's good. You had your hand very high up Some other things that we were hoping would probably lead us to a little bit more but the SSL code in Postgres has finally been completely refactored The idea being that we can support other implementations than open SSL Which we could but we haven't actually written the code for that yet So we still only support open SSL But we have infrastructure there to make it possible to support other SSL implementations reasonably soon We've also added support for subject alternate names in your certificates. If your server has multiple different names We won't error out in SSL mode equals verifiable as long as the name is in alternate names It's mostly something to build on the the refactoring Who in here is using pgx log dump very few of you well good for you because it's really a Once you have to use pgx log dump. You're already in trouble, right? But there is a new interesting thing about pgx log dump Which takes the stats argument now that it can look through your log files and figure out what kind of data is in the log file What's using up the space in your x-log is it index updates? Is it full-page writes? Is it heap updates? So you can run that basically on your log file you can use that to figure out why is my replication taking so much bandwidth So it's now become useful for other things than just oh My databases in real trouble You can still do that of course But the stats thing actually can be used for a lot of other things to gain insight into what's happening in your system And it just takes if I'm not mistaken It takes the x-log directory the first segment file and the last segment file It'll just summarize everything that happened between these two points in time and then give you a nice little output that tells you What kind of data is in your log file? So that'll help you figure out, you know, why things are running slow because you're generating too much log So moving quickly into a more performance related area Some of you may have been to Simon's talk if you note Well, now you only need to go back like two hours in your little time machine to look at brain indexes Which is what we call a block range index. It's a new index type It is what we formerly called min max it supports other op classes I don't think we have any other op classes implemented But you know, it could be the idea with this one is instead of your regular index Which stores every value in a B3 index will store every value in a sorted way in the B3 This one only stores bounds per block range So the default is to take 128 blocks in your data file and it stores What's the highest and the lowest value that's present in this block? That's min max when we say highest and lowest But we can do other op classes So it would be interesting to see things like applying these two geographical operations saying, you know things that are in this area lives over on these blocks Now it's somewhat limited in what it can do But what we'll do is it'll then figure out sort of which blocks and then do a sequential scan within these 128 blocks for example So it to be efficient. It requires a table that has some sort of natural order on this The typical the simplest use cases. Well, you have a log table somewhere. You have an continuously incrementing timestamp or ID column Brain indexes are ideal for that Because you won't do that if you have a heavily updated table Brain indexes will probably be mostly useless because you know, your data is randomly distributed throughout the table The main win of Bryn is they're small and it's very cheap to do inserts into a table with a brain index Like the overhead of inserts is much much much lower than it is on a B3 index It's just a regular Postgres index type So you just say create index using Bryn and then you get you always get a bitmap index scan on it And you can change the parameter pages per range would tells you sort of what size index do you want in relation to your size of data? We can't really give any recommendations on that yet We need to run much more, you know, real-world tests to tell you what the good numbers are there So, you know, please run those tests and let us know For those of you who've run into the problem of the gin pending list in Postgres you can finally tune it It used to be tuned by work mem Which is really really bad because it's completely unrelated So the gin pin pending list is used for the gin fast update Fast update means that when you do an insert or update in gin We just post that on a separate list onto the side because it's very expensive to update an index And then we hope a vacuum will come around and put it back in the main table But if this list gets full then we're going to put it back in the main table while you're waiting And you don't want that So this gives you the ability to tune that you can set the parameter as a Config parameter or individually on each index So if you might need a bigger list or you might need a smaller list you need it smaller Yeah, you're gonna get more of the cleanups happening while you're waiting, but they're gonna go faster So it's a use case dependent, but it takes care of a lot of problems and work mem has nothing to do with it So We also have gist index only scan Which showed up just a couple of days ago Currently, it's only supported for box and point, but I see some post gis people in the room. Hello We want this for post gis, right? It needs up class support. There are more up classes working both internally and externally I'm not actually sure if it's even going to be possible to do it in post gis But you know index only scans are a good thing if we can make them work This has the infrastructure and as we say box and point We have support for compression and of the transaction log Now we will only compress the full-page images But if you have full-page images enabled They're usually a fairly large portion of the size of your transaction log and basically it's about your trade CPU for IO Right, you pay the CPU cost you get a lower IO cost and you also get, you know, less replication traffic That's the part that affects you but it is still worthwhile to Gzip your archive files because all the data that is not a full-page image will still be written uncompressed to the file We do also have like a new well format, which you probably don't care about unless you're actually writing a well parser And we've changed the CRC algorithm Which really doesn't affect you at all. Hopefully it'll be faster and we'll have the ability to make more hardware optimized versions of it Based on that There's been a number of rounds of things to enhance sorting performance of different kinds We have abbreviated keys for text sorting. There's a patch in the queue for numeric And there might be something else coming online as well And it's obviously something we can keep working on in the future as well Which can speed up sorting a lot We have a pre-check for equality for text sorting again because actually doing a mem compare is really fast Doing a string compare is surprisingly slow because of locales and you know, everybody not being US ASCII 7 Maybe here you are, but you know the rest of us and there's been a bunch of other things done around this as well This is all transparent to you. The only thing you'll notice is that things go faster. That's like the best kind of enhancements Yeah, I think I mentioned numeric right if I didn't then yeah numeric is in the queue and probably gonna get in there There's been a whole bunch of changes to the low-level locking API so Postgres It's probably four or five people in here who actually ever looked at that But we have a new internal atomic API. We have better scalable lightweight locks that are built on top of this There are a number of internal operations that are now lockless that used to require locks We have concurrent locking for the hash table that accesses shared buffers But in general what this really means is well a lot of things will go faster in high concurrency environment If you're just a single user, it doesn't really matter that much, but in high concurrency environment It's gonna give you a speed up for free, right? There's also a bunch of things that are still in the queue Because we're in the final CF last I checked there were still more than 70 patches in the queue. So again, please help We're only supposed to we're supposed to clear this out in three weeks, right? That's not gonna happen. But with your help we might have a chance to get closer, right? There are some are big some are small so let me just mention a few the first one is actually incorrect And that's Andrew's fault We're hoping for having min max mean and standard deviation into PG stat statements Turns out as of about an hour ago. We have them But I didn't bother updating the slides Instead I spent the time sending an email to Andrew We also have a view called PG stat SSL It's gonna show you a lot of information about SSL based connection if I ever get around to actually Reacting to review or comments, which I got like three months ago, and I still haven't updated it So that's entirely my fault. We hope to get it there So what you can tell that I did this before I left Sweden We have a tool called PG rewind Which has also been committed it was committed on Tuesday So people are working while we are conferencing Which lets you rewind a replica or an old master to repoint a replica after a failover in asynchronous replication We're hoping to have file level incremental backup. I Wouldn't put too much hope into that actually getting into nine five at this point But hopefully then nine six, but maybe we'll get it into nine five. We don't know yet Some of you may have gone to Robert Haas's talk Otherwise again go back about five hours and you can learn more about parallel sequential scan Which is a tiny tiny part of the parallelization puzzle But basically it allows us to run a sequential scan on multiple CPUs at the same time But what's really important about that it's a whole lot of infrastructure underneath it that has been written and That will be able we'll be able to use that for other operations that are much more interesting to parallelize like doing aggregates or doing create index or Eventually doing joins and things like that in parallel But you know, it's a first step of a long list of things that need to be done. They're not gonna be down for nine five They're not gonna be done for nine six or nine seven either, but hopefully maybe we can chip away one in every release or something like that We have the insert on conflict patch Which has been around for a long time Hopefully we can get it done this time It's basically what most of you would refer to as absurd All right, the syntax I think that we're looking at right now this insert into a table somewhere and I was on conflict Either update or ignore It seems like it should be a simple problem Trust me. It's not It's a very complicated problem when you start looking at concurrency When you start looking at multiple sessions accessing the same rows It becomes really really hard part of it being just deciding how you want it to work And then you have to actually make it work that way and then you have to find about a zillion corner cases So it's been a lot of time has been spent on it. Hopefully we can get it in Another big patch that I'm hoping for is grouping sets The patches there it sort of works there are some arguments about exactly how it should look But it I think it stands a good chance With it and that one along with absurd are the two big ones that we're really waiting on I think so grouping sets for those Who don't know is where you can do The thing that most people use and most people are aware of is Cuban roll-up But this is the generalized version that follows the sequel standard, but Cuban roll-up It's basically if you say select a comma B comma count star group by roll up a comma B If you just say group by a comma B you get you know for every combination of A and B you get the count For with roll-up you also get for a the total sum for a Where B will then be set to null and the total sum in that and Since the B is on there you also get a null null and the value So it lets you compute the sums as well as the groups in the same one and of course using the generalized version You can do a lot of really funky things I've never used that myself, but I've used a lot of Cuban roll-up on other databases It's really convenient for reporting queries and aggregating data There's obviously always a lot more things. It's a bunch of other performance improvements. I can't mention them all I know there are a couple of contributors in here. I am sorry if I missed your feature But you know, I'm already running out of time So I'm just gonna mention one more feature, which is I like to always bring out those It's a really tiny feature, but it fixes a really big annoyance If you use psql and point it at a custom format PG dump We won't tell you this anymore Instead will actually tell you this is a custom format dump use PG restore you're using the wrong command It's really simple, but it's actually very nice to have that We need to not just build these super advanced features. We need to actually, you know build these user friendly features as well so with that I say thank you for showing up and Again, please if you have any time any chance at all go to the commit fest app It's commit fest dot postgres.org pick a patch take a look at it Build it see if it works. Let us know how it works for you Just send in a report even if the report is just hey I ran it and you know it actually does what the contributor claims it does well, you know That's good to know so help us verify that and The other and that's once we get to beta version just download it and you know pound it with your application and your code And get us the feedback as quickly as possible. Okay. Thank you very much. I Think I'm out of time for questions, but feel free to approach me after with any questions that you have