 Once you have a connection, you want to authenticate to the database. So you want to be able to lock that down. So if you're set that up with, instead of trust something like password, right, which is great, you have some sort of password, but that does it all in clear text. So when you make the connection, Postgres is going to say, okay, I need, you know, hi Jim, I need a password for you. And then the client's going to go back and send a password in clear text. Really easy for anybody looking on the network. If we're here on a public Wi-Fi connecting over a thing, they're going to be able to see your password going back and forth in clear text. MD5 helps with that a bit. And what that does is instead of just sending the password back, the server is going to send back a salt. So this way, with that salt, it's going to append it to your username, password, MD5, everything, and then send that back. So this way, if somebody's snooping, it's going to be a lot harder. So what would happen if somebody was snooping on that? They'd get back that MD5 something as it's coming back. And if they try to use that with a different salt, you're going to get narrow, so it won't allow a connection. If you're in a trusted environment, that might be okay, where you're not as worried about the snooping. But by and large, that's not something that, it's only a four-byte salt. So after enough logins, especially if you have a web user across an application farm of thousands of application farms all connecting to your database, you're going to cycle through those all the time. Because they're all connected with the same password and the same user ID across those thousand containers all connecting to your database. You're going to cycle through them really quick, especially if you have a server reboot or something, and they all try to reconnect at the same time. You're going to be able to find that. So a better way of doing that is scram. That's the original subject of this overall talk. The scram shot 256 started in Postgres 10. It's now the default in Postgres 14. It gives you a lot more secure way of being able to handle that authentication. So use it, right? It's the default. Don't change it unless you have a really old client that has to connect to the database server that doesn't support scram. I can't think of one of them off the top of my head that doesn't really support it anymore just yet. So use it. So if you notice here, there's a lot more round trips in order to be able to do this. So we ask for the connection request. The server says, OK, we need scram. It'll send the user name and some information about the client, the client nonce. Then the server will send that back with the server nonce and the salt as it's going back and forth. And then finally, the client's going to send really that client proof of saying, OK, everything's ready to go authenticated. The difference here compared to the other methods, if you look here, the password is never sent over the wire. Everything's all calculated on either end. So this way, you never have to send the password, giving you a lot more security. So if somebody grabs something, they're not going to be able to get that password in order to be able to do anything with it. That doesn't prevent things, other password attacks, if you set your password to password or 1, 2, 3, 4. Postgres doesn't help you in that sort of scenario. There is a check pass-like contrib module that'll do different things. But in order for that to work, you have to be able to send the clear text password to the server bypassing all this stuff. So just have good policies of setting that. Use other authentication mechanisms for it. If you have password complexity rules, the traditional ones of making sure that you have all the weird characters in it, just have something long enough that's good. A nice phrase is usually a better thing as opposed to just putting explanation points and dollar signs all over your password that you're going to forget. But if you're having Postgres handle your password authentication for you, really, it's a combat on the client as you're setting their passwords in order to pick strong passwords. You're trusting them to your database. You have to trust them with the process that you're going to set a good password to. So if you can't do that, you can't trust your users that have direct access to your database. You have to push that to something else like an LDAP or Active Directory or something like that in order to be able to do that. Things like Active Directory does the same thing. If you have password complexity on there, you're going to send your password and clear text to the Active Directory server. That's how it works. There's no way around it if you want to be able to have password complexity checked by a server and not sending that over the wire in clear text at some point in time. It's just there's no magic. Once you're authenticated, there's a whole bunch of really sensitive data going back and forth over that connection. Select star from customers. All of a sudden, you have a whole customer list coming back over the wire in clear text. If you're doing something like that and somebody's snooping on your connection, they're going to see your entire customer list. So enabling SSL prevents that. Same as if we're connecting to a website, you want to see HTTPS instead of HTTP. You want the same thing with your database connections of encrypting all that traffic going over the wire. You don't want to set up PGHBA and setting up strong authentication just to have everything as clear text over there so they can see all the data anyway. Defense in depth. You need to be able to do it at each level as it's going across. Ways of preventing the smoothing. When you're on the local host, when the server is down, if somebody has access to that host, they can spin up something else listening on port 5432. There's a lot of really cool things that listen on the Postgres port. Half the new databases coming out are probably 90% of the new databases coming out. All look like they're Postgres. It's very easy to set up something that looks like Postgres. And a lot of you who are running bigger installs are probably doing that right now with something like PG Bouncer or PG Pool. There's poolers out there that do it. So if you wanted to set up something there to look like a Postgres server on your Postgres server when it's down, not that hard to do. So if you're there, somebody tries to send in a, to connect says, okay, I need a password and then just send it in plain text. Got it, it's recording it and all of a sudden you have it's collecting your passwords. That requires access to the host in order to be able to do that. So hopefully you have your database host locked down that somebody can't do that. But if somebody could SSH, port five four three two is not a root controlled port. They could go ahead and just spin something up. Can do network spoofing. Same thing if you're able to hijack the URL, something like that. The domain name being able to get that. So this way you're pointing at your network host running on port five four three two. Maybe somebody has access to physical access to the router you're connected to. They could connect there and disconnect the uplink and lo and behold, they could be able to get those packets and do various nefarious things. I think I was talking to one of my colleagues, Alex Rubin who's a MySQL guy, security guy. The security researchers are super clever, right? You'll never think of some of the things that they do but they come up with these different clever ways to be able to do that. Like they'll, there's a number of different ways if you think you have it controlled, right? They'll find ways to do it. Simple thing is pass through, right? Like I said, a lot of you are using a pass through today with PG Bouncer and PG pool. Those are things that you wanna be able to pass through for things like connection pooling. But those are transparent to the application as it should be, as things are connecting through. Somebody has access to that network path as you're going through everything and drops in a proxy there, looking like, it could pass all the right commands down into the server, return it back and be able to capture everything in the long and away. Or even send additional queries. You might not ask for the user table or the sales table but lo and behold, that proxy could and run it as permissions of you that you just authenticated as. So without having SSL properly set up, you're vulnerable to all these different things that could possibly happen. The default for Postgres, unfortunately, is preferred. So Postgres is going to try to connect this SSL. If the server is not ready for it, it's like, okay, we'll just try to connect this non-SSL. That's the default. If you just try to set up your database connection with libpq or the JDBC driver or whatnot, just gonna try it. Okay, let's do SSL if we can. Require, setting up a require, at least then you're gonna have an SSL connection but you have no guarantee of who you're talking to. It will be encrypted but if it's an encrypted thing out to a rogue proxy that's there, it's getting everything unencrypted because it's decrypting everything by itself anyway and then passing it along to the server. So that doesn't prevent from spoofing, right? It'll prevent it from somebody listening in over the wire but that's really not gonna protect you for what you really need if this is your sensitive data or really any of your data, even your website, you wanna be able to protect it from this sort of thing. You really, when you make your connection, setting your SSL mode to verify CA at least. So this way it's, you're sure that it matches the server certificate from the certificate authority as it's coming across. This way, go ahead, from the one that was created on the server, yeah, yeah. It should match the one from the server though. Don't, doesn't have to verify the CA, have to, okay, okay, go ahead Bruce. They can do a bit of a proxy by a CA but it's fine as well. Sorry, so yeah, don't stress about the private key versus the certificate distinction for this particular talk, for this aspect, right? The main thing is that, yes, each side, sorry, the server has to have a private key that it can use to prove that it has that certificate that it's presenting, right? But at the end of the day, the issue with verify CA is that any certificate signed through any chain by a trusted CA on the client, right? So it goes both ways. So you have the client side that has a certificate authority that it trusts, right? It trusts CA, you know, number one, right? The server must then provide to the client in order for the connection to work with verify CA. Some, it must prove that it has the private key for a certificate that CA number one signed. Okay, could be, but it could be, you know, CA number one signs 10,000 certificates. Any of those certificates would be accepted by verify CA. Okay, that's the, that's the, you know, unfortunate part about only looking to see that the certificate that the server presented is signed by a trusted CA, right? What you really want in order to be secure, and it's probably the next slide, is verify full because when you have verify full, the client is actually going to look at the certificate and say, okay, it's signed by a trusted CA, that, you know, CA that the client trusts and the host name in the certificate, right? The actual name, it's called the CN, the common name, in the certificate matches what the client is attempting to connect to, right? So the client takes the host name, you know, abc.com, and resolves that to an IP address and connects to the IP, right? But it doesn't know whether that server is abc.com or not, right? If that server with verify full presents an abc.com, then everything's fine, right? And it's signed by a trusted CA, but if that server, you know, said xyz.com certificate, verify CA would allow it, as long as that certificate is still signed by the CA, but verify full wouldn't. So with verify full, you really know that you're talking to the server that you intended to. Yeah. I'm just curious, other than it has larger. I mean, yeah, I mean, these days I would argue that thanks to subject alt names and wildcard certificates and whatnot, you really shouldn't use verify CA, frankly, today. Historically, yeah, there were more reasons because subject alt name wasn't as well supported and you couldn't have wildcard, weren't very well supported, yada, yada, yada, but these days I would strongly recommend using verify full and doing that on both sides too. One of the things that we're talking about here, this is the client verifying the servers who they claim they are, but you really want the other direction too and you have similar options on the server side for saying, okay, you know, what kind of certificate does the client have to present, right? Could it be any certificate that is signed by a CA or does it have to be a particular, have a particular common name, right? And that's, you know, so you really want to, you really want to verify as much as possible when we're talking about these things. It gets a little bit more complicated when you mix in high availability with fallovers and like replication if you want to do disaster recovery out to a different like site somewhere where you're adding those things in, right? So that's verify CA gives you a lot of security, like as Stephen said, it doesn't give you everything, right, but if that certificate or authority is just designated for, let's say your database hosts, right, where if you're running your own CA for that, you could be reasonably sure that it's a certificate that you trust and not just a general one, you get off the internet somewhere, right? So there's, it's about balance and security. It's like, yes, from a security standpoint, you do want to do full and then maybe you're switching host names when you're failing over and doing all that stuff, but it adds, when you're doing fallovers, right, you're cared about availability as that things are coming across. So it adds some complexity to it as you're thinking through this stuff, but have at least verify CA, right? It's like, if you verify full but test your fallovers and high availability of how that's all gonna work. So channel binding, going back to Scram, we talked about through the authentication. What this does is now proving to the client that the, with channel binding, proving to the client that the server is who they say they are as you're authenticating. So doing the normal things of not passing that password back and forth of getting back that client proof to the server, but then the server is gonna send back a server proof to the client in order to be able to prove that you're connecting to the right place that you're authenticating to. Did I get that right, Bruce? Okay, that's okay, right, right. But compared that to the previous Scram slides, there's one other trip back to the client of that channel binding of the server saying, doing a server proof of saying, I'm connecting, I am who I say I am. Thanks, Bruce. Right, so again, these are all different layers as you're doing there, of just connecting to the database you know who you're connecting to and the database knows who's connecting to it. This is what Stephen alluded to. SSL certificates for authentication, of being able to authenticate using certificates and in that common name of defining who you are as it's sending it over. Instead of just going back and forth with something like Scram, you could do that all through certificates. If you have some sort of automated job that instead of putting a password in like a PG pass file or something like this, great way for automated jobs in order to be able to do that certificate. This is an enigma device. Cool. So, this is more preventing the theft of once you've actually saved data into the database, how do you make sure that if somebody gets a hold of those volumes, it's stored somewhere physically. If they got that, how do you make sure that if they walked away with it, all of a sudden they have all your data bypassing everything you put in place? Or somebody walks off with a backup. Disk volume encryption is a simple way to do it, whether you're using locks or something like that in order to be able to encrypt the entire volume. There's lots of tools out there to be able to just encrypt the volume. This way if they walk away with the volume without the keys, don't store the keys right there on the same machine with it. This way it's, you know, they can walk away with the encrypted volume and the keys are there unencrypted in the way you go. So, yeah, key management becomes a key important part of this. Can do a column encryption. If you wanna be able to encrypt that password number, Postgres has tools to do that with PG Crypto and things like that, but you have, that's a trade off too when you're doing that. If you're encrypting the column, now how are you gonna index that? If you wanna be able to search by something like that. It becomes more of a challenge as you're doing that. So, yes, you kinda wanna have that encryption. Everybody wants to have that encryption, but it becomes more of a challenge of performance versus the security. Sometimes you do need the security on individual columns. There's a difference there between hashing that versus encrypting it. Hashing it's one way, so this way it's passwords are a good example of that. You just wanna verify it versus you need that data again sometime with that credit card number. You're gonna need to be able to look that up and process their next transaction as it's coming through so you need to be able to unencrypt that. More than likely you're not searching by a credit card number. If you, it's usually something like the last four digits in order to be able to find the right record. You're not looking by the whole number, right? Of pulling it out of there. If you are, you probably designed your application wrong when you're looking for that. This is data that you wanna be able to use, not search by, but alluded to, where do you store that key? You can store it on the server. It does, if it's on a separate volume or something along those lines, if that has its own security around that, that does give you some protection, but if they walk away with the server, they have everything. But you're sending that decrypted data over the wire. You can store it on an intermediate server as you're connecting, as you're passing through. There are some third-party applications that do this that layer over top of Postgres and adds that layer security, where you're gonna get into clear text. It'll handle the encryption as it's going back and forth from the database. So this way what's stored in the database is encrypted values. It's handling that key. So it's separate from the database if somebody walks away with your database server, they got a bunch of garbage that is gonna be hopefully impossible for them to crack. Can do it on the client. So this way what's coming back is that, what's coming back is the, it's sending the key to the server and then getting back unencrypted stuff over the wire. So you can send it and say, hey, decrypt this for me with this key. That's one way of doing it. From application developer standpoint, I see lots of web app developers doing something like this. It's not necessarily the most secure way of doing that because then you're distributing that key to all your application servers. Can have it, it really prevents things from DBAs and everything of digging into that server data. Can use hardware that client hardware device in order to be able to store that there. So this way you can put one of them on each of your servers, a cloud world that doesn't really work where you could plug things in or even container worlds that become harder. There are key management services out there in order to be able to do this, in order to be able to even audit whenever something goes, grabs that key. There are things like that that make that a lot simpler. There's a number of them out there in order for key management if you wanna do it on the client side and have them be able to pull the key from that service and be able to pull it down. So the other thing when I'm kinda thinking through on the encryption, you wanna encrypt that data, your data directory, but you also wanna make, worry about your logs. Postgres is very chatty to its logs. So things like alter passwords, alter user and resetting your password. If you have a typo in there, it's gonna end up in your log, right? Or if you just, you're logging all your statements, you're auditing everything, it's gonna end up in your logs. So that has a lot of sensitive data in your logs, whether it's passwords or even just user data, or depending on what you're doing, because they're gonna be, your customer IDs are gonna be in your ware clauses if you're logging your queries, right? That's something too that you also wanna protect from an encryption standpoint. So if you're encrypting your data volumes and you're keeping your logs somewhere else, you wanna encrypt those too, right? So when you're thinking about it of just the various different layers of if somebody has access to this, like how could they get data they're not supposed to? You wanna do it at all those levels that it's coming across. Thank you, thank you, Bruce, for the slides. Thank you, Bruce and Stephen, for helping out with some questions. Appreciate it. Unfortunately, I couldn't go as deep on Scram as Jonathan would have. He does do that talk pretty regularly. So if you attend another Postgres event, I'm sure you'll have an opportunity to see the deep dive on Scram, but thank you, everybody. You wanna play, like understand the process on your laptop, right? That's cool. Use something off of that one that you created. Yeah, you can create a PA and have that then sign another certificate to you. And then you put that into a safe and a bolt and watch that slide as you see down really hard. Because you're never gonna, but that's, PA's examples are a lot harder to go get around, and you can, but it's a lot more difficult to do that than usual to get a certificate like that, right? But you don't want to use self-sign certificates to essentially read the certificates for your actual data service. So you wanna have a separate one, but yeah, it doesn't do your job to do that for a part of it, fairly. And if you're only using that for a couple database servers, when you're doing something like Verify CA, you can be reasonably sure that's what it is, but versus Verify full. But if you have a lot of control over those certificates that they're in a trusted environment, but you have to be able to protect that. So we're talking about encryption here. And what it boils down to is what the attack vector that you're worried about is, right? So very often people are concerned about DBAs or the database being stolen or something like that, right? If that's the attack vector that you're worried about is the database server itself being compromised, then the only way to prevent that is to have the database server never see the unencrypted data, right? Or the key. So that's where the last one here is either this one or the one before it is the one that makes the most sense in that case to address that attack vector because here you see the encrypted data is what's being sent, right? The database server never sees the unencrypted data. It only ever sees the encrypted data in this particular, you know, setup. And the database client is what's gonna actually take that encrypted data and decrypted, you know, in this case using a hardware token, but it could also just be a key that's installed on that client system, right? Now, I would argue that's typically your best answer because at the end of the day, the client system is going to see the unencrypted data somehow. It has to, right? So it's a lot better to have the data, you know, decrypted as late as possible in the overall process because then throughout much more of the system, the data is encrypted and not visible, right? And you don't have any choice on the client, right? It's gonna see the unencrypted data because it's gotta do something with it. You know, unless it's just an intermediate proxy, right? I mean, you can imagine that you could theoretically have some JavaScript code that's running and then there's a web server and then there's a database server. And if you can get it to the point where only the JavaScript on the client side knows the key and does the decryption, great, right? That's fantastic. It's not typically how the world works today. Unfortunately, instead, most of the time, the app server has to present the data to the web browser in a decrypted form, maybe through a tunnel, right? But it's still ultimately a decrypted form, right? I mean, there are, you know, I take some of that back because there are actually some JavaScript systems like if you use like Braintree or some of the other ones like PayPal, like they have JavaScript code that will actually do the encryption and decryption of tokens for you inside the actual client before sending it through the authentication system or through the credit card authorization system, I should say. But that's really your best answer, like as far away from, you know, anything, or as little or as close to the human as possible, I guess is the short answer, right? Is where you really wanna do the encryption and decryption. Everything else kind of ends up being a compromise. So very frequently, people do use what's called a KMS, a key management system. So rather than having the key stored on the app server, it's actually stored in a key management system. And yes, absolutely, that's a good solution because that key management server can then be much more tightly secured. And a lot of those key management systems have things around them like access times and frequency of access. And you can kind of tune things in a manner that makes it like if suddenly a whole bunch of keys are being requested by one client, the KMS can be like, whoa, wait, no, there's something doesn't smell right here. Huh? Yes, yeah, Hashi Corporate Vault is one of those KMSs. There are others, but yeah, there are definitely ways to do that and using a KMS makes absolutely a lot of sense. But again, you really wanna have the KMS, you know, communicating with the app server so that the app server is where the decryption's happening again as late as possible for the actual decryption. Any other questions? Thanks everybody. It works. Okay, hello everyone. Great to be here. My name is Bruce Mamjan. It's my second talk of the conference. My eighth conference since COVID, so hey, hey, hey. Yeah, I got one coming up Brazil next month and then Israel, Germany, and New York. Yeah, New York. So New York at the end of September, so something to look forward to. What's in New York? The conference, what? All right, go on, all right. What are the dates, somebody? September, I think it's 2223, isn't it? 2233, dedicated Postgres conference. And I'll be there, my wife will be there, and a lot of other Postgres people will be there. How many people are gonna be there? Let's see. All right, cool, all right. So maybe get a couple more. I don't know what the next West Coast event is. Mr. Wong, can you help us? Just East Coast, well, it's Coast, I'm sorry. There you go, okay, okay, good. So yeah, we're pretty busy now. We got a lot of stuff coming. I missed a whole bunch of European one day conferences. Past couple months we've been, everyone was waiting for COVID to end and then we got everyone just stuck it in there and I couldn't go to any of them but we had very good participation, so that's great. So yeah, my name's Bruce Momjin. This is my website right here, the slides that you're looking at are on that website, so feel free to download those. I'd be curious if anybody could pull up that QR from the back, I don't think you can. I think it has to be a lot bigger to actually pull up that little square. Steven, you should be good. Anyway, I've been with Postgres since 1996, with EDB since 2006. My primary job is actually to work with the community and I like doing that and then I spend a little bit of time on EDB stuff like customers and strategy and all that fun stuff. But today we're here to talk about Postgres 15. This has been in development since July of last year. We closed development sometime in around April of this year. I wrote the release notes and put them online. I'll have a link to those release notes in a minute so you can see exactly what's going on. We're looking at about 180 features and I picked just the top seven that I think are the most important. And I'll give you some background of why I think they're important, some of the history of how we developed them and then I'll obviously take your questions whenever. I'll just feel free to ask them. But there's a lot kind of packed in here. There isn't one, this is generally a problem with Postgres or a blessing. We're not a single trick focused group. We have a distributed team all over the globe and they're all focused on different things. So the things that people focus in Asia or different things in Europe and Latin America and different countries have different focuses, different companies have different focuses. So there isn't like a one thing all the way usually that distinguishes a release. It's basically a set of either initial and initial features or a set of enhancements to existing features. And because you're putting them out once a year, you get major momentum even in a couple of years. Somebody, in fact I see the gentleman here asked a question about, will I use foreign data wrappers in Postgres nine something? Have they improved since in 14? And we're all like, we had the panel discussion and we're all laugh, like that's pretty funny because it's like night and day. He was saying it was too slow in nine, nine, four, nine, six, whatever he was running. We're like, wow, it's like night and day. What we have today for foreign data wrappers is vastly improved over what we had before. So instead of being a big bang every three to five years, it's this basically cadence that comes. Probably the big, any questions first? Probably the big feature here that the most visible one is merge. I don't know how many of you have heard of merge. It's an SQL standard command. It's designed effectively to allow you to take data from one table and merge it. Hey, hey, see that word there? Merge it into another table. Okay, so imagine you have a batch job and you wanna take a new data and you wanna merge it into the existing, stock data that you have. Maybe it's a data warehouse. Use the merge command. And it basically kind of merges that data into the new, into your existing data. This has been often requested. It's been in development. It must be, I wanna say, seven, eight years. And there's a reason why it was seven or eight years because as we started to look at it, and I think we started to look at it about 10 years ago, I'm thinking back to some PGCon talks that I heard in Ottawa. We started to look at the semantics of the merge command as defined by the SQL standard, which we would have to follow if we called it merge. If we call it something else, we don't have to follow that standard. But if we call it merge, we have to follow the SQL standard. And there were some odd behaviors in how the merge command operated that we didn't really like. We thought it was suboptimal. It wasn't as clean. There was these cases where the operation could actually abort with unpredictably. And so we just kinda looked at it and we're like, we understand the SQL standard has that, but this is really not the feature we wanna give to our user base. So we started to look at Upsert, which I think MySQL had and a couple other databases had this. Concept of doing an insert and then an update, which you can imagine is kind of like a merge, right? But the big difference between the two is that while the merge is effectively set oriented, it thinks of a batch of rows coming in to an existing table, the on insert on conflict is row based. So you define for each row in my set, could be one row, could be a whole table, how am I going to merge that data in? And that's where we came up with the insert on conflict syntax. It's the insert that we've always had. And then this on conflict clause, which I believe again we've had for seven or eight years. And the difference between the two is kind of interesting. The insert on conflict really doesn't have abort behaviors that I can think of. It doesn't sort of arbitrarily say, oh, I can't merge it and I've gotta stop, which unfortunately the merge command has. And it's also fairly easy to define what happens when there's a conflict, right? It says on conflict and you can say ignore, you can say over update, there's a whole kind of nice syntax there. But still, even though we had the insert on conflict, there was a demand for merge again for two reasons. The first reason being that merge was SQL standard and some people had been using it. But there are actually three, there's another reason and that is that the merge command is join oriented, okay? Which means that you're not, if you're loading a whole batch of data into an existing set, you're not doing it row by row in the same way internally that an insert on conflict does. And there's some behaviors that are a little different. So when you start to see what merge does, it is a join oriented operation. You basically perform a join between the original data and the one data would emerge and then you kind of bring it in. Very different in behavior than the insert on conflict, which is basically just taking, okay I got a pack of rows, let me just bring them in one by one, right? So it is a little different. The other issue also is that the insert on conflict on conflict would require some kind of unique index. So you need a unique index to perform insert on conflict and most people didn't have trouble with that because again, they were merging data, wasn't a problem, but it is a limitation and there are some cases where you'd wanna merge data in not necessarily because of a unique index. Again, when you're thinking of row oriented, it makes perfect sense that you would require unique index. But if you think of set oriented, there could be cases where you'd wanna join in data that may not have a conflict and then you still wanna join it. So you basically join the data together and then you define how it goes, how it comes in. But again, one of the weird behaviors to merge is if there's concurrent activity on that table while you're merging, you can get a failure. And that's what I was talking about. The idea of unpredictable aborts or unpredictable failures of the query, which nobody likes, but again, because the way the merge is set oriented, you get that behavior. Let's look at some examples, I'm talking too much. So here we have a table called test. I've got two columns, X and Y. I get an A for creativity there. And then I insert the numbers one, three and five. So three odd numbers, okay? And then I'm gonna show you the insert on conflict syntax where effectively we're now inserting one, two, three, four, five, six. So as you can see, the one, the three and the five are going to conflict, right? And the two and the four and the six are not gonna conflict, right? It's pretty easy. And you see the unique index requirement right away. You get an error right here because there's no index to conflict on, right? So it just says there's no unique or exclusion constraint matching the on conflict specification. We can't do it, okay? So it's just gonna fail. Then we create a unique index on the first column, X, okay? And then we do it again. And we say insert into test values one, two, three, four, five, six. And if there's a conflict, we want you to update the second column to true. We don't care about, we're not gonna insert the actual, we're gonna insert the value, but we're also going to, we're, I'm sorry, if we have a conflict, we're not gonna make two versions of that row because we can't because we have a unique index. So our alternate behavior, this is not merged. This is what we've already had with one conflict. We're gonna basically just set the Y field to true. And you can see the results right down here. For one, for three, for five, we have not added another one and three and five here for the insert. We've merely set the second column to true. The nulls which you see here are there because we effectively have not changed those. Remember, we didn't specify the Y column, so therefore you don't have a field there. You just have a null, okay? Those of you who were in my null talk yesterday know exactly what I'm talking about because when you don't specify a column in the field, then it just kind of, it makes it null. Any questions? Okay, great. Okay, let's look at merge. So we just delete what's in the table again. We put one, three, five in again. Okay, now we're gonna use the merge command. You can see how it's quite a bit different. We start with the word merge, not with the word insert, so it's a new command completely. We name our table test. So I should probably use my little pointer here, right? And then we're gonna do one, three, two, three, four, five, six, but then I'm gonna, I actually actually have to use a values clause here. And actually we name the values clause M and we name the colors X. Because remember, we have to do a join, right? This is not a row operation, this is a join operation. So again, we don't have to have the unique index because we're gonna join the columns we want to join. It's kind of cool, right? So I'm actually creating a virtual table right here and I'm calling it M for merge and I'm calling the column X, right, pretty easy. And then my next line is on and then test X equals MX. This is my join right here. If you've ever seen the ANSI syntax, it's inter-join on whatever you put the clause and that's exactly what we're doing here. And then you've got this curious kind of clauses down here. This is very simple. You can do a lot more to this than I'm showing you, but I just wanna get the concept across here, right? So we say when there is not a match in the join between test and M, we are going to insert the value. Very similar to what we did with the insert. Remember, if there's no conflict, the insert happens. If there is a match, meaning there's a conflict, right, we're gonna set the Y column to true and look at this, we got the same output, isn't that cool? Okay, but you can see how different this is, how it's basically kind of pulling this together in a join way, in a different sort of concept. So I'm excited about this. There has been some, Simon Riggs had a really good talk at Postgres Vision in June. You can watch the recording of that where he just talked about merge and he goes into a lot of details and there's where clauses you can put on the when and very sophisticated stuff that you really could not do in insert at all. The insert on conflict doesn't have that smarts in it. So I'm kind of excited about it. It's gonna take a while I think for people to really understand when to use each one. But again, internally what's happening is that the insert on conflict is really doing an insert and then if it conflicts it's doing something else. This is not doing that. This is not starting out, assuming an insert, right? It's basically doing a join and then processing the data. So this is a lot more I think applicable to something like a data warehouse case than an OLTP. Usually an OLTP, you're probably just gonna stick with insert on conflict because you're probably just doing one row. You just want a very clean API. But this can be a very useful one, in some ways more efficient because you're not doing all these individual inserts. You're basically creating a join result and then handling it that way. Any questions? Was that confusing? Does it work? Okay, great, thanks. It took me a while to process, why do I want this thing? Why do people keep asking for it? It's kind of embarrassing, but I remember a similar case when we added the range types. I don't know how many of you know about Postgres range types, right? And I remember talking to Jonathan Katz, who should have been speaking here. He was, I'm replacing Jonathan, right? Or he was previous. Jim Alginski replaced Jonathan. I'm replacing Sarah. Mark. We don't know who I'm replacing anyway. Yeah, I got you, okay. So what's interesting is he's like, there's this merge command, there's range types, and you can put two fields, combine them together, and I'm like, well, why would you do that? There's just make two columns. And he's like, no, because you can create sophisticated indexes between the two of them in ways that you can't for a normal single column. I'm like, oh, that's really cool. So again, this is the kind of case where when I started out, I'm like, oh, this is just another insert on conflict, just different syntax. I'm like, not really, it's really different. And it's gonna take a lot of things for people to understand when to use this and when to use the other, right? Our documentation is pretty good on that, but I'm always open to improving that and stuff. And I'm always looking for a doc improvement. So it seems to be my thing for a while. Okay, second, SQL JSON functions. This is something, thankfully, we've been at this JSON thing for at least 10 years now, started with standard JSON and then JSON B and just indexes and the gene indexes, and then the SQL standard syntax, which is what we're doing here. And then we just keep enhancing this thing to the point where the SQL standards committee liked what we did so much that they made an SQL standard for JSON. And this is us kind of adding that to some extent. So we've added a whole bunch of new SQL functions that make it cleaner to do JSON in SQL. I don't know how many of you have done JSON SQL, I have some talks about it, but the API between SQL and JSON is not always super clean. In the early years, it was a lot of very sophisticated function syntax and you'd have to put special stuff in quotes. It was just kind of awkward. It didn't read real cleanly. So one of the ideas here, the SQL standards committee came up with is more of a clean design for doing JSON and making it look a little more like SQL, right? So we now have JSON scalars, the ability to create JSON scalars and objects and arrays sort of using a standard syntax instead of using a function call that has like a weird syntax to it. We have now a test of the type of a JSON value, the existence of certain pairs in JSON values. We can apply JSON pairs and probably the last one is most interesting. You can take a JSON document and then make it a table with columns and rows, right? Which is kind of crazy, but you can understand why that would be really useful. If you're interested more in this, Oleg Bertunov had a good talk in June at the Russian conference, but I think he also did it, no, I think he did it at Vision as well. So if you want to look for Oleg Bertunov, the Russians have been pretty instrumental in a lot of this JSON work and the non-traditional data types in general, and they continue to provide great, great solutions in that area for us, okay? Any questions there? Okay, logical replication. Again, I remember I said earlier that we don't really have one huge feature. A lot of times it's a lot of incremental improvements and because they're coming out every year, it's not so bad because you only have to wait a year or two and you get a really, really cool feature. Logical replication, again, we've had 10 plus years and we continue to refine it. Very similar to the way we continue to refine partitioning and as I said, foreign data wrappers and JSON, there's a lot of this get the initial feature in and then continue to enhance it as we go. Frankly, commercial companies, commercial databases have a tendency to plop a big feature and then stop. You see that a lot, or they may plop a big feature and then maybe they'll improve it once and then it's just kind of stagnant. Because we're an open source project, there's a lot more dynamism. That's why PSQL is so cool because every year it seems to get some new thing that it does and logical replication is no exception. These are not super amazing improvements but again, they're part of that path that we continue to go on and as we talked about yesterday in the discussion, ask anything that we did the last session, this logical replication is part of a larger process of being able to handle upgrades more smoothly and being able to do distributed Postgres more smoothly. So it's kind of part of that entire ecosystem that we're building on. It's not clear sometimes where we're going but as we build it out, new opportunities appear and then every year we're kind of adding to that. So one of the problems that we used to have was that if you had to previously, every individual table had to be designated as rep as part of a logical, as published. So you'd have to publish this table and then this table and then this table. And as you imagine, if you're creating and removing tables all the time, it's kind of a pain because you're having to create the table and then set a publication and then kind of play around with subscribing to it and all that other stuff. So one of the cool things I think in this release is you can now publish an entire schema and then also say any future tables also will get published. So you just say, like any schema, you just give it the schema name and it said that publication, that's that subscript publication is that schema, then anything gets created in the future that's also part of that schema. And that obviously makes it easier for developers. Another one which I think is pretty cool is you can now control which rows get sent because previously you'd have to publish a whole, you can see, if you squint, you can kind of see how these things would be, allow you to build out more sophisticated systems, right? Particularly the where clause, you could effectively almost think of it leading in the direction of bidirectional replication for a single table. You could have some rows, you could like say, you could have some rows where clause published this way and then the other with different where clause published the other way. Now we're not saying we're going there but it's something that now becomes possible because you're not necessarily publishing the whole table because now you can control which rows get sent. You can control which columns get sent too. So you could start to slice up a table this way and sort of send different things or maybe not send columns you wanna be secure to the, that's a great use for it. Yes sir, I'm sorry? The first one, publication of the entire schema. How that would work? Including future table addition. Right, what do I mean by that? So a schema, yeah this is maybe confusing. The relational systems use the word schema in two ways. In one sense a schema is the layout of a table. Okay, so you say, when I say create table X, cable test X integer, Y integer, X and Y are the schema of the table, right? But relational systems also have the concept of schemas as collections of tables. They're called schemas. And we have a public schema by default in Postgres and by default things will be created in that public schema. In fact I have a blog entry. My URL is like 650 blogs. And one of them is called schema schema cluster cluster because we use the word schema two ways and we use the word cluster two ways. And it's very confusing. But that's the terminology unfortunately we have. When I'm saying publish an entire schema I'm basically saying if we have a public schema as being a collection of all the tables that we've defined in that database we could just say instead of having to mention each table for publication we can just say let's just publish the entire public schema to another server. And if somebody adds something to the public schema later that new table will automatically get picked up as published to the other system. It is not a great, yeah the terminology is not good. Also there's a way for subscribers to skip specific transactions so if there's a big data load or something you don't wanna send over you can actually specify, believe it's a transaction ID and it won't send it. There's some purpose for that I'm not quite sure. Support for prepared transactions, prepared transactions meaning you have like prepare and commit if you've ever seen that system. That wasn't replicated before it now is. Suppressed replication of empty transactions. I didn't realize, I thought this was like stupid. I thought it was just fixing like somebody who does begin and commit. But no, it turns out we were replicating transactions that like if you had a transaction and you didn't mention any of the tables that were replicated we still sent something to the other end and we're like why are we doing this? So that's gone and another one if there's a replication error we there you could potentially get subscribers and get into some kind of error loop because they're continued to try and apply that so now you can error out of that particular problem. So you can see this is kind of an infrastructure play you're basically building out your army kind of toward a goal. I'm not quite sure. I don't do a lot with logical replication but I know that as I said with foreign data wrappers when we started out it was real easy but as it grew out it became more and more capable. In fact the gentleman yesterday said I tried foreign data wrappers in 9.6 and they were too slow and I stopped using it. Now if he tries them I'm sure he'll find that it's much better. Same thing with replication or all your replication you may have tried it for some sophisticated case didn't work, now it will. So it's that kind of thing. And then that builds the infrastructure then two, three years down the road you start to doing something like bi-directional application which we didn't have but now because we're building the underlying part of it you then have that's an option that you didn't have without all this here. There's much smarter people than me that understand where we're going. I'm just the guy who has to figure out what it is we've added in the release notes but it is amazing how when we need something it's kind of already been there for us. Somebody's already done it. I'm working on sharding. I've been talking about that for a bunch of years and it's a lot of incremental improvements that people have been doing over the years in partitioning in foreign data wrappers and in parallelism together that have gotten us where we are and I think that's going to take us along. It's going to take us into a much larger sites that's larger data options than we have now but it wasn't just one thing it was a lot of these little things that smart people did. Questions? Great, okay. Compression, this is a acronym like smorgasbord here. It took me a long time to get my head around exactly what was going on here. As you can imagine, compression particularly of backups was not something that we focused as much on as we have in this release. So we had some very rudimentary compression of PG dump. So when PG dump would write its results you could compress that. But as you can see, a boatload of things have come in this release for that. The first one is LZ4 compression. We had GZIP compression for base backup which is what we use for continuous archiving of base backup of the directory. But now we can use LZ4, which is far superior in a lot of ways to GZIP for this particular case. So the first one is talking about the compression of the protocol over the wire, okay? The second one is talking about the compression of the files we store on disk. Not the wire protocol but the disk storage, right? And there we've added the ability to do LZ4 and something called Z-Standard or ZSTD if you've seen that before. And Z-Standard is actually in many ways better than LZ4. And I don't, we've had some discussion over whether why anyone would want LZ4 sometimes if we have Z-Standard, I'm not gonna go there. We could do GZIP before but now we can do LZ4 and Z-Standard for the files, the base backup files we store on disk. We can also, in base backup, which is the continuous architecture, we can effectively control whether the compression happens on the database server or at the client. So there's like a new protocol, you define client dash or server dash. So you can do the compression on the server and then send it on the wire or you can wait for it to come to the client and then compress it there depending on where you wanna use the CPU, all right. PG receive log, it's another tool we have. We already had GZIP there but now we have LZ4 and completely unrelated, full page writes which is something that allows Postgres to be reliable. We already had LZ there which is a very old compression protocol. Anyone, can somebody remind me what LZ stands for? It's a rule. Thank you, I knew it was one of those things. So now we have LZ4 which is superior to LZ and Z-Standard for compression of those writes. So this is kind of a lot of acronyms and a lot of, it just took a while for me to get it around my head exactly what we did. It's kind of interesting, I thought. When I started to actually work on this talk, I had to go back to the release node to make sure I had those right and then I wasn't sure those were right so I had to ask the community members and then they had to tell me, no, this is, and then that's how it kind of worked forward. I fixed the release nodes and then I got it in my slides and I'm like, all right, this is right. I'm not sure I'm explaining it but obviously I'm excited to see we've really gotten serious about this because before we just passed, we had granular permissions. I don't know what that be is there. I'm gonna have to fix that, excuse me about that. Strangular permissions, they basically allow things like a view to be run by the permissions of the view user because up until now, if you create a view, anyone who access the view would be doing it as the owner of the view, which is good. It's like an escalation. You give people permission by creating a view and even if they don't have permission on the table, if they permission on the view, they can view the data but some people wanted the ability not to have that happen. No problem, we allowed that. You can grant control to change service side variables. That's kind of interesting. And we've added a new predefined role for checkpoint permission. I know Stephen in the previous talk we talked about the idea of not using super user for everything but giving people individual permissions, right? So instead of having users come into super user to do everything, you can say, I'm gonna create an ordinary user who has checkpoint permission or an ordinary user who has the ability to write to the file system or an ordinary user who has permission to view all tables to do backups, right? And you then are segregating ordinary users with these special permissions and you're not having to give the overall permission to the person who log in. This isn't really nice. Okay, another huge grab bag for me is memory. Any questions? Very good, okay. Another huge grab bag for me is memory. We've gotten pretty serious about this. I thought this was something we really couldn't improve much on. I thought it was just kind of, it is what it is, we've done the best we can but no, there's a bunch of things that we can do. First we now, our sorting algorithm goes back to Knuth, right? The guy who wrote all those computer science books. But the sort of best state of art has actually improved. Things are continuing to improve. So for example, I didn't think sorts could really be improved but it turned out that sometimes when we were spilling sorts to disk when they were too big for work, ma'am, we were doing it in a stupid way, just the way it was inefficient. So we improved that. We improved the performance and reduced memory of sorts that didn't spill the disk. So the ones that spilled the disk are better and the ones that don't spill the disk are all better, are actually better. Again, I didn't think we could do a lot. I thought this area was kind of just settled. You kind of have done the best you can. But it turns out that people continue to find things that actually improve it and some of the numbers I saw were 20%. I was like, wow, I'll take that any day. Another thing, this third item is actually one of the biggest in my mind, very hard to understand the value. But since I've been with Postgres since, I guess, 1996, I guess about 2002, 2003, we created something called server statistics. It actually has a new name now. What is it, Stephen? What's the new name? Cumulative, the server statistics are called cumulative statistics, or yeah, there's a new term in the docs. I guess that's why I know because it's a docs issue. But we've always had this server statistics concept where we allow, I'm not sure, how many of you were in the talk yesterday about there was a question yesterday about how would you know when there's an index and you should get rid of it? Does anyone remember that question? Okay, and the answer was, you go to PGSTAT user tables, or PGSTAT user indexes, and you can see how often the index has been accessed. Well, those statistics have been around for Postgres for almost 20 years, 18 years, whatever. And the bizarre thing is that the infrastructure that we used to do that up until this release was effectively to have a statistics daemon that ran. If you've ever looked at your processes, you'll see a statistics daemon. And we used UDP ethernet packets, if you're familiar with UDP, aside from TCP, UDP, lossless, lossy, possible, we would send UDP packets to this collector. This collector would accumulate the statistics in its helper, and then it would write a file to the file system. And if somebody asked for PGSTAT all indexes, or one of these, the back ends would read the file in the file system and then return the result. Now, I've never really talked about that before because I didn't think we could make it better. That's a really weird architecture, right? I mean, it's doing a whole bunch of things that just seem wrong. It's using UDP packets, which are lossy, and creating packets on the network to communicate every insert and every delete, right? And then this thing is writing it to the file system so other people can read the file system of what it's just written. Just really not a great architecture. So, thanks, Olivia, Andrew Shrewd, who works for Microsoft, got on, he got, being his bonnet or whatever, and he just said, we're gonna fix it. And I remember the commits that came through for this thing were like, I don't know where this is going. Like, I think I know where it's going, but it's, it's heady stuff. He's moving, yeah, I mean, you can imagine. This whole thing is being re-architected, and I'm seeing these patches that are humongous coming in, and I'm like, we're power to him. Like, go, please fix this. And he did it. He did it. He started, I think in, I think he started in October and finished in like February or so. It was like, it was this long slog of just these massive commits. He's like, I'm gonna rewrite this thing and make it memory-based instead of network slash file system-based. But he had to restructural a whole bunch of stuff. God bless him, he did it. And now we don't have that wacky thing anymore. We basically have a runtime service statistics which is in shared memory, which is where it should have been all along. There's no more UDV packets, and there's no more writing to the file system. So the data's more accurate. There's less network I.O., there's less file system I.O., everything is much better. It's very hard to get excited, for ordinary users to get excited about this, but as an architect person, yeah, I'm like, you know, it's sort of like you got a door and it's been like creaky and hard to open for like 10 years and all of a sudden, like a carpenter comes and actually fixes it and like the next day you're like, oh my goodness, this door opens like normal. It's been 10 years. I wish I would have done this before, right? So anyway, that's how I feel about that one. Any questions about that, Grant? Okay, this is really interesting and kind of, I don't think it's controversial, but basically Postgres has various ways of doing in-memory operations, like sorting and hashing and nested loop and so forth. What we found over time is that hashing is quite a bit more sensitive to memory and quite a bit better, let's say that's sorting. So if you spill to disk on a sort, you kind of can recover. It's not a huge problem because all the data is ordered so you can do one group and then throw it away and then pull the next group out that you sorted and do it, whatever. You can't do that for hashing because hashing requires you to look at all the buckets at once. So what we did in this release is we actually make, allow hashing by default to use twice as much memory, twice as much work mem as other operations. So we've kind of tilted, we put sort of put our hand on the scale for hashing and said we're willing to allocate twice as much memory to a hash because hashes are in many ways faster and more sensitive to being spilled to disk than a sort would be. And we're gonna see how that plays out, but I think initial results are better, it's good. We added a multiplier for hash in 14 and this release we're actually using it and saying, okay, now we're gonna actually be more aggressive in giving hashes more memory. Difficult situation, you're having to judge which one's a better solution, but I think it's gonna turn out well, I'm optimistic. And we also have new server variables. Again, to make memory more visible, you can now see how much memory is being used by how much shared memory you're being used and how much huge pages are being used, which was sort of an opaque thing before. I think we had in the docs, we had some like command, you could run from the command line. Now these are SQL level, you can just query it and see right away, this is what I'm using and that's gonna be useful if you're sizing shared memory for a system or if you're trying to allocate huge pages. In fact, you can oddly, before the server starts, you can actually start Postgres and ask for how much huge pages it would want without starting it and then you can go allocate those huge pages and then go in. So that was a tricky thing. We allowed a server variable to be queried before the server started because we know the server would crash because there isn't enough huge pages, right? So you can kind of say here, here's what you're gonna need to pass and then you can do that and then try it again. So there's the usability stuff here. Number, the seventh one, this is kind of interesting to me and this is a fun one, so yeah, I always end with a fun one. It's not gonna set the world on fire but I think it's a great example of where we're focusing on usability. So as you may know, when we created the copy command, we had a header clause. You specify the header clause kind of in parentheses like that, right? And up until now, the only copy format that supported the header clause was CSV or comma separated value, okay? Because CSV by definition has this concept of a header, right? So we kind of went along like we went along with that but what we didn't have is that for any of the other formats like the text format, which we use very commonly, we just didn't have it. So here's an example of it being added. We basically create our table, again, fantastic column names. Then we insert our two values, term paper and crossword puzzle and you can see if I just do a copied STD out, I get just my normal output but now in non CSV mode, in text mode, I can specify header and I actually get the X and the Y to come out, right? Which wasn't true before. Now that might be just interesting but it gets a little more interesting. For example, what I do now if I copy my file to a temp file and then I delete everything and I try and load it in, I actually get an error and the reason I get an error is because the X and the Y will not translate to integers, right? It's like you can't do that because X is an integer and Y is an integer. So if you dumped it out with header, you have to load it in with header. That's exactly what I've done here, right? But what gets more interesting is this one, if you effectively, if I rename one of the columns so the column names don't match anymore now instead of being X, Y, it's X, Z and I try and copy it in, well, it actually works and the reason it works is because I've just said header but if I say header match, it will actually try and take these X and Ys and match them to column names and if they don't match, I'll get an error. So the real value to me is it makes the whole copy operation less error prone. If you're loading it into the wrong table, if you're loading it, you somehow modify the table and you forgot but from when you dumped it, right? You don't get this kind of surprise. So is it a huge feature? No. Is it something that I think is really cool and will make copy more useful and less likely to fail? Less likely to surprise you, I think it will succeed in that. Yes, sir. Okay, the question is, will loading of header match handle a case where the columns move around? And the answer is I don't know. If somebody could test that, that would be great. And I had somebody... Yeah, I think that was part of the point but... It feels like it would be cool if it did, yeah. But I can't say that for sure. But we have five minutes and I'm sure some people have questions. So all the questions are going on. Somebody will happily give me that answer. I can tell you as a speaker, not being comfortable not knowing the answer is like half of the battle. Like, I don't know. Somebody's gonna find it for me though. And usually if I have Magnus Hagen under here, I make him do the leg work, but he's not here right now. Okay, that is the end of my talk. I have four minutes for questions. So if there's any questions, yes sir. Are there any plans to support NUMA architecture? So the way is kind of interesting. So the way that Postgres is architected, we actually recommend you turn off any NUMA affinity in your operating system. I actually have a blog entry about that. I personally on my server turn off the NUMA capabilities of my, I have a dual CPU system and then it really wants to use NUMA, right? But I've turned that off because when you run benchmarks, we found that it doesn't work. So to do a NUMA architecture, we would have to sort of bake that affinity into Postgres. Even to the level of having the shared buffer cache have some affinity for individual CPUs. I don't know of anyone working on that. It would be interesting to have a proof of concept to see how much we can get out of it. I know there's a lot of, well not currently, but about six or eight years ago, we used to hear a lot more about NUMA architecture and massive number of CPUs and how the benefits of that. But I don't know of anybody working on the NUMA case right now. I think partially because it's not as popular, but also because the shared buffer cache is this sort of generic piece of the puzzle and we assume every backend has very close connectivity to that. And I haven't seen a lot of, no, I don't know of anybody working in that area. Although we're open to it. I know we currently recommend you turn it off. Yeah. Sorry about that answer, but other questions. Well, great. The docs don't say. That could be a problem. Yeah, that doesn't help. Thank you. You did be easy. Yeah, yeah, yeah, yeah. Yep. Oh, totally, yeah. Oh, yeah. Yeah. I have a feeling it doesn't reorder. Because, yeah, and I think the reason they didn't do that is they don't, you could mask problems by reordering. And I think if we would allow reordering, my guess is we'd have a special like match reorder. So we would allow, like you'd have to specifically say, I'm gonna allow reordering here because I would feel like it might mask a configuration issue that. Yeah, yeah, oh yeah. Yeah, it's really a usability thing, but yeah, yeah. Well, thanks very much. I know we'll get you your question and have a great rest of the conference. I guess we're off for lunch now, right? Yay, all right. Thanks, folks. Coming from Newport, we have no audio. No audio. Check, check, check. Channel one, channel one audio too loud. Channel one audio too loud. Please turn down the master audio. This is channel one, please turn down the audio. Check, check. This is audio, and audio needed. Check, check. This is channel one, top just, you have a slight turn. Check, check. Channel one, channel one audio. Still too loud. Not hearing. It is still loud. Hello. Yeah, that works. Good. We'll give a shout out to you all of y'all for day two of the most great section on 7-Cow Linux Expo. Thank you all for being here today. We would like to give a, oh no, a shout out to our title sponsor. Thank you for not having a lock screen on here. I found this. Yeah. Our track sponsor enterprise database, EDB helps organizations get the most out of Postgres as well. Nearly 1,500 customers worldwide have chosen EDB, software, services, and support. Our offices worldwide enable us to deploy our global expertise locally and support our customers more efficiently. Take it away. Okay, yes, let me find how I will. Not recognizing this, it's not working. Create such a recognition. Yeah. Yeah, it is, because let me find here. Well, let me check that. It's actually in a second screen. Let's see, wait, it's the right one. So if we connect this. Okay, don't know why, but if it work it, let me, it's just this one, almost there. Okay, we are done now. And it can move. Yes. Well, yeah, it works. Okay, welcome. Thank you for coming. We are going to talk about upgrading a big database, a large, a very large database, I mean many terabytes, without little downtime. And when I said little downtime, I mean minutes, not hours. So first let me introduce myself. My name is Jaime Casanova. Jaime, like in hot. That's hot, me, me, me. Okay. I'm a PostgreSQL contributor since 18 years ago. Founder of Ecuador Spook. It's a little inactive, but it's there. I provide community support in Spanish in that mailing list and via a telegram group, which that is the link. Board member of PostgreSQL Community Association of Canada, which keeps the trademarks of Postgres and CEO of System Guards, which is a Latin based PostgreSQL support company, providing especially Spanish support, but because we are inclusive, we can give English support consulting or training if required. Okay, so what are we going to talk? First, why to upgrade? We look at an overview of the standard ways of upgrading and then a study case with known of those standard ways solved with the problem. So first, why to upgrade? Seems curious question because it's obvious you need to upgrade, but if you are in a mission critical system, you don't want downtime. You don't want problems and probably you live by that rule. If it ain't broke, then fix it. Well, the problem is that it is actually broken because all software has bugs. And of course, most database vendors are not going to say you, oh, we found bugs. They are going to say you, we have many feature, which is that we don't now, we don't have this bug anymore. That's the reason why Postgres releases the minor releases. I don't like that word minor. It looks, people think about it like a minor update, minor in importance, it's important. So I prefer to call them security releases and we'll probably will be fixing some security bugs, some vulnerabilities. Now, Postgres doesn't do minor or security releases for all versions, only supported versions. We always are the last five. So currently 10 to 14. If you are using 9.6 and we have found a vulnerability, we don't know if 9.6 is exposed because we don't look at it. So if you are using older versions, you are exposed probably. And in a couple months, we are going to release 15. So 10 will lose support probably on November. So if you are using 10, start planning your upgrade. Okay, let's look an overview of the standard ways of upgrading. The first method, the oldest method, probably the safest one, is speed amp restore. Obviously, this is only useful if you have a small database or a large maintenance window. You cannot use this for a very large database. Why not? Because you must stop writing to the database before starting the dump. And that dump could take a lot of time. In a 20 terabytes database, which was almost documents, it took around five days to take a complete dump. And when that dump finished, obviously it was useless because there was a lot of writes during that time. Okay, I'm not going to stop there. PgUpgrade, that's the only way the, sorry, the other way, not the only, the other way to upgrade a database which create the structure of the tables, create the schema of the database without any data. And then because it knows there is no concurrency and it's completely safe, it copies the data at the operating system level. So it's a lot faster. How much faster? When on a 25 terabyte database, it took around six hours to copy the whole data set. Now, six hours is still a lot of time. We made a little path to improve parallelism on the copy part, at least on the big tables. And it improved 20%. So it reduces the time to four hours, 40 minutes, which is still a lot. By the way, that path is pending because it lacks window support, not good at that path, so it will be there for some time until I have time to finish it or someone wants to contribute that path. Okay, it could be PgUpgrade, it could be faster if you use the link option. What's the link option? Well, instead of copying the data, it will create a hard link, which is just, I would say to simplify it, another name for the same file. I mean, the same space on disk. The bad thing there is that there is no rollback. Once you start the new cluster, the old cluster cannot be started again. How much fast could become that? On a 18-terabyte database, it took around 40 minutes. That's good. The bad thing, PgDump and PgUpgrade make invalid all the physical standby. So you must reconstruct them from scratch. Maybe your business allows you to do that, but if you are using those standby heavily, probably you cannot do that. Now, documentation of PgUpgrade says you can reconstruct those standby quickly with an ArcSync command, which basically will create the same hard links on the other side. It works always that all servers are shut down and it will take the same time that the original PgUpgrade took. So another 40 minutes for every server. So that will multiply things. Okay, that's a good option. But this is the better option. This is the new kit on the block, logical replication. And probably you understand what it does. What it does is you create this schema of the database without any data and then create a publication on the origin on the primary and subscribe to that publication from the replica. It will create an initial copy of data, will copy the whole data set. Okay, what's the good thing? The good thing is that you don't need to stop the service. Not once. The bad thing, and it worked between different versions. So all the standby are still usable because you still have your original cluster, primary, standby, and the new server, which is the new replica, logical replica. What about sync? First, all replicated tables must have a primary key. That's a good thing to do anyway. But of course, the fact that the primary key must exist to replicate the data, even for the initial copy of the data, means this will take some time to complete. And it will take a long time, cause bloat on the primary, and consume space on primary. Okay, I will return to those points later. Remember that on paper, logical replication is the best thing you can do. And it's true for most cases. Now, let's start with the study case. By the way, let me see how much time. Okay, I'm very fine here. Okay, this is the study case. This is our financial system company, Latin Basics, that process transactions from North, South America, and Europe. It works 24 per seven. All servers, all services, must be aligned or can't be down for more than four minutes. That's a difficult restriction. They have five read-only replicas. So if they use PGA upgrade, remember they have 18 terabytes. And so if they use PGA upgrade, it took them 14 minutes to convert the primary. And then another 40 minutes for every replica, physical replica, that's a lot of time, a lot more than the five minutes we have. Okay, they were using 9.6. And that restriction is important for us. No server has more than one terabyte of free space. One terabyte is a lot. That, and it generates an average of 10 walls per second. So around 160 megabytes of wall per second. This is an average, it's sometimes generates a lot more than that. Okay, so let's use that information. That's the cluster, primary, the five, 10 bytes, okay. I will use PGA logical instead of native logical replication. Why? Because 9.6 does not have native logical replication. It started on 11, if memory serves. It was developed by a company that then was called Second Quadrant, but since then it has been acquired by Enterprise DB. So now it's an EDB company. And PGA logical provides an appy to implement logical replications on top of logical decoding, which was introduced on 9.4. So PGA logical implements logical replication since 9.4, basically. Till 14, yeah, okay. And to use PGA logical, you will need to set that parameter, start payload libraries to PGA logical and probably do a few other changes in configuration. So you will need a restart there. There, that's a few minutes of downtime. Now, if you have problems restarting your database because it's too big and it takes too much time to restart, you can set Maxwell size to a lower number do a manual checkpoint, do the restart, and then restore the Maxwell size to each size, original size, that will help with that. Okay, after PGA logical has been installed on all servers, you need to create a replication set, which basically is a list of tables and sequences that will be replicated. It's the same thing that native logical replication calls a publication. And on the new server, you will create a subscription under replica, which basically asks primary for the information in a replication set, in a publication. Okay, let's move to the next one. So that's the commands, create extension PGA logical, you create the node, and the replication set at all tables. I'm not adding the sequences here, but this is basically what it needs to do. Okay, the replication set will take not too much time and will not affect processes. Okay, then on the new replica, you are expecting to do this, which is basically creating a subscription. But sadly, you cannot do that. So we will not do anything of that. Why not? Well, we will return to this again. Remember, a replicated table must have a primary key, which means that the copy of the data that will be applied to the replica needs to maintain those indexes. That will take a long time, especially because it's a primary key and it needs to check uniqueness. The initial copy then will start on the primary at transaction because it will do a copy. That transaction will be kept open all that time. Suppose we have a one table with more than five terabytes, and yes, we have that table. And only the data of the table is five terabytes without indexes. On a one gigabit network, it took around 10 hours to copy the whole data of the table. What means that it took 10 hours? Well, that the transaction was open all that time, causing bloat on the primary for all that time. And it was consuming space on primary because the primary needs to retain the wall. Since the start of the process. Now, this is not like a physical replica in which you can, wait a minute for the idea, and then it is not like physical replication in which you can archive the wall. Why not? Because the responsible to the coding the data ascended to the replica is the primary. So the information must be need on primary. Okay, now, not for them. Yes, okay, let me first say that a lot of time is a nice place to visit us in Ecuador. That's from Colombia. Colombia is a nice place to bat. Now it's just a bat. Technology there is not like here. It was what we have and nothing more. Okay, so, but yes, if you have better network available, please use it for that special. Now, those 10 hours only moving that five terabyte which is not a complete database size, consume around five terabytes on the primary just retaining wall. And if you remember, we don't have five terabytes free of space on those servers, just less than one terabyte. So the initial copy is a problem. We need to solve this first. What are we going to do? Well, we actually have all the data already on another server, the physical standby. So what we are going to do is take a physical standby and transforming it into a logical replica. After that, of course, it will be on the same version, neither six. After that, we are going to be upgraded. And finally, we will continue the logical replication from the exact point it stopped. So no data is lost in the process. It's lost in the process, okay? How do we transform the standby into a logical replica? Well, the logical provide us this command, the logical create subscriber, which does exactly that. And with those commands, basically you can make it work. If you have several databases, this is a list separated by commas. I'm using the default replication set, which is one of the replication sets P-Logical provides by default. Okay. P-Logical create subscriber does nothing special. It just does the same thing progress can do by itself, but in the correct order. So please do that or read the code. And if you want to do manually, repeat the steps in the same order. Now, after that command finishes, we have a neither six logical replica. But we need it in a better version that neither six. So the first thing we need to do is stop replication. So we disable the subscription. So primary will keep retaining wall, but will not send information to the replica. And after we have disabled the subscription, we copy that catalog, PG replication origin status. Every row of that catalog keeps information about what have received from the region. And on which position in wall, luckily we have applied that that. What we actually need is the external ID, which is the, if I remember correctly, that I don't remember correctly, external ID and remove the other side. The external ID is the slot names, I think, on the primary. And the remote LSN, which is the exact position on wall from the primary that we have received, the last one. It will have that information for as many rows there as databases that we have replicating. Okay. Put it in a file, you can put it actually in the real case. I put it in a table on the primary, so I can script it all. And after we have done that, we can PG upgrade the standby. Okay, it put a lot of spaces and tabs there, but it's the one line basically. Everyone here recognizes the command right and knows how it works. The lower case is the old cluster, the upper case, the new cluster, port, binary, path and data directory. The link option is the one that creates the hard links. And we first check that both clusters are compatible. And after we check that it will work, we can ignore that guitar. We can leave that option out so we can actually upgrade it. Now, we are still not doing this, we are doing this without stopping services. Remember, the only downtime here up to now is when we started PG Logica. Primary is working fine. Only this Logica replica is being stopped and it's a different service, it doesn't matter. Okay. After upgrading, then you need to inform the new Logica replica which now it's in a better version. Which version we were here, 13. You can do this with 14 of course. We need to inform that it was replicating from an origin and exactly at which point we stopped. We do that with these two functions. PG replication origin create and PG replication origin advance. By the way, why we need to do this? Because PG replication origin status is a catalog and catalogs won't survive to PG upgrade. Well, the changes we have done to those catalogs won't survive to PG upgrade. And basically that's the reason why you need to do this with PG Logica and while probably can't could be done with a natural logical replication, it would be more complicated than this. Okay, so, and of course that's a fake ascent. Take the one from the file, okay. After you have filled those, that catalog then you can re-enable the subscription. So what do you have there? Now you have a Postgres 15 Logica replica of the primary on 9.6 starting on the same exact point. You have lost no information in the process. Currently you have only downtime for the initial restart. Okay, at this point you can do whatever you need to do on the logical replica. For example, they needed to change some integer fields for beginning fields. Do that on the logical replica before doing anything else, create the index as you need to create. You haven't added sequences to the replication set. You can now just before the switchover set Valdem. This procedure has been tested with PG Logica. There could be possible to do this and with natural logical replication but the exact steps may differ. Bruce, please say EDV that feel logical must exist for some time yet. And I mean feel logical to that X-series. Now, what are you going to do after this? Because currently we have just one server which is a logical replica of 13. We'll just start moving the load of one of the standby to that logical replica. So the load is still on another server, is still running, take down this standby and recreate it from scratch on the new replica as a standby of the 13. In this case, that took around 15 days to recreate all the standby. But it doesn't matter because the load is the same and there is exactly the same amount of servers working. Yes, we use it at a time in which they have added two additional servers to come with the load. So after that time, instead of returning the servers, they keep it for a few extra time for doing this. And at the end, when everything is on 13 except the old primary, and yes, they were working 15 days, even a little more than that. Writing on 9.6, replicating to 13, and then replicating to the standby on 13. And they were working without problems. The delay was almost zero, not noticeable for them. The last downtime will happen when you stop writing there and move the load to the other server. You can do that, well, yes, with a maintenance window. Of course, what about the load on this? Because in some point, you need to stop making queries here, right? Is that also a downtime? Not necessarily, you can use PgBouncer, for example, to move the load gradually. And you are not going to notice any lose, any downtime, maybe a glitch, only. So that, you can, we did this with around, summing up, it was around 10 minutes of downtime, distributed in a few maintenance window. But of course, it took a lot of time to plan. We did the scrapes for this and test all the scrapes and make a lot of things during three months before doing this. But it works very good. And we have a lot of time and I have finishes, finishes everything I have. So please, if you have any questions, now is your turn. I've prepared a lot of things and set to zero now. Oh yeah, because they were out of support at that point. Yeah, it's starting at night at four, political two, it's open source. Series two. Yeah, yeah, yeah, yeah. Until 14. Yeah, yeah, yeah. Yeah, and three was never revealed to public. And I think three is not, well, it's something internal now. So I think you have now be like a fool because I had the rumors, but yeah. And there is a website, by the way, ripman.org, you can check that. Okay, any other questions? So it was quite clear. Yeah, sorry. Yes, that's why we know that you cannot do the P upgrade because we tested it. We tried to first also do the initial copy directly and it was filling the disk, so we decided it was not a good idea to. Yes, but on a different server, obviously, yeah. There was some, I know I have missing something here, but not about the process. The process itself works as is there. We have 10 minutes, sir, if you, yeah? No, because it's just an restart, actually. So we just took down the Maxwell size, do a manual checkpoint, stop, update, and start. It's actually very quick, just remember to always do the minor updates on the standby first. And the primary goes at the end, not only because you need a large maintenance window there, but also because you can have problems. Something I was forgetting, I said whatever you need to do now, you do it in the replica. You cannot replicate from a normal table to a partitioned table, not in pedagogical, not in logical replication. So we tried that too, and the replica started crashes for good reason. And, but we ended up having a replica trigger, moving everything that the normal table received and the logical replica to a partitioned table for them doing the change. Okay, and as you know, okay then. Okay, if there is no question, then that's all I have to say for now. Hope it has been useful for someone and see you around in the next event. Thank you. Good afternoon, everyone, and welcome back to the 2.30 session, the day two PostgreSQL track at Southern Cal Linux Expo. We'd like to thank our track sponsor, Enterprise DB. EDB helps organizations get the most out of PostgreSQL. Nearly 1,500 customers worldwide have chosen EDB software services and support. Our offices worldwide enable us to deploy our global expertise locally and support our customers more efficiently. Thank you all again for coming out. Hello, everyone. Let me know if you can't hear me. I think you can, but we're here to talk about some of the seven no-tendedly mistakes that we see over and over again in the PostgreSQL space. And these are things that we asked our service folks, what are you seeing over and over again from a support perspective, from a consulting perspective? What are the common things that people get wrong and how do they fix them? So to introduce myself, I am the head of open source strategy at Percona. You can find me on Twitter at MYankovet and I would appreciate someone taking a photo and tweeting that just so my boss knows that I was here and I'm not at the beach, okay? And you can check out the podcast, The Host Talks Foss. You may have heard from some of the speakers who have been on the podcast before. I've had speakers like Bruce. I've talked with Jonathan Katz before. So if you're interested in things in the open source database space, whether it's PostgreSQL or something else, check it out. So when I sat down and started this presentation, the first thing I did was ask our support team, what kind of issues are we seeing on a regular basis? And so they gave me this handy dandy graph, okay? And what's interesting about this is we see that a lot of issues are really in the config, setup, install, tuning, monitoring and the HA replication space. You have a little bit of security, a little bit of backup, a little bit of bug crashes. But a lot of times when you think of a support type of an engagement, you're thinking of the software doesn't work. Well, thankfully Postgres is very stable, so there aren't a ton of bugs. But we do see a lot of people who just don't understand the nuances, how to configure, how to tune, how to set these things up. And so this is the overall graph, but I'm gonna go through and get a little more specific on these, okay? And so I'm gonna count down David Letterman style. Hopefully people remember David Letterman. I'm old. I remember David Letterman. Maybe some of you do as well. So we're gonna start with number 10, okay? And number 10 is selecting the right tool for the job, okay? Now like this gentleman here, I don't think you should use that person as your saw horse for the saw. You don't want to use tools that are not designed for what you need them for. So, you know, for instance, when you start to look at load balancing or connection pooling, there are several options out there. And understanding the nuances between these is very important because if you install and use the wrong tool, you might not get exactly what you're thinking, okay? So the difference between PG Bouncer, HA Proxy, PG Pool, they do similar things. They just do them a little different. So PG Pool two, for instance, you're gonna get load balancing connection pooling, but it's not as lightweight as the other two. So how you balance this, it's important. And how you set this up, it's just important to realize the nuances between these tools. Another good example of this is PG Repack. How many people have used PG Repack before? Okay, great. So there are locking issues that can occur if you use it. It's a great tool. We highly recommend it, but there are certain operations that will cause slowdowns. So just by deploying the tool starting to use it, you might run into some problems. So there is a blog post that we did specifically on the things that could go wrong, but you have to watch out because some of the issues that you might be thinking that you're fixing might actually come back to bite you. Now the number nine issue, now security was relatively small, but we tend to trust a lot when it comes to the security of our databases, right? Just like this, I love this picture. It is totally secure. You cannot get through my security system, right? But we did a survey the last couple of years, and almost 12% of those who responded, these are DBAs, these are developers, SREs, said that they had some sort of database security issue last year, okay? That's up from 9.5% last year. And a lot of these are misconfigurations. There are things that we have missed. There are things that we could do better. There's other databases that have a more significant security issue than others. For instance, Mongo and Elastic, they don't like you to set the default password so a lot of people just leave it open. That's really bad, don't do that. But when we talk from a post-christ perspective, people not using SSL for instance to encrypt traffic, okay? Opening up the UNIX sockets to something that is overly permissive, over granting permissions, because we all know that when a developer wants access, you just give him everything and it just works all the time. Using and not understanding the trust reject authentications and then not auditing things, right? These are all important things that you should consider. Now, as a PG-14, there were some fixes and some enhancements that went in to help with some of the security side of things. I think Jonathan Katz is talking about Scram authentication. Do you have a presentation on that this week? Okay, well anyways, he has a presentation. You know, so there was an introduction to move that as the default authentication system. There's also some new roles that can be used. So as always, as post-christ evolves, some of these issues start to get patched, fixed, made easier for users to use. Okay, now number eight. Let me guess what this one's all about, okay? So clean up after yourself, it's right. So I don't know, sometimes my house looks like the hoarder house, you know, but at times it looks nice, I don't get it. But it's all about the proper use of vacuum. Everyone's favorite topic. Now there's tons and tons of information about how you should optimize vacuum, but for those of you who don't know what vacuuming is, it compacts and optimizes the dead space in post-christ. So there are settings that you should be looking at. Auto vacuum is on by default. If you turn it off, there could be some bad things that happen, so you wanna be careful of that. But you can tune the vacuum threshold and the scale factor to try and optimize this process. Again, we go back to those tools. PG-REPAC is a popular tool that many people have used for this. You should check it out if you haven't. And so consider how you're using auto vacuum because this can cause some issues. Now if you don't set this up properly, you turn it off, then you end up with, no, that's not a moon, it's your database because of all the space that it has eaten up. Yes, it even got stickers made. Come to the booth, you can get a sticker. So vacuum enhancements. So there are some things that are improving within post-christ, obviously this is 14. I have not looked at 15, but I'm sure that we can get some information on that. But there are some enhancements that have come out to improve performance of the vacuum process. So there's been some tests that suggest over up to a 25% improvement on those. And we're also seeing that the vacuum process continues to evolve and get better over time. But if you are not cleaning up after yourself, you're going to have that giant, massive, wasted amount of space and it is going to slow things down. Now, speaking of slowing things down, how many people have really busy systems? Nobody has really busy systems? Okay, a couple. They don't want to admit it because they're scared, but that's okay. That's okay. Because the other thing that we see that is constantly causing issues is more connections because we all love to connect to our database. Even more, there's no such thing as too much of a good thing, right? And so poor connection management leads to quite a bit of problems because of how the connection management and the threading and process within Postgresworks. You're going to get potentially lots of duplicated memory so you're going to consume more memory for every connection. And so as you start to go above 100 connections or more, you're going to want to start to look into load balancing. And if you're not, then you are probably suffering some sort of performance penalty. Again, these things do get better over time and so we saw in Postgres 14, some of these enhancements helped the throughput in the concurrent workload get better in 14. So these are a little older slides, but you can see that there was some increase in the actual performance at higher connection settings. So again, you want to manage your connections, understand what sort of workload your database has, how things are connecting, what you can do differently. Now, I mean, how many people have heard this? Has anybody heard this from a developer perspective that they're smarter than me? And so what does everybody do when they hear that or see that? Yeah, no, they're not good enough. Are they? No, they're not. So again, if you want to stick here, that says this, they're on our booth, so stop by. But no, default settings, okay? So Postgres is awesome. Out of the box it's tuned for a very generic workload, which means that you need to start to set the parameters that are specific for that workload, right? So effective cache size, shared buffers, work memory, the auto vacuum thresholds, these are things that out of the box, the defaults are not great. They're going to work for a general purpose database. And so you're going to want to look at how to tune and set these out of the box. Similarly, if you start to have a high transaction throughput, you're going to want to look at wall settings to see how fast checkpointing is happening, and you're going to want to look at logging and debugging settings as well, right? So out of the box, hey, it works. That's great, but you're going to leave a lot of performance on the table if you are not tuning. Number five, okay? So I don't know, how many people are over people? Anybody, oh, oh, and the rest of you are unders, okay. Well, so I'm not actually talking about toilet paper, I'm talking about indexing, yes. Yes, oh, see, I heard that, oh yeah. So how many people have ever been in an environment where the developers thought it was a good idea to index every column? By the last, I assume that that has happened to a few people. Yes, it does, every index doesn't need a column, or every column doesn't need an index. This is true because guess what? There is overhead to every index you add, and so the overhead of managing those indexes, maintaining them every time you update, every time you make changes, it can add up. And so you need to understand that your indexes are there to help with performance, not hinder it. And so if you want to index everything, then maybe you don't really want a relational database, maybe you want something completely different. But there are specific index types for different types of workloads, okay. How many people are using different index types in Postgres? Not as many as I thought, okay. But I mean, there's the B tree, there's hash, there's gin, there's just, so depending on what you're trying to do, there's a different type of index that's optimized for that particular workload. See, Postgres gives you a wonderful setup, but you have to know how to tune it and use it to get the most out of it. And if you're not, then you're leaving performance on the table. Now, the good news is if you are indexing every column, and please don't do that, or even if you're just indexing a few, there has been a lot of work over the last couple of years in reducing the index bloat, right? So we talked about some vacuum changes to help there, but there are also other things that have been implemented to reduce the amount of bloat that indexes cause and the impact of having too many indexes on the system. So that's good. So, how many people use extensions, by the way? Only a few. How many people have installed the extensions and never used them? Probably way more, because that is another thing that we continually see a lot of, which is let's just add the extension, we might need it later, okay? Postgres has an awesome ecosystem of hundreds of different extensions that can do all kinds of cool things. But you know what? You implement them and you never use them and then they just sit there. And so typically what I have heard from our support team is they'll go back and they'll say, well, what extensions are installed? Have you tried turning off the ones you don't need? And a lot of times that fixes problems and they're just like, yeah, you were using something you didn't need. So the extensibility of Postgres is awesome, but like Spider-Man, with great power comes great responsibility, you need to understand how you are deploying and using your extensions, okay? There are a lot of extensions in the ecosystem that have zero support, which means that if you go download them and use them, they could work or they could work. And whether they work together with other extensions is a potential problem and it's a rat's nest of issues. So you need to understand where you're getting your extensions, are they part of the official extensions that are released or are they just someone who is doing this as a hobby project and what sort of testing has gone into these extensions? It's really, really important for that. Now, with extensions, we've got that, hey, we've never used it, it's there, it's installed, but you know what, we never want to use, never want to use our HA or our backup, right? So a lot of people consider this a secondary issue. It's something that they're sure that they can outrun if they have to. It's gonna be something that they can, it'll eventually work out. Maybe you've deployed to the cloud, right? How many people have deployed to the cloud? And are you, how many people have set up HA in the cloud? Hey, how many of you have set up HA in the cloud? Correctly, because here's the fun thing, right? So we, again, back to that survey that we do on a regular basis, 22% of people who responded to the survey said that they had some sort of cloud outage that impacted their databases, okay? That's a lot, that's one fifth, okay? That's one fifth of the people who thought that they were protected and they weren't. And you know, 33% said that they had an issue whether they were running in the cloud or on-prem. So this happens whether you're in the cloud or not. So just because you're using and as a service doesn't necessarily mean that you set it up correctly or that you have taken everything into account, okay? And as great as Postgres is, it's only great when it's up and running. If it's not running, then you have a completely different problem. And a lot of people are not prepared for that outage, okay? They're not. And so you have to ask yourself from an application perspective, how much downtime can you withstand? How much impact will that have on your business? You know, do you have an RPO or an RTO? You know, what was the last time you tested your backup and failure? Wait a minute, when was the, how many people have tested their backups in the last year? Oh, well, for the rest of you, this one's for you. What did you mean that you never tested your backups? That's right. So that's something that is super critical. And part of this is also understanding, you know, the recovery process and the tools that are out there. You remember, I was talking to you a little bit earlier about making sure that you set the correct wall settings. That's gonna be critical for, you know, recovery. How are you managing your HA? Are you using tools or are you doing it by hand? It's surprising how many people still think that they can do things by hand. So are you using something like Petroni? Are you using something else? Automated failover is critical. From a backup perspective, how often are you testing them? Are you using tools or just backing up by scripts that you built yourself? PG Backrest is a great tool, right? Are you using it? Are you using it? What sort of retention policy do you have for your backups? Okay, so these are things that you should be asking and understanding because we all hope there isn't a rainy day, but when there is, you're certainly glad you have the umbrella of the backup. Wow, that was really cliche. I might not use that one again. But okay, so problem number two that we consistently see over and over again is this issue with which workload is the right one to use or to tune, okay? And so remember, I said that Postgres is great out of the box as a very generic setup, but that means that you have to tune it to do what you want for specific workloads. And just because it is battle tested, it's so used, doesn't mean there aren't potential bottlenecks and slowdowns, okay? And so, again, survey data, woohoo, surveys. I love surveys. So slowdowns and bottlenecks still plague almost 50% of users out there. So I mean, how many people here have experienced the slowdown in production with their database? Oh, how many people are lying about it? Right, I mean, it happens over and over again. It's not something that is solved. And some of this comes down to a couple different things, right? So one is understanding what sort of workload you have. So what tools do you have to look and analyze the workload that's coming in? Have you considered what your P99 latency is? Anybody familiar with P99 latency? Okay, okay, well, that is if you take the 1% outlier throw it away, what are you looking for in terms of your latency, right? So if this system needs to respond in 10 milliseconds, okay, and we throw away the 1% and it's under 10 milliseconds, it's good, but if it's over, that's bad, right? And so you should be thinking about what sort of response time you have. You know, when your workload does change, do you know? And you know, here's a classic example, right? There are industries that have a very cyclical business, for instance, accountants. You know what, they're really busy a couple months a year, the rest of the time they're eh, okay, right? And so do you understand how your systems are gonna be impact and what sort of changes you're gonna see on a monthly or daily basis? Do you have the capability to understand or read write split, right? So is this an opportunity to understand, are you writing a lot more data than you're reading? You know, optimizing one for the other is an important thing. And how are you, how is this data being used? Is this for every type of, you know, workload? Or is it reporting, or is it OLTP, ad hoc, you know, what sort of query workload is gonna happen? All of those are really, really important. And you can do this not only for the database, but you can do this for the systems itself, right? So when you look at the systems themselves, you've got things like the kernel parameters, right? So huge pages, swappiness, you know, so you can tune the Linux side of things to make Postgres run faster as well. And so have you done that? Is this something that you've been looking at? Very important that you understand what the workload you're seeing, what your expectations are, and then make adjustments based on that. Now, the number one issue, okay, is this a mistake or is it a feature? Right? So when we talk about where we are spending most of our time from a database administrator or an SRE perspective, what we're seeing is we're spending most of our time in a few key areas, okay? Let's take a look at what the survey says, because I want surveys. So 7% of folks maybe in this room spend their time on automation. Most of their time on automation. It's 4%, or I mean, number four is 9% moving data between environments. 11% say troubleshooting problems. Number two, query tuning, 21%. But the number one by far is working with developers on their applications. How many people are developers here? Oh, okay, okay. How many people are developers and DBAs at the same time and SREs and everything else under the sun? Okay, there you go. So when we talk about working with developers on their applications, understanding the application side, the number one issue that plagues most Postgres deployments has nothing to do with Postgres itself. It has everything to do with what we have put inside of Postgres, which is poor database design. Okay, now I have a whole talk on this, so if you're here on Sunday, you can listen to my, you know, lost art of database design talk. But, you know, when we talk about database design, there's a few different aspects to it, okay? We naturally flow to the schema design, which is an important aspect of this, right? So you wanna ensure that your schema is set up correctly, that you have the proper data types set up. You know, for instance, you can use ints or numerics, and do you understand the difference? Some people like to store everything as Varchar's, as opposed to ints. Some people store ints as Varchar's. There's been a lot of issues that have cropped up from a performance perspective, based on the number of data types that you choose. Varchar's probably the most overused data type out there. And part of this is, a lot of people like to use ORMs. People using ORMs in here, right? So how many people are using GUIDs, GUIDs for their primary keys? So are you familiar with that? Okay, it's like an automatic generated primary key. A lot of people do, but that is a really long Varchar, right? And it's something that could be replaced with a sequence number or something like that. And every time that you go to access the database by that, that's additional memory that has to be used. That's additional space that's wasted. And so choosing the right data type can not only slim down your database, it can also prevent a myriad of other issues from occurring. So you wanna make sure that you're looking at data types as well as schema design. And for those of you who are using ORMs, a lot of ORMs do a good enough job on writing SQL. Which means that they believe a lot to be desired and there's often a lot of opportunity to tune and optimize those ORMs and the code that it generates in order to get faster running and better performing SQL. This also goes along with designing the systems as well. So I've been at a lot of places where they'll say, we're in AWS, just get the largest instance size, we'll eventually grow into it. And then I've been at a lot of other places that say, use the smallest instance size until you can prove that you need the bigger one. So you get the extreme opposites of that, but a lot of people will over complicate things early on or they'll not think through the limitations or the potential outcomes in the future. And this goes hand in hand with not understanding those long-term potential impacts of what you're storing and how you're accessing that data. Right? We are digital pack rats now. I don't know if anybody's boss has ever come to them and said, delete that data. Let's get rid of some of that, we have too much data. Now everybody wants more data and they wanna store it for longer amounts of time. And that means that the systems that we're dealing with are getting larger and larger. We just had the talk about the 10 terabytes, was it? 10 terabytes worth of data. So that's a normal thing now. As opposed to an abnormal thing. So how do we handle that? And can we design around that? Should all that data be in a system that's handling most of your transactions or should that be in another database that's for long-term archiving or reporting? Should that be destaged? Should that be archived? Those are the types of questions that often come up. Now I mentioned this kind of over design and so this is kind of database design but I called it the bonus section. So it's the Rube Goldberg effect. Anybody familiar with Rube Goldberg? Yeah, so let's make the simple systems as complicated as possible, that's what we want. And we are seeing this that there is this desire for more, more, more and more. The number of databases being deployed in production is frightening. Not only the sizes, we mentioned the sizes but when you look at the number of systems, the number of databases in a single environment now, it has grown exponentially. How many people have more than 100 databases in their production infrastructure? Wow, nobody, wow, okay, well. So again, back to our survey here. Look at the data, 40% of those who responded had more than 100 databases that were running in production. And almost 20% had more than 1,000. And this is actually driven by kind of the cloud native microservice movement because every microservice wants its own database, right? Every subcomponent wants its own database. So you take something that before might have had a centralized database, now all of a sudden we have 15 microservices times that many databases. So now we have 15 databases. Oh, well, we also want replication and we want HA and all of those. So now we have, you know, 45 databases. Oh, and we're going to archive those. And oh, no, so now we're up to 60 databases and it just compiles over and over again. And so that's something that we have to be careful of and watch out for. And you know, it really has this mentality that you have to realize that the complex system does not win. So if we're talking winning and losing, you don't get an award for the most complex system, although it is fun to come to conferences and go to the bar and tell everybody how complex your system is. It's a fun after-party activity. So people implement some pretty wild things out there. Not all of them are good. Don't copy them. So that's the 10 most common mistakes that our support and services staff at Percona have seen over and over again. And wow, that only took 30 minutes. So we have plenty of time for questions. So this is where there are data types that you should be using for specific, different types of data, but most people aren't. So most people aren't going to choose the appropriate data type. They're not going to understand the special data types. They're just going to choose the default. And a lot of, especially applications that are out of the box, they're designed to work on multiple databases. And so Varchar's universal. Yeah, but no, I mean, it's a good point. So I mean, this gets back to making sure you understand what data types are also available to you. Just like the indexes, there's specific data types, specific indexes that are used for specific use cases. Cool, all right, that's cool. All right, I am done. Thank you very much. Yes, downstairs in the booth, Percona booth. There's a ton of stickers. Like all those things that I was showing, those are stickers that I made, specifically to put in the session and then be able to distribute the marketing. My marketing. Someone left a phone. Oh, might be able to help with someone left a phone back there. Oh, wow. Bruce doesn't like the phone. Okay, let's look at the other side. They're serious about their security. Oh, yeah, I have the poster, but... But... Is it Aircon? Yeah, yeah, yeah, yeah. No, no, no, he was asking about the supplies or... Yeah, I looked at some of them. That was just one of us. They're very few. Okay, so they are doing the video. Oh, every one of them is on the week? Yeah, yeah, yeah. Oh, that's awesome. You could watch it, that would be awesome. Yeah, yeah, yeah, yeah. But you know, sometimes I can't simultaneously. You know, two times. Yeah, so they're doing that, I know. And then on top of that, my slides, I know I'll have a... Sure. Oh, yeah, yeah, yeah, yeah. So I'm the next speaker, so I don't know if I... I'm not trying to push you out. Oh, no, I was just, you know, wondering if you were... Coming to ask questions. No, I figured you were... Yeah, yeah, the short answer is yes. You could have. You could have. What? That would have been interesting. Yeah, yeah, yeah, you mentioned that. But then you would have, like, confused so many people, I think. Oh, yeah, I think I was right. That's also fine. Here, I was hoping that you weren't going to do this stupid EDV steal. It would at least be a slightly more interesting. Okay, okay, yeah. Like, if that's the concern that you have, don't go down there. Right, yeah. There, I'm saying a lot of things. I could totally, like... I think, I think just... I think that hand mic is actually picking up enough that I could just stand there and keep it up to either one. No, I think the hand mic is... I'm loud enough that the hand mic would probably pick it up to the side. Like, I can see back there. Check, check. Oh, okay, yeah, maybe not. I can see back there. You can actually see the levels going up and down. See the green? Yeah, so that's the actual, those are the levels that it's picking up. Yeah, down the big stairs, left, left, go straight, and then beyond the right and left. Yeah, it's on the... Yeah, it's on the level where you can enter the hotel on. Same level as the registration level. So, the registration, right? Yes, it does. So, it wasn't actually coming down in that way. Right. But it eventually... Right. No. Check, check. That clearly is making noise back there, isn't it? Okay, cool. Sorry. Hello, everybody, and welcome again to the final PostgreSQL session at Southern Cal Linux Expo. We're so grateful that you've come out here, and we're also grateful for our track sponsor, Enterprise DB. EDB helps organizations get the most out of PostgreSQL. Nearly 1,500 customers worldwide have chosen EDB software, services, and support. Our offices worldwide enable us to deploy our global expertise locally and support our customers more efficiently. Thanks, everybody, and welcome to Stephen Frost. Yeah, hi. So, Stephen Frost, I'm going to be talking about identifying slow queries and some ideas about how you might be able to improve on them. As we go through, feel free to ask questions, just speak up. I'll definitely roll with it. I'm pretty good at that. So, just to get into it, a little bit about me real quick. I'm Stephen Frost. I'm the chief technology officer at Crunchy Data. I'm also a committer, major contributor. I've worked on role-level security, columnal privileges, the role system, contributions to various other things. So, been around PostgreSQL for quite a while. Some of you might have heard we were on the panel yesterday. But that's enough about me. All right, so there's a few different ways to, like, figure out what your slow queries are in your database. One of the ways that I'm going to talk about a fair bit here is logging, right? So, one of the tools that's really good in Postgres is our logging system. If you're not, like, monitoring your Postgres logs, you really should be, because not only are you going to find slow queries in there, if you go through some of the things I'm going to talk about here, but you're also going to see things like queries failing and queries having issues. You can also see things like when a lock is held for a long time, you can see that in the query log, which is really, really helpful. I'm going to talk a bit more about that as we get into it. So, I'm going to cover a number of the different options for PostgreSQL.conf. These options can all be also set through the alter system command. I'm kind of old school. I like just hacking up PostgreSQL.conf, but you can also use the alter system to alter system command to set those parameters. Talk a little bit about log analysis. So, PG Badger is a really nice tool for doing log analysis, and I'll talk a little bit about that. May even pull up a little demo of it. We'll see how well the Wi-Fi wants to play with me today. And then we're also going to talk about another way to look for your slow queries, which is called PG Stat Statements. PG Stat Statements is a module that's included with Core Postgres. You have to actually go enable it early on. So, you have to set it up in postgreSQL.conf, and then you have to create the actual extension. But then you'll be able to monitor all of your queries through PG Stat Statements. So, that's the other really great way to find your slow queries in Postgres. So, those are the two main things, two main approaches for finding slow queries. So, when it comes to logging, I'm going to kind of go through each one of these logging parameters here that are particularly relevant to finding your slow queries. The first one is log min duration statement, and then, you know, I'm not going to read them all. I'm going to go through each one. But, you know, if you're looking for kind of like the short list of things to think about when you're talking about logging for Postgres, these are ones that I always go and set and make sure that are set to reasonable value so that I'm tracking the information that I need to be able to find my slow queries and to be able to work on them. I'll also make sure this slide deck's online somewhere for you all afterwards. And if I don't, you can yell at me. So, the first one is a really great option that's been added and exists in kind of all modern versions of Postgres these days. It's called log min duration statement. So, if you set this to zero, every single statement sent to Postgres is going to be logged, right? That can create some back pressure. So, I wouldn't necessarily like, if you're not logging statements already today, I wouldn't just go and enable this on a production system that's busy, right? This is one that you may want to look at enabling and setting to, so the number is in milliseconds, by default, you can pass in a parameter to it if you want, though. But, by default, it's in milliseconds. And I would say, you know, if you're just starting out, set this like to five seconds or 10 seconds or something, you know, initially, right? And get a feel for, like, what are the queries that are taking longer than that amount of time? You know, if you start in on a system that's really, really busy, you set it right to zero, it's going to suddenly start dumping tons and tons and tons of logs into your log files, and that can create some back pressure and might have an impact on production. But, in general, if you set this to something where the amount of logging is pretty reasonable, I have found that it doesn't have very much of an impact on performance. As for what exactly is reasonable in your environment, that's going to really depend. And what it depends on is how long your queries are taking and how many queries you're running. You know, I like to, you know, try to make this pretty tight, so, you know, if you could set it to 10 milliseconds or 50 milliseconds, like, that's good stuff, because the lower this number, the more queries are going to be logged, and you're going to have that much more information about what's going on in your system. The big thing about using log-min-duration statement versus some of the other ones, like log-statement and log-duration, are that log-min-duration statement actually includes the duration on the same line. For reasons that boggle the mind and actually go back to that idea about yesterday, about maybe there are some options we should remove, I think one of the options we should remove is log-duration, because log-duration actually logs the duration on a separate log line, and when you have, you know, 100 different backends all writing into the log, it's not clear which one that came from. You have to go actually figure it out by the process ID and whatnot, so it's kind of painful. So, I don't think log-duration is terribly useful, frankly. I like log-min-duration statement a great deal more. Here's an example of what that looks like, so here you can see that the duration logged was, you know, 1,001 milliseconds from running select pg-sleep, so pg-sleep of 1 is just a command says, you know, sleep for one second. So this is obviously just a kind of a demonstration. On those really, really busy systems, sometimes you want to log those queries even if they're running quickly for various analysis, but, you know, if those system's too busy, you can't do that because logging every single query is going to back things up. Now here also is a log-min-duration sample, and you can set a sample rate. So what this will do is this will log a sample of the queries versus every single query, right? So, by default, it's set to 1, but you can change that and adjust it if you need to. Sorry about that. So if the query ends up exceeding the value set in log-min-duration statement, that'll end up taking precedence so this is if you want to have it. You know, you can set those independently so you can get, like, a sampling of queries that take a long time or a short time, and then you can have your other queries, you know, the longer ones always logged. Another really important thing for logging is setting a log line prefix appropriately. A lot of distributions do this for you already. They set a pretty reasonable one, but if you're coming from the source code, the default one is kind of garbage. So I definitely recommend setting this to a value. This is one example. This is actually an example I pulled from PG Badger about the different parameters to include. But you can adjust this, you know, depending on your interest. One thing I do want to point out here is you definitely want a time stamp and a process ID. And then the other one is application name is really, really handy. So one of the things that's really nice in Postgres is that as part of your connection string, you can set an application name. So if you've got, you know, a variety of different systems and applications connecting into Postgres in your connection string from those different applications, you can set an application name and then that will show up in here, which is really, really nice. So that's something that I would certainly recommend you consider taking advantage of, because then what system it was, what application that particular query was coming from. You may know that from other information based on the source IP or things like that, but you can get even more granular, right? You could have, like, different parts of your Django app or whatever, like different modules of the Django app having different application names if you want, because you can also set that on the fly, right? You don't have to set it only on connection, you can change it on the fly too. Another really important one is to log checkpoints. So one of the things that you can run into at times is you end up with queries being slow, but it's not actually the query that's slow, it's because of other things happening on the system, and if you are logging things like checkpoints, you can see if there's some correlation there. So logging of checkpoints will log when and why the checkpoint started, when the checkpoint completed, and some general statistics about the checkpoint. So here you can see the number of transaction logs that were added and removed. So this, you know, things like transaction log files added and removed will give you some indication of how much write activity is happening on the system. You can also see how long it took Postgres to write and sync that information out and the total amount of time overall for the checkpoint. So that's all really useful information and you can also kind of, again, back that out and say, okay, if a bunch of my queries were suddenly slow and this checkpoint was running, that's really useful information. The other one that's really important here is this checkpoint starting. So this immediate force weight is not something you should be seeing very often, that's usually if somebody goes in and explicitly runs a checkpoint command, or if they came in and did like a create database or one of the other operations that implicitly requires a checkpoint to happen. And so those are specifically cases where like, if you see an immediate force weight and suddenly the load on the system went up quite a bit, that's because Postgres is doing a force immediate checkpoint where it's going through all of shared buffers and writing out all of the dirty buffers right then as fast as possible. And that's very expensive and can end up causing an IO conflict or IO congestion, I should say. So this is why it's really helpful to log checkpoints. The other reason to log checkpoints is because you see checkpoints starting due to log or due to X log or due to wall, what that's telling you is that your max log, max wall size rather is not big enough, right? Max wall size is something that if you generate that amount of right ahead log within the default checkpoint time, which is 5 minutes, then Postgres will have to immediately do a checkpoint, which again means running through all of shared buffers and writing it all out quickly in order to be able to get back some space in the right ahead log area to be able to continue writing. So I'm not going to go super deep into how wall and all of that works, but if you have questions about that let me know, but if you see checkpoint starting due to immediate force weight or due to X log, you want to look at maybe adjusting your parameters or talking to whomever is going around running checkpoint because that can end up a problem. I always recommend logging connections and disconnections to Postgres, so you have an idea about how frequently connections are coming in and how long they're lasting. One of the other things to be aware of is that new connections to Postgres are relatively expensive, right? So sometimes if your application isn't performing very well, it's not actually the queries at all, it's because you're connecting and disconnecting from the database for every single query and that's very expensive in Postgres. So definitely recommend using some kind of connection pooling. Typically my preference is PG Bouncer. In fact, if you can get PG Bouncer to work in transaction mode, that's awesome. It does mean that there are some things you're not able to do with it, but that's something I certainly recommend folks look at. Application level connection poolers are also fine. They are generally okay. I find PG Bouncer to be better personally, but nothing wrong with them in the grand scheme of things. But definitely you want to make sure that your connections are lasting some amount of time. You don't want to have constantly be reconnecting to the database. Part of the reason for that is that whenever you connect to a Postgres database, you spawn a new backend and that backend doesn't have anything cached, right? And so some of the stuff that Postgres does right away as soon as you start issuing queries is we have to then go look things up in the catalogs and go get that information back off of disk and then we will cache it for that backend, right? And that means that subsequent queries will be faster, which is great. But, you know, if you're constantly disconnecting and reconnecting, then you're constantly losing that cached information on that reconnect. So, you know, just throw a bit of background on why it is that there's a performance hit from constantly disconnecting and reconnecting. It's because of that cache that gets filled which is called the syscache and the relation cache if you're curious. Next one is really, really good, right? Log lock waits. So, in Postgres we have something called a deadlock detector. Deadlock detector runs after waiting on a lock for one second, by default. So, whatever deadlock timeout is set to, which is by default one second, that's how long Postgres will wait after it has attempted to acquire a lock and had to wait. One of the things that will happen though is if you enable log lock waits, then after that one second timeout Postgres will write into the log all the information about that waiting that it's doing, right? So, you'll see here which process is waiting on the lock, right? And what it's waiting on, right? What kind of lock? So, this is a share lock on this transaction, right? And it'll also tell you what process is currently holding that lock and what other processes are currently in the wait queue for that lock. So, it's possible that your queries aren't actually slow, right? It's just that they're having to wait on locks and that's where logging of lock waits is really helpful. If you're finding that your queries are running, you know, 300, 400, 500 milliseconds and you're not actually getting this information, and you think it might be a lock, you know, you could reduce the deadlock timeout amount of time down from one second if you wanted to get the, you know, to see this log lock waits firing. I do caution you though, running of the deadlock detector is relatively expensive, right? And from a CPU computation perspective, because it's got to run and figure out are there any deadlocks between any of the locks that are currently being held. So that's a relatively expensive operation. So I would not set this down very low. Don't raise this value. I've seen people raise deadlock timeout and I think they thought it was because it would make the deadlocks that they are seeing in their system go away. That's not how that works. Right? If you're seeing deadlocks, you need to go fix those deadlocks, that's what you need to do. Raising this value is not going to fix it, it's just going to make it take longer before Postgres realizes the deadlock is there. Right? The whole thing about a deadlock is that you're in a deadly embrace and you can't make progress. So don't raise that value. But lowering it makes sense in some cases, particularly to see these log lock waits. Another one that's really, really useful is logging of temp files. Temp files are things that Postgres will create during a query run. In the event that we need to spill the disk for a hash join or a hash aggregate or a sorting operation or any of those, we'll spill the disk. What that often means is that you have work mem set lower than what would be helpful to Postgres. That's why it's spilling the disk. If work mem was higher and I'll talk a bit more about work mem later, but if work mem was higher, then you probably wouldn't see it spilling the disk. There are some cases where we have to go to disk, things like a materialized node, but in other cases we don't need to spill the disk, we are just doing so because we don't want to exceed what the work mem value is. So Bruce touched on this a little bit earlier about things like in hashing operations we'll actually just default to using more memory now. Because it's just so much of a difference between having a hash join spill the disk versus having it all be able to run in memory. So logging of temp files is good. This recommendation of four megabytes isn't bad. I like to actually set it down to zero personally because we don't write that many temp files and I'm just always curious about what's writing temp files. But don't be too concerned if you see temp files being written. Again, sometimes they're required for a query. But if you see really large temp files being written and your query is going really slowly then that may be something that you want to go look at and figure out what's going on. And the other nice thing about this is that when it does that it's going to log what that statement was that created that temp file. So you'll know which statement it was and again if you set application name or something else that can make it easier for you to go figure out where that query is coming from. Yes Bruce? Sorry, I didn't mean to say that as a default. There's a bit of a discussion between myself and Keith Fisk in particular about what value to set this to. This is his default. Yeah, this is his recommended value. My recommended value is zero. But he's the one who keeps, he's been maintaining these slide decks so, yeah anyway, sorry. But yeah though there is not a default by default we don't log temp files which I don't agree with personally but that's just me. Another one that's really good is auto vacuum min duration. So this one tells you about when auto vacuums happen. You can also go look in pdstat user tables and see when the last time of something happened on the system in terms of a an auto vacuum run vacuum or a auto analyze happened on the system but if you log this you're going to get all these other stats too. So I like setting this to zero so you always are getting information about what vacuuming is happening on your table and while it's hopefully not too impactful on your system it can be. Now that said vacuuming is really, really important you should not turn it off. I know a lot of people see vacuum running all the time and they're worried about it. That's not bad. The reason that vacuum runs all the time is because it intentionally runs slowly because we don't want it to interfere with the foreground processes and the foreground queries that are running. So that's why it may look like it's always running and what not but of course in some cases if you've really made vacuuming aggressive and what not then you may see some correlation between a really fast running vacuum and queries being slower. It can happen. There should already be logged from auto vacuum failing. If you're seeing auto vacuum failing and it's not logging how do you even know that it's failing? And it wasn't logging? Okay. Again, monitor your log files. I think I said that at the very beginning. You definitely want to monitor your log files for that. It was almost certainly logging almost certainly it was logging an error in the Postgres logs because certainly our default would be to log any errors that auto vacuum has while it runs. That's a bad thing and I've seen that before and it really really sucks because it was probably vacuuming that table because it was like the oldest table and then it keeps trying to vacuum that table and it keeps failing and that means that the rest of your system isn't getting vacuumed and that's bad. That's a problem. So monitor your logs. Sorry. All right. PG Badger is really really great. It's really simple to run. You just run it on this and then it generates a whole bunch of HTML files. I'm kind of running through my time a bit faster than I anticipated but maybe I'll come back and I'll pull up a PG Badger if we want to look at it. Let me just talk a little bit though about understanding some of the different reasons why queries end up being slow. So Postgres configuration issues of course. Dead tuples and bloat can lead to queries being slower but it's not actually very typical. Many many times it's the query plan at the end of the day that ends up being slow. So let's talk through some of these things. So when it comes to thinking about how to make a query faster one of the biggest options that is valuable here is Workmem. Workmem is the amount of memory that Postgres believes it can use for a given node. So when I'm talking about a node I'm talking about something like a hash join. So how much memory does Postgres think it can use to build a hash table in memory to run through the hash join? That's Workmem or in 15 it's Workmem times 2 but still it's related. So if Workmem is set to 8 megabytes and when Postgres is looking at the table size and looking at doing this join it thinks the hash table is going to be 20 megabytes or something it's not going to consider necessarily doing a hash join right or it might consider doing what's called a batching hash join. Batching hash joins are unfortunately pretty expensive because what they're doing is they're spilling out to disk to do that hash join and that ends up being quite expensive. So if your hash joins are batching or if your hash joins are or if you're not seeing hash joins when you're expecting a hash join would be the best it might be that Workmem is too small. So increasing Workmem helps a great deal with that. Maintenance Workmem is kind of related. Maintenance Workmem is about things like vacuum and some of the other background tasks. If you're seeing auto vacuum have to do multiple iterations through a table that can be because maintenance Workmem is set small relative to how big your data set is right or how big a particular table is and how many tuples it's having to remember. So maintenance Workmem can help with some of that. Effective cache size is also another really useful parameter when it comes to planning. So what effective cache size in Postgres means is it's the amount of memory that Postgres can kind of hope is being used for file system caching essentially. So effective cache size is going to increase the chances that Postgres will go do an index based lookup versus doing a sequential scan for example. And the reason for that is because if Postgres thinks more of the system is in cache then it's going to say well if it's already cached then doing that index base that random lookup is going to be pretty fast and so I can go ahead and do that. So effective cache size and then the other side of that one is random page cost are two parameters that are really useful to set and to think about settings. So in particular if you're seeing a lot of sequential scans and what you are really expecting are index based lookups then decreasing random page cost can encourage Postgres to do more index based operations because it thinks the cost of reading pages randomly is not as expensive as it does by default. So the default is four to one so sequential page cost is one random page cost is four so four to one is the default but if you're on SSDs or the database is such that you can basically fit the whole thing in memory then you may want to set that down to two as a rule generally. Shared buffers is of course very very important so shared buffers is the amount of memory is dedicated to Postgres's cache. So Postgres has a page cache that's where we load up all pages that we use and then we have basically a clocktube type of algorithm that will decrease counters to evict pages out but commonly used pages will be in shared buffers and if a page is in shared buffers then we don't have to go out to the kernel to go read it. So if you're operating today with the default shared buffers you probably are doing it wrong unless you're running on a phone. It's a common joke that we have around the Postgres community that our parameters are really decent for today's iPhones right and if you're running on a 64 gig or 128 gig server you probably don't want our defaults for things like shared buffers. Min wall size and max wall size again this is all related to check pointing and whatnot if you're seeing log checkpoints coming through you really want to increase or due to wall or due to x-log you really want to increase max wall size. Checkpoint timeout can be a good one to configure. The default is 5 minutes. Increasing checkpoint timeout means that we will spend longer amounts of time writing out checkpoints also means that we will write fewer full page images into the white ahead log. So the first write after every checkpoint is a full page write full page is 8k that's quite a bit larger than a typical row update inside of the right ahead log. So increasing checkpoint timeout can be valuable. I will say that on the flip side of that the larger your checkpoints the longer it will take Postgres to recover from a crash. Checkpoint completion target I wouldn't mess with anymore we actually were able to change the default to this to be .9 and I don't remember when I did that 13, 14 something like maybe it was 14 so that one thankfully the default is now better on more modern versions of Postgres but if you still see it set to .5 I would recommend increasing it to .9. Alright so let's talk about dead tuples and bloat. So the way that Postgres handles updates the records is that we have to mark a table dead or mark a tuple dead and then that space is available to be reused. That's what vacuum does. We do still have to consider those tuples when we're doing sequential scams and technically an index based lookups also there might be some but those are typically not an issue so one of the things you can run into is that a table can have lots and lots of dead tuples indexes can also have some bloat and in the event that you're having that if the table is 3 gigabytes and there's only 1 gigabyte worth of data and you're doing a sequential scan then you're spending 3 times the amount of time scanning through that table then you would if the table was perfectly packed. A perfectly packed table in that case would mean that you're only scanning 1 gigabyte whereas otherwise you might scan 3 gigabytes because it's 3 gigabytes in size. I will argue though that don't run out and do a whole bunch of vacuum folds or clusters or whatever to eliminate all bloat. Some bloat is good. So PGBloatCheck is a useful tool for checking how much bloat you have. Eliminating all bloat does require a rewrite with either a cluster or a vacuum full but I think the point that I'm trying to make here is that if you go and spend the time to fully rewrite the table, what is going to happen is the next time you need an insert or an update on that table, Postgres has to go and extend the table. Extending tables is relatively expensive. So if you're seeing your insert queries are particularly slow compared to like before you did this vacuum fold or this cluster that's because those insert statements are now having to go and extend out the relation that requires an extension lock. It requires going out to the actual kernel and saying hey kernel, you know this needs to be made bigger and so it ends up being more expensive than if we didn't have to extend it. So when it comes to bloat, don't fret about eliminating all bloat. Even 50% bloat if it's spread out throughout the table and you're doing a lot of updates probably not that big of a deal unless the table is terabytes and terabytes in size. When it comes to analyzing plans there's a few different output options that you can use to get this and then you can pull those out and put them into some different tools for analyzing those plans. So there was, I think somebody already didn't explain talk at this presentation so I'm not going to go too in depth with explain itself but if you have any questions about explain or explain plans I'd be happy to answer them. Explained.depes.com is particularly cool. If I have a few minutes at the end I may pull that up and show you guys. Pgabin 3 and Pgabin 4 also have a pretty graphical thing for looking at explain plans. So how do you fix these slow queries? There's some low hanging fruit that I'm going to kind of run through here. So if you're seeing a lot of sequential scans but you're only getting a few rows returned back from them and you're not and that's not because of an aggregation. Look at creating an index. Indexes are really, really helpful in Postgres for doing that kind of lookup. Now that does involve meaning that you need to create the index based on some kind of column or something that's included as a constraint in your query. That's typically not too much of an issue. I will say here also about indexes that you really want to make sure your indexes are actually getting used. Let's talk about this a little bit in some of the other talks. Pgstat user indexes will tell you when your indexes are being used and if you have a lot of indexes that aren't getting used. Number one, that might be because you went and created a functional index and you thought it was going to help your query but it ended up not helping your query and functional indexes can be a little bit tricky often because of data type changes and whatnot. So you really want to make sure that if you go create a functional index that's actually getting used. Otherwise, if it's an index that you just randomly created and it's not actually getting used then you probably want to go drop that index because every single insert, every single update, every single delete is having to go and make changes to those indexes as part of the operate. Well, I guess not delete, but inserts and updates have to go update the indexes and that's expensive, right? That's not cheap. So when you're worried about trying to improve performance of writing queries, you should be looking at your indexes and seeing which ones are really necessary and which ones are actually being used. A couple of other things about indexes is that you can create multi-column indexes that not everybody is aware of that but that's really, really helpful for queries that involve multiple columns, right? With a constraint of some kind. Another one is work mem. So there's like a million different reasons to increase work mem. The only reason not to increase work mem is if you start running the system into swap or running it out of memory, right? So I would caution about that though because remember I was talking about how work mem is the amount for each node? When you have really, really large queries if you have, you know, like five or six or seven different tables all getting joined, Postgres is going to consider using work mem amount of memory for each one of those operations, each sort, each hash join, each hash ag, like all of them. So a given query could use quite a few multiples of work mem. So you do want to be careful. So what I like to recommend is that you don't increase work mem too much at the base level, like in the actual postgresql.com. Instead what makes sense is that you might want to talk to your application folks about like, okay, for this query I found out that, you know, if I gave it a bit more work mem it would go do a hash join and that's way more performant. So what you want them to do is go set work mem right before they run that query and reset it afterwards so that you don't end up in the potential situation that, you know, some query that's already fast enough ends up, you know, using a bunch of work mem and ending up, you know, potentially hogging memory out of the system. Yeah. Yeah. You just say, you do set work mem equals whatever and that's a SQL query that you just send to the database before you send the next query. No. No. So the setting when you do a set work mem in a given connection that's only going to apply for that particular session. It won't impact the rest of the system. Yeah. If you're, well, so pgbouncer will actually issue a reset right between if it's getting two different, if it's serving two different clients that are connecting through pgbouncer it'll reset it in between. Now you can change that, but it'll reset it between two different connections to pgbouncer. Right? It'll do a discard all and a reset all and all that. So that I'm fairly sure will actually reset. Am I wrong, Bruce? No. Okay. Oh, between your set and your query if you did them in different transactions, but I don't know why you would do that. Your, so yes. Okay. So if you, okay. So pgbouncer has three different main modes, right? So if you're using pgbouncer in the use a different, you know, thing for every single query that you send, you're probably you know, you're probably running functions at that point, right? Because that's, I mean, that's certainly how the SkyTools folks did it. I mean, otherwise how are you going to do that, right? It doesn't make any sense. So if you're doing that, you're probably using functions and at that point you would just set work mem on your function. So other, other cute trick is if you have a lot of functions in the database and you want to change a value like work mem, you can set it on the function right itself and it'll just change work mem for that function that gets called. And so yeah, and that's how you would deal with that case in the, in the super aggressive pgbouncer mode. But in what I would consider the more typical pgbouncer setup where you're doing transaction pooling mode then with transaction mode, you would just need to make sure that you set the work mem inside of the transaction. So you start the transaction, then you set work mem, then you run your queries for that transaction and then you reset work mem afterwards. If you're using, if you're using transaction pooling mode, yeah. That may be correct. I'd have to go test it. Right, right, right. Yeah. Yeah, so it's another, another possibility. But generally, you know, if you see like a small data set and you're seeing like a lot of sorting or a merge join or something, increase work mem is going to encourage Postgres to do things like use hash joins or hash aggregates. So in particular, if you see a merge join happening and you don't think it should be a merge join, you think it should be a hash join, it's very likely that Postgres looked at that and said, well, I'm going to run out of work mem amount of memory in order to do this as a hash join. So instead, I'm going to fall back to using a merge join or, you know, possibly a nested loop, but more likely it's the difference between merge join and hash join where you'll see that. Other things when you, you know, is statistics, right. So if you're having like a large data set and you're seeing a nested loop, you may have an issue with your statistics. And so there's two different options for dealing with that. One is to run analyze, right, and make sure that your statistics are recent. Another thing that you can do is if you have correlated columns, you may want to go create an extended statistics for that particular set of correlated columns. So the way Postgres gathers statistics, we do it on each individual column and we look at them individually, right, trying to do every pairing of columns inside of Postgres would create a huge amount of statistics most of the time which wouldn't be necessary. So that's where we really, it's really helpful for the database administrator to say, oh, these are correlated columns, you know, such as country or such as like state and zip code or city and zip code, things like that, right, those are very correlated. So it's really helpful to have stats for both of those columns in one statistics object and you can create that today with extended statistics in Postgres. But that's why we don't automatically do that for everything. The other thing that you can do is increase your statistics target or decrease it. I've seen it go both ways. In some cases, decreasing it can actually give you better plans. But more typically, you might consider increasing your statistics target to give Postgres a better idea of the data in that table. Now, all of these cases though, something to be aware of is that in, you know, you're going to end up trading some amount of planning time, right, for that run time. And you want to be careful, if you create like a whole ton of extended objects or extended statistics, planner has to go consider all of those, right, when it's doing planning work. And you can see how that can increase your planning time and if your query is, you know, already only taking a second or taking, you know, 100 seconds or something, then that's probably not the best plan because you're probably going to start spending that amount of time in planning, right. Maybe not quite that much, but again just be aware of the fact that that's something that you have to deal with, potentially. Another one that, this is a, you know, one that's kind of a gotcha for folks, right, is indexes with foreign keys. So, if you are seeing your deletes are slow, right, Postgres does not by default or does not require you to have an index on the referring column, okay. You have to have an index on the referred to column, but not on the referring column. So, what that means is that when you go and run a delete on the table that has a foreign key pointed to it, we may have to go do a sequential scan through that other table to find and make sure there aren't any rows there that are going to violate the foreign key constraint, right, after that foreign key, after that item is deleted. So, you really want to check and make sure that you have an index on, you know, if you're doing cascaded deletes anyway, you really want to go make sure that you've got indexes on all of your referring columns. So, Postgres has this thing that kind of seems like it keeps coming up. Prepared queries are certainly helpful in some cases, in a lot of cases, if you plan it, if you prepare it rather, we'll spend some of that planning time up front and we'll generate what's called a generic plan, and then that'll actually allow you to then reuse that generic plan pretty, you know, over and over again, avoiding that cost of doing the replanning. Now, Postgres planner is pretty quick, but even so, all that planning time does add up. Now, Postgres will also do specific plans. So, what we'll do is after five runs of that generic plan or sorry, after five, sorry, what happens is that you prepare the query and then you start running it, right? For the first five runs of that, Postgres will actually generate a bit more of a specific plan for that query based on the constants that you've passed in, and it'll say, okay, you know, this ended up running, this ended up being costed at about the same as the generic plan. If that keeps happening, then we'll give up on the specific plan and we'll just continue going with the generic plan. Now, in some cases, it turns out that the generic plan while it came out costed the same or near what the specific plan was, was actually a much more expensive query plan, right, than the specific plan. In more modern versions of Postgres, you can actually tell Postgres to not do that anymore, to always do the specific plan. I'm trying to remember what the option for that is. Or am I not remembering correctly? Or maybe I just saw a patch for it. Okay, because it's definitely something that comes up. I mean, in a lot of cases, this is like somebody's Java app, right, where the JDBC driver and Java automatically will plan all the queries and then just rerun them, and like the first five queries are really fast, and then suddenly it gets really slow when you're like, what happened? And that's what's happening, right, is this case of, we planned it out for, you know, five times, and it seemed like it was the specific plan and the generic plan were costed about the same, so like, well, we're not going to spend that effort of replanning every time. If you see that happening, the really shortened-sleet answer is, don't use the prepared plan for that particular query, right, yeah. When you prepare a query, you have to give it a name, right, or your, or, yeah, let's use that example. When you prepare a query, you give it a name, right, and then you say, execute that query, right, and that's how it knows it's the same query over and over again, because when you go to execute it, you just pass in the parameters at that time, and it'll just execute the existing query. So it's not a query cache or anything like that, we aren't doing that. It's a, you explicitly as the developer, as the user, actually do a prepare statement, you send a prepare statement to the database, and then you send execute statements afterwards. You can also use the extended protocol to do the same basic thing, yeah. Plan cache mode, there you go, that's what it is. So yes, so you can actually, you can set plan cache mode if you need to, if you run into this issue now. Yep. Right, which is the default. Yep. Who that was, I don't think that was me. That was an interesting noise though. Okay. Oh yeah, I do. That's how you do it. I tell you do it if you're running the command yourself. Again, the extended query protocol allows you to do the same thing. Which I believe is what JDBC uses. All right, so a couple other things. Select count star from a table. Indexes can actually help with this, you may not realize that, but Postgres can use an index-only scan to do a count star. Although, really if you're doing the select count star from tables you may not want to be doing that. It's just an expensive thing to do in general. Quite often people run queries like select star from table. That's actually really expensive, right? You don't want to do that. You definitely don't want to return columns back that you're not using. I've seen that happen. You don't want to return rows back that you don't need. But one of the things that people don't realize is that Postgres has a toast system where what we'll do is in the event that a value is larger than what we consider the toast value, which is 2K we'll actually try and compress it and then if it still doesn't fit on the page we're going to store that out of line, right? So we're actually going to store it in a separate table. What that means, though, is that when you're querying that data out if you've got like an 8K or a 10K JSON or JSON-B or whatever and you're just doing these select stars all the time if you're not using that data when you get it back, Postgres is still having to go to that other table, do an initial lookup on the first table and then find the toast ID and go do a lookup in the toast table and pull that data from the toast table and then decompress it and send it back to the client. It's just expensive. Select distinct this is something that just in general I recommend watching out for but also I would say if you're doing this kind of comma join you can very easily miss join conditions and what that means is you'll end up with a Cartesian product ending up being created and then the distinct will end up going through and removing them. So you really want to be using this kind of join syntax that's going to make it much less likely that you're going to end up in a situation where Postgres is generating a Cartesian product due to a join condition not being included. So really strongly recommend that you watch out for that. You really want to make sure that you're doing making sure you have join conditionals for all your joins. A few other ones so I should really Bruce's null talk is a great one for this particular slide. In particular one thing that you can run into is that due to the way nulls work that trinary logic that exists it's very hard for Postgres to be able to optimize something like a not in query and in some cases in queries too. So what you really want to do is try to if you're doing an in you really want to turn that into a join typically that's going to be give the planner more options about how to run the query and typically it will be faster and if you're doing a not in you really want to either consider using a left join to generate an anti join plan is what we call that or use a not exist and then use a correlated sub query underneath. Using not exist in a correlated sub query is almost certainly going to be better than using a not in clause. So these are again cases that you want to watch out for. You typically don't want to do an in or a not in if you can avoid it. Using not exist or using joins are typically better. Yeah. Yes. That would be my short answer right. So you're querying the table first getting a big list of IDs back and then passing that big list of IDs back to Postgres that is horrible. So that is an anti pattern like why is that not a join in the database would be the question. Why isn't it? Okay. Yeah. Yeah. So there's actually two things that I want to point out about that. The first is yeah absolutely use a join instead right find a way to push that whole thing to Postgres as one shot and not do that I don't know why it was like that before but I wouldn't recommend it. You definitely want to turn that into a join if you can. The other thing though is that if for whatever reason you end up having some really really large list of values that you want to pass into Postgres in a you know as a please return you know all these IDs or whatever don't use an in list for that right. Using an array is actually better right so and by array I mean an actual Postgres array and the reason for that is that when you use an in list it has to go through the grammar right and the grammar yeah okay it's okay it's not terribly slow but it's not as fast as our array the actual array in code right when it's parsing an array that array in code is faster just long story short it is right it's faster because the grammar is generic right it's got to be able to handle anything kind of thrown at it and so it does a whole lot of other stuff in terms of setting up everything and it actually like for each one of those values it has to construct a node and all of that kind of stuff so it's a lot more efficient to pass a raise into Postgres when you're doing that kind of a big list of items that you want Postgres to go find I mean if it's like a really small thing or if the query is always fast then like who cares right I mean do this is all about like once you've identified that query is being slow and problematic then if it's doing an in one that's like a trick you say oh let me see if I can get rid of that in clause and make that into a join or if I need to have a big in list I can turn it into an array yes Bruce yeah but I mean I would still for at least for the case you're talking about I would still recommend just seeing if you can turn into a regular join yeah common table expressions are very helpful though certainly and more modern versions of Postgres will do a full optimization including the common table expressions that's actually another trick though if you are ending up in cases where you have a CTE and it's and you have a regular query and for whatever reason once Postgres optimizes the whole thing you're getting a plan that you don't think is great you can try doing if you pass into the materialized keyword to the common table expression Postgres that'll force Postgres to go and optimize the CTE independently of the of the rest of the query and what'll happen is that it'll optimize and then run the CTE and then cash that result right and then you know pull it into the regular plan upsides and downsides to that right it cost something to cash that information right but if it's a relatively small amount then it could be beneficial right and it may avoid whatever the issue is that you're having with the optimizer trying to do a full optimization on the entire query so that materialized keyword is another one that's really handy when you're dealing with query plans and fighting with them and whatnot so when you're doing CTEs I typically would argue to keep the results of them small this is especially true when you're doing a materialized though right because the results of that when we cash it we don't index it at all right and so without that index you know individual lookups into that into that results of that CTE have to scan it so that can end up being kind of expensive if you really really want a super fast count star just go look at the database statistics right we already capture an idea of what the number of tuples are in the table you can just go look at PG class real tuples right again that's a it's not super accurate it's only updated when we do analyze and things like that so and it's only useful for the whole table but you can use it if you want right the other thing that you can do is obviously we collect a bunch of statistics so you can go look in in the PG stat tables if you want some idea about like what the most common values are in the table or in the particular column you can just go pull that from PG stat statistic instead of actual or some PG statistic I should say instead of actually querying the database for it but again that information is only updated with every analyzed and so it may not be accurate enough for what you're doing it really depends on what your use case is so you know again increasing work mem and maintenance work mem setting effective cache size and increasing shared buffers those are all very memory oriented things but you know those are things that are going to help you out with your queries quite often partial indexes and functional indexes I mentioned them a little bit earlier really really helpful to have you know in with a partial index it will index only part of the table which can be really handy particularly if you have like a very common value right a great example is like an orders table right you have orders table and you have orders that go through a progression right you have a new order you have a in progress order you have a completed order well that orders table is going to end up with a whole bunch of stuff in completed eventually right and so what you may want to consider doing is creating a partial index on the orders table where that order status is you're only indexing the new and the in progress ones okay and because you're only indexing new and in progress any query that comes in and says give me the new orders or give me the in progress ones can use that partial index which is going to be way way way smaller than a complete index across the entire table that includes all of those duplicate entries for the completed orders now I will caution you about this though just like with functional indexes you need to make sure that those indexes are actually getting used for the query because one of the problems that I've seen is people will create an index and they'll do like you know I want an index on not completed now that's actually an index that's very hard for postgres to actually be able to use right we'll let you create it but in terms of actually planning the query out planning a negative like that is hard right and we have to be sure that we can use that index when we're we have to be sure that the index will have all of the rows that are valid for the query right in order to be able to use that index we must know that otherwise we don't think we can trust that index right because it's a partial index that's part of the point so that's why you don't want to use a negative in a partial index right what you really want to do is create a partial index on the actual values that you know you are going to be looking for in your in your select queries and of course removing unused indexes right so unused indexes as I mentioned before and that we've talked about still have to be maintained and that means you know the more indexes the slower of minutes here in terms of for questions I know you're not you know I technically have four minutes according to my watch or maybe I'm six minutes over I don't know one of the two any other questions no other questions no no so here's the thing around it right so we basically consider hints to be be essentially giving up on the optimizer and saying that you know it's not good enough or it's not doing its thing right which yeah sure it is and so we don't have a hint system in postgres what you can do instead is you can play around with tweaking things like random page cost right CPU tuple cost some of those other things can be useful for tuning I would be very cautious about those planner tunables especially on you know I wouldn't tune them in postgresql.com right just like with work mem if you go and set CPU tuple cost inside of a session it's only going to be set for that session and if you you know do it right before you run a query and then reset it afterwards it'll only apply for that query so those are parameters that yeah you can tune if you need to but I would do it on a kind of a query by query basis I would strongly avoid changing things like CPU tuple cost in postgresql.com but they can be helpful in other cases of course it if you're having to do that maybe using hints would have been easier the downside of using hints though of course is that data changes it's not always going to be the right answer to do a hash join with that thing you know that part of the query and so having a hint that forces it is really you're really kind of tying the optimizer's hands at that point and that's part of the reason why we don't have a hint system I don't know if you want to comment on why we don't have hints and forgotten yeah yeah if you set random page cost in the configuration yeah just to repeat that for the recording what Bruce is talking about is rather than playing whack-a-mole fixing individual queries with hints it's more effective to fix the model of the system by changing things like random page cost system-wide or effective cache size system-wide because then you're going to benefit all the queries or changing the statistics target on a particular column that's going to benefit all of the queries that are using that column and it's a broader answer that solves more problems rather than having to play whack-a-mole with all those different queries yeah yeah figuring out what the right indexes are can be challenging but I mean hints aren't going to help you with an index problem if the index doesn't exist and you got to go create it one way or the other that's not a hint type of thing right so but even in that case it's very likely that the reason we're not using that index is that there's a reason why we can't use that index is more likely again it's a partial index where we can't prove a partial index predicate is actually matching the rows that the query is matching or it's a functional index and there's some kind of mismatch with the data type or something if Postgres if the index is actually valuable and exists and is there and is something we can use we're very likely going to use it right the only other case I would argue with that is when you have a case where for example you create an index and Postgres is like well we're going to end up touching every page on the table anyway therefore the index is not going to be helpful that's something that could be helped by increasing effective cash right or decreasing random page cost either one because that's going to be like you know some way we're going to be like oh well you know it maybe it does make sense to use this right and again though like if you're finding that setting random page cost lower for that one query is helping it's a good chance it's going to help more queries on the system and that may be something you want to go set across the board but you know if you're at the end of the day if you're pulling out more than a few percent of a table chances are Postgres is right to do a sequential scan you know you really have to be searching for a very small subset of records in order for indexes to be better than a sequential scan even on SSDs and stuff even those sequential scans still can go through data faster than a than a random scan yes yeah I mean these days I would say you can go up pretty large with shared buffers we've made a lot of improvements in dealing with that on a you know on a 256 gig box I shared buffers of 64 gigs and effective cash size for the rest something like that might be or maybe even 128 gigs for shared buffers possibly there are a few cases although I think we we fixed some of them but I'm sure there are still a few cases where we have to actually you know for a certain operation we might have to scan all of shared buffers I think truncate still still requires that so like if you're doing certain operations that you want and expect to be really fast and you go up really high in shared buffers and suddenly they're slow that might be a case where we still have to scan all of shared buffers for that but that's it's pretty atypical usually usually increasing shared buffers is beneficial so it depends also on how large your queries are but on an OLTP system your transactions in the queries are typically going to be pretty small so you don't have to worry too much about how much backend memory you have but do realize that the more backends you have running the higher max connections is and the higher number of connections you have each one of those has got some amount of cash that is in it as well as having to do processing and so those that amount of memory can definitely add up over a long period of time so that's something else to just be aware of so anyway I'm losing people and I know we're past time so I'm going to call it at this point but you guys feel free to come up and ask me any questions you have thank you all so much I feel like this deck keeps getting