 Hi everyone. So as you said I'm Joe Conway. Is this too loud? No. I've been in and around the community for almost as long as it's been around. I started using Postgres in 98 and started becoming an active community member around 2000 and I'm one of the committers. I'm on the PG infrastructure team. I'm also on the board of PGUS as Debbie is. So I'm going to give you kind of an overview of what it takes to get Postgres up and running. I'm not going to be able in 50 minutes to cover that and also sort of a primer for starting SQL. So this is going to be mostly kind of administrative, although I will go into a little bit at the end in terms of data types that are available and some hints about PCQL that maybe you'll find helpful. So first of all, how do you do the installation of Postgres? Most of you are probably going to install Postgres using some kind of a package that exists for your OS. So if you're running Windows, you'll go to the Postgres site, you'll download the Windows installer. If you're using something like Red Hat, you'll probably get the RPM from them or maybe you'll get that also from the Postgres site. There's Debian packages on the Postgres site as well as directly from Debian and Ubuntu and you could also potentially install from source. And if you're doing kind of development work, I would definitely encourage that and it's not nearly as frightening as it might, difficult as it might sound to people. Just to give you a very quick, I know this is an introductory talk so I'm not going to go into it in much detail, but I just, right before we started, compiled Postgres on my laptop and it takes just under a minute to compile. So I mean there are some dependencies you need to install, you need to get the source from somewhere. There are source tar balls available including the kind of a daily snapshot of the development if you want to track that. You can download those or use get and then you just basically do a configure, a make and a make install and you've got Postgres and it does not take all that long to do. So once you've got Postgres installed, the first thing you're going to need to do is actually initialize the cluster because after you've installed Postgres, there's basically still no database on your system. Depending on which OS you're on, there are different ways to initialize the cluster and this talk is basically focused on Linux variants. I'm not going to really cover much in the way of Windows here, but there are Windows tools. If you download the installer for Windows and you get PgAdmin with that and then PgAdmin actually has kind of menu options for doing things like initializing the database that make it fairly easy. If you do install Postgres yourself, you're probably going to just use the native tool that Postgres provides, which is called initDb. And so in the simplest form to initialize Postgres, you just do an initDb dash capital D and then PgData here is just an environment variable that represents the directory where you want Postgres installed. The important thing to note there is that directory needs to be owned by readable and writable by the user that Postgres is going to run as, which is normally Postgres. It doesn't have to be. You can actually install Postgres under your own user account on a Linux system and if you do an initDb logged in as yourself, then Postgres is going to run as you. Again, that's not normally going to be the case if you're installing from a package for your OS, but if you're installing it from source, that might be. If you install from Red Hat or sent to OS 7, it comes with an actual script that's called setup. So Postgresql99.5, that's the 9.5 version, dash setup and you tell it initDb and that will create a cluster in kind of a well-known location for a Red Hat system. Similarly on Debian, there's a PgCreateCluster script. So you tell it create cluster 9.5, main and it's going to put, the one difference to note here is that in a Red Hat variant, basically everything goes into a single directory, including the config files. But on a Debian variant, the data goes into your data directory, what people will typically talk about as PgData, but the config files actually go in the slashetc directory, which they consider to be more standard across other types of servers. So there's a lot to be said for that actually. But in any case, you should note that if you're working with two different types of systems, you're going to find these things in different places. And notwithstanding what Debbie said, I will probably stop and ask for questions as I go to some extent. So if you do, you know, there's a lot of material I'm not sure I'm going to be able to get through the whole deck. We have a lot of questions in between. But if you have some burning thing you want to ask, go ahead. So there's anything about that part before I move on? The slides will be available. Every conference kind of does it differently, but there's a Wiki page on, if you go to wiki.postgresql.org. On the Wiki, there's a page that's called, I don't know, I always search, I always Google for postgresql, presentations, conferences, and Wiki. And usually you find there's one page that sort of aggregates all of the conferences. And then for each conference, there'll be a link into the Wiki that has a page just for that conference. And then all of the slides are normally linked up there. And I will definitely put mine on there. And in fact, I often for the conferences end up creating that page. So I will definitely do that. Okay, great. So once Postgres is actually initialized, now you can finally start it up. And again, there's a variety of ways to do this. Part of the reason for that is historical. I mean, Postgres has been around as other people have alluded to. Postgres actually started 30 years ago in Berkeley and has been open source for 20 years. So some of this is a consequence of evolution of capabilities over the years. But one note here, again, if you're probably as a beginner not gonna be doing this, but there's something called single user mode, which can sometimes be needed to recover or do certain operations. And if you needed to do that, you probably are already engaging with some Postgres consultant to help you do it. But in any case, this is the command, you can use the Postgres binary dash dash single and point it to the data directory and give it a database name. And it basically will launch Postgres into the single user mode, which is sort of limited in capability, but it turns out to be important sometimes. If you want to do a manual start of Postgres, and again, this is typically going to be the case where you've installed from source, not using the OS packages. There's a binary that's provided with Postgres called PG underscore control. Again, use the dash capital D, give it the data directory that you want to run. This is kind of an optional, you give it a path to a log file and you just tell it to start. And again, these, there's lots and lots of other options for these kind of command line things. I would encourage you if you haven't already, let's see if I've already got it up. If you go to the Postgres website and write in the middle their documentation and then pick your version. I mean, it can be pretty daunting to find stuff in here, but all over the nitty gritty details for those binaries are going to be available under either client applications or server applications. So that PG control is a server application. And if you went into there, you would see all the different options, which I'm not going to try and go through today. So now if you're, if you installed from a package and you're running this more or less on a production system or on a production system, you're probably going to use the service facility of the OS. So on CentOS 6, that's just service PostgresQL start. Red Hat 7 started using SystemD. Actually, some of the Debian versions also use SystemD now. So if you're using a SystemD system, it's going to be system control start and then PostgresQL-9.5 would be for version 9.5, 9.6 for version 9.6, and so on. On a Debian based system, again, they provide a script called PG control cluster, 9.5 main start. So those are all different ways to accomplish the same thing. That's going to get the service up and running. Now, how do you stop it? Again, if you, if you're in single user mode, basically, you just control D, which will exit out of the program that's running. If you've done a manual start, you can use this PG control again. You do dash D, tell it which data directory. And as an aside, one of the reasons you're, you're doing this with this specifying the data directory all the time is because it's perfectly possible to run multiple Postgres clusters on one machine if you want. Basically, each one has to be bound to a different port, but you could run multiple copies of the same version or you could even run multiple copies of Postgres at different version levels on the same machine. If you're doing, like for instance, I do Postgres development, if I'm fixing a bug and it applies back through Postgres 9.2, I'll often run Postgres 9.2 all the way through the current development tip all at the same time so I can apply and test the patch. So here we're doing, we're specifying the data directory, we're saying that we're gonna stop and then we're giving this dash M fast, which is telling it how we want it to stop. And I'm gonna go into that in a little bit more detail in the next slide. Suffice it to say at this point that probably what you want is this fast option and that's what if you're using the operating system control, that's what it's gonna use. I'm 99% sure. So again, Red Hat sent OS 6, service Postgres QL stop, sent OS 7, system control stop, Postgres QL 9.5. And by the way, this service file might not be called Postgres QL, it might be called Postgres QL-9.5. It kind of depends on where you got the package from. If you got the package from Red Hat itself or if you got the package, the RPM package from the Postgres development group repository, it might be slightly different. You could figure out what that is if you look in the ETC init.d directory and see what the script actually is named. And then finally on a Debian based system, you would use PG control cluster, 9.5 main stop. So in terms of these shutdown modes, it is kind of important to understand this to a degree. There are three different modes that are supported, smart, fast, and immediate. Smart basically says I want Postgres to shut down, but I want all of the existing sessions that might be running to finish first. So I want them to finish what they're doing and then shut down. And that sounds like you probably often would want that, except the problem is some of those sessions might last for a very long time. And if you're really trying to get Postgres to shut down now, that can be kind of annoying. This fast option will terminate all the existing sessions, but it does it in a graceful way. And what that means is it's doing a proper shutdown of all of those back ends. It's not just sort of cutting them off at the feet, right? This immediate option just summarily kills everything. That's essentially equivalent to pulling the plug on your server. When Postgres comes back up after an immediate shutdown, it's going to go into recovery mode, as if it had crashed. So as if Postgres had crashed or if the OS had crashed or the server had lost power, that's what's going to happen if you do an immediate shutdown. And the problem with that is, number one, Postgres is going to take more time when it comes up because it's got to roll forward through all of the files, the transaction log that had been accrued since the last checkpoint. And also you're going to lose some important information. There's some bits of information that are not maintained if Postgres is not shut down in a controlled way. In particular, statistics about when auto vacuum has happened or needed will be lost. And if you've got, I think it's unlogged tables, if you don't have a graceful shutdown, the data in those unlogged tables will be lost. Any questions about all that before I move on? So now let's talk about some other administrative things. What if I want to terminate just a particular session? Well, there's a couple, again, as with many, many things in Postgres, there's more than one way to do this. They all, in the end, boil down to the same kind of thing. But if you wanted to do this from the command line, there's a lot of ways to get this information. This is kind of my favorite command. If you do ps-fu and then the name of the user that Postgres is running under, that gives you just the processes that are running under that name. And on a typical system, the only processes running under Postgres are the main Postgres server processes. I mean, if you've got someone logged in as Postgres, they may have some other processes running. But typically, this is going to give you just the processes that are involved in Postgres. And if you grep for, say, the name of the database of the session that is logged in, you can get the PID. And then you can just simply say, kill sigterm. It's important. Sigterm gives a very specific signal that causes that back end to terminate in a graceful way, again. So you wouldn't want to do it like a kill nine. If you wanted to do that from logged into Postgres, you can use this. Again, there's different ways to get this, but this query is one way. You can say select PID, state, and then clock timestamp minus state change from PgStat activity where database name is equal to some database of interest. And you can see here that, and actually query would give you this text of the SQL. So in this case, I'm seeing a PID, I'm seeing a state, and here you can see this one is idle in transaction, which, as an aside, you generally don't want to see. That means your application is probably misbehaving. You should never really see that for more than a transient period of time. You can see how long it's been in that state because of this calculation here. So this has been idle in transaction for 26 seconds. And the last query that ran was a begin. So it started the transaction. So that's the one I want to terminate. Well, now that I've got the PID, there's this function PgTerminateBackend and that will do, that essentially does exactly what this did up here. So now instead of cancelling the back end, which will basically, again, pull the rug from under whoever is logged in, you could simply cancel a long running query, cancel the query that's being currently executed. So in a similar way, I'm going to use PS to find the PID of interest, and then I'm going to send a different signal, SIGINT, which is basically like hitting CTRL C. In fact, when you hit CTRL C, that's the signal that gets sent to the process. So if you do kill SIGINT to that, it will cancel the existing query that's running. Same thing, you can run that query off the PgStatActivity table, find the PID that way, and then say PgCancelBackend, give it the PID number. Okay, any questions about that before I move on to some configuration stuff? So configuration of Postgres is, by and large, kept in two files called Postgresql.conf and Postgresql.auto.conf. And as I said earlier, those files on a kind of red-hattish system are going to be in the data directory, more than likely. And on a debian-ish system, they're probably going to be under ETC Postgresql. You can use a hash to denote a comment in those files. And the format is typically, you know, name equals value. Pretty straightforward stuff. Peter, somehow I don't think you need an intro to Postgres. Feel free to sit in, I don't care. So once you change a value, I mean, there's a bunch of ways to change configuration. Again, like everything else, there's a bunch of ways to change configuration Postgres. If you change one of those files, and this one you're not going to change directly yourself, I'll cover that in a minute. But if you change this Postgresql.conf, and you want to see the change actually take effect, you have to tell Postgres that you made a change. So that's what we're going to do right here. This is one way. As you can use a query, select pgconf reload. And you can see I'm actually going to send that through the psql, which is the command line client as a command line parameter. And I want to do that as the Postgres user. So there's other ways to do that, which I'll show you later on. But you can specify the user on the command line for psql, if that's not who you're logged in as. But this will cause Postgres to reload its configuration. You can also use that pgcontrol program that I was talking about earlier. Again, you have to tell it which data directory, so it knows which cluster of Postgres to act on. And you tell it reload. The service control supports it as well. So service Postgresql reload, system control reload, Postgresql. And then finally, just like the other options, you can send a specific signal to the Postmaster PID. It's called SIGHUP. So when you send a SIGHUP to the Postmaster PID, Postgres will reload. And basically, all these other methods are just doing this under the covers. One note that I haven't made so far, but you'll, when you do look at these slides later on in a number of places, I've put a link directly into the documentation for you. That will show you the relevant information that's on the slide if you want to go into more detail. So when you're modifying configuration, you can make persistent changes. And you can also make changes that are procession. If you want them to be persistent, you're going to either do, as I was just talking about, modify Postgresql.com is the most likely way that you would do that. You can also actually modify the startup script for Postgres and pass in configuration options right on the command line. That was something that was done more often years ago. It's not very often done now, but it certainly still works. And then in the last few releases, I think since 9.4, there was a new command called alter system. And so basically alter system is a way to make changes to Postgresql.com and technically not directly Postgresql.com, but that Postgresql.auto.com is what gets written by this alter system set command. So if you from within a SQL session, do that, it'll write out to that auto.com file. And then in the auto.com files in turn, included in the Postgresql.com at the end. So changes that you make through alter system get persisted in that way. So you would say the other other ways that you can make persistent changes to configuration are through alter database and alter role commands. And these are a little bit different than the other in that if I do an alter database, database name and set some configuration variable to some value, it's going to change that value just for that database. So now whenever someone's logged into that database, it's going to have that that variable is going to have that value. Similarly, you can do that for a role. So in Postgres, users and groups exist still because they've existed from the beginning. But at one point, they were all re implemented as a generic role. So actually, a role is a role that can log in is basically a user and a role that cannot log in is considered to be a group. But in any case, if I say that a role that's going to log in typically is what you would do this for, I set some variable equals to some value, let's say I want, I'm going to cover this later if you're not familiar with it, but there's a setting called work mem, which has to do with how much memory you have available for certain operations when you're running queries. If you have one power user and you want that power user to have a lot of work mem, you can alter that user and set work mem to some big value. Now only that power user gets that large value and everyone else gets the standard value. You can also change some of these things per session. So there's a set command or reset and a show. And you can also find out those values from a view a system view called PG underscore settings. You can directly update that view in order to change one of these values. You can select the setting using a function. And you can also change the setting using a function. So these are again, multiple ways to do the same things. Now one thing to note here is I've been talking about this as if, you know, any of the variables can be set in this using this method or one of these methods. The fact is that some variables require a postmaster restart in order to take effect. And there's some variables that require a reload in order to take effect. And there are others that are session setable. So these things here, and these things here are really only going to work with session setable variables. You can find out which ones those are with this PG settings view. It has a column called context. So if you did select name comma setting comma context from PG underscore settings, it would show you all of the configuration variables on the system, how they're set right now. And that context field would tell you, and again, you'd have to go into the documentation probably to figure out which word means which thing. But basically that context will tell you which things are setable per session. Any questions about that before I move on to specific settings? Yes. So like I said earlier, the question is, is do some of the settings require Postgres itself to be restarted versus just reloaded? And that is definitely the case. There are certain settings. Perfect example is shared memory. Another one is the max connections. If you want to change something, some of those settings will require you to actually bounce Postgres. Whereas when I was talking about reloads earlier, a reload doesn't actually restart Postgres, it just rereads the configuration files without stopping and starting Postgres. So there's no interruption of service if you can get away with the reload. But there are a handful of things that you can only change by restarting Postgres. In the documentation on the .org site, but as I said, is this large enough to see in the back? So you can see some of these are Postmaster context. This is what I was talking about just a minute ago. Those will require a restart. The ones that say SIGHUP, you can probably figure out from what I was talking about a minute ago, SIGHUP means a reload. If you see user, that's something that can be changed by a normal user in a per session basis. But again, keep in mind that if you do that in the session, it's not persistent, it's only for that session. There are also some, I don't know if there's, see if I can see an example of it. Yes, super user. So there are some things that are per session setable but only by the super user. And then you'll see some of these are called internal and those are actually kind of for information only, but they can't be changed. So this block size is actually compiled in parameter. Anything else before we're going? Okay, so now I'm just going to cover like a handful of the more common things that you might need to think about. There's, I don't know exactly how many configuration variables that can be changed. There's probably a couple hundred or so. Regina, do you know? There are a lot. You also don't need to be here. So I'm just going to cover some of the more common ones, basically. This listen address controls what interface Postgres is listening on. So in other words, and again, the default values kind of depend on the distribution. Some different distributions sort of default things differently. But if this is set, for instance, to local host, that means no one will be able to connect to Postgres from another machine because it's not listening to anything but local host. So you could change that to an asterisk and it just means listen on all interfaces or you can change it to a specific IP address for one of your interfaces and it'll only listen on that IP address. And actually from a security best practices standpoint, you probably want to specify an IP address and not just put a star on there because you may want to restrict connections to the database from one specific subnet. As I mentioned earlier, Postgres listener runs on a particular port. The default port is 543.2 but it can run, you know, essentially on any port that's not already occupied by some other service. And if you want to run multiple versions of Postgres or multiple Postgres clusters on a single machine, you would have to change this port so that they differed. Now, whatever you change that to, you also got to make sure that the client knows that and most of the clients have options to specify. They all default to 543.2 typically but you can tell the client I want to connect on 54432 or something. The max connections, this is the maximum number of connections to the database is going to allow. One thing I'll say here is if you find yourself wanting to set this to like a thousand or more or really even over 500, you probably need a connection pooler. You don't want to set Postgres up to use thousands of connections. You really want to use connection pools. So, you know, when I go in somewhere and I see them with a very large value here, I immediately start asking why. And there are performance impacts if you start having a huge number of connections, even if they're not all being used. For one thing, each postgres back end consumes, you know, fairly small but still significant, let's say 10 megs. And let's say I set a connection pool for 5,000 connections, just do the math, right? Superuser reserve connections, this is kind of your emergency backstop. If you need to be able to log in as the superuser to fix some problem, you don't want to be locked out because all of your connections are used, right? So, this allows you to specify how many of those connections should be reserved only for super users. And then SSL, it defaults typically to false, SSL has to do with encryption between the client and the server. There is a performance penalty for that. However, you should be aware that unencrypted postgres connections, if someone's sniffing those packets, they could replay them. And potentially other bad things, right? So, if you're not using SSL between your client and your server, you better be on a trusted network. Or maybe your client and your server are on the same machine. Shared buffers, this is the buffer cache. So, when postgres, when you do a select and you pull data out of a table, postgres basically reads it all up into the shared buffers. And then this is kind of a fixed-size pool of memory where that stuff will reside until it needs to get pushed out because of other things that are being used more often. So, the whole purpose of that is performance. If your data is all read into shared buffers, it's all in memory, and now the next person comes along and queries the same data, instead of going all the way to disk for it, it's just going to pull it out of shared buffers. If you change that data, it gets changed in shared buffers and then gets marked as dirty and eventually gets pushed down to disk in a very secure way. The general guidance here is that shared buffers, you typically hear people say, use between 25 and 50 percent of RAM if it's a dedicated database server. In older versions of postgres, postgres didn't do so well if you went more than about 8 gigs of RAM, but then again in older versions, that wasn't so typical. And now these days you can find, you get many database servers that have 256 gigs of RAM or a terabyte of RAM or something like that. On older versions, you probably still don't want to go above about 8 gigabytes, but on newer versions of postgres it's more efficient and you can you can set that fairly large. The thing to understand though is if you're working set, the data that's constantly being read by the clients can fit in that, then your performance is going to be much better than if it just barely doesn't fit. Because if it just barely doesn't fit, you're going to have a lot of turnover. And that's actually going to, it's actually will be slower to have a large shared buffers if your working set doesn't quite fit. Well I would start with a quarter of a RAM, but the point is you really need to try it with, you need to experiment with your own queries and your own tables and your own working set and optimize that value because there's no way for anyone to give you you know the perfect answer. In some workloads it'll turn out that setting shared buffers to 2 gigs even though your database is 20 terabytes is going to be the best performing and it'll perform better than something that's larger. In some cases setting shared buffers to 32 gigs might make a huge difference in performance. So there's really no way to say ahead of time what's going to be better. There you know there is this general thumb rule 25 percent of RAM but I would just point out you need to be careful about just accepting that you should really test a few different scenarios. Okay I'm going to get into that right now. Yeah so huge pages if you've got very large shared buffers you'd want to use huge pages. This is an OS thing that is I think it was supported after Postgres 9.4. Basically if you're using really big chunks of shared memory you want those pages to be allocated in big chunks in order for it to be more efficient. WorkMem is used by query nodes for certain operations that kind of the two best examples are if you're doing aggregates there's a type of aggregate called a hash aggregate. Build a hash table in memory for each of the groups and then it just sort of updates it. That's a very efficient way to do an aggregate if you can. You also sorts will use WorkMem so if you need it if there's a sort node in your query if it can do that without spilling the disk that's more efficient and basically WorkMem is what Postgres will try not to exceed so if it does the calculation and it says in order to do this sort I need two gigabytes and WorkMem is set to four megabytes then it's not going to try and do that in memory it's going to spill the disk. Well this value is settable per session but you should understand that it may it's per node a query is broken into a bunch of nodes so when you're querying you know five tables that are joined and you're you're doing order by and you're doing group by you might have multiple nodes as the data is kind of gathered up to be returned each of which use WorkMem so the important thing here is shared buffers is once for the whole cluster so you can set that to half a RAM and you know and be reasonably certain that you're not going to drive yourself to swap but again this goes into a number of client connections if I've got a thousand client connections and they're all running queries and this is set to 128 megs again do the math you might drive yourself in the swap pretty quickly so you have to be careful about setting this value too high but setting it higher in some cases will really significantly improve a query so using tricks like having a power user that's got a higher value or just setting it in a particular session prior to a query to a higher value just for that query can be really useful maintenance work mem is used for in older versions of Postgres it's used for vacuuming and it's also used for index building in the newer versions of Postgres is actually a separate setting which I didn't list here called auto vacuum work mem but at default to the value of work mem so you can actually split the two out the other thing to know about older versions of Postgres you could not usefully set this value more than about one gigabyte more than about one gigabyte I think up until Postgres 9.4 after Postgres 9.4 you can actually set that to a very large value and it'll get used so this is a case and again this is per session settable because it's involved in potentially auto vacuum you do have to be a little careful auto vacuum kicks in and there's a setting for auto vacuum this is how many workers do I want to have so and that defaults to like three but let's say you set auto vacuum workers to something like 10 and then you set this value to something like five gigabytes and all your auto vacuum workers kick in on large tables now all of a sudden you're using 50 gigs of memory so you got to be a little careful about this one not quite as careful as you do with work mem but for building a specific index if you can build it all in RAM I've done testing and it'll build at least in tests I've done sometimes as much as three times as fast shared preload libraries is something that's used for extensions to Postgres it's not in many cases it's not required to load do this but what happens is if you've got a library listed here it will get loaded when Postgres first starts which means it's already there when the client connects the way Postgres works is when a client connects to the postmaster a process is forked so when it's forked it's just a copy right anything that was there when it was connected when it when the client connects goes with it when it forks so if you preloaded libraries they're already there as soon as the client starts up and don't have to it doesn't have to be reloaded so that's kind of a performance tweak that you can do if if you want and you're using some extension like PostGIS say you could load it here it'll be loaded already whenever your client connects instead of having to load per session in some cases because of the way they work some extensions actually require you to load them at Postgres and at time and those will typically tell you that and I'm probably gonna run short on time here what time do I have to? okay all right I guess I'm not doing too bad I'm on slide 12 or 43 that's a fairly complex question so the question was is there are there statistics available that would or that would tell you that these things are set appropriately for your for your workload and there are statistics on the system they're not really going to tell you whether work mem is set correctly there's an extension actually that you can you can use that will give you more information about what's in shared buffers I've only got five minutes yeah that's what I thought yeah 10 minutes okay I'm gonna try and speed up a little bit and get through the more of this material okay well let me just get through the configuration parameters anyway so there's parameters on on the right of head log which is also known as the transaction log you always want fsync on while level is set based on whether you want to do replication or not synchronous commit is is typically on but that's one that you can consider turning off the implication is that if there was a crash in the middle of a transaction you might lose that transaction but you won't actually end up with a corrupt database checkpoint timeout and prior to nine five there was a checkpoint segments and after nine five there's a maximum wall size basically this has to do with how many pre allocated transaction logs there are if you've got a very busy right heavy database you would want that to be larger you would want this checkpoint timeout to be larger and this checkpoint completion target to be larger these are all kind of the defaults there's some planner parameters you should know about effective cash size you typically want to set this to something about half of available memory sequential page cost maybe three quarters sequential page cost and random page cost this is the the relative cost of doing a sequential read on the disks versus doing a random read on the disks this was set years and years and years ago based on some testing often based on real workloads and real hardware today I find that random page cost is too high and should be lowered to something like two maybe even one point something and if you're basically if your whole workload fits in shared buffers you would actually want to start approaching one because if everything's getting read out of memory the two are pretty much the same but again that's something that you might want to do some testing on your actual data and your actual workloads and your actual hardware there's all kinds of logging parameters I'm not going to try and go through all of these but you should know that the what you can log where you can log to is very flexible in Postgres including using facilities like syslog and the event log on windows you can control many different things that can be logged log min duration statement is set to off by default but this is something where if you want to look for long running queries you can set this to say a thousand which means one second so it's in milliseconds and now any statement that takes more than one second is going to get put into your log if you set this to zero it's going to log literally everything if you've got a busy system that's probably going to cause performance impact in and of itself but it can be useful to do that every once in a while where you just do it for a specific operation you set it to zero you can do that on a per session basis you run some operation in your app you turn it back and now you've got a snapshot of every query that your app ran during that operation sometimes that can be really useful especially I've seen applications where they run hundreds of thousands of individual short queries so every one runs very quickly but there's so many of them the whole operation is slow and you can often rewrite those kinds of things where they do a set operation instead of a bunch of individual rows so in miscellaneous there's a search path in Postgres Postgres supports schemas tables go in schemas basically the search path controls which object is going to be seen by your client so if you just similar to like an executable program on your operating system if you had two copies of the same program in different locations the one that's that comes first is found first in the path is the one that gets used same idea here you can set a statement time out so you can it'll cause an error statement takes more than that time to run you can set a date style and a time zone so I guess that's that's all the time we have so is there some free and easy to use analytical tools that you can recommend about analyzing your system if your account is set right and what is the bottleneck like what prom can review something like that well there's a there's an online there's a website um called Depez Peter do you know the name of the website explain.depez which will do an which will allow you to put an explain plan in and it'll and it'll tell you basically give you a more easy to read version of an explain plan there's also a there's a project out there called PGTune that some people use which will help recommend settings PGSTAT statements is built in I mean looking at that is very useful anyone else have a quick question okay Joe will you be available later for people to ask questions and hackers out or maybe around the corner and the yeah I'm actually room hosting quite a bit but I'll be around the conference until Friday and if you can corner me someplace maybe tonight during the reception or something I'd be happy to spend more time with you well thank you Joe very much for your presentation let's give him a hand for him thank you so much