 Okay, after that short delay we'll get started. Shoaib talking about tuning PostgreSQL. Yeah, so before I start I just wanted to get some idea about how many of you guys are using PostgreSQL in production? Oh, that's great. That's great because I don't get to see that kind of feedback anywhere else. So yeah, that's great. And how many of you tried tuning the PostgreSQL server and found it really hard to do? Cool, yeah. I'm sorry? Oh yeah, that's what I do. I'm a PostgreSQL consultant and a worker for Jutsu. So we do the same thing as well. We help customers with tuning PostgreSQL servers and giving support services as well. We do 24-7 support and business support as well. So yeah, yeah, it's over there as well. So we'll be looking at basically going through a few things that we can cover in the next 20 minutes because tuning the database server is a very big topic. So we just be covering a few things because we can't basically go through everything in the 20-minute time. So we're looking at procedures that we can use for diagnosing performance problems for the PostgreSQL server and then how do you benchmark your database server performance and then a little bit of our database monitoring as well. So let's starting with the procedures that you usually carry out to diagnose performance problems. First of all, you need to identify the areas. So I basically cut it down to starting with doing the application analysis, looking at the SQL, looking at the memory usage that involves the buffer cache that PostgreSQL is using and then looking at the storage setup that you got file system and the very famous PostgreSQL.com file that people found really hard to tune. So starting with application analysis, the first thing that you look at is how the application is interacting with the database, whether it's using it's basically reading large amounts of data, if it's writing large amounts of data are you using any ETL jobs, using any analytical queries, because they basically all really matter because when you're tuning your PostgreSQL or Confile you really need to think about all these kind of things before you set up those values. So yeah, you have to be really careful with the kind of way the application is working and that's something that I've always found that the SIS admins had a really hard time tuning these things because they don't really at times know how the application is interacting with the database server and they've given this job to tune the database server so it's a very hard job for somebody who doesn't know how the application is interacting with the database server and you're trying to tune the database. The next thing is SQL like finding what are the problematic queries that are taking a long time because that's the first thing, that's the starting point basically to find out those queries that are taking the long time that they're executing for longer months of time. So a best way to do that is install a query analysis tool on top of your database server and what those two do is that they basically look at the database server log files and they can generate some very nice reports on the basis of data that they can gather from the server log files and a couple of them I've put them in the presentation as well EPQA is the one that I really like because it uses Perl so you don't have to install anything else but if you're using PG4 and you have to install PHP as well because it only works with PHP so I usually prefer using EPQA that's much nicer and it gives you a very nice report and once you've found those SQL queries that are problematic the next step is using Expand and Analyze on those queries. Has anybody over here used Expand and Analyze and found it really hard to understand the output? Yeah that's exactly the next thing so there are some nice tools that can help you understand the Expand and Analyze output so I can explain those tools after the presentation if you really want some more data on that because we can't really cover everything about the Expand and Analyze output in these 20 minutes so yeah but the next thing is you will look at the plan and see where exactly in the plan are the parts which are taking the most time and then start debugging those areas. Another thing to note about Expand and Analyze is that never try to use Expand and Analyze with the DML type of queries because if you're doing a simple Explan that's just giving the estimates but if you're doing an Expand and Analyze that's basically executing the query as well and then it's giving you the actual times so if you're doing a DML type query that's insert, update, or delete it will in fact run that query as well and so the changes are all done on the database as well so just be a little bit careful debugging the DML queries the next thing to look at is memory usage first thing you need to gather is the size of database if it is small enough to fit in the memory or if it's not if you can't really fit it in the memory if it's like say it's a database or a one terabyte database and you can't get that much of RAM on that box then you need to look at getting faster disk to basically nullify that effect and other than that you should be looking at the buffer cache usage Postgres uses both OS cache and its own buffer cache as well it's a little bit different than other databases something that I've noticed with Oracle DBAs is that they would think that giving all the shared memory to the database will get better performance but with Postgres it won't give you the same kind of behavior so you need because Postgres will be using the OS cache and the buffer cache together so at times you will see double buffering as well but a Postgres manages it very intelligently so you basically have to look at how a Postgres is using the buffer cache in order to do that Postgres comes with PG underscore buffer cache counter module Has anybody used counter modules over here with Postgres? No? So counter modules are the contribution modules that come with Postgres and there are RPM and DB packages available for those contribution modules so as soon as you install them they're all installed within the Postgres skill installation and you just need to run a few SQL scripts to get them all going so PG underscore buffer cache is also one of them and that helps you look at the buffer cache like what exactly is in the cache right now and how the table is using the cache that is available for Postgres and you can see you can basically relate to the most used tables and see if they are being cached properly or not if they're not properly cached there's something going wrong over there so that's how you basically look at the memory usage for the Postgres database server so if something is wrong you need to look at increasing the shared servers for Postgres server or looking at how you're using the OS cache Next thing about memory is looking at the query plans and looking for anywhere in the query plan where it's using external disk because Postgres uses RAM for doing sorts during the query execution so if you see a sort using an external disk for doing the sort that's a terrible thing for the query so keep a watch on the query plans and look for lines in the query plans where it says external disk sort being used because if it is using that then that's a bad sign and that usually happens when you don't have enough memory available and Postgres has no way to go about it instead of like it just goes in using the disk and that's like using the swap. Next thing is looking at a storage so what type of storage setup you're using you can use a data data storage or SAN. I would always recommend data data storage because I've seen, I've had some terrible experiences with using a SAN because we had a customer who was using a NetApp SAN that's very good if you tune it properly but they had one big aggregate that was using 48 disk and they didn't give the database a dedicated aggregate and everything else like the exchange server, the database server, file server everything else was using that same aggregate and the database was always competing for resources and it wasn't getting all the resources that were required so we ended up going towards a cheap data data storage setup so instead of giving it the share from the SAN we just went with the data data storage and that usually gives you better performance because you know that the database server is dedicated using that chunk of storage. The rate setup, I usually recommend rate 1.0 because doing some benchmarks with rate 1.0 and rate 5 we found out that if you are using a right heavy database rate 1.0 will give you better performance because with rate 5 you are storing that extra parity information so that becomes a hit on the right performance so we usually recommend going with rate 1.0 Other than that, look at the iostat outputs from time to time. I would have customers set up scripts in Nagios and then do trend analysis on those iostat outputs to see any spikes that they can see while the database server is doing the normal activity and if you do see them you need to look at the bottlenecks where are those bottlenecks and also look at the queue sizes as well. If you see the queue sizes going abnormally when you relate them to the number of disks available then there is something wrong and there is some bottleneck with the IOS somewhere so you need to identify those areas as well. And then the rate control settings, usually these days rate controls come with right back cache enabled but you need to verify if your rate controller is using the right back cache and they usually come with battery back cache these days so just make sure that your rate control is battery backed as well and something that I've seen people usually ignore is that they don't monitor their battery health. They think that they've got a battery backed rate controller but they don't monitor the battery health so when you have a crash your battery is not in a good health so it won't be able to do a proper rate crash recovery even if you had a battery backed rate controller available. And then Postgres has got table spaces so that can be very useful if you are trying to dispute data across different storage locations so a good starting point is separating your indexes and your table data onto different table spaces that basically helps you with the IO loads and then our transaction log files they should be on a separate storage area as well other than your data folder that helps with the IO performance especially with a right heavy database file system. Everybody has its own preferred file system but I usually go with XFS that's what we gathered after doing some benchmarks against EXT3 and the reason for that was that XFS had a better journaling system because it only journals the metadata it doesn't journal everything else so we found XFS to be much better than EXT3 on doing benchmarks against Postgres and XFS is much better with bigger files as well and each Postgres data file can be up to one gig in size so I know everybody has its own preferences about file system and I know XFS doesn't have a lot of tools available that EXT3 would have but it's your own decision you just need to do benchmarks as well but I found XFS to be better than EXT3 something to note with XFS is that it comes with barrier support enabled so it would be a good option to use no barrier when mounting the file system if you are using a battery backed controller because if you are using a battery backed controller then there's no use of using a no barrier support in XFS and then comes the famous Postgres to log con file there are a lot of parameters that you can tune in that file but we can't really cover every one of those parameters in the timer located so I'll be just covering a few of them which are very important first one is shared buffer this is the value for the database so Postgres basically uses that cache to do read writes and it stores important information in this amount of cache and this basically is a chunk of memory that is taken from the shared memory and it's allocated as soon as you start the database server so Postgres starts using it as soon as the database server starts a good starting point for setting up this value is use 25% of your available RAM don't give it like the 70 to 80% of your RAM just like the way you do in Oracle or other database servers because Postgres is like a big shared buffer you can get into buffer log contention problems if you have a really big buffer cache so again use PGNOS for buffer cache contribution module to monitor your shared buffers and see if you really need for the tuning for the shared buffers or you need to cut down the value of shared buffer this thing is effective cache size that is not the value that is that is not the amount it basically is taken again from the RAM but it is not the memory that is allocated to the database server right on the database server start it's just a value that's used by the QT planner when it is picking up an optimized optimal plan for the QT execution so basically it is the database server sees this value as a total amount of OS cache available to the server so it will look at this value and will get a proper plan according to this value so if you have a bigger number for this value it will be able to pick up a corrective QT plan using a proper index and starting point for that is start with 75% of the available RAM and you basically take this number out of getting output from free or top and looking at the numbers for free and cached and free and cached so you just start with 75% and then monitor the QT performance as well next thing that is very important is the work map this is the amount of memory that is used by sorting operations during the QT execution so basically you try to avoid as much just sort as possible and a good point to start is look at the explain analyze output and with Postgres 8.3 and above it shows you whenever the QT is using this sort so whenever you see something like external merge sort or external sort in the QT plan you know that I don't have enough work map available so I need to raise up the value and the good thing about this setting is that it is session based so you can set it for a specific session and that session can use that amount of work map you don't have to set this value and restart the server to put this value in effect so you can just use it on a session for a specific query so like the query is where we are doing we are basically working on large amounts of data we use the set of the work map to a high value like in multiple gigs of RAM and then we run that QE so that the QE can use that amount of memory. Next is maintenance work map memory that is used by the DDL operations like the create index operation, re-indexing the database or altering the table to add a foreign key or all the other DDL operations use this maintenance work map again this is session based so when you are doing a bulk load job or ETL job and at the end of it you want to create an index you basically raise this value to a very high number and just create the index to be able to use a lot of memory. The next one is auto vacuum that's a very important saying so by default it is enabled on Postgres 8.0 and above and it should be let enabled because if you don't enable it that means that you can end up having database blots which is not a good thing so what it does is that it looks at the threshold values for each table and those thresholds are like for analyze they depend on the number of when you have a specific number of inserts on a table auto vacuuming process will go and try to do an analyze on that table and the same thing goes for the vacuuming as well as soon as vacuuming auto vacuuming thread finds out that a certain number of updates or deletes have happened on the table it will go and try to vacuum that table again you can find the threshold values by analyzing your this is a very useful view in Postgres that is called PGSTAD user tables so that gives you the number of inserts, updates, deletes for a specific table so you can look at those numbers and find out the threshold values that how many inserts or updates or deletes you are getting every minute or every hour and with 8.3 a very nice feature came in which is we have worker threads available for auto vacuuming so before that auto vacuuming was a single process so only one table could be vacuuming only one table could be vacuum at one time but now with 8.3 above you can have pedal processes doing vacuuming and analyze on different tables next is default statistic target it's this basically value sets up the amount of sampling you need to do on specific table when analyze runs so analyze is a process of collecting statistics for a table and those statistics are then used for getting optimized query plans so this value by default is 10 in 8.3 which is a bit low it should be set to 100 at least but again it depends on the kind of queries that you are doing in your database server and looking at the query plans if you think that the query plan is not what you really want it to be then you need to raise up your stats because the Postgres server is not able to get proper stats next is checkpoint sequence again you usually set it to a very high number if you are getting bottlenecks on the IO especially when you are doing large amounts of write we are running a bit short in time so I will just cover the next slides quickly next is benchmarking I used to use Bonnie++ for doing benchmarking on the Postgres database server and the recent version of Bonnie++ is really useful iPhone is really useful the one after 1.96 I guess they are going to release the next one very soon so that is 2.0 it also gives you lag times as well and it can do some tests that are really database oriented so I have used that in the past and it has really helped me in finding the bottlenecks with the Postgres performance so keep on doing the benchmarks every now and then to find out how the behavior is changing and then for the database server specific activities you can use pgbench to find tps transaction per seconds and see how the database behavior is changing from time to time monitoring use the normal Unix commands like iostat, dstat top and keep on putting them in a database and then try to do trend analysis on that and then the last thing is checkpostgres.pl script that is a Nagios base plugin it is a very useful plugin if you have a Postgres database server which is in production because it can automatically check for the bloats it can check for the checkpoints it can check your free space memory and it can alert the admins about any kind of problems beforehand and then you should be using a combination of Nagios or tools like ganglia so you can get proper graphs on the trends for your database server performance that's about it any questions we don't really have time for questions because we're running a little bit late but perhaps you can talk to them after the sessions if you've got any questions just tell me what thank you