 Okay, so we are about to start. Welcome to the Devereen, the devereen of Postgres. Devereen, you are unwelcome. So our next speaker comes almost every year to Postgres, all the way from a very far away country. He not only comes to Belgium in Postem time, but he also comes during the Postgres conference in Belgium, which is in May. Because he loves cycling and he loves beer, which is something very difficult from Belgium. That's why he comes here. He's going to talk about his expertise in Postgres anyway. And if you have a very good question, people are giving up to 10 euros for this bowl. But if you give me a good question for Ilya, you can get this for free now after you ask your question again. So please welcome Ilya Kosmodimievsky. Thank you Boris. So it's very neat from Boris not to give the stress ball before the talk. And it's only after the question that I have a chance to leave. So, hi everyone. I'm Ilya. Maybe if you've been posted before you saw me already. I'm working for a data grid. We're doing some PostgreSQL services. I try to do PostgreSQL services as well. But regret to say I do more Excel nowadays. But still, we have some experience in upgrading PostgreSQL installations. And I would share some information we have to you. So just to figure out who already runs PostgreSQL 12. Oh, maybe someone runs 13. Cheater. That's why nobody runs 13 yet. So 10, 9.6, 9.5, 9.4. You need to upgrade probably several days already. Seven. Okay. Yeah. Just to figure out how many people need to actually to upgrade and on what version specific things I need to concentrate. Anyway, so why I prepare on this talk? Because upgrading your PostgreSQL is not a rocket science. It's actually quite easy job to do. But all this, there are some small details which are actually completely, which could completely ruin all your process. So basically, if you take a look on the documentation of how to upgrade PostgreSQL and PGA upgrade, it's simple, pretty much straightforward. But if you have really heavy loaded database and you are not concentrated on some small details, you can ruin your database. And this is actually a key problem for upgrades. So you can get downtime, which you don't want. You can ruin your database. You can ruin your data, which is not the same thing that ruining database. It's even worse. Because if you don't have a backup, for example, it could be a big problem. And you need to know your system, not only your database, but for example, how your application works, from which hosted goals, which features of database it uses, and so on. So besides of that long preparation time, long planning time, maybe a night shift, and because of this, most DBAs do not like upgrades. I don't know. Who considers himself as DBA? Who would like to? Okay. Who likes to upgrade databases? Okay. Me too. Because that person who normally works with Excel, so then I have a chance to look in Postgres, I'm really happy. But most of the DBAs are actually scaring of upgrades, and they are attached to outdated versions. And this is like sort of catch-22. So you know which sort of surgeon is the best surgeon. That one which makes lots of manual work, lots of operations. So basically, the cardio surgeon is supposed to do his job like every day, and then he is fit enough to do the routine operations quite successfully. And that's the problem with upgrades. If you are sticky to an old version, maybe then it would be necessary to upgrade. You do not have required skill because you do not do upgrades quite frequently. And that could be a problem. You ruin those things because you do it that rarely. So any upgrade became a major stress for you, and that's a bad thing for your database. You're trying to keep safe and up and running. So my suggestion actually, I can understand if your production environment should work on certain version more than one year. It's quite acceptable. Maybe you need sort of security audits, some other requirements which cannot allow you to upgrade like every year than we have when you release of Postgres. But even if you do not upgrade your productive environment, just set up the stage, put the same amount of data there, and try to upgrade. So basically if you see Devrim pushes a new package, try to upgrade. You will get that practice in your fingers and that really will help you to do real great successfully. Why we need to upgrade? Obviously bug fixes, security fixes, things like this. So basically you can only be safe if you are upgraded because there are lots of issues which are rising from time to time and we need to fix them. Recently I would say over the last five years at least, maybe even like last ten years, the tempo Postgres takes is incredibly fast. So people bring in new features that improve performance quite drastically. And now actually many software developers are much more agile than they used to be like in a bank twenty years ago. So they want new features of JSON, they want new features of something and as a DBA you actually suppose to help developers. If you're a developer yourself or if you're a DevOps, you probably already know that you need to bring new features fast. So in Postgres Word it's necessary to upgrade fast and that's part of the deal. And when you're upgrading on time it makes it easier. So if you run something, seven something right now, I would say like call me. So I cannot explain that in one talk or even in a new tutorial because for such upgrade, which is not on time, you will need a lot of preparation, a lot of investigation, what we supposed to do and so on. So then you upgrade, within this window then you need to upgrade once a year, once in two years, once in three years. It would be quite okay, but when you're upgrading from very old version, it's like complicated task. And well, I personally appreciate if you run seven something or eight something because well, as consultants we are happy to meet such customers. It's our favorite. So one big cooperation with starting with letter O, they just double the prices of the support for old versions. But well, as a community member, I always advise you to upgrade. So in PostgreSQL work, we have their specific term which goes major upgrades and minor upgrades. What is all about? Previously, we used to have a very ridiculous version numbering. So it was like 9.4 or 0.3. So we basically used to have two major release numbers and one minor release numbers. It was very confusing and that's why we moved to modern after 9.6 system of numberation like 10 point something and one is major, one is minor. So basically once a year there is a major release which includes new features. And after that, because new features always bring not only fun, well fun, it's still fun, but also lots of bugs because people rework on the back end. So we need to fix them and we have during a year lots of minor releases and we need to perform minor upgrades. So the definition is in major upgrades there are new features. In minor there are no new features. It's like fixing old stuff. So it's really important to distinguish that and because of that nature of major upgrades the methods of upgrading databases is different. And there is some sort of rule of thumb. Now we probably supposed to upgrade to version 12. Why? So version 12 was like in the autumn and it was like 12.0 and I generally do not recommend to install any sort of PostgreSQL in mission critical production which has no minor upgrades yet. Because those new features bring in lots of bugs. So generally I would say from version 8 I have some sort of not very accurate statistics and I would say like it's a half a year after the release. So basically you can start if you upgrade first some staging then non-productive environment. After that you can try to start upgrading your productive environment like version 12.2 to 12.3. It's usually like half a year from initial release. But that's like really some sort of catch-22 as well because we need to fix the bugs and if nobody upgrades that is difficult. But well there are people who upgrade just immediately because developers want new features. So rule of thumb still works. Before any upgrade that's like a very important rule read carefully version specific release notes because sometimes there are some specific issues with upgrade procedure and they are always in the release note. So basically take a look on that very carefully. Like if you are a successful PostgreSQL DBA you will learn to learn even release notes in your smartphone when you update in your application. So it's like automatic sanity check. Play with upgrade methods you like like several times before you perform actual upgrade because something can go wrong and there could be lots of problems. Talk to your development team. If they want this major Russian upgrade if they have some different issues if they use something outdated things which are mentioned in release notes. Basically a good idea is actually to sit with your architect who is responsible for application level and read the release notes carefully together just to align. Of course nobody do this but it's a good idea to do this anyway. And of course make a backup and check it by test recovery. And test recovery is most important part of this bullet because the only guarantee that you have a proper backup is that you checked recovery that it is recovered. Cross check, check some things like this they do not have any guarantee. So basically test that you do not lose your data because upgrade could be a risky process. Minor upgrades are theoretically easy. You simply install new binaries start new database on the new data directory. So there are no new features the binary format is the same, completely easy. Still there are some issues. First issue is when your data is supposed to be online you probably want to pulse pg balance or things like this and then you're stopping your database. It could stop like long time because you have lots of dirty pages and they need to be check pointed to the disk and if you do not play enough with this upgrade method you will run into problems because for example you're trying to stop you have some technological window like several seconds to restart your database it takes longer, you became nervous something calls you and says our database is offline and things like this then you try to stop database with minus M you ruin your database and well it's bad practice so don't do like this. Issue check point like several times then start to stop database then it goes smoothly so just prepare play with all these combinations not to ruin actually very easy procedure compared to another one and another thing keep an eye of upgrading all the related packages. Of course do backup check recovery as you do it's like default point everywhere. So check all the packages which could contain something considering Postgres because if you do not upgrade Postgres client that's okay probably you will get that warning your software developers will get that warning you upgrade them later but if you use some sort of contrips if you use some debug symbols things like this you should be aware that you upgraded everything and especially things like PgBouncer things like application level libraries, drivers and so on so basically the good practice is to have everything of the same version because it could be issues. Major upgrade methods are much more complicated because internal structures could be quite different so for that purpose we have several methods and we can choose which method is less dangerous. Good old dump restore, PgUpgrade and some replication based methods that's the set we have and we can try to do this. PgDump is a good way to upgrade your Postgres query instance if you can afford this because if you have a large database if this database is heavy loaded it could be really tricky to upgrade with PgDump but if you can afford this if you have a couple of hundreds of gigabytes of data it's not really heavy loaded it could be good you are on the safe side because it's simple method you can do that quickly and you will be happy. It's a frequent question actually how I should upgrade my database if I run Postgres inside Docker. Well, most of the things I'm talking about here are for just normal server not a container because usually if you run database in container it's like if it's small database for some microservice for some team which showed up and doing some research or things like this it's a small database inside the container you have the data on outside disk it's not big enough it's not heavy loaded because usually if it's like this you run into problems before you're thinking of upgrade so it basically would be much more earlier so PgDump is a good way to do upgrade with Docker there is some project of running PgUpgrade inside Docker I think it's called Docker PgUpgrade it basically runs PgUpgrade inside the container I tried to play with that I did not claim I saw that in production but then I played that on my computer actually I run into problems several times and it was advised to do this so try to keep PgDump problem is if you have requirements on downtime that could be tricky to do this another problem is disk space can be quite expensive and for PgUpgrade using dump restore most likely you need extra disk space or you need work with some tricky procedure which can ruin your data good thing it works with any PostgreSQL version since 9.7 but there could be some issues but theoretically you can upgrade even from very old versions because it has nothing to do with binary format it has only to do with your schema with logical structure so it's not like data files so nothing change there but for example if you have really old database on old version and you use intensively stored procedures something was changing during that time and you need to actually check compatibility of your server logic in this case and that could be tricky actually PgDump has option of custom format you can compress this and it's proper way to do the PgDump and it can save you some disk space actually PgDump can do several jobs so it can do things in parallel and in some cases you just need to experiment with this it could be good for you because it would be easier and faster but the problem is if you do tricky things like PgDump from this port and in pipe you restore to another port to save some time to do that faster I do not recommend to do this don't do it at home but you can try you can not do this using multiple jobs so just figure out what you need actually and try with jobs or without and if your installation can be upgraded like this that's good congratulations so that's not tricky anymore and that was probably you will be happy about this a small notice which actually would be quite useful for us on the future slides as well don't forget about locales so basically especially if you are not only using English, if you are using French, German, Russian Dutch something which requires specific locale you can be actually run into problem if you install your new database in different locale or just forget to install it using default locale everything goes okay normally so you can dump restore and at some point you will figure out that swords do not work properly, things like this you can spend lots of time on those things so put on your checklist actually that you need to double check the locale and it's supposed to be exactly the same major upgrade using pgdum the procedure is quite easy you install new binaries in service new cluster don't forget about locale, change config files appropriately that could be a problem just like with failover and switchover don't forget that you need pghba.conf same like on your old database because after that you will let application work with the database and it doesn't work that's the common problem so keep an eye on that, synchronize those things that could be very good idea to use the newer version pgdum but if you are upgrading from really old version could be issues with really old things and again read release notes, use the newer one if you are working with something very old just test that 10 times then restore your dump and try to figure out if everything is okay, test it see if application works nothing strange then switch your application to the new cluster and usually because we run into such problems many times do not delete your old cluster immediately even if there is not quite much disk space keep it for some time because something can go wrong and could be a problem pgupgrade for pgupgrades, I would say that's a major method we use and it's tricky but actually it's most useful one procedure is simple as well but to figure out how to do this we need to figure out how it works and based on that we need to choose the way how we upgrade it would be slightly different for just send on server for primary hot stand by configuration and for some tricky situations then you really really have very small technological window to perform the upgrade then you could not allow your downtime and of course details details so pgupgrade is very smart idea you have old and new database cluster and there is a mechanism which actually takes the old binary files and detach them from the old cluster somehow and then moves all the information postgres needs for normal work like transaction id counter like pg catalog statistics is tricky thing and moves that to the new thing and then you take the user data and add that to a new database and it works like this so theoretically the basic idea is very simple and well Bruce Mongeau wrote this too like many years ago and since that time it's online it works but what tricks so the idea is first with the couple pgclog and from new cluster couple old pgclog to a new cluster pop exceed which allows us actually to run database from the same point from the same transaction when then we restore the schema all the things we need to operate with our database normally and then we did that if the data files with user data would be the same everything supposed to be the same because we have the same schema we have the same transaction visibility map and things like this so we basically copy them link them or clone them to new cluster and it looks like the old one it sounds like slightly tricky and it is so procedures like this you create empty database for a new version of PostgreSQL don't forget about the KL things like this stop database with old PostgreSQL version of course you need an event application to connect because it can ruin things we need to afford some sort of downtime if you are using pgbalancer you can put pgbalancer on pause and those connections from the application will wait so theoretically it will minimize downtime again then you start upgrade procedure with pgupgrade command it is like an application it is quite obvious in this case then after it successfully finishes you start database on a new version then you start to collect statistics because during this procedure statistics is new so PostgreSQL do not bring old statistic to a new cluster pgupgrade doesn't do this and you need to recollect new statistics because if it is different if it is heavy loaded database it could be an issue then the statistic collection started at some point you can open your database for new connections of course if you do not have high transaction rate you can do that immediately and collect statistics but if you have lots of transactions most likely you start collecting statistics then it says 10% of your statistics is collected then you can open database and database proceed to collect statistics but already some plans would be quite ok this process is tricky so then you do this keep an eye that there are not a lot of logs and based on that probably let database to collect statistics slightly longer several upgrades you will figure out that you need for example a slightly longer technological window so basically it's like it's thing which comes with experience minimizing downtime pgbalancer has a very good feature pulse resume again your intention is to pulse pgbalancer and then it is paused quickly restart the database and it's like it's tricky thing so it's basically like many things could go wrong issue the check point to let your server actually stop quicker and then you can use minus k link to do the pgupgrade like very fast so basically pgupgrade has two major regimes it can copy files and that is tricky because it's like pgdump you need extra time to copy things from one location to another one and it has a problem that you need again double disk space so that's that's why for have a loaded database we use this ruling thing and that makes pgupgrade extremely dangerous because if ruling doesn't work you can lose your data so always think twice if your database is not big enough and if you can afford like a minute more of downtime most likely you can move the things with copying if you can afford this do this use relink only if you have very very strong requirements on downtime because it's much better to have a downtime of several minutes then a couple of hours is torn from backup things like this but it's everything depends on planning so common question is what to upgrade first hot stand by or primary or whatever my suggestion is actually first you upgrade primary as it is a standard on server just upgrade it leave your replicas alone if something goes wrong for example with relinking things you can just promote the replica brief and repeat the process from the beginning so basically you will be on a safe side because switching to the replica it's smooth process you're most likely do that frequently so you can do this don't try to upgrade everything simultaneously most likely that never ends well then you see that everything is upgraded it works it looks like your database not like some funk and chain style thing just instantiate your replica one by one again and you will have your replicas your cluster like it is upgraded so you pause pg balance on standby clone replica from upgraded primary pg base backup actually documentation probably still says use rsync I never advise to use rsync because even experience that people do make mistakes then they do backup scripts manually pg backup is a good tool use it or using pg backrest whatever you prefer and then resume pg balance start connections or start pg balance allow your application to go in so that's just like this now come details which can ruin everything there is no optimizer statistics so there is a script which is not a part of pg upgrade so basically pg upgrade generates a script which you need to run manually double check documentation about this so basically you need to be prepared to such things basically it does vacuum db minus minus all minus analyzing stages analyzing stages this is what I told you so it just starts to analyze in parts doesn't try to make all the analyze for entire database which can be actually quite difficult if you have heavy workload since 9.5 you can run vacuum db in parallel and which can be extremely good if you have a large database so don't hesitate to open this script read what it is in it and maybe edit it something along to your requirements so you can use different things and we actually basically just run in vacuum db minus minus all manually just not using that script we have from sort of our own and again keep an eye on logs you need to stop the script because it's relatively safe it's just statistics and block users from connecting or maybe block only some users which are issuing heavy updates or things like this then we run statistics so be creative experiment on stage not on the production and you will be on a safe side in this regard documentation suggest to you self think would say use pgbase backup so there are lots of possibilities to shoot your leg and maybe not only the leg some distributions like db and ubuntu have of course wrappers which allow you to do things like semi-automatically never ever do that for mission critical database I like db and ubuntu they are nice people with beautiful product so we have very good approach good attitude to build in the distributions but the problem is all those wrappers increase numbers of bug you can hit and pg upgrade is complex enough so don't do it try to do it as manually as you can like with the backup it should be simple as possible so manually extensions it's important thing extensions is a beautiful mechanism but slightly work in progress so some aspects of extensions are not quite good and one of the problem is the outer extension support is supposed to take care of upgrading scripts you can never ever know if the job was done properly and if it was done on time then you decide to upgrade to this version is this extension up to date or not most likely if it is extension which ships together with posgras it is because people do check those things but for third patch extensions that's not true sometimes so pg upgrade keeps all versions of extensions which could be a problem because it will not work with new database so normally the default prerequisite is to run cycle for all extension and perform outer extension name update so just at the point then you experiment on your test database do that take a look what is going on there if you have some warnings things like this if something doesn't work go for it do it some extensions are unknown to require the special care for example posgras should be updated before an upgrade actually so basically after that it would be like tricky things but once again don't believe me just take a photo of this slide and work on this every upgrade you need to go through release nodes keep in mind which sort of extensions you have and actually manually scrutinize what things you're supposed to do in order to upgrade there are lots of improvements in pg upgrades so basically I would say that pg upgrades are getting better faster now for example cloning is supported for newer versions of linux kernel it's safer than old thing with relinking and well it's a good thing but be careful you need a proper file system and some people actually trying to perform pg upgrades with cloning or relinking having database and write-a-head logs for example on different file systems it never ends well so basically check twice on every slide I can repeat this twice and see if it works using replication method to upgrade postgres call I would say that it's not the thing I would advise you what you can try the problem is in spite of all the problems with pg upgrade this thing promising you zero down time but you need to do lots of things manually so it's very variable intensive and then it is so you can like miss something forget to move some table but sometimes it's useful the idea is set up new database cluster set up replication which one is preferred for you and then just perform a follow over because those types of replication are working with different versions mostly mostly working with different versions of postgres normal binary replication doesn't so logical replication is relatively new promising method and from 9.4 you can upgrade to a newer version using this the problem is you need to check documentation carefully for each release as you do and then figure out how to manually actually migrate to entire database so basically logical replication is thing designed for like partial replication so if you need to migrate entire thing it's lots of work for you so just double check this there are some things which allows you to automate the process and check that you replicate everything you need so don't forget about sequences things like this and as well keep in mind that on your upgraded database you need user controls and permissions and things like this so this way of replicating data and upgrading data is more difficult use it only if you need really zero downtime slowny one it's old really really old method and still compatible with postgres but I would not bet that it would be compatible for long time and you can migrate from really old versions of postgres quell for over could be done very fast so basically it's trigger based even on heavy load it works it's nightmare to operate that for a long time but if you only need that for an upgrade it could be nice thing but it's complicated and if you're not quite experienced person in terms of slowny you like run twice read the documentation everything as usual but you need extra disk space and it's trigger based so if it's trigger based it could be really very CPU intensive both on the old server and new server long dice I never ever recommend it actually to use that some people prefer that to slowny slowny is complicated nightmare long dice is the same thing but not that old and not that bullet proof so what can possibly go wrong but the idea is like it's not under that heavy development like slowny used to be before the binary replication and before logical replication so it's like really buggy and it was designed for different things like transferring data partially so I do not recommend to use it conclusion so that's some sort of a matrix what we have we have methods downtime, extra disk space complexity and risking so pgupgrade with relink is low downtime low disk space, complexity is high a risk is very high but still if you want to be on a safe side use this one or use this one if you have some special requirements go for I would say logical replication if it doesn't work for you for some reason, for slowny but most likely logical replication because it's like new method under heavy development nobody can guarantee that slowny would be that good like in 5 or 3 years so that's like a conclusion it's a good written list of using some of those methods maybe not very fresh one so for example this talk of Bruce is quite old but he wrote pgupgrade and there are lots of good explanations how it works there are some blog posts so enjoy keep an eye on that and slides would be online video would be online so you can return to this and if you have questions you can email them to me later or just ask them right now thank you so the question is how to do smooth migrations then you upgrade your application so there are several tools for doing this but I would not say there is a civil bullet for this so basically some OREM framework things like this like Ruby Active Record suggest to do those migrations from the application side I would say many people who try to use that finally ending up with creating files on disk with alter commands and just basically manually preparing that from some internal logic they use to create the application so I would not say it's some automatical tools are working good in this regard if you keep in track on everything in your database for example to perform the upgrade using logical replication that's basically like some script which goes through all the database keep track on all the tables and prepares commands you need to install logical replication I would actually suggest if you are interested particularly in that part ask Boris those things because he is much more experienced in logical replication working in a company which creates that it's like yeah I address you to him so you have a question here why don't you like to use yeah well why not why do not recommend to use pg restore in a pg dump pipe pg restore well you can do this but it's not an easy thing so basically you can do this but something can go wrong if you use different things like this but if you know what you're doing welcome why not we actually used to do I think this piping from pg dump to psql before actually pg restore was written so it's really old practice yes it makes sense it's a good question it's a good question so the question was how to estimate downtime how long it takes to upgrade from one server to another one I would say that one of the best estimation is like previous upgrades so if you know your database you can figure it out and even this doesn't work I would say that it's like some rule of thumb if you have some amount of data if you are sure that you can actually disconnect some heavy updates from your database you probably can guess something if you know your database how long does it take on those disks to copy things from one location to another it's good if you have a test server which is like the same configuration or you can detach one of your replicas which supposed to be the same configuration just try to figure out copying files how long does it take so it is not like proper estimation but at least you will figure out that for example okay on such disks just copying the data files would be too long and you need to find another method for example but it's like experience things so basically that's why I suggest do upgrades recently and try to for example take your production data if it's possible try to do without any heavy load and upgrade and figure out how would it take on production and basically it's good thing with upgrades because most likely you need to close databases for heavy changes so it's much more easier than for example estimate performance on detached replica or standby so it's a good news in this regard packages and then you restart Postgres or update packages and then start Postgres, good question I repeat the question but don't give him a ball so what's the order what's the sequence of our things so by minor upgrade do we install packages first then stop Postgres and then start Postgres or we first stop Postgres then install packages it depends on the distribution actually but I would suggest if you can allow your downtime be on a safe side stop Postgres, install new binaries then do the things but in real world we actually every time we install new binaries trying to figure out that for example Dibian do not start, do not spoil some things for us and after that we actually do stop Postgres move to new things and so on only the first one will start the main one will just restart the other one will just stay over there but we will fix this problem in version 13 yeah well then you will figure there was input from DevRAM that RPMs also actually restart your database then you try to do like I described but my point is actually that you should do those things manually because you know RPMs as well as other packet managers they allow you a lot of sort of automation things to make things for DBA simpler but those automation is unpredictable so basically disable those things try to do that manually then DevRAM fix them then you will enjoy and include that on the slide that DevRAM fix things go on use automatic things so the question is if you are using not pgbase backup but using pgback rest is it applicable everything which I told or you need to make pgbase backup well what do you understand of immediate backup actually so theoretically answer is yes because you need your backup and you cannot restore that for a new version because it's binary incompatible but I would not suggest you to make a backup immediately if you have heavy loaded database which is right now collecting the statistics so there is some vulnerability period because if you will make a backup as well it would be maybe too heavy for your database so wait like several minutes then start to do this but basically you need to understand if you are upgraded to new binaries you do not have a backup anymore because all backups are backups from the previous version you need to install previous version then upgrade things like this so as soon as you can do the new binary backup or backup with pgback rest you can get me outside thank you