 Okay, so now we have Nick who's going to talk to us about PostgreSQL's archiving and back-up and replication tools. Hello, Martin Kestel here. Well, yeah, we do have a little bit of redundancy, but redundancy is good, especially when you're talking about back-ups, so it's okay. Who's been in Gülchen's talk earlier, like two hours ago? Half of you, okay. So, well, we will have some redundancy, I'm afraid, but still. Anyway, so my name is Nick. I'm currently working as a full-time DBA at Elster, which is the German system for the Elektronische Steuererklärung. So, when you're German and you did your text activation earlier this year, then it's in my database. Fully encrypted, no worries. So, I did think about doing some kind of decision tree. The documentation about back-up, archiving, restore, and replication covers about 50 pages of A4, which is a lot, but considering that the whole documentation is more than 3,000 pages, it's just enough, right? At the end of the day, I said, no, let's just go on, mention the other stuff, but tell them what they probably want to do. So, for 90% of you, or even more, what we will have at the end of this talk, we'll do one way or the other. There's still some misconceptions, so the idea for this talk raised from a discussion we had on the mailing list, because I ran into an issue and I didn't find anything in the docs, and well, it ended up in like, yeah, you can't write it down, you can't write everything down, right? So, some things people just have to know. Anyway, let's see. Some assertions. You're here because you value your data, otherwise you would not be in the Postgres room, but in some other. You have decent hardware and equipment and data centers, so you will have at least two servers that are the same size. You will have ECC memory, you will have a battery backed up, hard disk controller, or using a SAN. You will have a power supply, an uninterrupted power supply, and so on, so enterprise hardware, enterprise. These are separated as far as feasible. So, when you only have a room with one square meter, they're probably in the same rack. If you have a larger room, they are not in the same rack. When you have more than one room, they are not in the same room. When you have more than one building, they are not in the same building. I could go on. With continents, yeah, we'll get to that. You have at least an idea what your recovery time objectives and your recovery point objectives are, or what your requirements are. You don't have to have them fixed, but you should have an idea. Everybody knows what that means. RPO, RTO. Okay? So, your recovery time objective means the time that you need to recover from a fatal crash or server loss. Something like that. Building loss. Country loss. The recovery point objective is to which point you have to be able to recover. So, like, you are allowed to lose the last six hours of users putting data in your database. Or maybe it's two minutes, or maybe it's one transaction, and so on and so on. Doesn't make sense? Okay, cool. So, there are some natural laws that are always valid. Gravity, speed of light, Murphy. You all know what Murphy's law is. Okay. Disaster does strike. Unlike lightning, disaster does strike twice in the same place. You see that red point up here? It's the first and last time I mentioned the GitLab guys, but when I read that I thought, how do you fuck that up? Right. Okay. So, then there's Einstein. I'm not going to read it because, yeah, you probably know all that. There's two stars. Okay. Evolution. Götzchen covered that extensively. What I do think is that amending the docs when you add a new feature is maybe not the right way. Maybe not. You still should read it all if you have the time, but it's more than 50 pages, or almost 50 pages, 49 when I counted yesterday. So, that's a lot of stuff to read just to do some backups, right? You should still read it. Certainly should. So, with 7.0, this is what you had. You had a chapter, Backup and Restore, and you had a chapter, Database Recovery. Backup and Restore was basically logical backup. We'll come to that in a second. Database Recovery was this. Totally love it. Just imagine Oracle would put something like that in the docs. Logical backup. What does it mean? Basically, you use PGDump or PGDump all to do a textual representation of your data. So, it's really create database, create table, insert. Actually, not insert, but whatever. So, you get a snapshot of your data from the second that PGDump started because it opens in the transaction. MVCC will take care that you get a consistent view of your data, and you can restore exactly that state, by the way, but we'll mention that later as well. It will can dump the whole cluster with PGDump all. You can dump single databases. You can dump single tables, et cetera, pp. You can have a text file, really like readable SQL insert into et cetera, pp. Or you can get some proprietary formats, which file does recognize as PostgreSQL, proprietary, whatever. You know what I mean. So, plain text. It's plain SQL. Use this copy if you don't tell it to use insert because you want to port to MySQL. It can be read by humans, which is a big advantage. Then there's a custom format, which is basically compressed text. I think. I don't know. Yeah. So, you cannot pipe that into PSQL as you can with a text. So, you have to use PG Restore and pipe that into PSQL because it will, again, give you the text. It can restore single tables, and it is compressed. Then you have the custom, no, the directory format, which has an advantage because it can run in parallel. How's that light working? There. With minus j. It will create a directory with a different file or different files for every table that you have. So, you can also restore them in parallel, which makes it faster. Anyway, jumping forward because, well, yeah, you would say PGDump all. Never forget to use PGDump all if you do a textual dump or PGDump dump because PGDump also only reads from databases. Your schemers, sorry, your roles, your tablespaces are database or cluster-wide objects, not in a database. So, you have to dump them separately. And you should probably do that anyway. We'll come to that. RTO and RPO. The recovery time objective is basically, yeah, you can restore. It will take between minutes and days, depending on the size. That's simple. RPO. You can recover to the point in time when you started your backup. So, if you do your backup at Sunday night at 8. And Monday morning, somebody says, delete from customers. Then you can restore pretty well back to Sunday night. If that happens on Friday night, you're not happy. Mind you. Disaster strikes. Pros and cons. It's readable by humans. You can, like PGDump all, dump your schemers and roles on a regular basis to put them into subversion or Git or whatever so that you have it documented, right? A PGDump file can be read by newer versions of Postgres. So, if you have a PGDump backup of your Postgres 8.4, you can easily read that into a 9.6 or 10 and go on. Yeah, you can backup and restore single entities like tables, schemers, databases, and restore them separately. Another thing is, and that's why we mentioned this at all, doing a PGDump all will read all of your database. So, if you have checksums on your PGData and you have this, yeah, that table that you look into once a year and you have a corruption on that, that will be revealed by PGDump all or PGDump. So, it's not that bad after all. Yeah, you can only backup and restore single point in time. It's rather slow, yeah, RTO, RPO, not what you want. It's like commuting in this. It will get you there when it starts raining. It might be uncomfortable and you will not be fast and comfy and so on. It's like writing your text with this or it's like doing your calculations with this. It still works when you have no power. You will need light for this one. But it will still work, right? After that we got the wall back in 7.1. I thought it was an 8, but no, 7.1. 8 added the ability to use the wall. We'll see that. We got point in pitter, which is not pain in the rear. Point in time recovery. We'll come to that as well. 9.1 added pgbase backup, which is a very comfy method to use that stuff. And we'll have a look at that later. Online binary backup. Yeah, I would like to just say, I'll do it this way, but you have to see something. Postgres is using the wall extensively for stuff. You can still go back and watch the stream, but still the wall is basically a logbook. A redo log as other databases would call it. Every change, everything that goes to disk first goes to the wall, except for the log files. At a checkpoint, and the checkpoint can be basically spread over hours, so it's not really a point. The content of the wall is written to the actual heap, so your actual data files, right? And that creates new row versions. Simplified. This is the wall, split in 60 megabyte seconds. At a checkpoint it goes to the heap. It goes to the shared buffer straight away so that you can actually use the data with other sessions. Selects go to the shared buffers. Insert update delete goes to the wall. And the people who actually know the code will want to kill me after this. But still it's simplified, right? So you get an idea. 16 megabyte segments. It's actually not a chain of segments. It's more like a ring that can also grow and shrink again. So whenever a checkpoint happens, actually when two checkpoints have happened, the database will start recycling wall segments. So they get overwritten with, actually they only get renamed. And then the check pointer will write to them. Check pointer. So that's in PGXLog as I learned on Friday. PGXLog got renamed to PGWall because, oh, file system is full. Let's delete some log files. I'm not kidding. So in channel zero it will be called PGWall, but I think the tools will not be renamed. So you will still have PGReceiveXLog. Anyway, size is determined by wall min size and wall max size. Oh my god, it's Boston. That's by default it's one gigabyte or two gigabyte, but those are soft limits. We'll come to that. The archiver. Wallwriter writes wall segments, wall segments, and are then read by the check pointer process. And in between, if you have it, the archiver will run. And we'll do something to them. And we'll come to that as well. PG Start Backup. To get the binary backup simplest way on a, it's just to have it mentioned, right? You call select PG Start Backup. You get a snapshot. And then you release your backup again with PG Stop Backup. Since you don't want to do that manually, we go on. That snapshot goes to some area down here, which eventually goes to tape, hopefully. Or to something really durable, right? I was googling for tape. Do I need anything more? Like, yeah, okay, cool. I've got a copy of my database now. I can use that, right? Nope. You need all the wall segments that have been created since you started your backup. Why? Among other things, a reading from your database can create right traffic. I'm not going to explain that anyway. It does. So, hopefully, your wall segments are still there. If you have a busy database, do a lot of IO, they can have been recycled in between. Disaster strikes. So if that ever happened to you, rinse, repeat. You don't want that. So what you want to have is something that will put the wall segments next to your one-to-one copy, and that would be the archive command. Then you can take this copy, your file system copy of your database to some new server, your restore target, and you will do a restore command in a file called recovery conf and tell it to fetch the wall segments and start that database. It will see, oh, I'm not complete. I need some wall segments. We'll fetch them from here, and then you're ready to go. And you're ready to go. Rather fast. If your database is small, rather slow. When your database is large, still like you need to copy the data, et cetera, and also depends on the amount of wall files. So in other words, the activity while you did your backup. RPO, similar to a PG dump, but this time it's the end of your backup. So the moment that you call PG, stop backup. Or, well, or maybe it's the moment when you stop the backup before. That should have been in PG dump as well, right? That last one. It's a one-to-one copy of your database. Rather fail-safe, rather fast. RPO is fine. Oh, wait. The RPO is fine, actually. Never mind. So that's RTO. You can still just, in this case, you can only backup and thus also restore a single point in time. And you can restore, save, backup and restore only a whole database cluster. So an instance of Postgres. Yeah, the recovery point objective. This time it should be RPO. Yeah, still not great. Better than the Abacus, but it's a little bit old school, right? And maybe a little bit too much effort to get. These are not really faster than a capable guy with an Abacus. So options to get that snapshot. You can, for example, do a file system snapshot, LVM snapshot. I know a company who do it like that. Oh, I've heard of one. Stephen is not here, right? Zoom Frost. So you can use R-Sync or you can use PG Basebacker, which you want to use. Right? Everything before is still in the documentation and you should know that you can use it, but you shouldn't. Options to get the wall segments. You can basically have two choices. The archive command. Well, you should have one anyway, but we'll get there. Or you can tell PG Basebacker back up to just fetch the log segments that appear during its runtime. So that would be with xlog method. You probably want to use stream. Okay. Use both. Or actually get used to both. You will have a wall archive. We'll come to that. So PG Basebacker back up, on the other hand, is also handy to set up a new replication slave, whatever. We'll also get there. So when you have a wall archive filled with your archive command all the time, you can just skip the separate backup of the wall segments during your backup. Okay? Why would I want to have a wall archive? That's waste of space. Oh, there's two red stars here. Delete from customers. Among other things, the wall archive will allow you together with that snapshot down here to do a point in time recovery. A pitter. Pain in the rear. Not. And for that, obviously, you need the snapshot and all wall segments. So everything that happened inside the database since you last took that backup. Okay? Makes sense? Good. So your restore target will now not access only the single or a few wall files down here, but you will just have a generic restore command that will fetch it from the wall archive anyway. And you can have that running all the time. That would be a warm standby server. So if you have one of these, say, as a poor man's replication slave, then it can also replay every wall file. That means you are 16 megabytes behind, but you are still quite close to your master database in replaying here. So your RTO goes down massively, massively. If you have a cold standby, like you have a server ready that can take over or can replay from the wall archive, it can still take hours, days, whatever. A warm standby will be up and running in seconds, which is good. Our PEO would be the last archived wall segment. So the master will only send finished wall segments towards the archive, so you will lose everything that happened after that was last sent. So that's why I would call warm standby as a poor man's replication. It's better than nothing. That will do the job like you can do. Text on those calculations, even play games, all fine, right? Works. Maybe a little bit outdated. Maybe you want more. Maybe you want binary streaming replication. That's a warm standby server on steroids. So the wall segments are actually not the wall segments. Everything that is written to the wall gets sent over the network straightaway. The transaction that you have in your wall segments or in that stream will be replayed immediately. That means ASAP. It can fall behind. So you skip all this way down here. You have a direct connection. Fastly simplified again, of course. But you have some options there. You can have synchronous or asynchronous replication, and you can choose on a transaction level if your current transaction should be committed synchronously or asynchronously. Usually for your forum entries or whatnot, you would say, OK, I think it's OK. But when you have a payment, you just set that transaction to be synchronous and that will only commit when it has reached the slave. And still you can choose between remote write, actually remote write on and remote apply. So do I want to commit my transaction here at the moment that the packet has reached the slave when the packet has been written to disk persistently on the slave or when the slave has actually replayed it and will serve the same information like the master? You can use replication slots. I would for now say, yeah, I know that they exist, but you don't actually need them when we're done here. They can be cascaded. We'll have a nice picture there. Slaves can serve read-only requests. You can even take your backup from a slave. But oh, I think I didn't finish there. There's some implications with feedback, et cetera, PPP. So if you don't really, really, really need that, never mind. Streaming slaves can be delayed. So if you have your analyst doing delete from customers, you can still hit the big red button on the slave and say, OK. So you have a 30-minute lag. You hit the big red button, configure your slave to just replay to that moment in time just before the delete from customers happened and then become the master. So that's quite handy. I've seen investment banking systems who had that in Sybase. Now I have no idea why software should come. So that's quite comfy and nice and so on. Still not exactly the full thing, but yeah, it's nice. Like you cannot put a PCI Express card in here. But it's good. It's nice. Works well. So options. You could have a master and one, two, three slaves that receive your replication stream. That's actually what I use in my day job. You could cascade them. Like this master replicates to that slave and that slave replicates towards that slave. And that could go on infinitely, almost infinitely, only like limited by your network connections, by your delays, et cetera, PPP. You get the idea. If you really want to do something like that, read the docs first, all 50 pages, please. And more. If you have synchronous replication like this, if you configure all three of these slaves to be synchronous replication slaves, there are some implications. Like when one falls out, falls down, your DB will still work and it will still serve read only request, but every commit will hang until you have enough synchronous slaves again. So if you have end synchronous slaves configured, have one spare server that is async, right? Because that can still take over and be declared as a synchronous slave. So you just have to catch up a few seconds and then you will still be fine. And then restore that n plus one's server as an async save or whatever. But you get the idea. So you still want to have your system running. And remember, disaster strikes, it does. Network latency. So if you have a synchronous slave in America and you're in Belgium, I don't know what exactly the round trip would be from here to New York. But half a second, something like that. Every commit will travel half a second and it will wait for another half a second until it gets a reply back. Do that with 10,000 TPS. You're fine. You're not well. So choose wisely, right? So you can still, again in Postgres, you can choose if your specific transaction should be synchronous. So you can have an async slave in New York, no problem whenever you have some payment going over. Make that transaction synchronous. It can wait for a second. That's fine. So choose where to put your sync slave, obviously. Same building, fine. No, same building. Of course fine. Same city, still okay, I guess. Same country. Different continent. Not a good plan. Pros and cons. Okay. Fail. You still have a one-to-one copy of your database and it's online. It's there all the time. You can just switch over and use that and carry on, right? It's reliable. It's battle proven. It's been in Postgres since eight, four, one, zero, whatever, for years. The RTO and RPO are obviously very good, right? So you can be up and running in seconds and you lose the last transaction in the worst case. So it's 99.999% availability is possible. On the downside, it works on whole DB clusters only, again. And, yeah, implications on network connection loss. So if your sync slave goes down and you only have one, you have serious problems. So, hey, replication. That sounds good. So I don't need backup anymore, right? And this wall archive, nah, I don't need that anymore, right? Replication does not replace backup. You need backup. Rate does not replace backups. Yeah, we have rate controls. We don't need backups. Nah, nothing goes wrong here. Okay, fine. The SAN does not replace backups. Even if it's a mirrored, net-up-based, whatever, SAN, no. The cloud, the rest of that sentence is actually, there is no cloud. Ask the FFV guys. Okay, you get the idea. So then let's combine it. You want a wall archive. You want a replication server, right? Maybe you want more than one replication server. Maybe you want to have one of them or N of them sync. Maybe you want to have a replicated server that is delayed by 30 minutes or maybe even a day or whatever. Maybe you even want to be cascading. Like have a sync slave in the same data center and have two more slaves in the other one. And since bandwidth is not free sometimes, you might want to have those cascaded so you only send the data over the countryside once. Then you have a minimal RTO and you have the closest possible, if you have one sync slave, RTO, or the closest feasible if you are not using a sync slave. And you probably have some protection. Oh, that's missing two stars. Right. Against the delete from customer statement. And you can on top allow read-only queries like reporting, analytic queries, et cetera, pp on one of your slaves and you're not interfering with your OTP stuff. That has some implications again, but that's not covered here. So this is basically what you want. Have a master, have at least one slave, a stream that's feeding the slave. You have an archive command that will put all your walls to the wall archive and you have a restore command that the slave can still use as a fallback option to get the missing wall segments from the wall archive. It's also, you should always configure it to be able. I've read wall stories of customers that were running with replication for years and the replication actually did not work, but they were happy because they still had the archive which was fetching wall segments. They never realized there were 60 megabytes behind, but it worked. Fail again. So you get all of the replication and all of the wall archive pros. That's good, we want that. Still, the major version between these major version has to be the same. So if you're running 960 here and 961 there and you copy the stuff back, you can still run on 960. Don't have to upgrade the binaries first. So that's more like what you want. Current system works like a charm and you can still run Haiku on it if you want. Or a decent operating system, a nice one. Or it's more like this to come back to the commute stuff. It does most of what you want, but you can still change to manual drive and take over control. So what do you have to do there? How much time, by the way? Three? Twenty minutes, oh cool. Fine. After all, we're the last talk. I only have to go back to Münster in Germany today. I'm not going through all the details, but this I would recommend you to just set up in general even if you don't have replication or wall archive in place yet. Put this in puppet or whatever your configuration management does or something similar at least. So set these options in PostgreSQL Conf. Set the wall level to replica. It does add a little bit on top of minimal, but it's not that much. And if you can afford a few megabytes more, just set logical straight away. That adds a little bit more overhead as well, a few percent. Set archive mode on. That just means call the archive command whenever you finish a wall segment. And as archive command, if you write the docs, it says the minimum feasible command is something like cp. Put a script name in there. Like really put your archive script in there with %p, %f. That script could just do an exit zero by now, but have it there, right? And put this into your PostgreSQL Conf. Set max wall senders while you're there. PostgreSQL 10 will have 10 as a default as far as I remember. Right now it's zero. Max replication slots just the same. Just put them in there. They don't hurt you. And synchronous commit. Up to here everything is start time parameters. So except for the archive command. Set wall level is not changeable on the runtime. Archive mode not. Max wall senders, max replication slots all require a restart and we don't like restarts. Not at all. Set synchronous commit on a new server to local. Just for now. Like until you have everything set up, you can just say, okay, I want to commit synchronously locally. You do want that. Most probably. And set synchronous standby names to an empty string or even put your synchronous standby names. That's basically the name that a slave comes to the master and says, hello, I'm slave Bob. Do you have anything for me? And if you then say Bob here, it will take Bob as a synchronous slave. If you put that in, it will not hurt you right now because you have synchronous commit set to local. That's why I recommend setting local for starters. Also. Set hot stand by and to on on your master straight away because you will be syncing your postgreSQL cons between the slave and the master. Right. So configure your slave on the master. Straight away. There's some more settings that you will probably want to touch. But anyway, this one is important and set lock collector to on. We'll come to that. Then go and set up your wall archive and don't roll your own. There's a lot of smart people out there who have done that for you. Use PG Barman, use PG Backrest, use wall E. And there's a couple more. I don't remember most of them. I'm personally a fanboy of PG Barman. Basically because I use it. It's okay. I mean, Backrest is more or less has more or less the same feature said wall E will send your wall files to Amazon, to the cloud. Oh, that's two stars missing there. Whatever. Follow the setup instructions. Most of them are documented quite okay. Don't roll your own. Don't just don't take that time and invest it in thinking about how much redundancy you have and want to have, how much persistence or about persistence in general. We come to that and data safety. Your DB server is certainly not the right place to keep your wall archive. That should have one star and two stars, whatever. Well, the example in the doc says CP. If that's not an NFS share, it's on the same server. It also says the same server is not a good place, but still, yeah, should have been mentioned. Even the same data is a bad choice. Like if a plane crashes or the power plant blows up or whatever, disaster strikes, missing a red star, unless you have a mirror like maybe a sign that mirrors by itself, whatever. So what should that your archive script look like? I will not give you one. One of the reasons why the archive command and the restore command are not like there's no default for that. Not even in the docs because everybody's environment is so different and you cannot cover it all. You can't even try. I think some commercial vendors will try or do try and then you have some different backup solution for that database and so on and so on and that's not what Postgres works like. Archive command. You can write, asking, blah, whatever, into your Postgres queue account, stick up it and it will run that from then on. But yeah, even if you have some error checking, it will not fit, right? So put in a script. The script can just be changed. So next time the archiver runs it, it will be running the new version. Purpose, somehow get the parameter %p to your wall archive as file. So p is a complete parse to the wall file and f is the file name on its own. Comfy, you don't have to split it yourself, so give it both, right? And then get it off your server. I think it's actually not a bad choice in my personal opinion. But you have to make sure that f, so the file name of your wall archive, archiving that you want to archive, is not existing yet on the server. Otherwise, well, two stars. You clone your slave by file system copy or whatever and it will start up and run the same script and have the same destination server, the same destination directory and it will receive different updates and then you have conflicting rights on the same wall segment. Nah, not what you want. So make sure that does not exist yet. Call sync. We'll get back to that. Or maybe you want to have... Maybe you want to have that part on a file system, on a remote server that is mounted sync. And what I actually ran into, which led to this talk, is our sync tends to give you return codes of 255 or so when it cannot resolve the name. So if your DNS is being restarted right at that moment, you will get a fatal error in your logs. Like, ooh, archive of what received archive command return 255 exiting. So, ooh, what fatal, fatal, fatal, fatal. It's the highest level before panic that you can get. Most probably all red lights in your monitoring will go off blasting and yeah, that's not nice. So filter those, right? Don't return anything larger than 127 straight away because otherwise you will have fatal log entries. That's it. At the end of the day, the archival will try and try again except you return zero. When you return 128, it will exit, start again and try again. So not really a difference except that you have a fork in between the log entry. So make sure that you never, ever return zero before you have done the job proper with your script. Sound simple? Actually, it is simple. Unless, of course, yeah, well, when you're still setting everything up and so on, you have an exit zero in the first line. Set minus e, et cetera, you get the idea. Why we turned log collector on is that if you have log collector on in the PostgreSQL command, anything that the archive command will send to standard out and standard error will end up in your PostgreSQL log. Otherwise, it will end up in def zero, def null. Sorry, def zero, def null. So that's not where it belongs. So log collector will collect that and put it to your log. You want to have that. Okay, fail again. But we can start with that, right? It's got two stars. Remember that whatever you do, cp, scp, rsync, copy via NFS, you are probably writing into some page cache. So when rsync finishes, that does not mean that your wall segment has been persistently written to disk on the remote server. It just means that it's in Linux page cache and it can be there for the next 30 seconds or five minutes or whatever. So that's why I said always call sync. NFS is usually async. That's even worse. They're somewhere in the network buffer except you mount sync and then it gets terribly slow. So then when you mount sync, NFS is not really an option anymore, my opinion. But your backup and your wall archive is not safe until it has been flushed to persistent storage in a safe location. So scp to your remote server, the data center goes down and you have lost data at that particular moment because it is still in the page cache of your wall archive server. So that's why you sync. And same goes for the wall segments. I wanted to really, really, really make sure you got that. It's the same sentence. What's the last one? That is in a perfect world, but if your personal RTO and RPO requirements are not that high, mounting something sync is not too effective usually, so you will probably make some compromises. But know about the implications. Remember, you might still lose data. Okay? That's why there's two stars. Humans. Now go on, activate your archiving. Activate your barman cron job, whatever. No. Remove the exit zero. And watch it. Make sure that you really, really, really write your wall segments to the archive. PG switch X lock. X lock is a function that you can call and it will start a new wall segment and archive the old one. So you can check that if it works. If it does not work, you might run into trouble after a while because Postgres will not throw away any wall segment it has not archived. So if you have a script that does the job right, but does not return zero, Postgres will try to run that over and over again and not throw away the wall segment and then you get called out of bed in the middle of the night because this space is gone or out of this space. Replication slots, by the way, just work the same way, so they will not throw away any wall segment that has not been sent through all replication slots. Keep that in mind if you want to use replication slots. So now try full backup. You are using a tool, promise. You're using a tool, so that would be, for example, if you're using barman, run barman backup all and just watch and check that everything works, right? And then you set up your replication slave five minutes, roughly. Cool? So you have to do something on the PGHP ACON to allow for replication, but you probably did that already when you configured barman or your wall archive because many of them use replication now as well. You prepare the new PG data assuming that you're running Debian or Ubuntu, you will probably call the wrapper script and just have it create a new instance and then go there and delete it. Stop it first, maybe. Really do RM minus RF in your new Valepi, PostgreSQL, 9.6, main, whatever. You don't need that to scratch it, but you have a PostgreSQL account, the PGHP ACON, et cetera. In the right location, in its scripts, you can already set up. Yeah. So make sure that you have copied your PostgreSQL account, PGHP ACON, et cetera, over so that your slave will actually work if you promote him to be the new master. And then run it. No, run PG-based backup, actually. First you have to do the copy. So you run PG-based backup on your new slave, let's say tell it to fetch the x-log segments via stream, give it your host, give your user, and so on and so on. And you give it the new PG-data directory, Valepi, PostgreSQL, work, whatever. And it will just copy everything over and provide you with a running instance and because you have given minus R, it will also create a recovery font that will basically work except that you will probably want to add a restore command which enables the new slave to fetch missing wall segments later from your wall archive. Start, enjoy. Rinse, repeat, next slave. Maybe you want to have that async or sync or delayed or whatever. It's not really that hard. Now start looking for software to manage your application. Again, a lot of smart people have done that job already. Don't roll your own. It's not necessary. Wrap manager is, I think now five years old or something, open source project, actively maintained. Path by Dalibor, which is a resource for, what's it name again? What's that Linux HA manager? Pacemaker. It's a pacemaker, so it's PostgreSQL automatic failover. That looks promising. Pacemaker itself has a PostgreSQL resource, but that will mangle with your PostgreSQL account. You probably don't want that. PG Lookout, I actually don't know. I've got that from the Wiki. There's Wiki pages also for the archiving part, so that's a good place to start probably. Or if you are like me, you probably do not rely on some kind of software to assess your high availability, but do it yourself. We don't have that many instances. If you're around 8000 PostgreSQL instances, you probably want something like that. Otherwise, you get balls every day. Logical replication. I'm only mentioning it because it's not yet done, and I can only like, the guy who wrote it is over there could be awkward. Right? But still, logical replication in the not-so-recent past would have meant that you were using Sloney, Bucardi, Bucado, Sky Tools, and what they have in common is we're all gonna die. It's not that bad actually, but still, like, since they are based on triggers and have a lot of options and so on and so on and just keeping track that you replicate everything that you want to replicate, etc., that's the pitter to get back to that point. So yeah, I don't know. Wait for 10. 10 will have logical replication or use PG Logical. Now, if you want logical replication, as mentioned, if you ever, ever plan to use logical replication, set up your wall level to be logical right away because then you can, like, use it later maybe to do a point-in-time recovery and start your logical replication from that and so on and so on, it will allow, or it does allow, nice things like rather painless, low or zero downtime version upgrades like switching from 9.6 to 10 or 10 to 11. Goes to 11. Things like sharding come to mind, collecting data from different OITP databases into your data warehouse, multi-master application, etc., Gucci has covered that part quite extensively. When it is in court, it will be like that. You still have to know that you are in a three-dimensional world and that you have to, like, watch out for trees and so on, but it's much, much more comfortable than this, right? Okay. Hopefully, better. Famous last word. Don't reinvent the wheel, don't roll your own shit. There are so many smart people who have done all that before and you can only fail where they failed already, right? So, test your backup procedure and even more important, yeah, we're great at backup, but we're bad at restore, right? On a regular basis. And make sure everybody in your team knows how to restore, even maybe knows how to do a point-in-time recovery. Don't have that guy that is now on holiday in Northern India for four weeks. Be the only one in your team who knows how to do a point-in-time recovery, right? Monitor your logs and your legs, monitor your replication legs. That's what I mean here. So, if you fall six hours behind, maybe check earlier. Make sure your conflicts aren't synced. So, worst case that can happen is that you build up your master and configure stuff and so on and the slave is sitting there, replaying and then you promote him and your shared buffers, your connections, all these limits are not configured, so that means you need to restart that server. That means you have a cold startup. Not good. Okay, in case of disaster, keep calm and just carry on and do not try to do LVM, snapshot, restore, whatever. That's it. Fit quite well, only five minutes over.