 Thank you very much for staying so long. So yeah, I'm gonna talk a bit about Postgres. Migration may or may not be a problem for you, but I present you here something which we believe is quite new, quite innovative in the way we can handle downtime. And yeah, so very shortly, yeah, that I'm working since three years now at Audubet in Berlin. And we're doing some pretty nice stuff for restaurants and bars with live things. I have quite a fuzzy stack. And actually, I'm also like one of the core maintainer and recently also an owner of Mongo Engine. So strange for a guy doing so much Mongo presents something at Postgres, but it's possible to use both. And yeah, maybe we will see that actually using other database technology can help you also to bring new concepts to Postgres. Maybe to start, who he is using Postgres in production. Okay, who is making migrations? Okay, the other one you are lying. And who already had problems with migration or has problem with downtime sometimes? Yeah, okay, nice. Maybe we'll find something. So, okay, basically there are two main cases with migration so we can have something quite simple. You may, yeah, extend, add new table, new column, alter it, and generally it's the kind of stuff Postgres is quite good at. It's going out quite fast. You don't have big problem, very small downtime, hot migration sometimes. There are actually some possibilities also to combine basically two versions of codes and to combine actually to make migrations realized. Like one week deploying something, the next one, rolling back what you made, et cetera. But here's something quite different because there are still some cases which the guys who raised the hand lastly know about. We have really very different business logic. You may have also like a strategical change in the company also, which requires a huge migration which unfortunately needs a lot of also data handling and needs several, eventually dozens of minutes of downtime, kind of stuff nobody really likes to do. And yeah, even more if you have some non-tech oriented company. For this we're gonna use Postgres 9.4 and strictly 9.4. 9.5 will bring also some other things. Who knows here about logical decoding? Okay. Yeah, it's unfortunately not so known, but it's actually really great. Basically, logical decoding is the level of replication. Everyone last question. Who is using replication in Postgres? The one from Postgres, not from extra provider, extra drivers. Okay, not so many people. Basically, when you're using replication, there are some right ahead logs which is the system of Postgres to make the replication work. And you have different level of replication. Some hot stand by this kind of things maybe you already heard about. This logical decoding is basically a next level which was introduced in 9.4 which allows basically to describe what is happening on a database while doing the replication in a logical way. As it says, meaning precisely, here's an insert, here's an update rather than having some blobs and diffs. So this allows quite a bit of stuff because basically you can extend it with output plugins. And what is also very nice in Postgres is replication slots. Other database just do replication and you are speaking to other database which handle replication maybe with some parameters on their side. But on Postgres you can have really slots which means different slays and so can read from different slots which you can start at different time which can have different levels and different outputs. So you can do a lot of stuff. Now you are all a bit lost but slowly it will start to make some sense I hope. Yeah, basically this is what is possible. Our toolbox, we're gonna use a Walto JSON which is a great tool which was not written by Ethan Snyderfight, this is the fork which we are using because there is some help for the compilation. It's something which was proposed to go inside the core of 9.4. Unfortunately it didn't made it. It's actually really great. Wal, as I said, is the writer headlocks so this internal Postgres replication language. And if you expose it to JSON, you understand that you can do some stuff with it. PG Migrator is a toolbox or a kind of module we built purely Python and Psyco PG2. This is why we're here. And which allows to do some fancy stuff. There are two main modules basically which one is switching from a traditional physical replicating slave to a logical replicating second master because basically when you start to do is logical slave, is logical replication. You are not anymore or you can choose not to be anymore in a master slave. We're gonna have two master and here will come the trick that we will see later. Still replicating but two masters. And JSON receiver which basically allows to parse JSON as you can read and apply some transformation. So what we're gonna do to have a migration which for whatever size will at the end just cost a very few seconds of downtime is the following. We start with the classical replication. We switch to logical replication. We pause it. Then on the new switched second database form a slave which is now a master replicating logically. We apply the migration locally there. Then we replay the migration. So we paused it. We replay what we missed in the time transforming data on the fly. Then we have deployed the application. So the Python application, the backend site and we write directly to the new master. This is the concept. I hope you follow a bit. There are gonna be some graphs and then a live demo. So this is starting point. I guess this is fine. Then the second part, as I said, yeah. We change a bit the way basically both communicate. The second master, the old slave becomes a new master. When the migration is there. Then we enable basically wall to JSON. I didn't decide this is actually a C plugin writing directly or communicating directly with Postgres. We replay that. We go through our toolbox which is PgMigrator which updates what needs to be updated. Adds, transform, delete, remove, filter. Everything is possible. And write what we want to be written in the new database that has the new schema. And basically, when you are at that time and you have your data which is transferred, your old backend, your old logic which is transferred there but which still arrives there, you make a small downtime, switch. You can have blue, green deployment even more but the easy case is you make a downtime, you switch. You have your new code base which writes directly to the new master. The old one, you can do whatever you want with it. Actually, if you're in production, you won't have just two database but you will have three or four. You can serialize replication in Postgres very easily so that the new master directly has a slave because yes, security matters as we all know. Yeah, I guess I'll continue directly with the live demo except there is some very urgent question and as we have a slightly longer live, yeah, question. Q and A session. Okay. There are a couple of tasks who knows me, know that I like this. I tried to make it somehow visible to play a bit with the colors. We are on the first, on the left. It has a blue color. Blue is when I'm talking to master. So basically here I'm on Vagrant and I have two Docker containers for master and slave. They start as master and slave then they switch but just for keeping it easy. The green one is the slave, yeah, as usual and the two last are gonna be for some fun and tests. All right, so here we're gonna run master. Okay, it starts database. This is some classical output. I could also have less tasks but here we can have the output, the database. This is quite nice. Here we can start the slave. Ah, interesting. Just retry it this way basically. I had, yeah, this was my mistake. I had deleted indeed the tests before and forget the most important part, excuse me. Right, here we continue. So we can see that these are the kind of, so the right head lock which is the kind of thing for the replication. Okay, this is basically the setup. We can see on, no, there isn't. Here we're gonna, yeah, just log basically to the, so have a small dash on it. All right, so we can do some fancy thing by masquerades. What do we have here? Flower table, another one. All right, here we can basically check insane pain. Nice name, no, you'd like it. Basically our slave, it's not really a slave so we need some random name because you never know about the state actually. We can do the same thing here. Ah, excuse me. I added to prevent some disruption to change boards. Yeah, and also so that you believe me that I'm not cheating and have the same thing in the back. All right, so as you see, yeah, I don't want to show replication. Replication works and you may know about it. Just to make it even more obvious, we can do something like this. Oh, it's gone and what about the slave? Oh, it's gone. Now we can do something else where basically we're gonna insert some thing to the database. Let me, I'm still not used to how to move here between the produce windows. All right, just fire this first. So yeah, I don't think I need to explain what it is but what it's doing is basically creating table data and inserting some value after a small sleep. So nothing extremely fancy until now. Yeah, we can check that again. We can also do some this, all right. No, also here, this is quite trivial but now it's, this is basically a case where this tab is basically your customers which are still putting new stuff and you want in parallel basically to do your deployment to start your migration. So you're gonna use now the tools we're working with whether in PgMigrator, we'll use this first component. So yeah, Python and PsychoPg to switch. I see so bad from here. I should have made a replication of the screens. All right, excuse me. So we have, we can define some source database, source user, definitely source password, et cetera. And we can define some slot for the testing. That, excuse me. Yeah, sync this three. Okay, we created the slot. The slave is in a good state. We use the trigger file. So this is a classical way in Postgres when you have replication to use a trigger file to make the slave become a master. This is also in case you want to have some fallback strategies, you're gonna use that. So now we have a master and we are eating the changes basically which are what we want. Okay, so now basically I'm gonna go back to the shell here. This is the slave shell. It's maybe not very obvious before. So this is the master slave. The master, excuse me, the master shell. The master shell is still eating later. And the slave is stalled. It is planned because it became a master. So we don't have exactly the same replication. So now basically what we want to do is to do some fancy things which is basically our migration. So for example, we're gonna alter the table and add a new column. And eventually we can also run. So this is the schema migration and we can do some update where we're gonna basically say some new info and we're gonna take the old info seeing how it works. Logically, 224 rows updated. This sounds quite good. And now we can go to the next step where I'm gonna paste it because it's gonna be too long and continue here. And the slot, yeah, let's take the third slot. All the rest should be fine. Ooh, so this is JSON receiver. Basically now we are still in the state we are not replicating directly from Postgres. We have updated the migration but the new master is doing nothing. So now we're gonna start the JSON receiver which basically will eat up what we want to eat which was basically in the slot. It was quite fast for a while because basically it went back all what we were missing. It replayed what we were missing and now it's going slowly because on the fly, as you can see, it's adding info and new info with the values. But what is done here actually is what is done here. So we can see now the max ID which goes up because it's the max ID from here. So basically now live, we have two masters and the second one is working. You could already write some other data. If you really want to have like two systems right in parallel to things also synchronizing, you can try. It can be tough but you can try. But basically what you will do at that time is the moment where you usually switch your, we make it into the small downtime to be sure that they are not parallel right to be really sure and you switch your system, you switch the code behind. So basically now we can actually kill that. We saw that this works. We can make sure that here 623 and on the old master also 623. And on the slave we can also do, it's on the new master which is this green one. We can also check if there is someone which has like where new info was null and there is zero. It's quite good news. Just the new code base writes to this new database. That's the new master. It worked. And now, oops, this is done. Yeah, just some words about the future. So for sure we'd like to develop an open source bit of the package. The last time we had a bit less time to develop it. So we implemented, but really on a beta, alpha, version style, some components for adding a column for doing the updates, et cetera. But it would need a lot of more details for really more complex stuff. It would need some serious testing, some serious documentation, these kind of things. But later we could integrate it with Python frameworks and eventually, so like if there is interest for that and we see community willing to work on this, I guess it's possible to have a Django package which would just basically check out the migration file and automatically create the transform object to allow to have that for a very cheap developer cost. That was it. And I would like to thank Ethan Snyder for his colleague, not here today, but who worked a lot also on that project. And now it's time for questions I guess. Thanks for talking. Be able to show the code that does the transformation? Sure. Kind of prepared that, wrong screen. The structure you see, the test is about that. So this is basically, so there are two main components. The migrator basically for noise implemented, we have a class, like a transformer class which register, so against the one or the other operation and which defines basically how you do the transform value. So this is like the base class for it and you will, for example, here, so you cannot see very well, but this is basically the way you will have to add new data which you could do this way. Yeah, basically filter out all foods, delete some stuff. And yeah, this is like a very simple main. As I said, it's really alpha. That was for prototyping, but basically you have transformer, you register whatever you want and you have some, the receiver, it's connects directly to the output plugin and writes new JSON, which is then taken on the other side. Is it okay? Or were you, okay. Some more questions. We'll be happy. Thanks for talk, very good. Did you test the solution for more advanced structure like HSTOR or full text search fields? I mean, full text search, I don't think anything can go wrong, but HSTOR must be more complicated. I guess the problem will be basically to write the transformer here, but since this is a quite flexible structure, I guess everything is possible, but I didn't test. So I can't, but JSON, HSTOR, having a list, then I guess it sounds doable. A possible setup for a hot standby of Postgres could be implemented using similar way. So hot standby is a level of replication and the logical replication is basically the level higher. The logical replication decoding, and I will also put the presentation on also on the your Python site, and I will publish it, and I will add also some links to documents. But basically, this is just the levels of replication. And when you do this logical replication, you have included the full hot standby. It's just something on top. You lose nothing. Yeah, there's one step and you can bring it further. Thanks for a great talk. I'm not quite understood. Do you know, when do you know that you can switch your application? When do you know the replay is done? This is basically something which for now is not automated. I don't know if it's completely automatable. Basically, you could listen on what is written on the master and check the moment where nothing more is written. So it depends which kind of fade out you have. You could also use some kind of caching mechanism for the new one. If you have a blue, green deployment system, you could even do really hot switch. Like the main problem there is actually the sequences. There is actually, except the sequences, there is nothing which prevents you really completely of making a complete hot switch. With sequences, you would have to trick a little bit by resetting them on the slave to allow a space which is high enough for what you expect to come in from the master. It's doable, but yeah, it's quite kind of tricky. Else basically, you would shut down your data input and if you have a system that you monitor well enough, that you know well enough, you know then when is the right point to start to write on the new master. The only problem could be some clash between the sequences and this is why you should make sure nothing else needs to be written afterwards on the old master. That's basically it if you want to have it for moderate cost. Is this possible to have the interplay between two different database systems? I mean, here you synchronize Postgres with Postgres, but is it possible to think about synchronizing Postgres with something else? Like, I don't know, MySQL or whatever? MySQL, don't ask me, I'm sorry, but with Mongo for example, if you have JSON afterwards, it's, I think it's quite obvious that you can do it. It's basically completely out of scope for that, nothing to do with the migration itself, but since you have this flexibility with the replication slots and the replication flows and even output possibilities, you can definitely use something slightly similar to that to indeed write some output to some, like for example, if you take JSON to some new SQL Mongo database, I guess this didn't try it, but it must be interesting and I'm quite sure it's possible. Even more like for example, in Mongo, you generally have documents which are more complex and contain data from several tables. You can have some intelligent listener over several tables to aggregate and write nicely, prepend output, even transform on the fly. Also possible. Then I have a follow-up question. Have you looked into symmetric DS? No, no, I don't know about that. So I have to admit, there are a lot of other possibilities to replication, master master schemes in Postgres and so, we wanted basically to be a bit afraid by all these tools which all claim to be the best also and like half sponsorous and supported by Postgres. So we wanted to work actually with really the internals of Postgres, checking also what are the next upcoming developments and yeah, build our system on top of that. There might be for other purpose, other or better or different solution. Any more questions? No? Okay, so.