 Good day, everyone. Sorry for another set of technical legal faculties, and I'm going to talk about my skill replication advanced features in 20 minutes, and I know it doesn't make sense because you cannot cover my school all advanced features in 20 minutes, so it will be some advanced features and hopefully 20 minutes. So first, let me ask you how many of you are using some kind of my skill replication here? Wonderful, we can't do without that. Now, let's see what kind of replication guys are you using, who here use the classical MySQL replication here? Okay. Now, what about MySQL group replication? What about Galera replication? Oh, wonderful. We have all kind of MySQL replication being presented here. Okay. So let's look at MySQL replication first and their history of MySQL replication, really how that happened and evolved over time, because I think that is a very significant here, and that is always, I think historical reason is why MySQL replication is so, what's the world messy, complicated? Oh, you have a bad word, Simon? No? No? Yes, yeah. Is it started fucked and ended it up? Okay. No, no, no. I shouldn't say bad word here, right? Oh, that's actually Belgium, right? You guys are kind of okay, right? Yes, yes. You know, that's when I'm speaking to you as I have to make sure it's all the rules to be politically correct, so that's such a relief, I only have to follow less rules here. But anyway, MySQL replication, it started back in 3.23, right? And that's the first statement trial replication was replicated, which I think first was similar to like weekend project, right? So it was very similar to cover basic things. And then you can see through versions, each or almost each version would be adding a lot of new features, right? And for zero, we had a split Iof thread and SQL thread, then in MySQL, phi one, in addition to statement replication, we added the row replication, mixed replication mode. In five six, we added peer database parallel replication, GTID in five seven. Now we have other kind of parallel replication, multi-source replication, the group replication was added. I also could add MySQL eight here, but actually now the most improvements that happen in the group replication, which is designed as a plugin. So most changes are also available in both MySQL five seven and MySQL eight, right? So that doesn't mean what nothing great and wonderful happened in MySQL replication since five seven. Well, it just, it's in a plugin. Now besides MySQL replication, it had sort of a alternative track which exists now also for what, probably close to decade, is there Galera-based replication, right? Which is kind of similar to MySQL replication, but that's a lot all the code base. And I would call it more mature. And that's available as a patches for MySQL. It's also available for, from Purconis, Purconis 3D cluster, and also include in MariaDB. Now when I speak about replication in MariaDB, I think it's important to note what in MariaDB, there are a lot of differences, right? So MySQL, it's not 100% the same as in MySQL, right? For example, GTID replication is quite different, right? It has a different features for parallel replication, right? And we are not going to cover MariaDB features in this presentation, right? Just want to get it out. Okay, so what are we going to call Advanced MySQL replication, right? Well, actually I'll call Advanced with everything which is outside of pretty basic MySQL replication, which means, hey, master and a few slaves, and which covers all of those, you know, standard basic features and improvements. Okay, let's look into what kind of choices do you really have with MySQL replication and you can make. The first one is you can choose what kind of, how the data is replicated, right? That can be statement replication, then you actually have your queries. You can have a raw replication in actual rows. I've been replicated, right? And then you have a mixed replication, which MySQL, it will make its choice how best to replicate this query and it will use raw replications when a query looks as a potentially unsafe to be replicated in a statement mode, right? In reality, both of those approaches have some performance benefits and other consideration, and that is something what you may want to make a choice for your application. Now here is another interesting thing is if you're looking for a raw replication, you also can have a choice to figure out how those row changes are going to be logged. If you look at the best approach, right, really from your data, is when you have a full before image of a row and a full after image of a row, right? Why is that wonderful? Because you have a best opportunity to spot the conflicts or data inconsistency. You can also actually go back then, right? When you have both before and after image, then you can replay the logs in reverse order and restore your data in many cases. But the downside, of course, is what your binary logs and the amount of data you have to replicate is huge. So you can have also two additional options. You can say, hey, I would have a minimal row image, in which case only change its columns, right, and primary key to find them, store in a binary log, or you can use no blob, in which case blobs, which are typically the largest parts of your data, are not logged. Now, if you use your row replication, you can say, well, you know what? My replication got broken and all I get is this kind of strange error. And I don't know what query calls it, so I don't know what application corresponds to and so on and so forth. Then you can also log the query for informational purposes with this bin log row query log event, which is quite helpful feature for troubleshooting. Okay. Now, what can you replicate? Now, in your case, I would say the most simple practical way to replicate is a full database, right? When you have full database replicated, it is wonderful from troubleshooting standpoint, from operational standpoint, you can take it back up from one of the nodes in the store to any nodes, but at the same time, MySQL is much more flexible with that. You can both replicate only part of the data, as well, you can choose to add additional data to the slaves if you choose to. What kind of filtering do we have? We have two ways to deal with that, actually. There are options either on the master itself. You can select what options you can, what data you're going to write in the binary log, right? And you can also filter on the slave and you can choose what data you want to filter. Now, there is some relatively complicated interaction of how those options worked exactly depending on where your statement replication or row replication, right? So, at least mind what the replication filter in MySQL is complicated and I would think where you actually really need to use it. Again, full database replication typically is the most practical, especially be mindful about filtering on the master. Because remember, binary logs are not only used for applications, they're also used for your point in time recovery from backups, right? So, if you don't try to back data in your binary logs, you also use opportunity to do point in time recovery using the binary log backups. Okay, the next question is how the position is identified in the MySQL binary log, right? There are two choices. One is your binary log position when you pretty much identify that by the file and by position. The challenge with simplicity, right? Also comes the challenge that that position is going to be different on your master's one position and on the different slaves, it will correspond to potentially very different positions. So, failover is complicated and so on and so forth. You can also identify that by a GTID, right? Which kind of handles all of that position in much more fancy, but it's also much more complicated. So, what is GTID in MySQL? Again, in MariaDB, GTID is different. That's one of the biggest differences in the replication. GTID looks this way, right? You would have the source ID and some transaction ID information. And this value of a GTID for transaction is always preserved on the slave, right? Even if you have a multi-tiered, very complicated MySQL hierarchy the GTID of a transaction will be preserved if this transaction goes for the slave. Now, benefits and drawbacks, of course, is what you have an automatic position discovery. You can say, hey, you know what, start replicating, right? Or on the slave after you restore backup, for example, it will be able to figure out GTID. You can simplify slave promotion. You can also spot and discover if there are some transactions which have been missing in some cases. But it's also pending the ask for manual troubleshooting, right? If your position-based stuff, you also can, you have a simple, hey, you know what, there are a few transactions which you can skip. You have some rows missing. You can always just manually apply the updates. With GTID, it becomes substantially more complicated. Replication topologies. MySQL is now super powerful in terms of what kind of replication topologies you can use, right? You can have a single tier master's ladychill already displayed. You can have a replication which is set up directional. You can have any kind of tree. You can have a ring, also called circle replication. You can, but you shouldn't, you know? MySQL replication is a very, very powerful food gun. It allows you to do many, many things which you should not do, right? I remember that. And you can also, especially with multi-source replication, you can pretty much set up very fancy directed graphs, right? You can have a fun in and fun out replication. So, oh my gosh, you can, you know, have become very creative for your doom. Okay, so what kind of options do we have for dealing with those more complicated advances topologies? Logslay updates, that is a most important one which is also on by default in MySQL 8, right? Which allows you actually to have slaves of other slaves. Parallel replication, that is a very important feature to keep in mind. The replication is often limited performance factor if you don't enable parallel replication. It's available since 5.6, but only in 5.7 you have this logical clock kind of parallel replication which really allows you a truly parallel replication even if you have a rights going in the single database. Okay, standard MySQL replication is asynchronous, right? What that means is what when you do the rights, data persistent on the market and then it's eventually transferred to the slave and even more eventually is applied to that slave, right? And eventually, hopefully means a few milliseconds, but hey, you know, I have seen the MySQL replication log of being more in a month in some cases. Anybody else? Yes, yes, so eventually may mean many things, right? And that replication log is unbound in MySQL replication. There is no, unless you have some extra monitoring MySQL will not complain about having any MySQL replication log out there, right? And what you also have to know in this case is what asynchronous replication if a master is crashed and lost, some data loss may occur because even that network transfer happens in the background. Now, there is an option for MySQL replication called semi-synchronous replication which requires a plugin. And then if you use that replication, replication doesn't become synchronous, right? But at least the data would be transferred to the slave synchronously. So if master crashes, then the data loss can be prevented, right? There are a bunch of semi-synchronous replication options which you may want to review, but in the interest of time and keeping Fred happy, if you're not going to go in more details, we'll talk in, and now I think you need to know about MySQL replication. MySQL replication allow you to set active, active master replication, right, in this case. But again, because it doesn't have any kind of conflict detection and resolution, you should not use active, active master, right? You also should use active, passive master, right? If you use MySQL replication or use Galera-based solution, it has had it much better in terms of, you know, preventing or avoiding conflict. Now, there are some people who still absolutely want to use active, active, you know, because they thought they are, have designed the application to avoid content, right? Or they want to make sure they can write to it, you know, two-sided at the same time for performance reason and they don't care about their data. These are probably the same kind of people who still run MySQL tables, right? But anyway, right? Now, if you want to fill on the risky side, there are some features you should be aware. To reduce amount of conflicts, right, reduce kind of errors, you can configure those different sides with different after-increment offset and after-increment increment. And also, there is this fancy slave execution mode, which is important, which allows MySQL replication to ignore a lot of data, oh, you know, just overwrite conflict and data with no warnings, right? So if you don't really care about data consistency, you know, then here is a recipe for you. Okay. So in terms of replication position information, that is another thing that MySQL gives you a lot of choices, right? You can figure out where the data will be stored in terms of position information, a repository where it's file or table. If you store that in file, then it typically gets very hard to synchronize, right? In case of power loss, right? So we can see in general trend, moving that information to the table with MySQL 8. Though for the years, that also would come with some performance issues. So the MySQL, MySQL slave replication is the most commonly used variant, right? As we saw, and we had a pool here with a raise of hands, right? So you don't have to just believe me, you can believe yourself. MySQL group replication I think is another, you know, up and coming option, which in my opinion is technologically is much more superior, right? Conceptually to the standard replication is essentially in our age of cloud, when you want to spend less and less time to manually, you know, maintain the system. So what is a group of replication overview? As I mentioned, that came after Galera replication. Well, that's not official, that's my belief, but that was inspired by Galera ideas and its success in the market. The great thing about that, the group replication, I think, is what it is built on the top of the MySQL standard replication. So you still bin log, GTIDs, right? It doesn't leave here kind of on the side. It's available as a plugin for MySQL 5.7 and above, and that is considered GA at this point, and that is very actively developed. We can see even in the new releases of MySQL 8, every release gets a lot of, you know, super cool improvements in the group replication. In terms of difference from MySQL replication, there is no more this kind of master-slave relationship. You can have a group of nodes, right? And all the groups kind of know about each other, right? And manage as a set, while there is, in the full configuration, one of the nodes, which is primary state in rights, another can be used for reads and for failure. There are a lot of consistency checks in this case involved, right? So for example, if nodes lost network connection, right? It will be removed from a cluster, cluster itself promotes other nodes to primary, right? So you don't have to have external, the external tools to do that and so on and so forth. So, and there is a lot of work for simplifying failover and so on and so forth. As I mentioned for rights, group replication configures by default a single primary. You can also configure that in a multi-primary setup, but it's not recommended because there are some features which don't quite work with it yet. Now, what I can see as the limitations in the MySQL group location, then there is no automatic node provisioning hopefully yet, right? So you still have to make sure the data is somehow magically restored to the same on all nodes before you kick it off. And there is still, you have a manual node recovery in a network failure, right? So if you had some network connection which was kind of disconnected for a few hours, right? A few hours, the node would not rejoin and re-synchronize that. But I am very happy to say what in the time I did this replication last time, there was also no way to prevent a stale read some nodes. Now it was fixed, right, in the recent release. So I'm glad what the list of my complaints is shrinking. Also with MySQL group replication, we have MySQL in ADB cluster, right? That is kind of more high-level package which also includes MySQL shell and the MySQL router for traffic, for traffic management. Finally, if you are not using MySQL cluster, right? And still, well, you know what? If I just still use classical MySQL replication, how to make it kind of more robust and less painful, right? There are a few tools come from community which should be aware about. If you are looking to automate or failover the leading solution right now I think is an orchestrator, should know that. If you're looking for read write splitting, traffic management, if there are lots of features, you know, again coming from community or open source, ProxySQL. And if you are looking to get, let's say, some sharding for MySQL, right? The VTS at this point is the leading solution. And that's all I have. Did I keep you happy, Fred? Of course. Wonderful. Yes. Okay. The questions. Any questions? Okay, you know, oh yeah, go ahead. Yeah, the question is in which case you should use Galera replication, should not go to the group replication. Well, I think what you may want to evaluate, right? Those solutions, what works for you or what doesn't, there are some other differences as well. In my opinion, in generally the Galera still has an edge in terms of its robustness, right? Obviously it was around for much more years, right? But I think the group replication has been catching up in this regard. How many nodes you can have if group replication versus with Galera? Well, honestly, I don't know quite if there is a specific limit, right? I mean, typically it goes into a practical limit. In group replication, that's nine. In group replication, that's nine, right? But I think it's unlikely you need much more than that, right? Well, thank you.