 Welcome to foster them again. I hope you had a great day yesterday already enjoyed the talk so far My name is Lane Scurma. I belong to Sun Microsystem I joined the company years last year through the acquisition of my SQL. I've been with my SQL since April 2002 so One of the people that have been around for a while in the company Talk will be about my high availability solutions and I'd like to make a service announcement Right away, and we do have a deaf room in room in building AW Today, so we're having a number of my skill related talks and sessions in there and due to some unfortunate timing We have a bit of a timing problem right now There is a talk about my skill cluster in particular which has just started So if you're deeply interested in learning more about my skill cluster in particular I encourage you to to move over to this talk right away because I would you just scratching cluster on the surface and Not going too deeply into the technical details and in addition to the Schedule that is printed in the foster brochure and the schedules that have been printed out here We had to make some changes to our schedule and I tried to update the printed schedule in some places But it's probably best if you check the wiki page for the latest Changes that we had to make we had one speaker who had to return home due to some family emergency and Especially due to the recent happenings in the my school world last week We decided to have a Q&A session where Kayana is going to talk about the things that happened at my school last week And what they mean and we want to get into a discussion with you about Well, what do you think of what son should do with my school? We are going to explain what the changes at the happenings last week me to my school and how we plan to go forward So I if you're interested in my school as the project. I encourage you to be in our deff room at a quarter past one today Okay, that's it for the announcements Let's get going So high availability solutions for my square what I'd like to talk about is giving you a general overview About what the typical configurations are how people that we're working with make their my school installations highly available I'll be starting with a few basics just to give in and An introduction to high availability what the concepts behind it are some Practical advice and then I'm going to start to explain the various features that my school provides be itself and The additional tools and technology that is available in open source to make them single my school server Highly available so you can really rely on it in case one of your system goes down I'll be talking about a bit more in detail about DRBD, which is a tool that is Frequently used in these configurations. I'll be mentioning my school cluster What it does and how it differs from the regular my school server and I'll be going over some other tools and applications that are useful in this in this context Okay, why does high availability matter at all? I mean I Learned it at the hardware last week. I have one single PC that serves my personal home page blog and everything It died so I'm pretty much offline right now And I I don't have any high availability solution at home So I should have listened to my own advice, but of course, it's also a financial issue the thing is in IT something can break and usually it will at some point in time there are various components in today's computers that do wear out over time and Accidents can happen Disasters there are various things that can really disrupt your service operations or Simply put from time to time you may even want to update a running system to a newer version because there's a security flaw The server that you're running is known to be vulnerable for a buffer overflow How can you update the server without disrupting the ongoing services? So even in for those cases it may make sense to consider having some kind of an HA configuration That allows you to perform such tasks downtime is expensive. Yeah Especially in today's world everything is connected websites serve to Customers want to take their orders once the site is done and people can't deliver or dispatch their orders They look elsewhere. There's plenty of alternatives and well your boss won't be happy about that so if you are in charge of Maintaining some kind of infrastructure that serves a public service make sure that high availability high availability is high on your agenda and Yeah, visitors won't come back and Plan about this in advance Really if you start configuring and designing your structure or infrastructure Make sure that the high availability aspect is being considered from the beginning Because once your system is up and running it will be much harder to implement such changes and making it highly available without Disrupting service again to make those changes in particular we will be talking about High availability clustering which means we are providing redundant components that make sure that in case of one of the Components of services goes down There's another one that can take over the same services immediately and can make sure that The user or the application does not really notice what's going on in the background the typical scenario is that you have multiple boxes and You have them configured that they are able to switch IP addresses quickly So a service that is used to point to the IP address of machine R can simply be redirected to machine B or machine C and get the information from there if needed and Also, it's possible not only to switch the IP address But you can take down an application on one machine and start it up on the other with taking over the data That just has been accessed from the from the other system before There are a few terms that you should be aware of fail over versus fail back for example fail over is What's happens when one machine goes down and the services fail over to the backup system Fail back means that once the system that has crashed has been up again that services move back to the original system before in some HA scenarios the The backup machine might be a bit lower in performance and is just there for the eventual case That something goes down that the service doesn't Get disrupted completely but can still be served even maybe on a lower performance level And once the original and the main system is back up you fail back to this one. So you have full performance So it's just helping out in case it's needed and The term switch always usually used in cases where the administrator Knowingly and willingly switches the services over to another machine for example because he has to take it down for maintenance of any kind Clustering is also a term that is being used in high-performance computing where you put Thousands of boxes in a rack and they all crunch on the same problem to solve a problem in parallel And this has nothing to do with high availability. It's it's really about scaling a job across many nodes and High availability clustering is not in particularly designed or aimed at providing this kind of service And it's also not provided or designed to add you to give you additional throughput So even though it is possible in some scenarios to use an HA cluster also as a load balancing cluster You should really consider Having these in two separate functions because high availability and high performance are really two different things and if you If you are depending on your HA cluster to deliver the services Because all machines are active all the sudden one of the systems goes down and you can't deliver The data in the performance that you still need it There's really no point in in misusing your HA cluster or your high performance cluster for high availability So there are a few things you should keep in mind about that This is a very popular slide it shows what high availability means in terms of downtime Usually people want to get as much nines as possible people talking about yeah We need high availability with five nines, which would mean that you have five minutes of downtime in a year So the service is running nonstop all the time except for Accumulated time of five minutes You can get there, but it's getting very very expensive. So you really have to consider How much downtime can you afford and how much availability do you want to achieve? And yeah, this slide gives you a few examples on What people usually aim at? So data centers can live with eight hours of downtime within a year Banks usually have harder requirements and telecommunication and military Wanted even more higher available SPOF the single point of failure is the single component in your configuration That Causes the entire system to go down if it fails So what you want to make sure is that you don't have any single point of failure Which would cause the entire system to Disrupt So you go through your systems you make an analysis And some of the components that usually Should be on your list are hard disks Since they are mechanical devices they can wear out vibration can kill them But also you should consider what other components of my systems may be vulnerable to to break down network cables Maybe someone tripped over the cable connection is down the system isn't reachable anymore If you have an application that has a memory leak it starts to Crunch up all the memory your system comes to a grinding halt because it starts swapping is mad In that case you might want to consider switching to the other system until you have resolved the problem power supplies are also pretty Common for blowing up at some point and usually if you if you buy service nowadays They have redundant power supplies that can be swapped on the fly And the same is true for even more and more components in two days pc hardware so make sure that your systems Provide the redundancy that you require and all the components that you depend on are available at least in two if not more Versions or copies of the same thing The software that provides high availability usually consists of a few components that take care of of certain tasks Usually what you have is some kind of an heartbeat application This tool basically makes sure that all the nodes of your cluster Or of your high availability system are alive it makes sure that the applications are up and running Giving you the correct answers and the system is running properly And the nodes check each other if the other partner in his configuration is alive as well And depending on the outcome and the result of what the heartbeat application Identifies you usually have one controlling application, which is called the h.a monitor that then performs particular tasks or actions depending on on what the heartbeat has discovered So if if heartbeat Realizes that one of the nodes is down The h.a monitor then performs the steps required to migrate the application and the ip address To the remaining nodes and make sure that everything's under control again In most h.a scenarios You usually have a shared storage device like a storage area network where all your data is being stored So the the application servers in front are redundant But they all connect to a central storage where they access the data that has just been operated from the previous node An alternative to that is to add even more redundancies that you use a replication In any way that makes sure that the data is being copied instantaneously to another storage subsystem So in case that your central storage goes down you still have a replicated copy somewhere else And I'm going to Discuss this a bit more A shared storage resource like a sand can of course also become a single point of failure. Usually these these Sand devices provide lots of High availability functionality by themselves But in case your whole data center runs out of power something like that. This doesn't help you very much so depending on On your requirement you may want to consider having Backup storage somewhere else that really makes sure that you're not getting into trouble if your primary storage fed system fails Um split brain is also a term that I'm going to talk about it in a moment a bit more This is a situation that can be dangerous in case you have a centralized storage device because all of a sudden it may happen that Several applications try to take or access your data at the same time and Depending on how your application is written. They it may not be able to cope with that and of course Significant trouble on on the central storage system, but there are techniques that Help to avoid such situations, but it's something that you should keep in mind Split brain is something that can happen in certain configurations And your system should be able to cope with it And of course if if your storage is somewhere on a network you have some overhead when it comes to Reading and sending data over there versus having it on a on local disk drives on your on your note um So what you can do is you replicate your data and copy it to a separate separate storage subsystem somewhere else and Here if you have to decide if you want to make this Asynchronous operation that means that every change it has or every file that has been stored on that storage subsystem Is synchronously copied over to the other side and you really know that the copy is Complete and ready before you continue with your operations the other hand You can use asynchronous replication Which means that the data is being copied in the background and you don't really have Confirmation that it's the operation has already completed while your while the application proceeds this is a bit Easier to handle and provides lower overhead and Usually gives you higher performance because you don't have to wait for the synchronization to happen But in case of a crash of your primary system You can't really be sure that every data that has been modified has reached the other side yet before the system crashed So you may run into or you may face some data loss depending on how fast the replication system works Some more about split-brain that i just mentioned A classical scenario is you have a high availability configuration with two systems They are able to heartbeat each other So they are both alive and happy And think everything is good and all of a sudden the connection that is used for the heart beating dies So both are still alive, but they can't reach the other one anymore and They might be considering. Okay. The other one is not reachable anymore. I suppose it's not my task to take over All the jobs and they start turning over the the jobs and Well, it starts to assume it is now the The only remaining node in the cluster and starts to use activate the applications And if you have shared storage the node on the other side may Want to try the same thing at the same time. So this can really lead to bad things And what you need to do is make sure that you have some kind of a so-called fencing System or you have a third system that acts as a moderator and can really tell the two nodes Hey guys, even though you do both two can't talk to each other anymore I know you're both alive. So you shut up and you stay what you're doing right now in in layman's terms Fencing does it even harder? It simply Turns off the other system so that it really can't Perform any badass anymore until the administrator has resolved the problem Some general rules Well prepare for failure We simply have to assume that it's going to happen at some point So you better make sure that you know how to handle it And these preparations can start with Simply rules for the administrators of what switches to use or what cables to put somewhere else if you have a manual failover system Or how to define your automated higher availability solution And of course consider what data is important to you What part of your system should be highly available? What Is it necessary that you have a redundant copy of the operating system that is always in sync? Is it enough to just synchronize the data directories that are changing from constantly? So there are a few aspects that you should keep in mind and consider here And very important don't over engineer this whole thing. Keep it simple. So it's it's easy to grasp And and to see what it does Stick to established components I know it's tempting to start writing your own ha system creating cool scripts that do these things Um Not sure if it makes sense if there are so many existing solutions available So try to lower complexity as much as you can try to automate it and of course test it don't wait for the Disaster to happen before you realize that your failover solution is not performing the task as it should be so every change should occur With a repeated test afterwards to make sure that the modifications don't cause any mayhem And if your ha system is properly designed, this usually shouldn't cause any service disruption or at least only very little Compared to what would happen if you don't test your changes and the system doesn't fail over in case a crash occurs Okay, let's get started with mysql related things here Mysql provides an inbuilt mechanism that allows you to replicate data from one mysql server to another It's called mysql replication. It's a one way replication system So if you have one server and another one is able to get all the changes From this one and and apply them locally So you have one master, but it's possible to distribute these changes to Many slave servers in parallel. So you're not limited to a one-to-one system here We can use what we also call fan out replication So you have one single system that fends out the changes to many others The whole system is asynchronous which means that You can't be sure that all updates deletes And so on with which happened on the mysql master server are also replicated on the slave at the same time Depending on your network connection and depending on the amount of changes that are being replicated The slave can lag behind and that's something that you should keep in mind Um The master maintains all the changes that are happening All the sql statements that modify data in any way are being stored in a binary log file, which is A special format file, which includes usually all the sql statements That's what we usually call this whole statement based replication So the mysql server really records the sql statements that were issued to modify data and puts it in his log file The slave takes these statements and then simply repeats the commands that have been performed on the master to update Its local copy of the data accordingly Um starting with mysql 5.1. We have also introduced something that we call row based replication In this case It's not the statements that are being replicated But rather the data that has been changed to the actual rows that have changed in the tables are being copied over to the slave This has some advantages and some disadvantages that i'm going to talk about in a in a future slide The cool thing about mysql replication that it's really Very easy to set up. We have It documented in the manual Um, you just need to have a second mysql slave server Configure it point it to the master and tell it okay. This is where the master's data is at right now start replicating now And this usually works pretty reliable The thing that you should keep in mind is that the Replication slave performs all the updates using a single threat so In comparison to the master which can process many queries in parallel The replication is replicating these changes in a serialized fashion So the slave takes more time to process all the events that are happening on the master So as funny as it may sound in some cases it may make sense that your slave server is a bit More performant than your master server because of of this particular problem However, all mysql server provides really is the replication We don't take care of automating the failover There is not really much of a high availability solution built into the mysql server It's just replicating assuming that the master is alive So in case that you want to use this system for a high availability configuration You have to perform some additional steps And make use of some additional tools that i'm going to cover now This is just a graphical overview of how the replication system works On the left hand side you have the mysql master which is the primary server receiving all The data modifying statements like update insert and delete It applies these changes to its local database And also stores all the changes in its binary log file On the right hand side the mysql slave simply connects to the master Pulse it for all the changes that has happened since the last time it connected And stores all these changes in its local relay log file before it starts applying them using This the sql thread And as you can see here the mysql slave itself can also store these changes in its local binary log again So this allows you to daisy chain them for example Some more words about statement based replication This is the system that has been included with mysql for a very long time I think yeah, it was mysql 3.23 or even earlier The rock files are quite small since all the only thing that we're logging is the sql statement itself Which also allows me to use this binary log file for other purposes like auditing to make sure that the application isn't performing queries Queries that it's not allowed to or if you want to check on your users what kind of data modifying statements they have issued There's a tool that is part of the mysql distribution called mysql bin log That converts the binary encoded log file forward into plain text sql again, so you can look at it And the replicated tables don't really have any requirement when it comes to having primary keys or something like that It just replicates the data as it is Unfortunately, this approach has a few downsides to it most importantly If you use sql functions that are non-deterministic or if you use user defined functions You may have to watch carefully if these are being replicated properly For example, uuid which creates a unique id which is unique to that particular sql node Can't be replicated safely the uid would be different on the replicated side And there are a few other functions. We have a whole section in the menu that really describes Which functions are replication safe and in which you should handle with care but in general Most users still prefer using statement replication because it's proven and Easier or more transparent to to to work with So on the other hand, we have row-based replication Which means that the master server performs the sql statements and then Locks all the rows that have been affected every row that has been changed are added as being included in the binary log file Which is a technology that most other database management systems also use This makes it easier on the slave side to insert the data Because it can just take the rows as they are and pipe them into its local tables without much consideration or much processing work That of course means that the binary log file grows significantly depending on how much rows are changing If you have a statement that just changes a single row, this doesn't really affect you much But if you have massive updates that update your entire tables Every single row needs to be replicated. So in some cases this really Increases the size of the log files And that is something that you should keep in mind if you use binary log files for performing backups And want to restore from the log file. You have to consider the space that is being needed And there is a restriction that row-based Replication tables need to have a primary key, but depending on your database design, this might actually not be a much of a problem And the replication slave may provide different results of queries. So if you have An ongoing row based replication that really updates a lot of rows on the slave and you run the same select statement Since the slave hasn't fully caught up with all the rows It may return just a subset of the results that would be returned on the master server. So In that case, this is something that you should keep in mind if you're using replication To scale reads, for example, if you have many MySQL servers that are serving your application with read statements The slave may be behind so the results it may differ if you're using row-based replication And it hasn't fully caught up yet This is an overview about common Topologies on how to set up master slave replication On the top left you have one master one slave. That's the standard configuration On the top right you see one master that replicates the two slaves in parallel And these slaves are really independently from each other if they are made out of different components and have different performance Values one slave may be more behind than the other or So They they don't really care about synchronization between themselves As I said, you can daisy chain them. You can have one master that replicates to one slave Which in turn then replicates to two other slaves Something that isn't possible. Unfortunately is having two masters that replicate to a single slave a fan in configuration It's currently not supported by MySQL replication It's also possible to do a circular replication So that you have two MySQL servers that are both being a master and a slave of the other one So in theory you can perform inserts and updates on both sides and they're being replicated to the other side But this configuration has a few things that you have to keep in mind. So That's a I think I have a slide about this later on And you can also configure them in a ring configuration that really Statements are being replicated in round robin fashion, but that too has the same Caviets that master master replication has Okay, more about master master. So as I said, you have two MySQL servers Both a master and slave of each other at the same time Which is a very popular configuration for MySQL high availability systems Something that you should keep in mind here is that this system does not really help you in sharing your load even though it is possible to Issue writes on the same side Since they have to replicate the changes. They are performing both jobs So every update that is happening on the one side Plus the updates happening on the other side have to be processed. So this doesn't really give you Any improvement in in distributing the load So in theory, you usually have them configured in that way that both could accept writes, but you the application only Performs write and data modification operations on one of the two nodes And in case of an emergency, you could fail over to the other one Because if you start sending writes to both systems, it's really hard to tell which of them is the authority Especially if they are lagging behind and replication hasn't finished yet And one of the nodes goes down you may end up with data that is in an inconsistent state So if you're using this configuration, write updates should only happen on one node all the time If you want to use MySQL or if you want to distribute write load What you should look at is what we call sharding where you have several MySQL servers which serve different parts of your data and you have either Using either the MySQL proxy or some logic in your application To distribute the load depending on maybe a key or a hash on on the different MySQL servers Okay, what happens if the master fails? Nothing the slave simply stops accepting Updates from it. It tries to connect and if it doesn't connect it just waits and sits But it's happily answering queries at the same time. So The replication is just Taking down if the slave fails again, nothing will really happen. The master will happily continue updating its binary lock It's just that the slave isn't following it anymore And when the slave comes back up again, it's a later point in time It will simply have to replicate all the changes that have happened on the master server in the meanwhile And yes, this is not an ha solution yet simply because we're just doing replication. We're not taking care of Failing over automatically or anything like that So what you want to do is combine MySQL replication with some other solution and usually In the linux world you would use heartbeat, which is part of the linux ha project Which allows you to move ip addresses between systems and also Allows you to start or stop MySQL server using scripts that are being executed in the case of a failover situation Load balancing here in that respect means that you can use the The replicated slave for distributing read loads. So if you're having lots of selects The slave can happily answer these as well And a replication slave is also very useful if you want to perform backups Without disrupting or disturbing the master's operation Depending on what kind of backup technology you use it can affect the performance of the MySQL server MySQL dump for example, if you're using my isam tables will lock the tables for a while until the backup has finished So your application may not be able to proceed until the backup is done If you distribute that on a replication slave it it can be even shut down Through performing the backups and it will then just catch up with the master server once it's back up and running again Tricky to fail back Yeah, you have to make sure that your tables Are in sync once you have you've had a failover situation and you're ready to fail back to the now restored master server Take a look at if your tables are consistent and intact so We don't have anything in here that allows you to resolve conflicts that may have occurred in your data after the failover solution There are lots of scripts That take care of that and i'm going to give you a few links about this Some more about linux ha also called heartbeat. Oh, there's a typo um Here common configuration is two more nodes, but you can also configure to set up even more nodes, I think The linux ha project Refers to some configurations where they have 16 nodes in an ha cluster that is being monitored The fencing mechanism used by heartbeat is called stoneth Shoot the other node in the head Which basically means that you have a device that can be controlled by the cluster to switch off the power from a certain machine So if if one of the nodes is acting up it's simply being switched off um linux heartbeat allows you to create policies that depending on on your current situation May result in in different actions that are going to happen These can also be dependent on the time. So for example A failover mechanism may look differently during the weekend or During daytime when the system is very busy. All this can be configured Um, it ships with quite a lot of applications including support for mysql um, it has a GUI tool and Very low dependencies on other external applications libraries And it can perform failovers in a very fast way and this is how It usually looks like You have a mysql master so that replicates to mysql slave You may have additional slaves that are also taken part in the replication and heartbeat Make sure that the applications Only the access one of the systems with the virtual ip address that you are moving over in case of a failover case Another system that is commonly used to make mysql more highly available is drbd Distributed replicated block device. This basically says it all um Other refer to it as raid one over the network. So you have assist two systems Which both have local hard disks and one of the system is the primary server. The other one is configured as the secondary Every block that changes on the primary server has been Or every change in the blocks of the disk is being replicated to the secondary system So you always have a one-to-one Copy of your hard disks that is in the primary system so to say Um Recent versions of drbd can be configured to do this replication either In a synchronous way. So once you commit and save your data You can be sure that the copy on the secondary has been updated as well or if you have lower requirements on on your system you can Configure to be in an asynchronous Method as well So your local application doesn't have to wait for the commit to happen and can proceed but then of course you again have to make Sure that You can live with some data loss in case the replication hasn't finished If one of the nodes goes down it has been reconfigured It automatically resyncs with all the changes that have happened on the other side in the meanwhile Since it's performing on the block level of your application of your of your operating system It doesn't care about the application on top of it. The application writes its data into the file system drbd takes care of just replicating the blocks that have been changed regardless of what the application is doing Um, it can also mask local IO errors in case that the local hard disk on the primary fails It is still capable of getting those these blocks from the secondary if needed So it's not like you have to fail over immediately in the case of a disk corruption And the the common configuration is you have one primary That has mounted the file systems where all the changes are being happened and the secondary is a so-called hot standby That just performs the replication and does nothing else in the meanwhile Because the problem is since the changes are being made on on the block level of the disk You can't really mount that file system where the the blocks constantly changes underneath the file system There are only a very few file systems like gfs or the oracle cluster file system that are capable of handling This kind of situation your usual file system like xd3 or Xfs would become severely corrupted if you would try to mount The secondary file system while it's still being replicated drbd is open source. It's part of the linux kernel. You can get it from the link down there And in the mysql scenario what you would do is to configure the mysql server to put its data directory on A device that is being replicated using drbd. So mysql updates its local tables Stores all the data in its file system drbd takes care of synchronizing every changes to the to the passive node as you can see over here In the case of a failover The first thing that drbd has to do is mount the file system perform a file system check And then mysql d can be started and access the data on the replicated copy and continue as normal So the mysql server usually is only started in the case of a failover You you don't use the mysql replication in that case One downside to drbd is that it's available on linux only There are other solutions on other operating systems that perform similar tasks Okay, mysql cluster is a completely different beast I have to speed up a bit. I'm running out of time again um mysql cluster Has been added to the mysql server. I think two or three years ago um We acquired a company that was a spin-off by ericsson who created this cluster system for telecommunications Solution so mysql cluster isn't really an integral part of the mysql server um What it does is it it creates so-called data nodes Which are processes that run on on separated systems and they take care of distributing data Equally between the nodes and they take care of keeping the replicas Consistent and making sure that you have redundant copies in in the depth in the various nodes Um, they synchronize those changes in um synchronously between the data nodes And the mysql cluster nodes take care of the failover situation by themselves. So that's already built into the system Once a node has gone down and you bring it up again It automatically integrates into the cluster again and synchronizes with the nodes So the redundancy is being restored as fast as possible Um, it supports transactions and there are various ways on how to connect to it Some more about cluster here um Up until recently mysql cluster had one severe limitation in the sense that all the data that you wanted to store in the cluster needed to fit in the main memory of Your systems that wasn't capable of storing data on disk This has been fixed in the meanwhile With the only requirement remaining that your indexes have to be Stored in memory, but every other data of your tables can be stored on disk It's not suitable for all applications So you you if you are designing a system using mysql cluster you have to Look at what kind of queries that you are issuing Since all the data nodes have to constantly communicate with each other There are a few query patterns that aren't really suitable for mysql cluster Especially if you have complex joints that that involve Fetching data from lots of nodes or if you have table scans or range scans that scan An a large amount of your tables It doesn't support foreign keys yet And yep It depends on your data It's it's a very specialized solution and you should consult with us if Mysql cluster would be an option for you But it can also be combined with for example the mysql replication system to make it even more redundant Just to give you an overview the mysql server itself Connects to the mysql cluster and uses this at a storage engine You may know that mysql is capable of storing tables in different storage engines The most two common use ones are myisam or inudb Which store tables locally on disk In this scenario the mysql server simply hands the the tables over to mysql cluster processes Which run on on separate machines and they take care of distributing and storing the data The interesting thing here is that you can have several mysql servers all accessing The same cluster nodes and all seeing a consistent picture of the data So if you're performing updates on the server on the left and you perform a select on the mysql server on the right You will immediately see the changes that have happened in the meanwhile In addition to using mysql server The mysql cluster also provides an API to talk with the cluster directly So you don't have to use SQL or the mysql client server protocol You can talk with the cluster with your application directly using our API I'm going to skip that one MMM is also a tool that is commonly used It's particularly suited if you have two mysql servers configured in a master master configuration So it's an alternative to heartbeat a bit more simpler. It simply changes the IP address And and updates a few changes on the mysql server. So it's it's really focused on on this particular configuration But it's pretty widely used and for people that want to sit up linux heartbeat. It's a good alternative Flippa is also a little script that performs a similar task But it's not doing anything in an automatic fashion. This is really lo-fi so to say Flippa just automates the process of switching The roles between a master master configuration and you have to do this Consciously by issuing a certain command, but it's still quite helpful if you're an administrator And you just want to switch your application to a different mysql server Sorry Flippa may come in handy Right at cluster suite also very Popular in the meanwhile many people use it for their applications Has similar features to linux ha but can do a bit more and also support some load balancing functionality Also has good support for mysql server. So that might be something if you're using sent os or reddit enterprise linux It's very well integrated into it Solaris cluster open ha cluster is a solution made by sun primarily for the solaris operating system We recently open sourced the entire suite and are Working on improving it. It also has excellent support for mysql And also provides replication to remote facilities And using it in combination with storage air networks for example So if you are running on solaris The solaris cluster suite is something that you should investigate closely And some more links. I think I mentioned many of these before You can take them from the slides Yep, we got these before Oh, yeah mudkit is also a very popular Collection of scripts and tools that you as a mysql administrator should be aware of they really Simplify a lot of the processes of working with mysql And continuant has just recently open sourced their replicator Which is Another way of how to replicate data in between databases I haven't really looked deeply into it yet, but it sounds kind of promising. Okay Five more minutes left for questions Any questions? Yes So the question was which of the limitations of replication are Just a problem of Of the the system itself and how it's configured and which are limitations of the mysql server and could be improved by fixing the mysql server And I am honestly can't tell you out of my my head right now There may be some limitations That we're working on fixing and some simply are inherent to the system and how it works If you have a particular one that you're interested about I can try to find out but right out of my head I don't really have an answer for you. Sorry Yes Right. So the question was mysql master master replication has a few Issues that you should be aware of and one of them for example is if you issue updates to both servers at the same time You have a consistency problem that since it's asynchronous replication You can't be a hundred percent sure that both servers are always in the same consistent state at the same time So if your application distributors the rights to both nodes and one of them goes down It will be very difficult for you to find out which of them has the um The real data Yes, as long as you write to one, it's pretty safe Considering that you know, this is the main master server It it has the authoritative information and the replica may be behind but that's something you can check from the logs No, it does not the question was if it supports synchronous replication and no unfortunately mysql replication is always asynchronous If you need synchronous replication, you should take a look at the rbd and do it on the file system ever Any more questions? Yes The practical use case for replicating from one master to multiple slaves or right um What is the practical use case for replicating from one master to one slave and then to several slaves chris Scaling read only scaling right, but Yeah, so really if you have more slaves You can have an application that distributes the selects to all the various slaves instead of just putting it on the master so um It's hard to explain I'm really missing the the key point why you want to do it master to slave and then from there to in several more slaves if you have Multiple data centers. Yeah, that would be one example If if you have a distributed system where you have a company with several branches that need copies of the data You can then distribute these into separate locations for example All right, then no more questions I hope to see you at the mysql death room then for some more additional mysql related information and Thank you very much