 So the HAA, so the topic here is about the myth of MySQL HAA which is today NLDB because it's the number one topic in how we tackle the high availability and make sure it is always online and easy. So HAA, high availability in MySQL, it has been around for many years. Oh, do you know what is the age of MySQL? How many years? 20? 20? Very close. Very close. Very close. 25 but very close. Another number. 27. Far too much. Yes! Who said that? Oh, you said that. Yes, 24 years old. Okay, by the way, our course, the age is 42. 42 and 22. 24. So HAA, we have been actually doing very good over this so many years in how the redundancy we maintain. So we do replication. Replications, I have data, I pass this to you. I have the data, I pass this to you. So this has to be done a lot of time manually, manually, although this is like, okay, passing the data to the other one. But what about I fail? I'm the master. I fail. Okay, so I have two slaves, master and slaves. I fail, I crashed and the replications will fail because I cannot send the data to the other two. So very often, in this case, the other server has to be promoted, becomes the master. And then the other server, okay, you talk to this guy and grab the data. But who does this? Who does this? Somebody, you! Program is not automatic. That's why we come up with this InnoDB cluster to build this in. And when I actually crash, oh, someone will be promoted automatically. And then this one will handshake to this to get the data directly, automatically. When I recover, I join automatically. And then you have more data than me because you are live. I was dead. And then I come back, I have less data, less data. I have to ask for more data, more online data. And then once I got all the data, I online, I become online and I joined the cluster. So this is automatic. So they are all good. So what are those? Okay, so before going into this, actually I show you one demo. So basically in here, this is just a screen. To show you, I am accessing the database. The access the database is, I can select the data in here, select one or whatever. It doesn't matter just to select a value. So when I do this select value, what I do is, okay, this is working. This is working. Okay? So when I actually select something else, which is select where I connect to. Where I connect to. I connect to a server, which is the host name, my notebook. And the pod number, 3310. Although this is small to you, but I read it to you. The pod number is 3310. But what happens if the server is crashed? But instead of telling you I crashed the server, I try to, the server I need to offline. The 3310. I'm promoting another server called 3320 to be the primary. And what happened this? I go back and then select this again. Okay? What is it? Can you read? Your eyes is good, I believe. What is the pod number in here? That means what? Switched. So when the server crashed, automatic. You see that? So that's the idea. If this is your applications, do you care about the database done? Seems like it's all automatic. So this is the way why we promote this as number one options. And I can do this by switching it back to 1.0 instead of crashing it. Instead, I can crash the server. This is just, I want to bring it back. And I go back and select this again. And the numbers returns here is back to 1.0. Oh, thanks, masters. So let's look at again. So what I'm going to tell is what are those secrets? Now it is a section two. So the beginning, beginner sections, we just talked about what it is. And now into the HA models, how we can work it better. And the next sections after my talk will be the trend about no SQL. How my SQL is working on. So it will be after me. And the last section is everything is done. We need to look at performance. How we monitor how we troubleshoot. So this will be the last section. So back to this one section, we look at the basic and how we deploy. And what are those magic? What are those the magic? So we will go over the basic first example. And things is not just one single data center. Sometimes we need production and also DR. Or we call site failover, site failure. In here, all earthquake, or the building crash, or the room crash, all the power's gone. So maybe another site, the DR has to come up. So how the data, so without the data, DR, the DR data center has no use. Data is a must. Without the data, you know nothing about your customer. Without the data, your bank, your account has slow dollars. We are all very poor. Or we are equivalent. We are all equal. No more rich and poor because everybody the same. No money. So this is actually NLDB custom vision arts. As a single product, my SQL with HA, and the scaling features, which we can add more nodes, 1, 2, 3, add 1, 2, add 3. And they are all automatic as easy to use. You see, you saw my demo. It is easy to use. And it includes the components, which for sure multiple servers. One server cannot be HA. One machine cannot be HA. So we have to use servers. Multiple redundant server and data has to be automatically transported and exchanged. We have the shell, which I did it. I switched the primary. I switched the server. Okay, so that's the shell. And there is also the router, which is the router in between all the server, all the back ends. And it's like a proxy. The applications is always connecting to this kind of router. And it knows where it has to talk to. So look at this as the basic. We have been talking about master slave replications at the beginning. And we are talking about NLDB cluster in this talk. So what that replication is. The replications, by default, people use asynchronous. It's like posting a mail. The data is written in its own server. And there is binary lock. It's like a data staging area, which is supposed to be sending out. So this is, we call this binary lock. So data is committed only within a single server. And the messenger will come in to pick up the data from the binary lock and send it over to the inbox to the other office, which we call this relay lock. And the relay is just kind of messages in the box. Someone's to pick this up and has to transform this into SQL statement and apply the data back to the database. And our applications, when we use the slave server, we can see the data. That's why when we send the data to the server, we call this master. And then you want to read the data just right after I write the data on the server A, the server B may not have the data. And there is also a point. When the server A is crashed, the last minute data may not getting there because the data is still there. The messenger hasn't come yet. So there is a risk data is gone and lost. And there is also the so-called half-half semi-sync, meaning that when the data is written on my server and I pass to the binary lock on my server, and this also pass to the other side relay lock, when this actually data pass from the binary lock and relay lock, the whole section is acknowledged and committed. So when we commit a data, it has to be here and as well as there. That means we do not have the data loss as what I described in the asynchronous replication. This is called semi-sync, semi-half-half. Why half-half? Because I deliver the data only just the edge on the other box. And the data has to be applied to the server later on. That's why this is half and half. So this is half and then another half, which after applied people can see data on the server. So as what I said, the N-O-D-B cluster is the magic to automate this like the process of sending the data, getting back the data, and promote the database when there is a crash. So this, we have the group replications to exchange data within the group, and they are the members. And there is also application connected router and sending to the back end, which we do not need to care about which is online or done, because it's all automatic. And there is the shell underneath here to maintain how it works. So all this actually within the enterprise edition, we have the graphical GUI and to monitor. No matter what, let's look at deployment example, how this actually may help a company to deploy these H-A models to make sure we are always getting online database and it has to be always be there. So here's the example, at least three server. Why we need three server? One, two, three, at least. Because look at, just you and me, if I crash, okay, for sure you have to be the one serve the data. If you crash, I have to serve the data. But what if in the middle it crash? I mean the network is crash. So are you alive? If I can talk to you, this network is not down. Okay, but the network is down. I cannot talk to you, but somebody can talk to you. Are you alive? You will say yes. And then you ask me am I alive? I will say yes. So both are alive. You write partial data on me and you write partial data on the other. Okay, data is crashed. So that's why two, this split brain effect is no good. That's why within the three server when the network is really down, so here one, two and three, server the network between this and me is down. We have the majority. Two is bigger than one. So we will tell we are alive. This one as one, the minority is going to tell you I'm gone. So that's why this is actually very essential. We have this like majority and threes for deployment within one data center. And the application will connect to it through so-called router as transparency to access. When one is done, the router will know, okay, this is done, this is gone. It connects to the other server for data. And what if we have some more data center? One, two, three, four. So this is actually the way that we have the replication and connect them together. So in here you see that we put a router in the box like in the DRDC2 in here. So the router is taking care about when the database in here one is done. When one is done, the router knows, okay, one is done, it will connect to the other and to get the data. So it is intelligent to maintain the data stream from the production site to getting the data to the DR as always. And without actually knowing, okay, oh, this is done. I need to reconfigure another channel to pass the data into the DR. So this is again automatic. So look at this. Let's deep dive into a little bit more about what in fact the details about these configurations. So in fact there are so many things that we actually look at. It is text but just to keep the idea. There are things that we call consistency. There are things that we talk about the network, how we make sure we are there interconnect within the data exchange. And there are people, the application connect to me. So in a box we may have two network cards. One network is internal network, one is external. That's why there is so-called IP wide list local address. I will explain it later on the slide. And there is also one thing called network reliability. There are three options right here called expel timeout, auto rejoin, trials, and majority timeout. Those three will tell how to manage the network reliability in the coming slide. And there's also things like how do we take care of the member priority? We call member waiting. So I'm the one server, server two, server three. But how can we actually prioritize? I will always take up the work for ABC. How can I do this? This can be the priority. Oh, I always to do the work for apps. So people write data, always write to me. When I fail, always write to the others. So we have three servers. The third server can be very relaxed. In fact, sometimes when we do jobs, some jobs is very heavy, like reporting, very heavy. We never want this heavy job to actually concurrently to run with the OLTP. So we can put this heavy duty job on the third server, on the third server. Unless we know this is one, two, three, then we can actually run on three servers, the third server. That's why the priority, we can assign priority, which is the member waiting, and we can actually put the workload on the last so called the prioritized server. And there's also how we call this execution. What we do if somebody actually the network is cut and the server is idle there for what, to still work or actually abort and then shut down or actually behave like offline. This is actually in here, we can actually set it abort with only or offline mode. So here I set that custom network. So within the three server, those are the application. The application connect if you know my SQL. It is like port number 3306, for example. But there is actually underlying network. We exchange data. We don't want to expose the data in your network. So we expose this data internal network. So we need to define what this network is. That's what we call local address and also the subnet we call IP wireless. And then to tell, this cluster is talking about interconnect where it talks to. Do not just to use, I mean the default values. The default value is some kind of everywhere. And there is also, I mentioned a post server. So a post server, the red box is the primary. When actually this is actually one server's cut and this server is leaving the group and at the end it's just timeout and shut down. A post server is to shut down. So it is good or it is not good. I believe may not be always good because you see the server is gone and then you may have problem to see should I restart it or is there any other issue why it's shut down. If this actually just offline it's just online and then you can look at the day, the warning and then you can tell what happened, why it's offline. So it is better than shut down. That's why we have the exit state action. What is the exit state? When this kind of, okay, you're leaving this cluster. What should be the state is going into. And there is also the consistency when we talk about when fade over automatic. But how is this done? So when this actually a server, which is the red one, we call the primary, primary. We write data on the primary, but we read the data on the secondary. So when we write the data on the primary, so data will pass to the secondary, the other toolbox. There are some bad logs, some data to be queuing up, okay, to write to the secondary server. So I finish but I pass the data to you, you have to write back to the database. So what this means when this so-called the apply stage, when one is server, the server is crashed or primary is done and this one is promoted or when it is promoted, somebody will come and then see, I read the data. Oh, when I read the data, the last two pieces, you see the last two pieces data, it was written on the primary, but it is still in the queue. When I read, I may not be able to read the yellow and also the green. I might not be able to read because it hasn't yet applied. So what this means, the data is not the updated data. So how that works, we actually what we do here is the state of data, this actually we want, what we want is when this data applied and then we read it, whenever it's reached, that's why it comes up with the so-called before on primary failover. In the applications we create, we can set the section whenever there is something, okay, we failover, we need to wait, it's automatic. As long as there is default setting that we put it in and before on primary failover, meaning that I need to get the data, okay, with the consistency when I failover. So there is also the setting with the global, one is the section, one is the global. Okay, just put into the global variables, global means the default values. Set it up, the default values, when service startup, AV sections will take over these values, the MySQL router will take this as the default and when it failover to another server, it makes sure it connects to the server with all the data applied and then pass through the connection. So this is actually quite important variables, the before on primary failover. So in here, we do have other options. We have before, after, and also before and after. Where's before? Before means when I have the applications to curate the data. I can curate the data for the actual data, whatever is updated anywhere in this cluster. It has to be applied before I read it. That's why this we can set up the section with this consistency. I need the actual data, the consistent data to be curate, selected. And also when I apply the data, when I write the data, I can set this to after. There's a before, there's also after. I write the data, I will write to server A, server B, and server C, and come back and commit. So this is like synchronous replications. Okay, so MySQL has all these options. It is not so-called async or sync. It's very, very flexible. It's by application. You design what to do. And there's also network reliability. And the network reliability is there's some kind of habits. Are you here? Are you here? Five seconds, you don't reply. Five seconds, you don't reply. I kick you out. Five seconds. Okay, this is what the default. So what that means, five seconds, is it long? Is it short? All depends. Really, all depends. Sometimes there is applications or the network is so interrupt. Maybe even 30 seconds to resume. And it's very often to do that. Then I will say, set this up for this expel timeout, expel timeout. To kick you out, I set it to 30 seconds. That means even there are interruptions with the network. Because within company, our organization may say our network is not as stable as others. We have some kind of interruptions every week. We don't know when, but we don't want disruption to our business. But every week we have this. So what we need to do is we may set this up like 30 seconds, for example. To set this up 30 seconds, then okay, within the 30 seconds it's still hanging around. Because the hobby, I'm still waiting for it. Can this network timeout to be set like one hour? Oh, one hour. That means, oh, are you here? I can wait for every like one hour until I know you are expelled. One hour. That means if you really crash, really, really crash, okay, the system is down for one hour until I know I kick you out. And then, okay, we do work. Continue. So the expel timeout cannot be too long. I have to recognize, okay, maybe two minutes down time is good or 30 seconds. So there is maybe a judgment. What the limit? I believe 30 seconds to two minutes is kind of good values. Maybe. Some people actually 5 seconds, as actually the default. A good network, 5 seconds should be good enough. 5 seconds. So 5 seconds you don't respond, kick you out. Good. But certain network may not be, right? So I believe most likely it has to be less than 20, I mean two minutes. Okay, to make sure that you are, okay, if you are not responsible in two minutes, go away. So this is kind of the expel timeout. And there's also the auto rejoin. Consider when one server is done because of network. So during the weekend, during the weekend, so in fact, in fact during the weekend, you don't care because still two server is running. Okay, one network is offline, two is running but I don't want to come back to office to fix. I need to wait until the Monday to come. So during this like the weekend, I don't want to come back but by some time somebody will fix the network. Okay, if that server is good and intelligent to reconnect by itself, reconnect, then I don't need to do anything. That's what we try. This is like reconnect, we try. So during this time, actually the default is zero. It never we try to rejoin. So if we set it to like 12, 1, 2, 12, it means that every five minutes, every five minutes, it comes to, okay, I come back, I come back every five minutes and try to rejoin. If the network we assume, okay, come back and then back to the state, it's the cluster. 12 means one hour. 12 means one hour. So if we are talking about three days, let's assume the weekend, three days. Then what values do we set this? Three times 24 times, what, right? Times 12, every hour 12. And then that's the value, set it to it. And then three days, you still, the server still keep we try every five minutes. So they are the values, okay? And there is also one called the majority timeout. Again, wow, you are the part of the game, you are responsible for all the data's creation when the transaction is coming to you. At this point, crash. Commit data is trying to ask, are you there? And then are you agree? Are you agree? Okay, do you agree the data? Your network is done. I can listen to you. You cannot listen to the networks from from this, okay, transaction. So this actually the majority is here. Your minority. So what happens to this minority is by default, this value, by default this value is zero, means hang. Means you just hang there. Somebody to look at you, hang there. Okay, zero means hang. Is it good? Somebody to look. And then we to just, oh good, we have to expel timeout after five seconds kick you out. We, okay, you take up the primary. So the other one we will, we have the primary. But you as, also as primary. You have two primary, but this primary is hang. So this is the majority timeout. So what we do, maybe you said, okay, one, two, oh, two minutes. So after two minutes, this hang will come back and then say hours. And when this hours happen, so this actually the data will roll back. And you know, okay, this one, you are actually not coming back. So your minority will turn the state to our state. When turn this our state and then because, oh, error, and then you retry, you will try to see, oh, am I able to, after five minutes, am I good to go back? So this is actually going back to the process. So this thing is a good child to look at how to protect the server and automate the process. How to retry and make sure the network is still unstable or stable and still keep our servers to be reliable. They are very good and important. And in the MySQL class, in the DB cluster, we add new features from time to time. In the just beginning, just in July, we add one more feature called clone. Clone. It is quite easy. And when we have one tool and the third server come in, so it is just an empty database. Nothing. We have data. Always we have data, one server or two server. What it does, clone my data to you. It's just like backup and recovery. So this clone is just easy. And in here, the MySQL shell, we add the instance and recovery method as clone. And the data will be just all data back. All data back. Even the username, all this. By default, your user root is empty password. When I clone my database to you, my user password is yours. I pass to yours. And the empty password is no longer valid on that server. So here, this is kind of the demo. But other things, so this is like automation. This is also performance-wise. How we tackle the database like when data is written back, written back onto the slaves. Here, we have the database as master. Data come here. I pass to you. I pass to you. Data has to go back. We call this SQL Applier. Apply the data back to the database. So we have the algorithm to make sure that we can work in parallel. This is what the parallel type. Default is database. And we can change by clock. Who can run first? Who can run the second? It's by like the timing. By default, who can actually in parallels to run is by database. MySQL, the first line, the first line from Ryan, is talking about database equal to schema. So basically, each database, we can run different threads. So if we have more applications and writing to database DB1, DB2, DB3, then DB1, all these transactions, they are independent of DB2 and DB3. So those actually parallelizations can concurrently execute it by databases, by database. This is the default. But in many cases, we do not run multiple database and then parallelization in this way. In many cases, what MySQL or the application is written is one single database, one single database, or even multiple database, they are, they depend on each others. So what we do is we can change this parallel type to logical clock and then put in a parameter called workers. How many workers do we want to work and to apply the data? Two, three, four, it depends on your iO and also the CPU. You have more, you put in more and data come in more and it actually execute at the same time, at the same time and put it back to the database. So this is how we actually make it, okay, work faster. Logical clock and more threads and the commit order must be preserved, preserved. Okay, so this actually is handled nicely. And in the InnoDB cluster, MySQL, we have this, so automated way of InnoDB cluster. We have some kind of GTID and we need the BIM logs to be in certain format and the criteria. They are BIM log format, checksum, GTID and how the data is actually sitting there in the repository doing the table. And also when we write how we put a hash number and then that's using what kind of algorithm we have to define in here. So here I'm telling you there are a lot of things that make it more reliable and make it run to be better and faster, right? So it's so heavy, we have this automatic MySQL InnoDB clusters to get fade over automatic and to make it work. And there is also router configuration. We have the server, we also have the router. The router, we need to take care how many connections to connect and where we put the data. Okay, the log can be info and on Windows, we can put in the event logs, window events. If you like the windows applications, you know there are places we put the logs on windows. So people will see that the event logs may be a place. And there is also the notifications. Wow, the server, ABC server, we may have server down or someone to switch over because of manual operations. So this is notification. Things changes. Among the server, we need to proactively to notify the router. I have changed. Next time you come to me, talk to this, talk to this, talk to that. So this is proactive. This is so-called use GR notifications and it enables the notifications for what changes we make in the server set. And then it's just applied, and then the router will know and whatever connections come in will know where to connect to. So here's another myth. So somebody will try to put your application on the from drive. From drive. Application on from drive, portable, right? Everything's talking of portable apps. So from drive, most likely from drive is FAT. Am I correct? FAT. So FAT, it has a characteristic. The characteristic is it does not have like the privilege. It's everyone can write and read. And my SQL router, there is a key files, which is the key file it has to be run properly and keep properly. So here is the error. When I run this, there is like, oh, everyone has full access right. Errors. Because there is like password, password and key. It's the key storage in the router. We do not allow this key to be stored in this FAT volume. If you put it there, you'll never be able to start the router. So this is like kind of tricks. So portable app and then you put it there, maybe you'll see issues. So better when you do this, like portable, it's still NTFS if there's a window. Or on Linux, it's EXT2 or 3, doesn't matter, or 4. What is the very common file system on Linux? EXT, EXTFS. Yeah, by default, correct. But on the thumb drive, people don't do it. Still FAT because window uses it. Yes, correct. Yeah, so that's why portable apps, okay, there is kind of tricks. And there's also, do you know what is lock rotation? Lock rotation. On Linux, this is open source. People use a lot of Linux. So the lock has to be maintained. And application has always be active and running. Application always active and running. It always kind of writing to certain files. It's the lock's file. It's being locked. And you'll never be able, it's grow, grow, grow, grow and then too big. It runs one month, two months, a year and non-stop. You never be able to clean this up. So we need to find a solution to stop it and then rotate it and clean it up. So lock rotation is the way how we trigger. Usually on Linux, it is using a signal. To signal, so-called a hang-up, H-U-P. So you send a signal, kill minus one. Okay, so this is open source, you are technical. I tell you more detail. Send a signal, kill minus one to the process. It tells the process to close the file and reopen the file. So what we do is, this is actually the tricks. On Linux, you man the lock rotate. Man the lock rotate, you see how you do things and we actually, the router can do this lock rotate at the same time. Work together with lock rotate and the locks can actually be renamed and also can be closed and reopened. So all this, we have the shell. We set the option, we maintain the cluster and they are all be within the shell we call admin API. We have DBA command, we have cluster command, we have all this working nicely and it is command line. Why we have the command line? Because a lot of things we need to integrate this admin to other tools, to other tools and we use some other things to monitor and the graphical GUI or we use others but we still need the admin CLI to maintain the interop with other products. So admin API to get the things how it works. We can set options, we can see options, we can set a specific server options, consistency, fade over and all this can be done within the shell. Only four lines you can create the cluster, four lines of this shell admin API. You can create the cluster up, up, up and then we bring this three into a cluster, four lines of cook and then we can bring them into cluster. It is very easy. Okay, so there is also the way that we see backup recovery. So the backup recovery is the way how we see the efficiency. Oh, we have the a server, b and c. Where do we backup? Sometimes we backup always the very active server, very busy server. No, we don't, we may want the oh the third server which is quite a lot idle and we backup from the very idle and relaxed server. Am I correct? The busy server still keeping to run database on other activity. We will separate the loading on different server so the backup may actually be executed on the lower priority server. So the backup has to be also running fast and we have the writeable server. This is read write, this is read only and this is read only. And when I backup from the read only so this actually the backup will go to the read write server and write the data and telling you already backup and the three server will be consistent and it knows we have already finished the backup. So this information is maintained in the database but this only server cannot be written. So when I backup to this the backup software will automatically it knows this is a cluster and it knows which server can be written. It backup this it will come back all comes to the writeable server and write the data there and it will be transferred back to here. It's all automatic, it's all automatic. So that's why this is like the backup and there is also we talk about the configurations. There are three configurations found in the MySQL world and one is we generally we have the configuration and define all the values, variables. That's the my.cnf or my.ini and there are the other two files. One is the server name, server id, we call the server uuid. It is stored in the auto.cnf, auto.cnf within the data directory and there is also in MySQL 8.0 and now we have the persistency variables. Persistence variables it is stored within the MySQL d-auto.cnf, MySQL d-auto.cnf. So these two files it indicate the actual server identity. Okay so free configuration if I have the data you have the data I have the data. So when I restore to there and then I put up the so-called the three.cnf file and the name they are the same and we just need to make sure the gtid must be the same. So from the backup when we backup using the enterprise all this okay backup from me and I know how to we provision another new instance and then bring it to like in sync within the cluster as one new member I backup the data using enterprise backup and then there is a gtid informations about me about the server gtid means like in Oracle the scn number it knows what actually store within the database what actually I can bring back the data within the scn or we call the gtid. So this information is called metadata gtid metadata this is within the inodb within the enterprise backup when this actually information we cover back to the new member and data is there gtid is there and three configuration is there this is new server with the same set of data as the existing cluster members and they can form the new cluster. Okay so if you want this as the same name as the existing one put it back the three concept if you do you think this actually server a new name okay we move the auto.cnf it will be another name because the auto.cnf is the name uuid you just delete it it will come back oh you are a new member I don't know you because your name is abc my previous name is xyz so if you delete the auto.cnf you got the new name so here this is kind of to tell you so they are the things that we need to protect the three configurations and also the gtid sometimes when I work with customer people always try to use reset master have you heard reset master and reset slave no just the tracking of all this transaction ID oh forget it everybody the same so clear your mind clear your mind you're empty you're empty you're empty so when everybody is empty they are the same right empty database empty database empty database they are the same reset master is just logically to tell you you know nothing you know nothing you know nothing but in the database you store the data so sometimes when I go into I meet customer they tell oh uh quite a few weeks ago because some problems okay you have some problem and you have some problem okay they don't know the problem and then they want to fix the problem how they fix is clear your mind clear your mind clear your mind you know nothing you know nothing I know nothing and then we are all equal now because oh we know nothing then we can bring back you see but in fact you still have the problem internally I still have the problem internally you still have the problem internally when we hit the problem again the problem is still there so superficially on the face surface it is like here we all equal but at the bottom they are not equal so this is kind of many times we go out we see customers just to fix the problem by covering all the facts by removing all the details we set and then you know nothing you know nothing I know nothing so just be aware don't do it because this is down the board down to the bottom it has the data and it has the ABC data it has ABCDE data so when the time somebody updated yee there and then coming back to here I have already yee data then what problem is this duplicate data and cannot apply because we are not the same but you are telling they are the same so this is kind of very popular okay people trying to fix it by covering the facts using resets so this is make sure that we understand correctly what it is doing and people always to back up by with the schedule jobs in on the Linux like cron job and then to back up the data onto a separate folders not I mean the separate volume because backup always grow and then you have never expect this somebody will come and then clean this up so this actually the volume should be untied to the data volume or the data volume should be a stand alone volume for data and people grow the locks grow something else cannot grow into the data volume otherwise the data cannot be written okay better fail but application still running properly okay so I think this is the section about how we can maintain a reliable database and giving you the way always online so questions any questions actually I I have a question yes um right latencies so how does the the various uh configurations uh our strategies affect uh right latencies so for example is it does it block the rights or does it write but then it does not return until until all the data is consistent how how does it work okay thanks for these questions so the latency means that when we have the applications to write so it depends on whether the operation is to write data whether this is like to read data so the question is about to write data to read data okay first to go the other side to read data the latency is about we read on server A server B and server C so they are independent it doesn't require to read from me and then I ask the other data from other servers so it is very fast read and very isolated environment good actually this is a good environment for scale read because all independent three not enough adding four five six and then more users can come to read independently and concurrently so this is read and latency is to read built in it's about i o and memory latency to write write to a server so basically miasko in no db cluster okay can be single primary single write can be multiple writer single writer or multiple writer so let's back to single writer first single writer first so single writer is to write on the one server the one server has to pass data to abc server pass to me pass to the third so the latency is hi i have the data i have the data it requires the acknowledgement that's the latency so this first first it write the data to the server in the memory i mean in the in the memory and then pass the data in the memory to the other server so the latency is still memory not i o at this point not i o this is important and this like we call this certification process and it requires the majority rules it's about latency majority rules means if this is one two three majority is two so when any one of the servers say yes pass and you are ignored why this happened because when this is to write and somebody to tell you okay that means the other one has to say okay otherwise this one just drop off you have problem you just get off so that's why this is the majority rules so when this happened this is latency is fast but to the point us can this be like always like two servers to run because i'm fast you are fast always i respond and this one it's like gone because it never catch up problem happened so no because there is a flow control just look at one individual statement one individual that can be fast but we look at like the overall lifetime of in our operation we cannot lift the other server behind and then we don't care so there is a flow control make sure we do not have too much gap when this actually concurrence we need we do straight tests and this one the latency we will try to hand you wait this one left left behind too much this is flow control so this is another deep dive parameters so to answer you is memory copy memory copy and to write the bin law and relay law concurrently and send the data concurrently to a b c server concurrently so from the latency point is like network and then getting back and then this is independent okay so and also the majority this is fast from single operation it is fast from the overall in and outflow must be the same so i hope you understand this is actually the best option and latency is more than just more than just so-called passing the data latency also talking about the bulk of the data i send a transaction a transaction can be k can be mac if a k one k2k it is small but one mac passing over the stream one mac if we do not trunk k into trunk then things actually in between cannot pass through that's why there is also mechanism to break the big chunk into smaller chunk so there's some other heartbeat in the network can flow through and acknowledgement is like a synchronous to add knowledge to backs to you so this is all built in to the inodb cluster already i hope this answer your question yes thank you very much um any other questions any other questions okay so i think we can take it to break now so but if anyone has any questions please feel free to step up to uh iven and ryan uh during the during the break and ask all the questions you want uh we will be uh back here at 4 p.m sharp for uh more mysql so it will be unleashing the power of no sql using mysql are we fascinating looking forward to looking forward to it thank you very much