 Hello everyone, my name is Narendra Singh Chauhan and I have a part of my SQL replication and I do testing on replication. So, today I am going to talk about what is new in replication in my SQL date, say August April. So, today's program agenda will be talking about the replication different variants of replication in my SQL and why we need it and then I will be talking about the main part of our presentation which is enhancements that has been done in my SQL date in replication and a quick roadmap in my SQL for the replication. So, in today's work it is a technology mesh, all things are distributed and there is a large amount of data that we need to handle, transform and store. So, for example, with having smart phones coming into our life, we have loads of data that we want to store, we want to post it everywhere and it should be quickly available to everyone. So, the thing is we cannot afford any offline duration and it is simply unaffordable. So, I want to keep more and more data on social media and I want to see, ok, look at all the pictures and when we are ok, we want all the data for the monitoring purpose for any number of years and with IOT there is much more things that we want to store. So, extract, transform and load and that will be as soon as possible. This is the key component for us to progress in future. So, it is like the zoom and distributed coordination and monitoring is the key here. So, what is replication? Replication is a process of generating and reproducing multiple copies of data at one and more site. So, for example, if I do some insert on a primary, then it goes it should go to the replica without any data group. In my SQL replication, we have three different types of coordination between the servers. So, the very basic ones are asynchronous replication and semi synchronous replication. So, my SQL replication is in the process in which we write everything to the log file and then it gets replicated. So, we are using logical replication. So, here when we insert some data, the server read writes it into the binary log and also it commits it in the server. So, it is get dumped over the network and it gets only in the relay number of the secondary number and from there the coordinator thread assigns it to workers. So, if there are multiple transactions which can be penalized, workers they work on that penalty. So, it improves the performance and once it is written, if we have a log in a debug of the server, then it will be below for there as well. In semi synchronous replication, we send acknowledgement back to the primary. So, that we know whether the slave or secondary has received the data or not. So, it has become relatively synchronous kind of thing. Now, coming to the third and the most important my SQL replication which has improved a lot. In the previous replication setup, it might be different scripts, utilities to provide the failure work and further a lot of other activities. In group replication, it is a highly available solution. It is a very good infrastructure where the members can grow and shrink dynamically and whatever the view consistent view is available, the same view will be available to all the members. And we can have two different types of modes available here. One is single primary mode where only one member will be doing read write operations where all other members will be reading only. Whereas we can have multi primary mode where all the members will be doing write operations and it will be a conflict detection and everything handle automatically. And the member who is having the primary mode or the secondary mode, the same information is available across the group. So, let us moving on to the enhancement section. So, for what we have seen is in replication we transfer some data and through logical replication it gets available to the secondary. But assume when the data packet is in motion and it has not received to the secondary here, during that time if someone queries it what it will get. So, for example, in this case I am updating a data on some of we where the value was 1 and changing it to 5 now. So, it is in progress at this point of time it has not reached a yet. So, during that time if someone queries it what it will get? They will get a value 1. So, here what we are saying it is an eventual consistency. So, it will be consistent, but eventually not exactly when we are acquiring. So, we have enforced some consistency level over here. So, it is enforced on read. So, for example, if the menu has been updated on B and if someone queries from server A it will wait the query will wait until unless it has execute it executes on A and this is this can be sent through group application consistency is equals to before. So, when we say before we are enforcing consistency on read similarly assume I am writing something on server B and I want that I should not give the give the handle to the client back until unless I execute it and all the members. So, for example, here in B when I am executing some DML until unless it is only for member A and member C I will not get the commit ok. So, when it is the commit gets back to the client now when we are query I will I will get value 5 in across the group. And we also have before and after commit consistency level. Now, moving on there is one more scenario. So, assume I am on a single point with a board where I am executing something on member B and mean during that time this member goes away and meanwhile if some client tries to query from the member member A he should not read the state data. So, in this case when I say the consistency level is equals to before on primary fade over the query will hold until unless it executes it on tape. So, when it executes it and then the regular is fine. Now, in the operations part what used to happen earlier in group application is the member which was the part of this group. In single primary mode the member A was redoing mode the moment we use today stop group application that member becomes offline. So, when it was not a part of the group we were able to move means and writes over there. So, which was not good because when it tries to join back that might have some extra data which is not which is not supported in the group. So, now when we say stop group application the member will remain read only. So, only reads will happen there won't be any write. So, after we will do some changes to it to rectify it it can join it back to the group. Now, one step further. So, assume network partitioning happens. So, network is network is quite unreliable. So, suppose there is a there is a network glitch due to which A gets separated from from this group and it is unreachable to that. So, during that point of time now we have two mechanism one I can say that this member should become read only and only read must read operation is a lot of this and no write should happen. So, further I can say read only which is default as a exit state action. And if I want it to be more secure then what I can say is I will set this value as about server. So, that member will go will shut down automatically. So, there won't be any read and write from the router to this particular member. I can also increase the priority for a particular member so that when any member which goes down the other member gets picked up which I have selected. For example, in this case I have three members member A, member B, member C. Member B is the primary. And member A and C is having a different weight based on this one I have decided that if suppose member B goes down automatically C should become a primary. So, eventually when this member goes out of the group then C will be elected as new primary. For that I can use this variable as group application member B and I set this values accordingly. By default it is set to 15 throughout the group. This is one of the most important feature that has gone in years 30 which gives us the power of selecting our primaries on the when the members are online. So, in this there is a single primary mode where B is the primary at this point of time and as a dev of operations I want that let's meet A as a primary. So, if I set using this UDF a particular member ID so I will be passing the server UUID of server A over here and once it is done automatically A will become the primary and there is a note over here that this UDF can be called from any member and it will be the action will be taken out in the group. Similarly, if I am on single primary mode and as a DBA I think that no the single primary mode is not what I want I want multi primary mode then I can execute this U here and automatically all the members will become redirect they will accept redirect operations. So, this is known as single primary to multi primary switching. Similarly, from multi primary mode we can switch to single primary mode and based on the lexicographical order of the server UIDs automatically the new primary will be elected if I don't pass this number UID suppose I have this ABC and I want that we should become the primary then I should pass the server UID of it and then automatically we will be elected as new primary so we can switch from multi primary to single primary and it's everything is online we do not have to take it offline relax member eviction. So, there are the network partitioning network glitches it's very common in real world. So, what happens is assume I have this group and somehow due to this situation A server UID goes out but then network partition can be on different ways it can be a longer outage or it can be a small outage which happen momentarily. So, in this case if I know certain situations are very common I can set member expert time out so let's say 5 seconds So, if I say 5 seconds the member of the group will wait for 5 seconds before he joins back and if he joins within that duration then it will be part of the group so I do not have to take the extra effort of bringing it back but data to travel from a particular server to another server so in this case assume I have member C server C and one data which initiated by server A and then it has to obligated to server B and then from server V to server C so I can find the information that from at what time the data was originated on A what time it was originated on immediate master which is B and if I want D I might get the information I will get the information of when it was generated on C apart from this I will also get information then when it was received what transaction I am executing at what time I have received it and then if there are any transient errors I will get that information along with the 10 step so I have more information how data and what data at what time it is travelling and how much time we are taking to execute it and there is a fine control for receiver thread worker thread, body data thread and worker threads so we get all this information from PMS tables in group replication A, B, C are having means if we are using similar primary mode or multi-primary mode the roles of them can be a primary or can be a secondary and they might they can be on different versions so now I have the status of this complete group having the information like who is primary so in this case A is primary B and C are secondary and A is with 5 or 7 or 2 0 version whereas B and C is having 8 0 3 versions so I have this information and it is consistent across the room if I query a particular server and get the information which is same across the room from the security perspective only what used to happen is or still it is the thing is if I am writing something on the server the data since we are using logical propagation it is written in the log file suppose if there is a person who gets an out of authorised access to this machine he might be able to read some information to overcome that thing we have enabled encryption so there is an encryption or binary log on logging mechanism so if I am writing something on the server that will be encrypted and then this binary log will be encrypted and the same information when it goes to the second reader the reader logs will be encrypted too so this is two tiers of security protocol where whatever the data is there in the file that is protected by the file password and this file password is again protected by binary log encryption key so this is two tiers of encryption on disk and that helps us from any unauthorised access and people will not be able to get that extend that information and there are many more features that has gone into bicycle rate and if I start talking about everything it would be a very lengthy session so what I would encourage is maybe we can discuss all these things in bicycle mode so I am just touching upon couple of them over here like how we have enabled we have changed the default option so that highly-operable propagation gets available to you by default so binary logging is enabled by default log slave updates but the reason is that relay log related information gets written to the written to the slave automatically and from there it can be written to the binary log so that gets some level default by default meta data is stored in your database so that it has become crash safe even after the crash the data is persisted in the meta data is persisted in your database apart from that replication the write-based update mechanism is using hash scans now write set extraction is on by default which group replication requires bind log expiration is set to 30 days and server ID is set to some unique value set to value 1 and it can be changed as we set up so all these values has been changed by default so that we get highly-operable replication enabled talking about on the enhanced spellings part for the monitoring purpose threads, conditional variables, mutexes has been instrumented and all these informations are visible in performance schema so it is helping a lot in monitoring aspect apart from that on operation sites save points support has been added for good replication in the write set extractions host names are supported in the group replication write list and there are few variables which has been added to control the flow there is flow control related variables and with that we can do throttling we can put some thresholds and utilize that for flow control of the transactions in the group apart from that IPv6 support has been added for the group replication and for the performance point of view the code path has been improved between network layer and replication layer apart from this there is a write set build-up transaction dependency has gone into write set so for example in the previous slide where I showed that if there are transactions which can be applied in parallel so if there is a good amount of transactions which are non-conflicting those can be so we were using earlier database then we have this logical block and now we have write set so write set what it says is using the class of the write set extraction if I have the hashes which are non-conflicting I can exhibit all of them parallel so this has improved the performance a lot and apart from that we have partial decision update so partial decision update and block and text these are big data type and in this if I am changing only a particular part of the data of it then for particular data it is very strong to transfer the whole data so with a partial decision update I can only transfer the chain data so it will help in storage, reduce the storage as well as performance so that is it on this enhancements and if you want to talk on length on this maybe we can talk about bicycle work and I will really have to explain more on it so talking about the roadmap so these are the time time stamps when this important releases has been launched and 21st January 2019 8014 has been in G8 and with this you can get all the cool features from what I have explained so far to you and you just download it and just try it out end goal so this is something we are talking about in the end goal so think about the replication semi-simplest replication and group replication people have different use cases they can have simple master and slave replication linear replication where A, B, C will be transferring data to each other and we can also have multiple masters sending data to slave or even we can have a master which will be having multiple slaves for the range scalability so you can think of any topology and even you can mix and match all those things with a group replication so if you notice the cluster which in Bala we will be talking about in postman session he will give you the complete glimpse over here so I just give the basic idea of what that is in here in this picture the G8 provides you the high ability and then for the breed scalability asynchronous or semi-simplest replication can be used and my simple shell can be used for provisioning management and the removal of any particular server in this cluster and my simple outcome helps in redirecting the leads and writes to a particular server where we want so there can be one replica set or it can be extended to n number of replica sets so this gives a complete picture for higher quality disaster recovery so just feel free to write so from here for the packages download we can go to this link and for the documentation my simple documentation has been written in very improved manners and roughly replication related manuals has been improved consistently you can always go and do it from here and logs from engineers if you want to see even more tactical depth then we can reference the logs which are written by the engineers on higherability.com thank you and thank you for listening today