 We're at the beginning of this year, our 10.11 series went long-term support for five years of that and we're continuing the same pattern. Colin mentioned the early, the version numbering sort of jump, what we've decided as our process now is that after we release a long-term support release we'll bump the major version. So the long-term support release is always going to be the major minor version below the the big big numbers on that. I probably didn't have to try this again. So to be a bit different from usual I'm actually doing this presentation at Juniper Notebook which is one of our developed features from one of our former staff members. If you want to follow on this tiny cc slash MariaDB Fossasia we'll bring you up an instance where you can play with the running server. As before I said you know GA started in 2011 so if we run that and this is what I'm running in this test instance on this little cloud service. So new features I'll run into them a little bit quickly. UUID is now a data type. This is added in 10.7. Strangely enough it looks like a UUID. You can refer to this just as a text name there. You can insert things into it and they come back as the text of a UUID. You can convert binary strings into a UUID and you expect that if you got something invalid you can expect it to fail and give a warning. Strangely enough just what a normal data type is meant to do. So 10.7 is when we added that UUID data type. The other data type that was added is INET version 4. You may ask what happened to IPA in version 6. Well actually INET version 6 was added all the way back in 10.5. So a quick recap. There is an INET version 5. INET version 6 here. It can take IP version 4 mapped addresses and also IP version 6 addresses as you'd expect. So what we added with INET version 4 as a data type is Strangely enough stores them and treats them exactly like a data type there with validations as you'd expect. In MurrayDB 10.10, I think we added this UCA collation. Historically MurrayDB has been on the collation standard of the UCA of 5.2.0. We finally updated to the latest collation standard of 14.0.0 being a standards body. They seem to move quickly. It's now after 15. Sorry we can't quite move that quickly. But you know it was a major effort to get this collation work in. So what this means is these are all the new collations that are available. For those that dealt with MySQL collations before there's a case insensitive and a case sensitive version of these. And what is new is the AI and the AS. And that's the accent sensitive and accent insensitive versions of them. We'll see like previous ones there's a no pad and a pad collation depending on how you want to actually treat white space in your collations. So this is there for all the languages of the standard going through. Yes, there's a lot of them on that. And what you do is you just attach this to a UTF-8 type and you've got your own collation there. Since 10.7, we've obviously been catching up with the world on using a lot of JSON functions. So this JSON equals to JSON objects are equal if they're the same. If they're not the same, it turns false, which is kind of what you expect. To make things a little bit easier, we've actually made a JSON normalize function to bring the text of a JSON object down to a normalized form. And the useful for this is actually if you want to actually have a unique JSON object in your tables, you just create a generate column as JNorm here. And that's the normalized value of it. It's a virtual. We had a unique key on it. And what that means is that if we insert Alice with blue that way and try to rearrange it and do it and so again, we've got a duplicate key error. So it's one way to normalize it down. JSON Pretty is pretty much the same as JSON Detailed in MySQL. Just creates a pretty version of it. And this was quick community contribution that we received. Probably should list them a little bit more. Got a CRC function was mainly added for a test case, but it's there for everyone else. RenderBytes was also a user contribution from our community that's one of the same as the MySQL function. So that one's there now. JSON Histograms, for those that remember back to the Marine Bee 10.1 days, there was to have histograms there that represents the distribution of the data over a table. So what JSON Histograms does is extends this syntax in the way that it's stored in the table to give better selectivity, particularly around those edge cases as data changes from one string to another. It gives a better granularity that way and so you'll get kind of better queries. So here we create a table. We pushed a bunch of values into it. We run an analyzed table with histograms with the JSON-HB-Histogram type. And what we see in the column stats at the end, all the way at the end is this cache maybe, is just a histogram in a JSON format that describes the layout of the table, where things start and where things end. And obviously it should move towards a bigger data set. This becomes a lot more significant on that. Another function, JSON sort key, natural sort key. Too busy on the JSON mindset. So natural sorts is, I guess when you've got numbers and letters all on the same thing, you want the numbers to be sorted in a numerical order and the alpha numerics in an alpha numeric order. So we create a table, sort of bunch of things. If we didn't have a natural sort key, what we'd see here is that A2 came after A11, which doesn't actually mean not the natural sort of order that people may see things in. But if we do a natural sort key as the order by, obviously not the most efficient way to do it, what we see is the A1, A2, A11 coming out in that order. Some applications of this you may not immediately think of. If you want, say, IP addresses sorted or anything with actually a bunch of numbers separated, you can make this to make sure the IP addresses come out in order. Another useful application for this is if you've got version numbers in a table and you want them to come out in order, just do a natural sort key on that and they'll come out in order. Not sure if I did it twice. Once it's out of the gun. For those who have played with large amounts of concatenation, stringless, string right and alignments and trying to do arguably silly things like formatting in databases, we've made the S format function there to make things a little bit easier. So this introduces a pretty much adjacent kind of formatting function into the database. So you can format it there. This is based on the lib format library, which is going to be introduced into the C20 standard. But for the most part now, it's an external library. And this will allow you to do things like use two values here and just pass arguments at the end. If you were to do that in previously, you'd end up with a bunch of string concat functions, possibly bunch of cast functions. So now you just get it all pretty much easily there. And there's a whole bunch more features in the format, like left alignment, the things you normally see in printf anyway. Descending indexes, something that MySQL had for a while, we finally got them, was if you're doing a sort where one index is in one direction, and you're doing an order by that field, and an order by the other field in the other direction. Now you can declare the index that maps to that correctly. So you can use the entire index for the sorting, rather than just part of it and relying on to sort afterwards. And yeah, a couple of query plans will get through that fine. Convert petitions. For those who have worked with petitions and trying to get a petition out into a table or a table into a petition, you may have seen these steps before. You, from a table, you do an altar. We create a new petition. We remove a petition. We exchange it and drop the other one. Or if you're doing the other way, moving a normal table into a petition, you get the values and then you try to swap and exchange things before. What we've done now is we've created a simplified center where you've got one office table, a variant to move a petition into a normal table and another alter table variant to convert a normal table straight into a petition with the constraint around, you know, the values in my petition. So that's, I guess, one simple indication. System version tables, they were there since 10.5, if I remember. So it's rather old features. So what's the new aspect here is we've gained the ability with enabled system variable that you can actually insert into a table. So you're actually able to migrate data that might be in ordinary tables into system tables. So a quick recap on what system versioning is. If we've got a table, we see it's only got one column. It's got system versioning enabled. When we insert it into it, we've got these two magic columns, row start and row end, and they represent timestamps of when was that single value in valid in that table. So here it was developed from the beginning of 1980 to 20 hours and one second later in 1980, it doesn't exist. So what you can do is you can clear it as of, you know, one second into the interval if the value is there. Anytime before it, it didn't exist. And so that eventually system version tables, you've got a concept that there's a current table, and there's what was in the past. And you've got this extended as of syntax to query what was in the past. Clearly, password reset, reuse plug-in. Very simple plug-in, means you just can't set the plain password twice and it regains a history on that first that made that sort of compliance. Just keep in mind that, you know, while we add these things like plugins of password reuse, there's actually a plug-in API for that. So you can actually develop your own plugins for password reuse. And all the way back to even data types. The data types is actually a plug-in in MariaDB. So you can actually write your own data types if you were so inclined. There's a working community that's going on to make some of these plugins a lot more available by making wrappers in Rust. So you can write your own wrappers in plugins in Rust and get those into the server. There's a HeshCorp key management plug-in, just like Pacona has to integrate with the key managers. And this integrates with our freely available transparent data encryption that's been there since 10.0, 10.1, long time. And if was installed in this machine, it would show up. Get diagnostics is a mechanism that's sort of not commonly seen. If you've got a value and sort of got like one value in and you insert a value, you get a duplicate entry for the primary key. For this one, it's fairly obvious that it's the second one because that's the only value. If you had more complex unique key constraints, or otherwise that it's not really obvious which one was have. So if you want to do some error handling, you can say get diagnostics. And we get it was the second row number two that was the error handling. Function attributes in and out are now function attributes. So you can make sure you're using your functions the right way. In most cases, there's a constraint about it being in a select that's not valid. Alter. Yep. Pandis going to do a talk later. So I'm not actually going to cover this. He'll cover that a lot more better. But in the replication, that alter tables are pushed through a lot quicker. Ready to be bin log. I think there's a community contribution as well. Finally, got all the arguments to support GTID data types that was added in 1010. So you can do a start position, a stop position and use GTIDs in the form later way that you would have used bin log positions before. On scenarios, I guess you sometimes frequently transferring a master to a replica or a slave and back with the GTIDs. If you didn't know them really, really well, it was easy to get them mixed up. So what we've added is just a change master to make to slave single instruction. Hard to get it wrong. And that will just swap the positions around on it and enable it to become a replica. Granted, public, it's not the same as non-wish user. It means that every SQL user has access to the test in this table, which is fine. Read-only, we've split out the read-only from the super user. So it's a separate grant. And there's reasons you may want to do that around orchestration and management. But in ADB and the underlines, we've got a fast insertion. We've got the resize of, so the table faces can do a number of them on a restart. And that's about all I've got time for. So yeah, there's a few things there. A few tech searches, a bit better memory management, and a bit of auto versioning. So apologies for running late. And there's also some other UI improvements. So any questions or future requests? Yeah. Are old UUIT type supported or is there a generator for them? So if you use them as like in place of an auto increment, can you just drop that in and let them generate? Yeah, I mean, the UUID function, which generates a UUID version for type. The thing I didn't actually mention was in the the way we actually stored the UUID is pretty much orientated around a version for types, which means it's such that all the time based things is all at the end. So you get a faster insertion if you do bulk insertions around that. For for for us, the time based on the Mac based one. Yes, deviates from my SQL and it's not available in my SQL main. How much of this is available in my SQL? I think I'll defer to some of the most girl people here. I think randomized is there. Some of the other functions, maybe some of the JSON functions there, but all the system versioning is MariaDB only, correct me if I'm wrong. The petition swapping is MariaDB only. Not sure. I think I'll give the same test. Not sure as to how much this is historically offered. Yeah, especially the hashic of plugins and all this and all not available in my SQL as well as the system versioning tables. That is also not available in my SQL. But most of the all other stuffs. It is almost 80% we can see 80 to 70 to 80% similar to what we see in my SQL, especially the system versioning is not available in my SQL. I guess it's also important that things like system versioning are an SQL standard that we just adopted and implemented. There's a lot more in the standards that doesn't quite always make it to the implementation stage. Yes. Is it possible? Is it still possible to migrate from my SQL to MariaDB or from MariaDB to my SQL? I mean the compatibility with the new version. Is it possible? Okay, it's possible to migrate my SQL 5.7 to MariaDB. Easy enough. There's probably a small number of caveats around. There's about like a small list of functions that aren't there that will try to correct you for anyone points them out. Going from my SQL 8 to MariaDB. At the moment you'll have to do an SQL export and re-import the Wave, the innovation of NADB underlying on both sets is diverged enormously such that you can't just move the data directory over and expect MariaDB to run on my SQL 8 at the moment. Any more questions? Well, thank you very much for this wonderful presentation. Kwan, who's going to tell us about linking predictions using Heatwave and Auto-RML. At least that's my understanding. So, welcome. Yeah, thank you. So thanks for joining us today. This is the third session. No, the fourth session of the Moscow Tracks. So my name is Ryan. I'm based off of Malaysia. So it's absolutely good to be back here in Singapore as well as for Asia. So my last for Asia session was in Bangkok like three, four years ago. So it's good to see some familiar faces. Okay, so I'm just going to talk about prediction, but just a quick, you know, quick overview of Moscow. You probably have seen this in a keynote in my colleague presentation. My SQL has been around for 27 years now. It's a very solid transactional database. And because of all this user, Moscow has gotten, I mean, it's been really solid and stable. And from the Moscow team, we just added, you know, a great innovation into Moscow. Cloud-native, built for the cloud, Heatwave database engine. So this is what I'm going to talk about today. The cloud-based Moscow Heatwave engine. And not forgetting, you know, lots of open source application supporting Moscow that make it really, really popular. Okay, so Moscow Heatwave, it's one of the greatest innovations from the Oracle Research Lab that we have added a special plug-in to the InnoDB engine. Bear in mind that this is a cloud-only database, managed database in the Oracle Cloud, as well as in AWS. And if you're running Azure, you can also take advantage of Heatwave using the fast-speed interconnect from Azure to Oracle Cloud. So essentially, Heatwave is a special plug-in to the InnoDB engine. As you know, InnoDB is very, very good in transactional workload. It's really stable. Customers have been telling us that, you know, the replication is rock solid. Without the replication, you can't really do a massive hyperscale application. And for that, there are certain things we still want to improve, especially on the old lab workload, which additionally doesn't really work really well in InnoDB because of, you know, the role-based, you know, RDB and MADS design. So what we have done is we have designed this InnoDB engine, the Heatwave engine, to be able to help you to automatically convert the role-based into hybrid columnar format, and then making use of the elasticity of the cloud to hide the scale of the database engine for you. So this is introduced in Oracle Cloud in 2020. Today, we can scale up to 64 nodes, which means that you can create a Heatwave cluster with 64 nodes in the cloud. It's capable of, you know, building a data warehouse of 60 terabytes. So you can move 60 terabytes of the data into the cloud. So essentially, when you have your transactional data, product, sales information, sits in the InnoDB, you can push this data automatically to the Heatwave cluster. And data will be split and sharded, distributed into the Heatwave cluster. So in the columnar format, and when you run your end of the query, it will be automatically pushed down to the Heatwave cluster. And the data is stored in memory. So you get distributed processing as well as in-memory processing to give you the power. So you can take advantage of the multiple nodes in the cloud. And one of the reasons why we choose to do this in the cloud is because you can scale up and down the resources, as opposed to if you run this in the on-premise, you would need to procure and buy server as opposed to doing this in the cloud. So that gives us a lot of flexibility in adding new features and functions into the Heatwave cluster. So for the interest of today's topic, I'll talk about machine learning. So machine learning is not something new. So every day, we've been using the apps, Facebook, Brab or Uber. So you can see that all these apps really make our lives easy. For example, go to Facebook, you see the newsfeed from your favorite friends. Things that are important to you pops up in your newsfeed. And that's all because of the machine learning model. In the McCurry talk yesterday, they talked about how they use the machine learning to do, learn to ranking, is to move those content that's matter to the user to the top. So that's essentially what people are doing today, using machine learning to help build an intelligent system. Booking.com, so recommendation system, so that people can quickly book a package on booking.com. So they want to turn clicks into sales by having this recommendation system. Also Uber Eats, to be able to send you a notification, the right one at the right time, at the right user, to the right user. So all these are done using machine learning. And if you are in a different industry, and many different cases, banking, fraud, retail, telco, and other industry, I'm just going to fly by this. So just last week coming to here, I received this SMS. I'm sure you receive all this SMS as well. So this is obviously a spam. So it, can you see? So say thank you for your payment for Tommy, for this 2000 ring it. And very quickly, by looking at this, you know this is a spam, right? Because it is a personal mobile number, right? And the language is not properly structured. And the contact number is really not really a contact, the bank official contact number. So this can be a good use case for telco. So every day, we see all this, and machine learning is going to help us. But it's not that easy to implement machine learning, because it takes a highly skilled person to do it, right? Data scientists, you need a person that's really good at per Python, really good at machine learning algorithms, be able to, you know, get the right data, structure the data in a way that's good for learning. So these are the four machine learning pipelines that is common to all the machine learning tasks. And in order to get a tune model, you really need to do an iterative way to get the model right. And that's not very productive, especially, you know, if you have many, you have a lot of data, and you need to come up with a model in a good time. So that's, okay, I have a quick demo here. So this is to show you how a typical machine learning process looks like. So you're going to have a data in the data frames, and then you would build a model, right? And then here you need to choose a model, right, the right algorithm for the data set, so that it makes sense to come up with the model that you like. So just go to it, and you see that in the model, there are lots of parameters you need to tune in order to, you know, generate an accurate model. So that's typically how you would approach a machine learning with a traditional machine learning pipeline. So in HeatWave, we are automating this tedious machine learning task into AutoML. So this is an Oracle research lab technology. So instead of you doing all this task, you automate it into the AutoML engine. And with that, we don't really need to have a very high Python skill, but leverage on the SQL knowledge you have. So you can use SQL to invoke AutoML to create a model. So these are the four things that we innovate and put it into the AutoML. The more one is that, you know, in the preprocessing, we'll make sure that data set complete. And a second step, you know, is to be able to sample the data into a good size for training. The goal is really to come up with an accurate model in a short amount of time. And third one is we have a proxy model that we can start with. Instead of iterative, you know, getting this model, we actually have a proxy model to start with and fine tune along to get to the tune model. So with that, we have the engine within the HitWave database to allow you to do machine learning. So these are the three algorithms or three different type of categories that we support right now, classification, regression, and forecasting. Soon to be available is the anomaly detection. It's going to be available very soon. So you can tackle different problems using all these algorithms. And yeah, so if you work with Scikit-learn, these are the algorithms that we support. And if you run AutoML, the engine will pick, depending on the categories, which algorithm will give you the most accurate in the shortest amount of time. So putting this together doing machine learning in the HitWave, it's really, you know, having a set of data loaded in the HitWave engine. And essentially four SQL calls to implement machine learning in HitWave. So you can see that there's an ML train, there's a scoring for you to test the accuracy of the model. And then loading the data into the HitWave model in the HitWave engine. And then depending on whether you want to do it in a bash way or in real time prediction, there are two different functions that you can use. Essentially these are store procedures that you invoke on the HitWave. To be more detail specific, these are some of the functions that you can use. I'm not going to go through each one of this but if you're familiar with SQL, this is just fitting the value into the parameters. Predictable, explain role. I'm going to talk about explain a little bit so that you know the explain function is very important in the machine learning model. Okay, so I'm just going to run through a few illustrations to show you what happened in the backend. So when you have the data loaded into the InnoDB engine when you execute the ML train, the data will be loaded into the HitWave node in the cluster and then the training will happen in the HitWave node and once the model is trained, it will then store into the model catalog and then you can start using the model to predict by feeding it a new dataset. So when you run the predict table it will get the model from the model catalog, load it into the HitWave node and then it will get the result back to your application. Same goes for the explanation. The model will be loaded and it will explain why a prediction is made in a certain way so that it tells you what are the significant features that is used to making that prediction. So because this SQL base you can actually reuse or you can use any of the Notebook, Tributor, Zapplin or any SQL tools to work with HitWave machine in the morning. So I have an application developed here to show you how you can use this. So there's a banking marketing data which the model is trained to predict which other customers is likely to subscribe to a new product. So that gives the core center a good list of potential customers going to sign up because they can turn the potential into a customer so that you get the maximum return out of the core center. So the model has been trained and all I need to do is to load the model into the cluster and then I can show you the explanation of the capability in HitWave. So when you do a show model explanation it tells you for this particular customer why does it sign up the product if it cares the core duration with this customer. So there's other parameters such as the bang rate, the ureball rate and also the number of years this person is employing is important making this prediction that this customer will sign up with the product. So with this information the core center can pick and choose the customers that will get likely to subscribe with the product. In terms of scoring you can check the accuracy of the model and the many different types of metrics you can use to check the model. And here you can run this prediction in bash for example I'm creating a random table and I run the model on this table and it will give me a prediction of which customer is going to sign up with the product. Okay so putting this all together using the actual SQL you have an idea of how easy it is to use this in HitWave. So I'm just connecting to HitWave customers and this is the same data set that I showed you actually connected. And these are all the models in the model catalog. So I'm just going to load the bank marketing model just load it and this is the prediction on the table. So you just specify that the test data set, the model which is loaded in the HitWave clusters and the table that restored the result. So that's how easy it is to use the model with SQL. Okay so we also did a benchmark against artist. You can see that because the in-memory clusters and also the AutoML technology, we are way faster than the Redshift ML. And in turn the cost, it's just 1% of the cost of using on Redshift ML. So with that, very quickly, I'll give you a quick overview of what you can do with HitWave. Besides, you know, it's a very fast OLAP engine but also you can run machine learning on HitWave. And if you want to try out, because this is on the cloud, you can actually sign up on Oracle Cloud with a trial count. You get 200 US dollars or 500 dollars sync in 30 days or whichever come first to try out HitWave and machine learning. So with that, thanks for your time and I'm here to ask question if you answer a question if you have any. I see a bunch of questions. Hi, cool technologies. So how much influence do I have on the architecture of the model itself? If I say, okay, I want this multi-layers and skip layers, there's many layers and so on. Do we have any more influence than just say, okay, I want to classify? So the question is how many different types of algorithms? So today we support classification, regression, and forecasting. And normally, detection is coming. So yeah, the list of algorithms is pretty much those three that we have currently. And you're talking about model training but then how about model serving? So I see that this is a very notebook driven environment which is great and all for prototyping. But then for actually putting this model into production, how does that work? Okay, so it's all SQL driven. So once you have the model, it stores in the model catalog. So in order to consume the model, you need to load the model into the Hibay classes. And in terms of the application, it's just a simple SQL to invoke the model to do prediction. So either you can do a batch kind of predict on the table, or you can do a role-based kind of prediction. So it's all through the SQL. So in the application PHP or Node.js, use a master connector, connect to Hibay clusters, load the model, and then run the stored procedures for prediction. Sorry, I have an architecture question regarding how this works. So are you actually doing the querying and the model training separately or are you actually able to do some form of distributed training on like a cluster of having like one of those cluster systems basically? Right. So Hibay is a cluster system. So then you run the ML train. So it's actually distributed into the cluster node and then do training. Because data is distributed. So you do distributed training. It's all done by the heat wave. It's all done by the heat wave automatically. Time for one last question. So will this be published to open source or how much of it already is? Good question. This is not open source. This is only in the cloud. So that's why you can try out the heat wave with a trial account. Well thank you very much for this wonderful talk and answering all these great questions. Thank you. Thank you. We're going to talk about MySQL architecture and how to get the best performance you can out of MySQL. Okay, so I'll talk about MySQL architecture and my experiences because I work as a support engineer and I went through a lot of cases where customer faced challenges, different kind of challenges from migrating 5.7 to 8.0 and as you know like we had 5.7 when we moved from 5.6 to 5.7 I would say it was not major change as compared to moving from 5.7 to 8.0 because there are a lot of changes in 8.0 even though it is released as a minor release. There are changes that can break things. I would say if you do not pay attention. So yeah in this talk expect that if you are using MySQL 8 or if you are planning to use MySQL 8 and I'll talk how you can upgrade, how you can optimize it and few ideas on how we can scale using the feature which are available in MySQL 8. Okay, so I'll start with the MySQL architecture and then we'll focus on the areas which are changed in MySQL 8 but let me go through the architecture overview. So when I say MySQL basically EnoDB is the main storage engine so I'll start with the EnoDB. So in a memory area we have a buffer pool and change buffer. These will take care of your insert, update whatever changes happening. It will happen here in the memory area. Then we have to hash indexes. It is used for in memory indexing the hash indexes and we have the log buffer which will buffer the entries for read or log and then it will write to read or log on the disk. Then on the disk level we have the system table space and here we have the major changes comparing to 5.7 that I'll discuss but in 8.0 the only thing we have here is a change buffer if you don't choose to store your data and indexes in system table space. By default it's stored in EnoDB we have a setting called EnoDB file per table which is by default enable that means whenever you create a table your data and indexes will be stored in the individual table space. So if you have million tables you will have million table spaces and it works fine. It sounds if you are using Oracle then yeah it sounds weird but yeah it works fine here. Then we have a double write buffer which was part of the system table space earlier but now it's moved. Then we have general table space and EnoDB file per table space they were already in 5.7 there is no change. But there are major changes when we talk about double write buffer, undo log and temporary table space okay. So these are the major changes in the architecture. So this is how these are the components of EnoDB storage engine. So let's see how my data directory will look like if I install MySQL 8. The first thing the first change you will see that the bin logs which was not enabled in 5.7 in 8.0 bin logs are by default enabled that means you will get that and it's supposed to be on in production but if you are using replica in some cases you don't need binary logs in that case we have the double write buffer files which are created separately now earlier it was a part of system table space. Why this double write buffer file we why we use this file we use this for crash recovery purpose for example if my MySQL crash and while recovery MySQL need the good good good page whatever data committed it need a copy of the good page it will be contained in double writer buffer as a safety mechanism. If you are using a storage like a fusion IO in that case you can skip it okay but yeah by default it is it will generate these files. Then we have the new redologue directory it is introduced in 8.0.30 which is very very recent okay. So earlier we used to have two redologue files by default but now we have the separate directory and I'll talk about it more in next slide then we have new session table space for example if I am running a query optimizer need to create some temporary table for some kind of operations sorting for these things. So yeah these those kind of operations will be managed by this table space and the very very interesting change is the data dictionary okay again I'll talk in the next slide okay what it is but it is purely created for the system table space this is the major change I would say in a 8.0 then we have our under logs okay which are separate from my system table space okay so again in 5.7 we have under log in the system table space if you don't explicitly create separately and once you start your mySQL you cannot change it that was the condition in 5.7 that means you cannot move the under logs but in 8.0 it is yeah by default they are separate let's talk about the data dictionary because this is the major change so this was the condition in 5.7 that means you have the your metadata files you have my asm tables you have you know db tables this is this is a mix okay so but in 8.0 we have the transaction data dictionary to store the system tables and that's the major change from for the dictionary then we have the under logs so for under log there is a feature for automatic and manual truncation and you can drop and you can create table spaces like this okay these are some example and the variables you can explore for temporary table space earlier in 5.7 there is only one table space that is ibtemp1 but now in 8.0 we have the temp directory which create temporary tables temporary table spaces like this for session and optimizer for under log in 8.0.30 there was a variable there is a variable actually but the new variable introduce that is inodb log capacity which supersize the inodb log file in group and inodb log file size so because of this you have the redeveloped directory and inside that redeveloped directory there are 32 files and out of them some are ordinary spare files so ordinary is which are already used and the spare which means no they will they will spare they will rotate or no the data will be filled soon along with that there are there are some new features like redevelop archiving, disabling, redeveloping and redeveloping encryption this disabling redeveloping is very dangerous okay you should not do it for no in the production the one use case I can say is for example if you are loading the bicycle dump that time you do not want to have the redevelop right you do not want to write those things in a redevelop in that case you can disable and know your dump will be faster okay data loading will be faster so that will be the that that could be the use case for this case binary log yes these are by by default enable and now we have server ID is equal to 1 if you do not mention the server ID for your server so these are the default changes the term slave is a change in 8.0.26 now they call source and replica okay I think everyone all organization aligning with that so that is the change replication was a single threaded till 8.0.25 in 8.0.26 by default replication create four threads and your replication will be multi threaded okay so why these are these are important because I have seen for a customer where because of this setting this is no this experience some problem okay and that was the buggy behavior but they could not know they were confused why it happened okay suddenly and then we came to know that yeah by default they upgraded and by default their replication is multi threaded so yeah that is the reason it is very important to know it is a positive one okay but the thing is in that case the the behavior was buggy okay that bug is fixed okay so I would say it is a positive okay having a multi threaded replication. The tools we have in mysql 8 is mysql shell and mysql upgrade checker so mysql shell is basically used for right now if you are using a mysql cluster it will be a this tool mysql shell is a part of that tool and you can also use to query your database so I would say it would it is kind of you can use all in one tool okay rather than using only mysql client so yeah this you can use. Mysql upgrade checker if you are upgrading your mysql and you would like to see what are the changes you have in your database for example if you are upgrading from 5.7 to 8.1 you would like to see if you have any deprecated options any character set issues so there are 20 plus checks it performs and now you will get a summary that know before upgrade what is the problem and what issues you should address before moving to 8.0. If you are using a Pocona server we have add-ons like this which is completely free we have audit log, thread pull data resinscription with the keyring vault you can use a hash with it data masking to know mask your data pluggable authentication plugins and some there are extended for example the verbose information in slow query log so these kind of features you will get if you are using a Pocona server for mysql okay. The server level features are same mysql we are always compatible with the upstream mysql you can have your source as a Pocona server and replica as of no mysql no problem okay so it is fully compatible okay things you should look out when you upgrade read document this is very very important I know it is boring but yeah it is even if you are doing a minor version please do it the default changes and dependency in your environment for example character set this is the problem I have seen many times customer has a Latin character set in 5.7 they move to 8.0 and while querying they are getting some different data output some different output because of no client and server character set mismatch yeah you should look that and how we can solve this problem using the mysql checker because it checks that thing and there are a lot other things also remove and duplicate variables like inodb temp storage engine okay then myism is removed earlier it was myism so now it is inodb so yeah as I said use this utility it comes with the mysql shell and yeah look for the change terminologies and finally once you do this type of checks do the performance benchmarking okay and you can use a monitoring tool to see the time series data for example you know with this setting how much IO it was using or no the mysql related stats you can compare now the important configurations okay let's talk about it there are 600 plus configuration it's difficult to say we should configure but I can there is a reason I did this I know divided section wise so at least you can know you can focus in which direction you should look okay if you want to optimize so for example use of table spaces to store data efficiently so for example you can have your understable space depending on the amount of under operations environment in your environment you can use general table space which is for example let's say you have 100 clients rather than storing each table in individual table space you can do you can create 100 table space one table space per client okay and you can store all tables in that table space so that kind of use case you can do binary log locations rather than keeping on same mount point you can keep it on separate mount point so that's how you get the good IO sorry performance okay rather than keeping in the same data directory use encryption to secure your data because all these table spaces and all these files are available for encryption so you can do that and OS resources for example you can use which malloc library you want to use okay disable the huge pages okay and set the correct numeric control then the strappiness okay so this kind of a configuration you should do on OS level also because no when we run out of I I mean when we optimize database side of a configuration and even after that we are facing a problem then we should you should look no this side as well so these things on inodb side these are the list of variable I would like to focus on the important one which are inodb flush method and inodb flush transaction at commit and okay so these these variable can change okay how these variable can trigger the performance issue and also they can solve the performance problem okay so yeah these are the some important variables you should look and the third it is it is a part of a configuration because whenever you whenever you face a problem you should check you should know how to find a problem okay and for that MySQL it comes with a very good enhancement when I say performance schema because time to time to time I can I have seen the lots of improvement in performance schema and know it is very very useful to troubleshoot the problems log obesity this is very very important in 5.7 you used to see the error sorry the nodes as well okay the normal messages error and warnings in 8.0 you will see only warnings and errors no nodes okay so if you want verbose log you need to set that option explicitly okay so let us go over some debug and monitoring tips you can use a slow query log okay and the audit log okay for for the troubleshoot now let's talk about this is the final final thing how you can scale okay what are the scaling solution you have in 8.0 and what you can do to know scale your workload so earlier we talked about the vertical scaling right configuring optimizing variables on the same machine operating system configuration configuration and all but now once you done with the vertical scaling how you can scale it horizontally so let me talk about the re-scaling you can use the traditional replication that is source and replica the another good option is group replication and you can use on top of that you can use HAProxy or ProxySQL okay to load balance and do the routing of your right load then the inodb cluster the complete package which comes with which is which you know which so inodb is built is a set of group replication mysql cell and mysql router okay these things and finally PXE you can again use with HAProxy and ProxySQL for the right scaling I would say it's not a pure solution I would say it's kind of primitive that means no you cannot write on multiple nodes at a time you will see the lot of conflict issues but there is a hack there is a work out not a hack but no I would say you can do it using the ProxySQL query rule okay so you can use the ProxySQL and you can divert your load depending on which in which table on which host you would like to write so that kind of thing you can do geographically distributed for example let's say we have a dating app so yeah I am in I am in Singapore so yeah I can have one server in Singapore and I can have one server in let's say Europe okay maybe in London okay so I am sure that local people will look for no search for local reason so I can have server one server for right there and one server here in Singapore and that's how you know I can write at multiple places so it's more like application base okay that is also possible then there are few options pure shouting option for example NDB cluster it's it can it can scale your writes by doing horizontal scaling and there are other solutions that like you know we test so they provide write scaling solutions so yeah I added some references you can I will give this slide so you can refer and if you want to know more about the you know once you test or once you benchmark how you will analyze the performance we have the tool called PMM so tomorrow we have talk on that so using that you can have you know different kind of graphs for MySQL it also support the MongoDB and both operating system monitoring but you can use those graphs to know analyze the performance of your database by changing various options and how it performs okay so that's all and then thank you and yeah question for me so first question here general question is prokona now just providing patches to upstream MySQL you still do your own engine as well good question so we fully compatible compatible with upstream okay so whatever bug any user any community user reports we fix or the upstream fix and we offer the patches it's up to them whether accept that I'll club it but yeah I have seen a lot of the time they accept and know yeah we exchange the thoughts yeah it's fully compatible are there other questions otherwise I have one okay I'll give my question and I'll bring it over to you so could you describe for me you mentioned tuning huge pages how does MySQL use huge pages and how does this effect performance much about it but usually the recommendation is to disable it I have solved you know good reference blocks that I can refer or I can share and I can discuss with you but yeah that's the thing hi is the replication synchronous okay so there is asynchronous and there is a virtually synchronous so it is not fully synchronous okay so that means for example if I have two nodes one node will write it locally okay and it will ask other node I is it okay if I commit it another other node say yes it's okay but the other node will write it write that thing no maybe second letter or something like that okay so it's kind of yeah it will write for sure but it's like a virtually synchronous not a same time okay next question okay great talk I enjoyed it I think I think we all did and yeah I would love I mean I just shared what I know but if you guys know any use cases I would like to discuss please let me know I would be happy to know this whole time okay one second okay so next we have Han Yu Chung he's going to tell us about his roadmap towards building an independent open source distributed database from the ocean based team so very much so today we're actually talking about our open source distributed database platform so before we start that story we really want to look at decided to develop this database in fact necessity is a matter of all the vengeance so yeah for some reason so we were looking at explosive data growth so when we say explosive data growth you're looking at e-commerce like social in the next two years we're going to reach about 132 z bytes of data and another thing is e-commerce website you're running something like Taobao you're running something like Lazada, Shopify, Shopee one of the challenges is there's a massive amount of data volume comes in many different factors one is product data, you've got a metadata, you've got customers doing feedback and then you also got to handle the transactions nowadays if you use a platform like Lazada you realize that every month or every now and then they come up with some promotion offers of course the most popular one is W11 in fact the origination of this database when the company finally decides to say okay enough is enough we need to find some way to do this it's actually W11 where one of the commercial databases if you were using 12 years ago Q over and cost a lot of pain someone is also like a stronger capability to handle the kind of data growth that you have traditionally the we call it monolithic database but it's just how the database was being developed right so when we started to have the concept of database we started with the mainframes and then you know the mid-sized systems like AS4 and so on and then we developed into the PC you start to have machines more like commodity hardware where it's running Linux so Linux actually is the start of the revolution and that's why I'm very happy to be at the open source platform where I started my IT journey I actually started with Red Hat Linux version 1 and in that time the database model is like I want to find some way to store data and it's not that much of a data we want to store because during that time you know the computers are not that capable you know the memory is the storage you know we're talking like 40 80 megabyte hard disk that's what I remember right there was a really really long time ago so going from that perspective the traditional solution is that I make it as I can just have bigger more performance systems but eventually you hit some kind of roadbox where bigger is not always better and another thing is you know then you have another solution is that you know I use to work for Oracle as well Oracle says okay you know I can make it I can make it scale but in some sense you know it's still monolithic because you know compute I can separate it so what Oracle real application across the means is that you can have a cluster of systems but your storage is still shared right your sections information your memory state is still shared so you still have some kind of limitations of course you have machines like exadata which is very very capable and you can scale to you know handle petabytes of data but it's a really expensive customized hardware solution and of course then you have sharding right so a very good example of sharding would be the likes of dynamoDB from AWS I use to work for them is it's a kind of sharding that you do the hash key and you specifically use algorithms to shard your data across a number of systems and you can split them and partition them as much as you want but still in order to do that it's a bit of a complication that you have to actually manage that yourself so from that perspective we think about this as more like a distributed requirements that eventually you need to distribute your data you need to expand the way that you're going to store your data and not only that is that we're still doing like partitioning, sharding of data and so on but you want to like also automate it and you want to have the ability to not just like doing read write on one particular or custom machine so for some of the systems in the market yes you can do scaling you can do horizontal scaleings like take service or lower for example you can scale it horizontally they have a shared storage system but there's only one instance of read and writer so you have a one massive large read writer but you have a lot of readers that you can use to scale your requirements so for read only you can scale it to quite a bit of a size but for the transactions that you're writing in the database but you still have one single instance to do it of course then you have other systems like you know you start to have someone like Google Spanner which is kind of in a very similar concept where now you can spread it into like multiple systems automatically and you can expand it into different regions as well sorry about that okay so the pain point for us was you know we have a performance we really want to scale and the challenge here is that because we want to do the scaling we needed to be distributed we needed to scale very fast and on commodity hardware another one is that we need to eventually now we also think about customer says not only want to do the transactional so this is the chicken egg issues all the time customer says I want to do transactional and then suddenly they say also we want to do analytics so at the moment you just do it in two places but for us we try to say in the same instance of schema and data tables you can do the same now we can locate I have to go faster now so just to go very quickly on our evolution we started in about 12 years ago we started with Taobao we decided to really have this self-developed distributed architecture initially supported Taobao eventually we also supported Alipay so now if you use Alipay you use Taobao the database back end is actually ocean base and also if you look further forward and then we started to have our partners our financial partners they said you know this is great you know you can do like massive distributed transactions financial transactions we really want to use it on our own business as well so then you have also the banks the brokers and even insurance companies started to use us and then when we move forward there's more and more commercial partners and overseas now we got payment partners such as GCash and Donna that's actually using our database platform and very quickly the way it works for us is that it's a distributed architecture so every time you deploy an ocean base database even for a community addition we can do it in one instance now but it's basically you need to have three nodes and it's a cluster and we can separate have different tenants and each tenant is an individual database and you can decide like which of these tenants which one will be allocated to the leader so leader means that this will always try to hold faster for the read and write requirements and when you want to write to the database when you transact you will always hit the leader but if you don't have you don't have you have consistency you don't have strong consistency requirements you can always ask the system to direct the read to the followers right and then you know you can decide how you want to run it and another really unique thing for us is that we are able to do real-time compression as well so it's adaptive compression so when you're doing transactions and you're storing data into the database the database is just like okay what kind of data type this is and apply a different kind of encoding or compressions into the database so it's very common that my customers say if they go for MySQL they have a 10 terabyte database and when they go to ocean base actually it's only 3 terabytes because it's fully compressed so one of the reason why we decided to go for open source platform is that this is really where the world is heading more and more open source database is getting adoption and open source database has proven to be a model where the more that the community is using the database the more stable and more capable of that platform is so the best example for me is coming in from the open source side is Postgres and MySQL because Postgres now is version 15, 16 now and MySQL now is on 8.0 and it's because the community is pretty strong and it's increasing the use and also the ability for the communities to point out where to improve the database so we want to have that capability as well so we are at the start of a journey of being like as part of the open source community we started our project on 2021 and what we wanted to do here is we wanted to adopt the best practices for an open source foundation so we are taking elements from like for example for Apache foundation we are going to have a tentacle oversight community we are going to have a development group and user group and what we want to do is at the moment in China there is a really strong open source community for ocean base but we are also expanding it overseas and this is where we are at in terms of being an open source platform which we started with 3.1 where we open source 3 million lines of code and put on Github and knock ourselves out the source code is here you can compile it, you can run it, play with it and then slowly the ecosystem of surrounding the databases we are opening it up as well and just want to highlight some of those it's like doing data migration, we call it the ocean base migration services, there is also the ocean base management platform unique in the sense that I have never seen like any other platform management platform come with open source database platform there is also the IDE, there is a few of the ATS that we are opening source as well and most important is that we are also integrating with the open source ecosystem so some of the ETL tools there is open source, we are contributing code into that for example for IDE, you can also use dbvphere in dbvphere, if you download the latest version you will find ocean base inside there so that's the ecosystem and then later versions of version 3 we are also looking at how do we handle slow sequel, top sequel monitoring and basically the ease of use of the open source platform as well version 4.1 which is really exciting in the sense that I talk about initially we need to have like 3 different nodes and in this one cluster so our open source user is saying that it's a bit difficult to have that much of resource to do initial testing and development so we kind of re-architected platform in order for it to run on a single instance of compute but it's still a distributed architecture it still retains our design but we are able to run it on a single piece of hardware now and another thing we wanted to do in proper next eventually we want to also take on analytic performance kind of requirements for a database where you know there are platforms such as VechChift it's really strong on Colama analytics, we are going to put that into our platform as well and most important was that we continue to improve our capability to automatically distribute the transactions and the data query requirements towards the node itself so that you don't have to do it yourself so in essence what we wanted to do is to really share, to really go for standardization we want to be like the MySQL of the distributed database so there's no such thing at the moment there's still quite a lot of platforms available for the distributed database side you know relational distributed database I think we are kind of unique that way and in other things we want to scale our user base as well of course in order to get adoption we are MySQL compatible we are fully compatible for 1507 but we are getting more and more compatibility for 8.0 as well so by the end of the year we expect to have full compatibility that is our github and most important is that this is not a new database platform this is a platform that has been in production for quite a while now so in order to start with the open base community edition 4.1 right here is the documentation you can come in and take a look at what is the open base about what are we doing in the open source space and our community edition and there we apologize at the moment it's still in Chinese however if you go on here and you find that means a community edition you can download the x86 the ARM version of that is ARM specific probably through China but at the same but what I'm going to ask them to do is to also validate on ARM in the international market as well for example AWS Gavaton so we want to validate that so I'm going to do this I think I'm just okay great so thanks for that I'm just going to go back here so that you understand what is the ease of use and if you have any questions so far what consistency grantees does it provide ah consistent grantees so by default it's eventual consistency right so there's a leader there's two followers we use PESSOX to ensure that the data has high availability and also make sure that they are reliably replicated but there are two modes to do that one is eventual so that you can use the followers reading but if you want strong consistency you can define that you want strong consistency so either you go directly to the leader which of course you have strong consistency or you actually delay the reader knows a little bit to make sure that all pretty consistent is done I have a reclated question okay so suppose I want to use this for financial transactions I need strong consistency how consistent are the backups like if I take a backup of the whole distributed cluster is there the possibility that different parts of the cluster will be at effectively different points of time when they get backed up or do I have a completely guaranteed stable snapshot of all nodes it's going to be completely stable snapshot because we are only taking the snapshot with a completely stable and consistent data as a backup okay that's cool more questions yes I just want to ask how is it comparable to like a CRM like a CRM like a customer relation we can definitely power CRM but CRM would be like self-porting and so on right are you asking how easy it is to run a CRM platform on it oh very easy if you can run it on MySQL you can run it on Oceanbase distributed financial transactions and it's not very common for a relational database that's distributed, scalable and handled financial transactions yeah so it's commonly like well for financial transactions I would use Postgres for that because it's pretty good I would use Oracle for that but those are monolithic databases they are not designed to be distributed so next question well you can follow me or download it and give it a try oh there's more one more question I have a question I think I read it somewhere that you have Oceanbase and then you have PolarDB also from Alibaba and I think there's also something called in memory something I forgot the name and what are all those differences between so first off we are part of ANG Group right we are kind of a sister company with Alibaba so Alibaba itself has also their own database technology as well so think of us also in AWS is the same so AWS also have their own managed database services right we do from the commercial side you know we do sell our services on the RD crowd we also sell it on AWS as well so we have multi crowd so the difference is that for all these different services whether it's PolarDB or whether it's Aurora they are always specific to a particular crowd so you can only use it on that crowd you can't really use it on somewhere else so where else we are kind of different we want to be a standard for distributed database so you can just use us on anywhere you want we are more like we are co-packagers if you will not really competitors but at the same time we are because we are competing in the same space I have one more question and I'll hand it over here let's talk about like rebalancing data sometimes with some distributed systems they tend to be a little static if you're writing a lot of data maybe it almost ends up going to one node for one reason or another how are you able to rebalance things if one node becomes a lot more full than another if so kind of could you explain that a little bit so it depends on how you want the database to behave so if you let the database do the automatic partitioning, balancing so the database will constantly look at the different nodes and say ah am I hitting one of the nodes that's really really hard because sometimes you got hot table problem where one particular node because of a kind of transactions or a region let's say in Singapore orchard is suddenly really really hot then that particular node suddenly gets a really hot usage so you can manually say ok for that node you want to split the data and it will automatically be done automatically by the system if the system monitors that and says ah you know there's a hot table issue here you know we're going to split the data as well so that can be done by the database and that's why we can't not have that problem because even if you use something like a shawning you use something like dynamoDB that's exactly the problem with dynamoDB if you hear the table really hot that partition dies which is a common problem so the question is around the edge computing that question base and edge computing do you have any use cases that you had seen in your experience actually most of my customers wants it to be in country right so like if you go to Malaysia you go to Indonesia they actually they don't even want to talk about edge computing is that you need to be there it's either you're in the data center or you're in a public cloud that's in the country and if I have customers that says I need to deploy on the edge services which I'm talking to the edge edge computing service providers as well you can deploy the database at the edge it's not an issue at all do we have more? I love the question. I'm sorry if it sounds like two questions Susan one but first of all may I ask you about requirements for the ratences between different availability zones and do you support multi-regional installations when I say region I mean like one is Singapore one is Japan exactly in Europe etc different continents personally I think that's an impossible task that's one, second is that we've been trying to do that so we've been trying to do like in different places like Beijing, Shanghai or Guangzhou which is pretty far anything that's on fiber and under 50ms like the latency must be under 50ms 50ms 50ms I know where you're going with that so a lot of these distributed systems we use synchronization in fact our crop must be fully synchronized even Spanner needs to do the full synchronization as well and they have one of the most accurate crop on the earth we have the same requirement because we are distributed system we have time for one more question but I'm looking at the licensing, did you mention the licensing because this is Mulan 2.0 with Chinese versions great question, I'm expecting that one I asked the same question because I started 3 months ago so the Mulan license is very similar to GPL version 3 it's an open license, you can use it your own code needs to be open as well, it's as simple as that and you can include Mulan license in your own licensing as well it's kind of like what Sparktree, Sparkt is doing so if you look at Sparkt's licensing it says oh I'm licensed under Apache license version 2 but then these are all the open source licenses that I have as well so Mulan can be included that way too, so no issues thank you very much for this wonderful talk we have I think a brief break and talks will resume in about 15 minutes, thanks alright so for our next presentation we have Pandey Krishnan and he's going to explain the improvements that have happened in MariaDB over, I think it's ultra table, right, and how this is improved in relation to replication if I understand correctly for MariaDB corporation, so in the legacy altar how the altars are going to be, is currently handled this so whenever we execute the altar on a master or source, it will be executed fully on the master first, once the execution completed then only it will be written to the bin logs and it will be flown to relay logs and it applied on the slave and again it will be executed on the slave through the slave SQL thread, so during that period we may face the replication SQL thread will block the replication for that time, so during that time we will be phasing the replication lag and delay in the replication so when you have multiple slave, multiple replica for your master, you will be phasing you will be phasing replication lag for on most of your most of your replicas, you don't get the real time data, so to avoid this in current situation we are mostly using lots of tools like PTO, Percona online schema change and then even GitHub has one tool called GHOST and Facebook also has its own tool, there are multiple tools available in the community, so like always MariaDB comes up with some innovative method to solve this issue, to handle this and reduce the replication lag, so how MariaDB is trying to solve this, so whenever we start executing an alter table on a source, it will be divided into two bin log, in the bin log events it will be divided into two steps, so like how we handle the XA transactions, like XA prepare and XA commit or rollback, so how we handle that in the replication, the same method we are handling here, so every alter has two events like start alter and then a commit or rollback alter command in the bin log, so when the alter is started executed on master immediately there is a start alter event or return to the binary log and send to the slave at the same time, so in the slave the the requirement in the replica is you need to have a parallel replication, so once once you are even sent to the replica immediately there is one more SQL thread which applies, start executing the start alter command, so it won't block the replication for that time, once the alter is completed on the master side, there will be an event called commit alter will be logged in bin log and send to the replica, so once the commit alter is reviewed on the replica immediately the commit alter command will be applied and then the table structure will be changed in the replica until that time you cannot see the table structural changes in replica until you receive a commit, in case if your alter fails you will get a rollback rollback event on the bin log and the alter command will be rolled back on the replica so on the next few slides I am going to see the implementation detail, how it was implemented in MariaDB, so for this we we have a variable called bin log alter 2 phase which was introduced in introduced from MariaDB 10.8.1 onwards, so in the master end we just need to set this variable before running the alter so once we set this variable and execute the alter on the master, master doesn't have any changes only in the bin log we can able to see the start alter or commit alter events, so once the alter got locked and started executing immediately for example this is one of the event for a successful alter and commit in the MariaDB so immediately the bin log will get a start alter start alter event on the bin log that will be sent to master and in the master this alter will started executing without blocking the SQL threads, so those who are not familiar with the GTAD implementation in MariaDB the GTAD implementation in MariaDB is little bit different we use domain ID and then server ID and then sequence these three variable which you are seeing GTAD 1-1000-784 so one is the domain ID 1000 is the server ID, 784 is the sequence so for the start alter for the start alter you will get the event with the GTAD plus start alter in the bin logs and the real alter command will be locked in the binary log when you get a commit alter in the same binary log you will get a new GTAD sequence as well as you will see the ID for which sequence it needs to apply for example in this case you can see above the GTAD sequence is 784, so in the commit alter you can able to see the commit alter ID equal to 784 this commit is for the event which was executed for 784, so once the commit event was executed on the binary log then only you can able to see the changed table structure on the replicas so what happens in between when you are executing the alter command on the master you cancelled or it got failed what will happen on that case there will be a rollback event so in this case you will get a start alter command and it will start the alter on the slave so once you receive a signal from a master like it got failed or it got cancelled due to some reason immediately in the master binary log you will get a rollback event so that rollback event will be sent to slave and that alter will be rolled back so this all steps will be done in the background without knowing to the users so this is the implementation steps for lag free alters in MariaDB currently started with 10.8 so still it is it is in the development stage so still there are lots of improvements or suggestions from the community is always welcome and then like already Daniel mentioned about this the MariaDB bin log current from 10.8 onwards either the MariaDB bin log is aware about the GTADs so this start position stop position options are already available in MariaDB bin log utility but previously we can able to we can able to pass the log positions in this variable but now onwards you can able to pass the GTADs for a start position and stop positions as well as the MariaDB bin log utility is more aware of the GTAD strict mode as well so when you are applying applying your binary logs on any other server you can still enable your lmbl the GTAD strict mode as well and then if you use the verbose the 3v symbol you will get a warning for out of order transactions so which we usually encounter when the replica there is some writes happened on replica and then when the strict mode is enabled usually replication broken replication rates with this error out of order sequence and then there are few futures about MariaDB comparing with the MySQL what are what makes different MariaDB so we have a audit plugin and then the system versioning table which Daniel already showed and we have a compatibility mode as well so you can still create your procedures stored procedures, functions, triggers similar to Oracle, Oracle syntaxes and then spider storage engine which is very useful when you look for some shading you can shad your tables using the spider storage engine and then sequence sequence is also one of the options available in MariaDB not in MySQL and then the hot backup like Maria backup, Maria backup is a fork of Perkunos extra backup so these are the few additional futures available on MariaDB compared to MySQL that's it for me thank you. So any questions a couple of them I will, sorry, okay. Hi can you elaborate a little bit more about the spider storage engine like for example what, how do you partition is it based on hash based on some other methods? Yeah you can choose your partitioning methods, yeah custom So I have a quick question so with this two phase ultra table that you have seems like you're moving away from this MySQL and Oracle atomic DDL does this mean that there's effort trying to support transactional DDL over the case where you could maybe have multiple DDLs and even DMLs that could get rolled back together or do you see that never happening? It's just a question of your opinion I understand. Transactual DDLs No, for now I think even in our community there is one MDAW Yeah I think it's a little bit too far off you know it's a start to make things smoother as case ap but yeah it's a big effort and I, you know how it's implemented in Postgres and that kind of thing in the tuples but yeah it's a big effort. This is the beginning for transactional DDLs My question if you start an ultra transaction on the master in a two phase thing and the master just totally gets killed and you restart it does that mean the slave thread continues on and he's just waiting for a commit that never happens? Yeah that is the current behavior so you just need to manually kill off the thread on the slide? So it's dangerously wait for the commit Next question? Is there any more? So I have one more question that I've written down. I noticed in your diagram you had the fact that before this change you'd run the ultra table it would run on the source and then it would run on the replica and it would block replication. Does that mean that ultra tables are logically replicated? That you don't have some sort of physical replication of them or I mean you could just Oh it's sort of a statement. Okay Yeah it is a statement. The DDLs are replicated as a statement based on Okay That was just curious. Any other questions? Is there a way to pause and resume the alter operation? No Any other questions? Well in that case I want to thank you very much interesting presentation and enjoy it very much The powerful thing today is the native cloud apps and as far as the communities and with minuscule out. So today we are going through how we do this running native cloud apps with minuscule and running the libraries like in the DD class then running on the communities. The first topics we will try to explain the cognitive overview and the second topic that we need to go over is how we deploy minuscule and the libraries like in the DD class that is using and we will show the demos how we create in the DD class and as well as how to create and deploy like using Hamptar or a young man with us with the programming of hremin and working closely how to do like external names with minuscule. The cognitive overview. So what's today people looking at is like taking down the system into pieces and working together closely. So this people's cloud is microservices and small small pieces and working closely together and then communicate wires and so on the protocol and that's the way talk of the years like if people do it with the resource. And in the old days people used to do SOA or some other corporate in the old days. But today people always talk about managing the system and by pieces and working closely and with the microservices people with a personal subsystem and they are actually within the personal subsystem and what lights in the old days a big database centralized database and as things connects to just one simple database. This one is today more than people starts to have a small footprint of the database and just on a scale of this. So as the characteristics for these applications on cloud. So people run which other services on cloud. The multiple this is like some other AI services like machine learnings and as well as some other very intelligent services or even literally copies and equations orchestrations or doing services from the cloud and as more deployment we need the easy deployments will actually continue the deployment is very favorable to approach and as all these integrators and microservices they are loosely coupled and API based services they are useful. So on the cloud how do we manage that we are doing. So Oracle cloud because Oracle cloud infrastructure is OCI we have the point fixed okay my SQL service we call this my SQL database service MDS. So it is very just easy to use we can bring this my SQL with different shapes okay all depends on what you need or the CPU memory and the complicators it's just handy because all these like back up securities and things. And the most important thing here is the extensions we have the feedback and it does the accelerations high performance and big data. So what does my SQL and now it is empty at the back the provision a set of cluster machines and normally it's used to store the data. So the data is not just to store in your DB on the frontier but as well as to push to the back of the cluster. So that's facilitates like in memory code and it is in normal kind of and it is very very super fast. And there's one more additional things and now it's a machine run. So the data when we talk about machine learning is to train the train as a model. So where the data is likely database right. So it's actually in database ML so we provide this like capabilities in all in one. One set of the solutions as like the MDS, we provide OLTP, OLAP as well as the machine learning capability and create your apps. So for today we talk more likely the deployments using container, okay Kubernetes. There might be possibilities. It's then along my SQL run as a server using Docker image. It's just simple. And there's also anonymous properties with master in the DB cluster and it can be deployed using master operator, okay, for the Kubernetes. So what we are going to show you. So the operator, how we do this. So first of all as to basic master in the DB cluster it is more new things. But what we do here is to deploy this, okay, on this more easy custom container and it is up its way in the Kubernetes and balance. The three most stable set components, okay, down at the bottom minus zero, minus zero one, minus zero two, okay. The three nodes cluster can be deployed, okay. And the largest can be three and the fourth and the two can be one. So all of this, if you do it on your own you have, we have to manage each of the components by ourselves, okay. And as well as the, right how apps to connect to the routers. So with the MySQL operators obviously just one kind of one single command, okay. And we deploy the whole set of the, you know, because the three nodes routers and also the connectivity how to manage the up and down and bring them sort of down and then switch over to the other, bring the data back all this automatically. That's the operators. So how do we do this? So basically the LB have the documentation and you can do it with with me on the open source registered MySQL operator areas. And we apply, yeah, as the manifest installation we apply the CRD because from the source definitions that we already searched the IC in the Db cluster companies. And when we deploy the running engine MySQL operators as some of the apps, okay, to help you, okay, to facilitate some of the operations that are those and other things. And to create the cluster, it's just simple, create a YAML or FDUs in Hentron. So here as the manifest, we create a YAML file and that while the YAML it will create the cluster name, the second things. So people may also like to use Hentron. It's also possible with the MySQL operators. So we have the repository and just to use and install the operator as the somebody engine, okay, to handle the background jobs, the integrations, ground donations, to install the MySQL in the Db cluster through the the Hentron. And the red text in here, so the TRS use self sign equal to true. So this is some additional possibility people may be missed, okay. So I just put it down here when we create, we can actually set this, okay, because we might not want to provide the certificate. Otherwise, we'll have some more steps, okay, to provide the certificate in order to create the cluster. So as simply as the in here, just to set this to true. So look at this, okay, to demo. So when we show the demos, how we do this to create the MVP clusters using MySQL operators. So we're going to deploy this model using the latest versions of 8.032 operators. So once right now, I have already got the repository and that's why I just to look up the repository and show the MySQL operators what it has. There's two TRS in there once the operator, one is the MVP cluster. So the first first things, we install the operator engines. So we just have to install operators having the names first, the name MySQL operators and create a name. So just done. We can add to look at what has been deployed. So we just look at the MySQL operators and see if it is already up and running. So it is up and running. And then at the GitHub website, we can actually copy the tags and here that's how we do the Hentra installations for the MVP cluster. It is just real simple. We can install just a name and this actually we use the MySQL MVP cluster and the namespace just for the name. We have the MyIC01 and the users and the password super secret and it can actually be locked in through anywhere. So three instance for the server. One route to instance and as mentioned earlier, we use self sign. So once we have done this and the server will be created under this namespace MyIC01. So you can see the facilities are locked. You do not need to create a specific server one, server two, server three. So with this one single command, we create all these servers together and as well as the vouchers and I will just call the servers how we look up the IP address so how this is all done. So you can see it is still under the stage of provisioning so it is under the stage stage two of two. So it is the imageization of the DB. So after that in around two minutes time, so you can see that our 127s later on we have three server up and running so it is all entirely being created down entirely and as well as you can see the services which is actually being created running on the routers created finish. So there are four ports coming running. One is the three server and routers all ports. And what we do here is we have the new name called deployment and we have the IC in the DB cluster and which has three on my port running and also the routers up and running. So one is it tries to connect to the bring up another port running as both shell. It ends to connect to the server and then look up what is this happening. So it should be secret in my cluster and it tries to look up the cluster status. So they are all up and running with the latest versions 8.0 32 and the name right in there. So it is my cluster special so let's look at the routers which is what we are talking about how we do this deployment. It is also quite a popular dashboard famous and people may like to use nanopads or ham charts we will use nanopads this time and how to add the data source and add the data source it is my as well. But one thing this is important is that applications run as separate namespace so it may not know the IP address should be just to use IP address or use a more independent names what we call externally. So here this is the deployment of the family using the younger files and we have to look at the younger file which is the image of the family and put number of 3,000 and we may configure some sort of the source CPU and memory as well as to create all the load balances so that I can actually log in from external IP address through the Internet to access to it with load balances. So what we do actually happens and we can just to create a namespace of the family and apply the younger file within that namespace so you can see I just to apply this file we just create and as well as you can see the younger file create different level of services including the the family itself as well as the load balances of the surface and once we find this to use and we can look at the container it is up and running so still provisioning and getting more and more and we can stretch the source to see it and once we get this to it what we do here is like how we can connect to the server so we just look at the IP address you can see the external IP address and we just log in and we can see because the IP okay sometimes it's like okay IP address yeah you know okay another way and the server time out again so the IP address will be changed but anyway in here we can just to use the IP address in here do you see still working in here so the various surface which we have just created and then we can test the connection source there's some advice so what the needs is we do not like to use the IP address we want something as mean so that my cluster has the service to connect to it but we are actually in two separate space how can we do this so what people want to do this is to isolate them by means of use recoupling and using the service and meaning the type external mean and connecting to that mean we have to define it to my IC01 so we define this specific okay my cluster file and to define all of this like same space so you can look at this we have more names that we can use so none of it is my cluster and when my cluster is the point routing to the MySQL routers point meaning when we have a race and it's switching up and down it will point to the right server and getting that to use so that works and you may get hookups to write specific a schema, performance schema, a specific table regarding memory usage something like that so just take an example so we add 2c columns, allocations how much memory is free out and for specific events as well so on this we may actually be able to see that okay we have a report okay getting data so in those tries to create this so on this we may actually create another one select a version in other panels so we can the dashboard that's actually pick up the server name, host name and as well as support and versions so once we have this this is pointing to my cluster as the server name so what we do here is we may change the primary cluster okay to somewhere else we still shall connect things to a and connecting this primary instance to another server so it wasn't to this once we do this we will have the server and change this to my cluster tool for just an example so we can actually change it to cluster 2 so what we do here the server just switched and if we refresh the page and what it does the server connects to this my cluster as the route here it connects to the right servers okay which is the primary server so we can actually change another time and so from the application point of view it is similar, it just disappears and automatic sending okay the last bit of piece that I want to show is the way we are using the PHP so here is a not quick demo when we deploy PHP my admins to my SCA in the dp cluster and last time the banner was using YAML and this time as the services we may actually need this by hand chat as well we can use PHP my admins find somewhere else and we have the binami and we have the hand chat create creations where we just to set the dp host into my cluster again my cluster is the main so basically with provisions using the hand chat values the values between the server and the edits I just put the edit to the hand chat report values to put in the certificate for the connections so the PHP my admins connects to the server directly as the SSL connections so I also put the alert balances to enables the browsers from the my missions can connect to it so we can see the IP address and finally hookups to the servers so now we have the service up and running so here IP address and the alert balances 1, 2, 9, some other things and we just copy and paste and put it onto the page and we will find we are working but the thing is we don't have that to put in the external means that's why when we connect it fails now we go back and here we need to provide extra across the reference regardless if you want to my cluster on my service so what we have is events in my cluster almost which has all the external means for the servers to be PHP my admins home space and once we have all this done we can run this PHP my admins again so it is just easy and handy so it is all created so with this we just be looking again and done and it is SSL connections I hope this can reach with the questions and see if anybody will have any questions for us so for our final presentation today we have Colin Charles who is going to talk about deploying Galera clusters in various aspects of his experience on this go ahead and take it away last talk we are just waiting we should all go for that so yeah do stuff with Codeship, do stuff elsewhere done a lot of my SQL for a long time so what is Galera really it is virtually synchronous replication some of you managed to stick around to talk earlier about group replication it is also high availability with consistent data across all nodes so copies of data across your three different nodes for example no single point of failure so I think that was an ocean based talk that said you could reduce the size by a third even though it is distributed and there is a reason why they do that whereas we follow repeatable read very consistent three copies of the data very much like NDB cluster does column based failure handling so again you need three nodes because if one node fails you still have two thirds commits with optimistic congruency control so whatever is written to basically one node is shipped via certified before it actually is the ok send back to the client it is multi primary or multi master all your nodes are equal it is called feature of the product by design so it does transaction conflict detection and you can issue your basic transaction to any node though many people do have a proxy in front of it and it works obviously in clouds without issue obviously supports SSL so in transport as well no framework is required for automatic failovers it does do parallel apply and it's got literally thousands of users it's still doing extremely well of course you have some trade-offs so it'll never be as fast as asynchronous replication which is just commenting the primary or semi synchronous which is primary and at least one secondary so we never will beat the laws of physics and of course storage storage is going to be tripled so we have many instances of people across the more than a decade now deploying galeras across various companies you'd actually realize that sometimes people want five nodes, seven nodes but you're really replicating the data five times, seven times it's not really like the most practical thing to do you should really aim for a three node galera cluster not a two node galera cluster which actually a lot of people also think they can do and get away with so there are multiple distributions of galera I just want to highlight well obviously the highlights the codership upstream variant is the one that actually comes with clone SSD as a plugin option so your SSDs don't happen via extra backup and I'm sure you've seen some nice presentations about why the clone plugin is much faster it's highly like an rsync with enough amounts of locks so it's not just rsync in a live data directory clone is much faster than extra backup which I'm sure anyone from Oracle MySQL will tell you it's also true when it comes to full state snapshot transfers and for the non galera folk in the room galera has two forms of state transfers one is the incremental state transfer which is something you have if you already have an existing galera cluster and a node goes away for a period of time and it comes back you can get an incremental set of data it gets like a diff whereas the full state snapshot transfer is for provisioning new nodes because it literally can either rsync a copy over or use this clone method MariaDB actually the first to ship galera cluster in a distribution and it's included actually so MariaDB by default is either async, semi-sync not as a plugin galera also not as a plugin and you obviously get all those amazing MariaDB features that multiple people talked about earlier today as well and then there's pacona as well basis pacona server it comes with proxy sql now with the operators it comes with hf proxy the strict mode is kind of nice for what it just allows my isam tables without primary keys forces you to ensure the bin log format is set to row it actually has automatic configuration of ssl this is maybe the most interesting thing on a pacona xdb cluster 8 the fact that you probably run a database now configured in the cloud and I hate to break this news to you but all my sql replication happens in plain text unless you turn on ssl so if you're not turning it on I don't know what you're doing so there's some feature highlights so obviously for galera for one day things like intelligent donor selection for an incremental state transfer so it'll find the node that is most compatible to giving you information plus with less load pc recovery equals on has been turned on so in the event that a cluster node crashes or even if the whole cluster crashes the ability to have persistent cluster information maintained not requiring a bootstrap is much easier this is actually how many tools nowadays can manage to actually do an automatic recovery of a cluster because it actually manages to pick up the gra state that file information fairly simply gt id's are different in both my sql and maria db and I think that was a good idea of how you could design the maria db gt id's now the good news about galera is either if you use my sql or maria db it follows the native gt id of the database server pro tip galera actually came with its own gt id galera was actually the first to come up with a gt id implementation followed by tungsten replicator where they had this in a transactional history log followed by actually supposedly it would have been it was actually then maria db and then my sql foreign key support very good when I say improved foreign key support all you were seeing the error logs were lots of errors when foreign keys were encountered it turns out that they were just warnings they weren't errors your foreign keys were actually running we're just stuffing stuff into the error log so by improving it we've actually made it less verbose so that's good so new my sql tables like WS trap cluster cluster members and streaming log cluster and cluster members will tell you fairly things about your cluster as well as the members in the cluster it could be 3 nodes 9 nodes and of course streaming log I'll talk about streaming replication fairly soon so this one replicates transactions of any size so before galera 4 your maximum transaction was 2 gigabytes so if you were doing a low data in file and you exceeded 2 gigabytes it would fail so this would and you could actually control it typically people telling you to always do it much smaller but with galera 4 you control it via streaming replication what that means is you can actually ship 10,000 or 20,000 rows across the network even before the transactions are actually applied but they actually already there just waiting to be committed so it's much rather than shipping gigabytes of data and then finding that you have to do a rollback later for networks this one is something that I think galera has benefited a lot from learning from group replication because by default the Paxels based protocol that group replication uses actually handles poor or slow networks reasonably well whereas galera 4 had to learn from galera 3 to 4 about making this better so you actually can handle much better network that will drop in a poor without sacrificing your data consistency so this is another benefit of having open source out there it's let everyone benefit from each other so the reason why I put this slide here is for the sole purpose that you can actually get some of these features inside of open source black box for error logging is actually kept inside non-blocking operations for online schema change also open source inside of prokona server I believe it wouldn't take much longer for prokona to also port gcash encryption for the to ensure that your data directory is fully encrypted so no one here talked about the fact that mysql supports address data encryption and it is fully address data encryption so your entire valid mysql is encrypted and in the early morning talk today we did allude to the fact that you could use vault hashacops vaults to actually do key management or call a solution like KMS there's also Amazon's key management solution that MariaDB ships so multiple ways for you to have key management but also ability to encrypt your entire data directory but if you happen to use gara the gcash is the only thing not fully encrypted and honestly believe a port is coming fairly soon the biggest hurdle that I've actually heard from most people in terms of wanting to upgrade to mysql 8 most people want to stick to 5.7 or MariaDB users and they just say oh we don't want to migrate to mysql 8 if you're also using mysql 5.7 I mean you probably shouldn't be sitting here any longer you should literally go out and plan how to migrate to mysql 8 because not migrating is no longer an option so most common setup that I've seen that we would obviously recommend as the minimum viable setup is 3 cluster nodes in 1 data center the other common setup that you see is the 9 cluster nodes across 3 data centers these 3 data centers can be Singapore Frankfurt New York no problem presumably you're running this inside of Amazon or Google cloud if you're telling me you want to run across across clouds that the latency will hurt your application eventually especially if it's insert heavy remember database operations are actually local you can actually segment database operations so what happens with a 9 node gallery cluster is that if you write to Singapore it will write to the 3 nodes in Singapore segment 0 and then it will write to 1 elected node in Frankfurt segment 1 and 1 other elected node in New York segment 2 and this actually means it doesn't send the same interaction across 9 times but only 5 times and 2 times over the network so it's kind of efficient in that sense so the latency penalty as you know is minimal because it's just certifying it before anything and of course some people also use this with asynchronous replication though I like to reference Marco Tusa's blog post and his YouTube video about why running it across multiple data centers may not work out well for you and you may prefer asynchronous replication between 2 data centers for one we find not everyone really has 3 data centers many people are happy with disaster recovery in 2 data centers so telling them to actually find another data center is an expensive operation telling them to throw up another 3 nodes expensive operation so sometimes people do async across 2 as well please remember the weighted corums so this is why you don't have a 2 node Galora cluster you don't have a 4 node Galora cluster you don't have a 6 node Galora cluster odd numbers are good so realistic common setups that I've seen over the years yes there is a 2 node cluster no you should not run it 3 node cluster across 2 data centers usually this comes from the telcos they say we have very fast interconnect between the 2 we can do this as long as you know what happens when it fails 3 node Galora clusters across 3 data centers not even segmented so it's one big one Galora cluster with latency across 3 data centers not the best idea 5 nodes spread across 2 data centers again you've got the 5 node thing correct for the 2 data center thing not quite correct so actually I'll introduce you to the arbitrated demon after this and well this one is production 7 node cluster in 1 data center with async secondary hanging off of 1 as well it's a big e-commerce site typical things you probably need in your my.cnf these are pretty basic of course you'd have lots and lots of nodes remember the segment things this is what most people forget to do actually have a few more tips for you so make sure you have segments for each data center you can actually increase the replication windows make it make it like longer timeouts even timeouts to above the max roundtrip time across nodes you must monitor flow control which is everything with FC that's that's what you look at you can also actually run Galora cluster on a dedicated network because Galora cluster is port based you can actually route traffic to separate network interface cards so port 3306 goes through eth0 and the Galora ports go through eth1 it's also doable in fact NDB cluster does this quite often actually and then you need to play control to flow control this one's being renamed by the way because we can't call it master slave any longer in 2023 we must be politically correct causal read timeouts EVS auto eviction your gcash size needs to be set otherwise you will not get an IST an incremental straight transfer you need to open up the port for your incremental state transfer otherwise you always get a SST you need to make sure that if your methods are like rsync iclinux is not going to stop rsync you should probably have retry autocom it's set to something like 5, 8 this is like maybe the most basic thing I see most people not do is that they don't set the ability for your application to do autocom it's so that when there's a cluster conflict there's a victim transaction and a priority transaction at least this will help you retry up to 5 times 5 minutes left I'll be on time with questions then you know my sql likes to give you options to shoot yourself in the foot like certified non pk equals 1 for example there are actually other options that I didn't put here like ignore split brain things that you should never use really if you don't use primary keys in my sql with you know db guess what happens you get one inside anyway yes it's internal so it'll actually cause problems with cascading deletes and so forth so really use primary keys in fact maria db is a little smarter method they've got this false primary key option that you can actually put inside your mind at cnf it's kind of nice it will cause the developers to be smart this one wsreplicate my isam it should really always be 0 because you should not be using my isam however if you use maria db you may find that this option does exist because maria db's temporary tables are also in aria and system tables the documentation needs to also replicate so it's more or less I'd say experimental but the documentation may say it's usable the answer should be switch everything to your db so if you have two data centers and you want your you want to still run this way the best way is to have a galera arbitrator node so it can be or it can be sitting like a it can sit in amazon digital ocean whatever the number of nodes you can't afford more than two in one data center have the arbitrator demon running on maybe your application server all it does is it looks at the traffic and acts as a voting mechanism it doesn't actually have to do anything to install data so it's actually a reasonable solution to fake another node you probably want to use some proxies I think we talked about proxies today morning quite extensively so there is galera load balancer like they're both level 4 load balancers I highly recommend proxies sql or max scale if you can succumb to the license fees so really just go use proxies sql pretty good it's open source gpl v3 actually supports native galera and group replication host group types so you can actually reconfigure things within the proxies sql using what looks like sql in terms of backups backups are good to have you should use extra backup or maria backup but if you want to provision new nodes clone assist is actually really good I'm surprised prokona has not managed to port this yet I mean it is open source it is a plugin it will get ported soon I'm sure it cannot run in maria db because maria db's architecture is completely different from having clone this argument has already been had with monty before it takes a lot of work so common setup and run time issues slinux firewall DNS might be a problem everything could be a DNS problem really so maybe you should just use ip's this one don't start and restart two nodes at the same time people do this more often than you think and it actually bring your you know your cluster down also don't you know if you have really long running queries maybe that's what an asynchronous replica is for my sql is this thing called max execution time you can use maria db you can just kill queries these functions not that important lots of adoption not so important plenty of things to improve on probably not so important but you can look at the slides later lots of reading I've given you plenty plenty to read I'm open for like one minute of questions for whoever is still here so any questions otherwise I can ask one I have a quick question actually sorry I've managed to forget my question I will take it over to maybe have a quick question he's been locked he's forgotten his question so I mean the thing about shooting yourself in the foot that's in the name of convenience just make it easy yeah that's true my sql is always about making things very easy for the end user I think to be fair to my sql we suffered a lot less recently compared to say MongoDB with people going oh it's web-scaled blah blah blah they got web-scaled eventually just I mean if you were on long enough you'd remember that people used to laugh at my sql documentation and said hey you don't need transactions you can work around it so I do remember my question let me just turn that off so you mentioned telcos sometimes running three node clusters and three different data centers okay the question I have about this is that just because you have fast interconnections doesn't necessarily mean that they're low latency so have you actually had problems with long fat pipe cases in those cases slow response time and we go well obviously it's your setup what do you expect laws of physics do apply we can't fix them but obviously for some people asynchronous replication is probably the better choice or even semi-sync but you can run into long fat pipe problems even there and I think facebook was a large proponent of semi-sync as was google and even facebook realized that at some stage semi-sync wasn't enough for them let's go raft but you were not here in the morning when we talked about that okay well thank you very much fascinating talk and thank you