 Hello everybody, so let's start. I have only 20 minutes to tell you all the fun stuff This is about me, so very short Just working full-time about on my scale performance extremely fun stuff and I can you tell much more so later just I have only 20 minutes about I'll try to do it in 15 and So well, what are my main problems so in fact if you if your problems are something how to tune your query or Use right in the right way my scale and so on everything you can read from your TFM It's not my problem. So my problem you see it's about what you cannot fix Which is broken by design or will not work by design and so on so Many problems are known may many problems. We just discover and Well work in progress. So it's a contents for progress here Historically so from 5 5 my school 5 5 was who already tested my school a 2 Guys who already move it to 5 7 at least so 5 5, you know, we deliver it just will Some fixes which were already known in 5 6 we started to do some deep changes Which bring huge pain because there was many regression many differences So the most painful point in 5 6 was and writes were faster than reads So in fact if you want to read faster, you need to you need to send some rights And they will unlock your reads and you will read faster. So it was completely Odd dumb what you want and in 5 7 finally we fix it reads. So reads become faster, but We started to lose in efficiency. So there is always cost on something We always always balancing the code to get the most optimal stuff So in a to we are Fighting and I think it will be main fine for many years now for efficiency to get the best Possible performance on the same hardware. So I'm not we are not running for the bigger hardware. You should be more efficient the same way so You will see some flame graphs. So it was huge progress on 5 7 you can see here once we Deliried on the read only Monstrous result it we were so happy Since this big machine was upgraded From 72 cores of 96 so we are happy to get more than 2 million square in our so okay It's real numbers. It's possible, but the difference between 8 0 and 5 7. It's not really big and in fact Well, we don't care about this anymore because we already saw that we can do 2 million 5 so for it only it's fine. We don't touch this So there'll be probably some regression because we added new stuff okay, but Our main Problems it's everything which was remaining from 5 7 so I will just add Potential fixes which will have here and speak about the problems. So On read only there is still remaining block locks for example. So if you read constantly the same rows You will have contention. So you will go just slower the other The work around here. It's just to use query cash or proxy SQL is query cash There is no more query question at all. So proxy proxy is the best solution Look up on sector secondary indexes on energy be can be many times slower than a primary key So the main work around just to use primary key or well We are working in another solution to speed up all this stuff There is also adaptive hash index if you enable but well the problem as soon as you have rights It can slow slow down your stuff. So These are hot topics for us here and for utf-8. It's extremely faster now in a tool So before for example in 5 7 you are probably you can be 10 times slower if you use utf-8 So at least in 8 oh, it's only 10 20 percent slower than Latin one huge progress here For double right so this is all read write problems double right is expected to be fixed at 8 oh So even though we'll not speak here. I will tell you about the redo look changes. We are doing transaction locking and lock management So I will speak you about cuts and coming with a to Transaction insulation will still work in progress. There is a huge potential fix here on update performance Directly related to redo logs and insert performance. We are working here. Well, as soon as you keep B3 Cache it like if you if you can use partitions or different tables You can go very fast on insert. Otherwise. Well, B3 is impacting you and For purge if purge is lagging in your production. So at least with a to you can truncate undo So you can truncate as a space which is your garbage collection so To be short so we touch it So we will try to select the most killing problems than we have so one the most killing one is redo log So as soon as you optimize it everything's and then you cannot write on redo lock as fast as you want You're blocking. So this is a the final bottleneck than you hit. Well, except if you hit some other problems If you don't hit them, so this is one So we see for this one thing this one we attack it to fix a NATO another one is Every Iobold workload. So in energy we have global lock currently as soon as we start to read something from the disk Do you every eye operation? Taken the global lock. So of course it cannot scale and as you have faster and faster storage You cannot go faster with faster storage And so this is related to five system locking and also so for Role locking so we Taken contribution from Michigan University so we'll speak about this later. So now about the redo lock. So what happens inside? So well, I suppose that you know then we have Transaction commit, you know how they to use them so you flash on every commit or not or once per second so Well, it was three years ago we discovered and we can go faster even if we flash on every commit So in fact the storage is not the problem all the problem was about the locking inside So all threads of user threads are fighting to write to redo log And in fact This was the old model So you see the global lock which is blocking everybody and the new models and we have Dedicated log writer thread log flasher thread and notification threads around so just simplified Structure so users are not block it anymore They are writing directly to the log buffer and in parallel we are flashing all this data to the disk so in fact we are Block it only by your storage performance. So faster you can write faster will your redo log work There is no more grouping. So we don't waiting. It's natural grouping by your Flash speed on the disk. That's all so you will go as fast as your storage can go so this code is Extremely well instrumented so you can know exactly how many ways you have what your thread are doing So what happens inside and all can we keep configuration is dynamic? So you can change whatever you want inside and see it even resize your buffers log buffer For example live or even stop all the redo log if you want so This is multi thread model now, but you have a trade-off So with multi thread you never can be faster than single thread which is that just doing alone So it will be right if think right if think without any weights without any synchronization. So of course it will be go faster than The right place with threads So there is a trade-off and the only option so we will be disappointed than Even driven system cannot be as fast as spinning. So spinning is the most efficient way anyway So we see spainful Probably later we will we will reinvent something more efficient But currently only spinning helping to go as fast as single user doing the same actions so you can Imagine here. So what happens? Well, just to give you an example So if we do nothing if we don't use redo log so you can see this is where Going this is a red line. It's about the current 80 so if we don't do these changes on the look you see we don't reach the levels. So this is a Let's see me This is the highest level than we get with spinning but spinning As soon as you eat a lot CPU. So then you go down. So in fact, you need a balance between spinning and even driven and so Currently what we decided so it will be adaptive spinning So in fact, okay, you can tune it list to say okay as soon as they reach this level of CPU usage I don't want to spin anymore only as soon as my disk become slower than a given response time I don't spin anymore So at the end it will be just auto tune it will auto discover what happens on your system And you will don't touch anything but as the first release well We prefer to give you some tuning points here. So what we have is this for the first time we present in result with transaction commit one because Before it was never possible as soon as you use one everything was lower now it become faster so a to become faster than 57 and 56 in pure old to period right test and Much more faster yet in pure update So when you Bob on bombarding when you have heavy updates workload, so the difference you will see What is amazing here than you see in 57 we got huge regression here and 57 is lower than 56 so in fact We are two times faster than the 57 fine, but we are getting back these regressions and we got since 56 So this was related to all this work about read only improvement and so on so We are still not scaling why so because Read the logs just the first step and we have next layer lock. So all this transaction locking Lock management and so on so just to give you an example how it should be so this is our prototype so in current development version and it's So if a to is not scaling on one so this one CP socket on on the left side So this is one CP socket and here in fact, it's two CP socket So we we can reach 400,000 a day's per second, which is enormous numbers. So never seen until now, but while this work is in progress I hope it will be fixed soon once we deliver a tool Now what about I work clothes so well Any I abound workloads were Blocked by storage until now But now use you have a huge game changer is flash storage which is coming and become faster and faster and faster and In reality, you have max throughput on your flash storage, but your real performance today is Was driven by IO operations per second. So in fact This output is limited But you can like divided like a memory access It's not like before, you know, then we try to do bigger IO to read more on whatever now You can read smaller you and you will still match the same throughput But you will have more operations per second especially if you need to read few records from big block, you know, so it's Smaller blocks will give you better performance. But in fact what happens on in a DB by default We have 16k page. Okay, so Compression topic is very popular. So we say, okay, let's compress imagine when we can compress it four times So we will be able to read four times more because so page of 16k compress it to 4k and With the same throughput we can read formal pages Dread the problem is you have exactly the same buffer pool. So your memory is not four times bigger So once you uncompress your data, it's still the same useful data sets In fact, you can read faster, but you cannot use your data faster because before to read You need to process what you already have. Otherwise why you read right this data? and What happens so if you will just instead of 16k page use 4k page in reality for the in the same memory You will have four times more useful data and here really will go fast so of course This works if you need only few data from the same page not if you need the whole page. Otherwise, okay This should be okay as well, but All the story is not possible just because we have this global lock Mutex so global lock for every eye operation. So as soon as you start to read faster This global lock is blocking you so you cannot read it anymore and so the good news is that In a to all the fixes this so and to validate this changes so we We got a chance to use the latest Intel obtained drive which is alone able to deliver you could imagine So just one single thread Doing pure IO can just read with one gigs per second. So 1000 megabyte per second one single thread so in fact Using two drives like this in theory with 4k page were able to To the 1 million reads per second, but is it true this And in fact, yes, so we doing more than 1 million real eye of old Selects per second. This is huge. So it's pure eye of old point select here and The same huge Jumping updates because in update it's much more expensive So we can have to read the page update the page write the page and so so you constantly doing eye operations and well, this is The progress and we expected from a long time and finally everything coming good on the same time So we have storage Solutions which are coming with flash from any vendors and we have code which will work with this and For the last point I wanted to tell you about the cats so in my school it's called cuts initially it was called that's and Cuts is just contention about transactions scheduling. So was invented by University of Michigan and adopted In my scale now so available since my school a tool. So the idea is Looks pretty simple. So in fact, not all transactions are equal and you you have some transactions Which will lock more data some less so some objects are more important less important So in fact, this is a simple schema. I put your links also here to read more about So it's very long paper scientific paper explaining all the logic with many examples and so on so in fact how to decide Traditionally it's we just using FIFO and first coming for some lock it But in fact if you do it in more smart way, so you don't unlock the transaction which came first But transaction which is blocking more others you have better performance but in fact all the story it was much more fun than this because It's torn it to a real a detective story in fact So there was a claim about huge performance improvement But as soon I started to test it so any probe test did not show any different any difference No game zero or you just going slower So well we started a long investigation and discussion with guys from Michigan Well because the first impression was They are just kidding you know there is nothing real just it's fun something around them They know Maria db already applied the patch so everything is working shit. We did not see any results. Okay So we started to understand what they want to solve Finally, so I found the way to To build a scenario which shows a difference so then it was big you can finally we can see what could be improved And as soon we started to see the workloads Sonya started discovered bugs and bugs in the patch And then it started to loop In fact if on the remastering fixing retesting again Do we have a game we lost the game we go again the again another bug is open it so and so on so well In fact, well, it's probably we spent nine months on this in many loops before fixing everything and at the end It was still unclear In which way you you should use which algorithm and again Dba cannot sit down and look on the workload and say, okay, let's Five minutes around this one or now I switched to another model again So finally we've come with solution which after detection the problem And then you will switch from fee for cats according what is better for you in your current workloads So we'll just discover how many locks you have and decide what is better so It helps everywhere when you have for all look contention how you can recognize it just follow your show engine mutics Output and you will see if you have locks on your current workload or not. So in fact, you need to pull Somebody never money towards the production. Well, bad questions. Everybody money to our production, right? So well you may need all this all the time. So of course, you will see your spikes and you will see, okay You it's your case. So you will be happy with a to and So here is example with a sole level at Pareto distribution. So means and you will have artificial contention on data and Many threads many users will fight for the same data. So with a growing load You can see them so without this algorithm You lose performance with growing load with this one at least you solve something, okay, but You should be also realistic and Well, if you write your application and you look every time everything. So you're just bad programmer, right? So you try to avoid locking in fact So it's because it by designing your application. So you don't want to create problems and main problems are coming Mostly because in energy be we have repeatable read the transaction insulation. Okay, we want to move to Committed read the problems and this currently it's creating even more problems because of transaction locking But this is a way what we will do. But in fact even this case when you can use Read commit you don't see is any difference, you know You use what so I don't use so whatever Everything fine. So well, I think once we'll deliver all the fixes and we won't everything will be transparent You'll be much more happy Even well than before and with this solution as well. So Go to action download the staff and the most important point have fun because otherwise We are doing action Our job is stupid So if you don't have fun this point Thank you. I'm on time There's no time for question. Yes. So why I will set up to that and you can answer a question So any questions? Yep. Yep. Well, so in fact the main problem with double right you see is Currently so you have you just have a Small double right buffer, which is very small and especially than you have locking inside So in fact, what is the problem with double right? So this is the only protection we have today because of corrupted pages and what's amazing than no storage vendor can Linux on Linux can Support you then you will not have corrupted pages. There is no support So in fact, we need to write it twice every page and the only problem is and you need to write them sequentially So in fact your Right time will be two times bigger So as soon as you can write two times more in parallel everything is fine. So the new solution is just allowing you to write Faster in parallel and because you will have many rights Conqueror in France going together and you will hide this latency time, which is increased by two It's not bigger. It will be unlocked in fact So you are mostly block it only by storage country are blocked by design. So this is a problem but as soon as In virum will come, you know, so you just you don't need storage anymore because the problem you write twice you will Kill your flash device for example two times faster So as soon as you will have in virum, which you can rewrite any time as you want and it's battery protected So you don't care anymore. It will just go in memory and it's fine Yeah, but this is double right first try to memory and it's protected and then you write to disk So at least one will be safe. It's fine Other question. Yeah well, so We are working in full redesign of transaction management So in fact all this looking which happens today, which he'll may kill you I have a detailed blog post for example explaining The the stuff around why it it couldn't be dangerous. So as soon this looking will go You will be just happy with commuter tree So we know so just be patient scamming no question