 I can okay so okay I will start so hey let me talk now it's my fault I know I started okay I need to stay here they told me that go around stop to just jump all the you know all the intro and the agenda will be I will do a small step back just to let you understand my mentality and how I try to approach things and then why we have why we use hash join and what kind of test baseline I build up and then the test between different distributions for for hash join and then my personal consideration can you hear me on the back yes perfect so the numbers come from a city that is somewhere in an island that is not part of Europe anymore which is London and okay I was trying to figure out some statistics and I truly only was the only one giving me quickly some numbers and we have 10 million people and we have 60,000 different streets that can be that are reported so far that's what they said and 30,000 different professions or you know work categorization and as human being if I think I want to know who is a physicians in that specific building that is in front of me I per I you know I rely I just picture in front of my my eyes someone so people that is living in front of me and these are physicians right something like that now how a remote database remote sorry relation that the base works is totally different right you try to optimize what is looking for base on the information he got by each table and by the number of rows so it's trying to optimize the things so for instance we can have a plan saying let's start with the streets and so we have one streets one address at the end and then identify all the people living there let's say 40 people living in a building that specific building and at the end how many physicians we will have just one and this is a possible plan right another plan that looks like very similar actually if you check the first one is starting with more rows number the second one started with the last number with less rows say okay now let's start with the professions category right and see how many professions are in are you looking for so filter for catech for profession I am looking for physicians so one row then filter the 10 million people and then at the end feel filter for the address now if you those two things it looks like very close and quite the same but they are not if you use an index on the different table and you do the calculation the first plan will result in more or less in 10,000 rows look up I mean it will look for 10,000 rows while the second one it will look for 600,000 rows if we remove the indexes the scan will cost us 10 million rows in the first plan and then 17 millions for the second one I think this is a huge difference don't you they were not the same at all right okay and also if the second one was looking as the the best option because start with the table the smaller table it actually doesn't work well it's failing there's all the calculation if I have to show you all the calculation we will say here until Fred speech but so trust me that that was coming up and this is why we have hash join actually because no no why this is why why we have hash join we have hash join because in case like the first the first case the first column here that use an index we have this is coming from yeah from all the test comes from dbt3 and on each test I had reporting that when there is an exception or there is something to say I reported the query number so you can go back and check and analyze what I'm saying the data for instance if I use an index the data used for that specific nested loop index next that loop cost 12 megabytes if I remove the index and the same kind of block nested loop at that point that is going through you know the full scan will cost 1 gigabyte so there is a huge improve increase of dimension and number of rules and needs to be parsed right and scannable we know that and when scant take place we have a block nested loop and this is why we have hash join that's to help us to prevent that now and if with the new mask well explain with 3 you can easily see when there is an edge join and what kind of things it does now Eric already explained everything about hash join I'm not going to repeat just brief mention we have a lot of Maria db people mounted itself but I'm not going to explain hash join in Maria db I'm just saying using the I called all mass qualiterator to differentiate the new mass qualiterator but is that I'm Maria db mass qualiterator classic adjoin approach equicondition subjoint and different algorithm right that for the block nested loop but we are interested only for the to the hash join now I was not sure to show if I would show that but I will this is an image coming from my squirrel showing the difference between in performance between block nested loop and a hash join but to me this is misleading because that assumed and sorry this also use dbt 3 as test and you can see the query on the on the bottom I think this is misleading because it gives you the idea that everything works perfectly much better using hash join but this is true only when you remove all the indexes and you know we know that dbt trade test use multiple hash multiple joins and I don't think there really there will be one single dba here that will remove all the indexing production to do the you know to make to see if this is more efficient or not so in my opinion when we analyze this kind of things we need to analyze the things removing maybe just one index in the join and for and to and see what is going to happen with a mix of nested loop for the index and hash join okay which for me is more real than other situation the test I have done for the baseline I was using dbt trace dbt 3 with factor 1 and factor 3 and factor 4 factor 1 fit in buffer pool for the 4 doesn't fit in buffer pool and my squirrel 808 19 p.s. 018 MariaDB 10 412 and everything was more or less configure as default except the buffer pool dimension I had adaptive hash index disable and the buffer pool the jump buffer pool to 1 gigabyte for one specific reason but because I don't want to spilt on this comes much as I can okay everything should be memory now from the first round of test what come up is that lower is better and from the first round of test what comes up this is against standard dbt 3 with all the indexes right we have that db factor 1 and db factor 4 are more or less the trend is more or less there and what it seems what it seems is that MariaDB is going is taking longer mountain stay there don't worry stay there yeah okay okay but yeah it looks like it's taking longer and actually the sum of the time is is higher right but if we start to analyze how many time this this row has been really for real more efficient than the others so how many time my MariaDB or my squalor has been more efficient than per corner and my a square we will see that the we have a almost an equal distribution just one query two query are different right and this is so what's going on here what is going on is that we have the two distribution do more or less the same thing except few cases and this is for dbt re test with the indexes now the hash join run I I did only with factor 1 because I want everything buffer pool again I try to minimize the disk only few queries are relevant so the the ones that have very significant hash joins and and I also didn't kept separated per corner may squalor they use the same algorithm and interesting was that some query were hung unless few changes right and the tests were done with the buffer pool empty and then redone again with the buffer pool field because you will see what happened and at first one what is comes up is totally the contrary MariaDB is much more efficient than my squalor in the total time that is actually taken by MariaDB is less in executing the the joint and this is interesting because that is for hash join total time and with the base with the index we saw that there were some difference keep in mind that that I was summarizing here not all the queries but just the one that I also apply for the hash join so the results are a little bit different from the previous graph but again if I start to count how many times was my squalor better than MariaDB or the visa versa how many times MariaDB was better than my squalor we see that with the buffer pool empty MariaDB was nine over six and there were the buffer pool field was eight over seven so guys we are there right no okay and again why so the query execution time here is the buffer pool empty tell us that again once more the two distro are very close one to the other in there in how they perform except few cases in few cases we can see that the my squalor run was dramatically worse than than the one in MariaDB and we have for instance query 19 that is taking much longer or query two that is taking longer as well and then we have a special case here that I will describe as an interesting case that is the infamous query five and Eric we have been yeah working on that interestingly when the buffer pool is actually filled the my MariaDB is in some cases less efficient than before and and my squalor is able to gain a little bit of you know over it but again as we see the query that were not correct the query that were taking long they were still taking long and here we have an example query five with index the query five takes 71 second in my squalor with a buffer pool empty and the five second with buffer pool field we in MariaDB takes 70 second and four second 88 so MariaDB is better my squalor take a little bit longer but is a matter of few you know few things is not really too much in a hash join with hash join we have that the first run the first run is great I can believe that okay the first run for buffer pool empty in my squalor take 44 447 seconds and MariaDB 955 seconds so my squalor is much better but the second running my squalor take Eric how many hours yeah it's 47 hours it takes 47 hours actually I was killing the query and say okay that I reported that to our guys my friends I love you guys so you know I love you but it was it was 47 hours and yeah and the problem here was that actually the plan was changed by the the optimizer and it was not recognizing correctly few things and it was executing crazy query we in fact in the first run the one with the buffer pool empty we were having that read per join for that specific join only 129 gigabyte but in the second run when the buffer pool was filled it was doing four terabytes there data just because was rating and doing the thing right wrong plan of course there there are way of fixing this right and the one way the trick to make it work better was to change the cost in the mass squalor engine cost table to say hey this query the cost of the query is the same either if you use memory or if you use the desk wait wait wait wait wait wait and and actually it works fine and and it works fine and the time at the end was acceptable but obviously that cannot be used in production right because you cannot change the cost model in production you will screw up all your applications well you screw up the access to the application there is another case that I was mentioning to you the query 19 with a hash join was yeah the query 19 with a hash join was in my squad was taking 10 minutes of 57 second and in Maria db was taking 2 minutes and 70 seconds but the interesting part here is if you check the plan it's totally different the plan from Maria db was starting with the small table index blah blah the part the plan with from my squalor was starting with the part table using word filtering whatever and it was not going very well for me I don't get it maybe maybe you right I have the notes if you want to let me fix that I will fix doesn't change the results of the same doesn't change no the point was the order the point was the order and actually actually was doing the nested loop exactly because it was not doing at the end what we should do and in fact if you change the join order and you set this exactly the same join order well actually is still well it's your repulsive block nested loop but actually in the if you do the plan with the tree you see the hash join but it's the difference is that when you're using the other plan the plan will bring you to two minutes and 18 second exactly the same thing that Maria db was doing so is it was a plan so what is at the end so guys that we are talking about one second now we're talking about one second here one second there what's the difference here what's that makes real the difference what we're doing problem was is that the optimizer okay why I'm almost there I think in terms of time right yeah but I I'm checking I'm 23 minutes so I should I should stop all the time I have the clock okay well for question I know there will be questions so so consideration the problem here is that yes the two implementation are very they perform more or less the same well similar okay they don't they are absolutely not the same thing one user one algorithm another one using a different algorithm one can spill to this we don't in this test I didn't use try to date not use the spill to this having join buffer quite large but of course if I have a you know I speak to this can help because sometimes the the joint that was bigger than one gigabyte right so it was not fitting in memory right Eric so it was in any case in any case use that but I was trying to reduce in that the point here is that changes are very minimal what instead makes the huge difference is the optimizer and the problem is that I don't think that we have the hash implementation talking correctly with the optimizer my perception is that whatever we see here is because the optimizer needs to have a little bit more information in order to be more precise in choosing the right plan how you guys can do it I have no idea but my point here is as it is right now we may have similar issues in production and affect the production right much worse than a block nested loop that is known it will it's already implemented is there it will take what it will take but at least is predictable more or less right is already considered I'm not saying that hash join is bad actually hash join is good I'm just saying that we need to have a better way and better mechanism a better way to provide feedback to the optimizer and okay the massquale has joined with spilt disk is totally different conversation it needs to have comparison you know pairs with pairs not not with something different and what else yeah instrumentation MariaDB please give us a little bit better instrumentation to analyze the the hash join itself because the only way we have right now is to do the jason you know explain jason which is not enough we don't have all the numbers that we may need but yeah I don't care if it's three or if it's jason just give give us more more detail yeah and yeah and the other thing is in order to given this feature let's say it's still new I would say please give us the option to play around it enabling the disabling enabling the other disabling the other for debug purpose because if you just force all of us to go in one direction that will be a wrong things to do I mean we need the option to say okay use hash join use block nested loop use this use that on on you know on on the session of course but really to be sure that that is what is going to happen right that's it thank you question do we have time for question and yeah I am not sure that I can answer by the way field field by okay the question was thank you the question can you repeat the question no I know I know the question was what I meant when I was saying buffer pool fool I meant I was saying but buffer pool field by the query meaning that the query had already executed you know to be already read all the data from the disk and put the data as page in the buffer pool so the next run it doesn't have to go on this but just read the data directly from me didn't the yeah no the one that was going slow because that plan was changed and was ineffectively doing that right don't kill yourself yeah but there was no concurrent joins just I well wait a second if I put a join buffer to one gig I'm giving MariaDB some benefit because I'm giving you more space because they can spill on this while with with MariaDB if I understand correctly the implementation and please correct me if I'm wrong if I have a smaller join buffer and the other settings I'd never remember that that is related to the join buffer itself also the well yeah there is another one that is correlated so I give you the space I give MariaDB the space to work as much as close as the in memory as much as close as the hash join yes yeah no that was on purpose to give oh you know to try to compare parallel I mean appled with Apple because the implementation is different no okay that one was a specific case but on the others I have if you want I can share with you my text file with all the notes and you will see okay any other question no okay I'll immediately after one run yeah fix it good fix it yeah okay thank you very much guys