 My name is Oli and I am my SQL consultant at a small Swiss company called from Duel and what they do is basically anything about my SQL I have no clue about next cloud I have no clue about development I only have a little clue about my SQL so that's our business so I'm here to talk about my SQL or whatever data store you use for next cloud I had a little problem I don't know if there is a developer here or users so who is developers of next cloud in this corner okay so a few people who is user okay the majority so I'm sorry for the majority of you maybe it's not so much for you but let's have a look so what I want to talk to you about in the next few minutes is about database scalability I've heard that's database scalability is a problem for next cloud is this true or is this a rumor or not true okay I don't know okay so it depends you are a consultant right no not really okay let's have a look so what I want to talk about is critical resources so performance and performance patterns higher high availability and performance and what is evil so part of this you as a user can influence but not much and most of that you as a developer can influence so we have to have a talk or have a look about it critical resources typically in hardware there are four of those this is RAM the IO system CPU and the network so let's start with the RAM databases really love RAM memory and here we can typically say much helps more so as more memory you have in the database server as better the performance can become that's not a my escrow advice every database consultant will tell you this okay so ideally hold your database set shoot or could fit into the memory so that leads to the question how much RAM do you have for your for your next cloud database server and how big is your next cloud database please let me know I have no clue about next cloud to be honest so how much RAM do you have for your database service please okay sounds not too bad one on the gigabyte of database is that a big setup or small set above no clue medium not thousand 250 not 250 thousand right okay okay but the files hopefully are not stored in the database right okay so but 100 gigabyte next cloud is metadata for 250 users okay so sounds okay okay so all your data sets should fit in the RAM it's not possible it still works but as worse this relation becomes a slower the database is so he ideally all the hot data should fit in the RAM and then the database is fast and what the hot data set is the developer should tell you or can or whatever for you as a next cloud administrator what is relevant here is you know DB buffer pool size this is the my escrow variable you have to configure properly according to your RAM rule of thumb is 80% of your RAM on a dedicated database server should go into go into this variable do you have some kind of installer for next cloud yes no how is it installed have no clue do you have an installer yes okay so it does not configure so who should do the database stuff the users okay okay feature request you should have a proper installer and then the installer should configure this variable properly and until we have this feature request implemented your administrators this variable should be set to 80% of your RAM on the dedicated mysql only database server the slides are online already you don't have to make a snapshot of the slides question no okay why Ram we don't want to read from flow disks we want to read from fast memory memories roughly 100,000 times faster than this so we want to have our hot data set in RAM and then the database become fast the next thing in databases typically we care is the IO system because of the data don't fit in RAM they are on disk and then it depends on the disk performance but typically databases do is they do as in chronos random writes and fast synchronous sequential writes on the redo logs we don't want to read from the disk anymore so your disk should be good in this that means for us in hardware we want to have dedicated disks direct attached fast means expensive rate 10 or better SSDs and this is not what the trend goes with modern hardware and infrastructure because we want to have this cloud stuff and design stuff and this whatever this is not what we want to have for best database performance now cloud stuff and next cloud I don't know how it matches because you do the infrastructure son is not a good idea right for the database not for the file trash the reason we don't want to share we don't want to wait for slow and far away far away disks so IO latency matters sand and cluster file system stuff for database not a good idea the next thing which is relevant is CPUs fast course process slow queries faster so that means faster CPUs make your slow non-tune queries faster and also the good tune queries make them faster so that means we don't want to have green IT for fast databases so just for your imagination for course can run for queries at the same time and not more and now my question is about next cloud and concurrency I have seen an advice on your website that you we need multi core machine and multi socket machines right or okay so that means you have many threads which process many queries okay so what is the concurrency active running queries do we have such numbers running not connected yes running no no not really anymore no no it's what you do what the application does it depends on the developers what they do there I will I will do an example later okay network nowadays from the data's database side of you we don't see any limits anymore with 1 gigabit and 10 gigabit networks if you have other experience please let me know so that are the critical resources so the presentation was about scalability and performance so what does it mean because we found out that a lot of people have different understandings of performance and scalability so when we talk about performance we have their two different meanings the one is how fast is it that's latency and the other one is how many per time that's the throughput so it comes to your point so when we talk about database performance how fast is my query and how many queries do I run or how many queries do I run per time scalability means to what point we will run at these two numbers to what point what amount of data what amount of transactions uses time and so on and so forth when is our system overloaded that's the question of scalability so if you have such a number let's say concurrency and response time and you increase the concurrency concurrency at a certain point you get this exponential curve going up and that's the saturation point and if you go behind this point performance will always be worse so you have to know where this point is and never go ahead this point now for you as a user isn't a little bit difficult I don't know do you have such benchmarks that you can say 10,000 users with this and this are you can go that meant that far not yet okay so you have to do that you have to find out where is your saturation point is it at 400 users is a 10,000 or 250,000 users you have to find it out do you have some benchmark tools for that but you provide them to the users okay then they can find out that point okay performance patterns this is something you cannot do that's what the developers you have to recognize the performance patterns you have read is it read problem is it the right problem is it the latency or is it the throughput problem this is what you have to figure out you think about when you develop stuff in right is it the latency or throughput problem the same so where do we suffer what you also have to recognize is do we have random or sequential patterns maybe you know that sequential access typically is faster than random access at least on the disk but sometimes also in memory it has to do with internal structures in databases in MySQL so do we random fetch rows or your queries do they fetch random rows do they random writes on disk that are things you as a developer should consider I show you a few more detail points later do we sequential read write sequential full table scans for example do we log writes to disk so this is something you should keep in mind caching effects we see very often users doing micro benchmarks they run a query it's low they run it again and it's fast oh now it's fixed or we don't understand that that's typically an effect of caches databases has a lot of caches we have a file system cache we have a cache on the IO system we have a cache in the database and then we have the query cache which you probably advise to disable I don't know or is it left to the users that sounds good so you know already about the patterns that's not too bad that's good so that I really love to hear that that's good so that means next cloud has a chance to scale huh okay so what influences read so what influences read latency indexes so if your application has good indexes and that's not you can influence that's task of them so kick them make bigger feature requests etc etc etc if a database is indexed well then it should be performed fast then RAM access versus disk access influences read performance as well RAM is roughly 100 times 100,000 times faster than this so if you have your data in RAM it will be faster and complexity of query complex queries perform worse and simple queries so the fastest query is a primary key access and the slowest query is some used joints with soup queries and unions and so on and so forth so what are you doing okay wonderful so that is where Maya school really can scale good throughput you told it already or you told you were speaking already about it one connection to my SQL can process one query that means one threads on the operating system that means it uses one core up to now my SQL one query will only use one core in my SQL 8 which came out this week maybe this will change we don't know yet but up to now the developers have to spawn multi threads which do multi connections which do run multi queries in parallel to really use the full power of your CPU and your hardware so and you say yes you're doing that already so okay so maybe you will use the hardware just keep in mind four queries can run a four cores can run four queries at the same time a little bit more complicated four cores can process four second of queries in one second real time or if you think about that 10,000 queries running each query one millisecond will run about two and a half seconds on the four core machine probably more because inside effects there are some frameworks I think Java people like to use that so no fear here right they do one click and then the framework does 10,000 queries on the database every query runs one millisecond and the user is wondering why is it taking 10 seconds or something like that so be careful if you are using frameworks what is it come on yeah what you can do in these cases either less queries or faster queries or faster cores or more cores so variables influencing read performance that's for you or for administrators we are talking on one one more question to developers we are talking about InnoDB only right no more anything else only on InnoDB okay perfect so what can we influence from the variable side query cache this typically improves read performance dramatically if you have always the same and the same and the same queries again so your advice is enable the query cache right pay attention MySQL 5.6 the query cache is disabled by default that means you have to enable it if you migrate from MySQL 5.5 to 5.6 yeah that's sad but Oracle has decided to do that pay attention the query cache can be a bottleneck at high throughput and if you have a high write rate so the query cache is good always repeating queries and less writes but it can be bad in the other way so you have to know your pattern and you have to find out if query cache helps or makes you pain you see that in the process list there is a state called waiting for query cache mutex or something like that if you see that very often switch off query cache then table open cache table definition cache the new defaults are good that means you should not misconfigure it the new defaults means in MySQL 5.6 and later table definition cache should be as big as table you have in your own next cloud instance how many tables do you have 200 300 400 500 thousand not thousand 20 okay so the default are good okay so a two-digit number okay so the defaults are good for you do you have some multi how do you say multi tenant setups and you shim up a tenant and you will have these multi tenant setups or do you have such customers because multi tenant setups that can become an issue if you have 10,000 tenants and every tenant has 50 tables you have 50 500 thousand tables something okay so several tenants in one server okay so you do sharding okay very good well done table open cache the defaults are good so let's stop with that influencing read I already told about inodb buffer pool size should be 75 to 80 percent of your RAM okay and if you have a lot of concurrency inodb buffer pool instances should be age 16 depends on the number of cores the defaults are good in five six and later so my school variables influencing right we are just talking about inodb nothing else right query cache can have a negative impact if you have a high right on a rate on right table open cache table definition cache the same rules as before and now let's have a look at inodb writing inodb flush like a transaction commits influences the behavior of inodb how the data are written to the disk that means either safe or fast but not both so you have to decide as somebody if you was about safety and security and no loss of data was it you what is your advice here okay no advice good let's discuss what about the users safety that means slower for a faster faster safe you cannot have both you have to decide both okay depends the default is safe so you are on the good side if you have right performance problem the defaults are not optimal set it to zero or two for getting better performance with the impact you can lose one two let's say 15 seconds of data in case of a crash is that acceptable depends on your business how critical is it if you lose one two 15 seconds of data it's not corrupted it's just less lost is that the problem yes no yes okay so you have to invest more in hardware then you can work around that okay inodb lock files files before my school 56 the defaults were horribly bad okay in 56 the defaults are much better here we recommend the value between 64 max to 256 max do you huge data loads in to next cloud if you do this merging and federating stuff will that impact or will that have an impact on huge data loads small chunks okay so you huge imports so he should know what he's doing that so well what's the typical my school version they are using 55 older or 56 in your 55 something like that yeah okay so the defaults are bad change the defaults to have bigger values here pay attention with my school 55 and older it's a little bit tricky you have to shut down the database move the files change config start again 56 it's more relaxed you have just to change the value and then restart the database in this good okay inodb buffer pool size also influences right primarily it influences reads but also writes if the data are on this then we have first to read it and then to write it and so on so big buffer pool also helps to helps to improve the reads and is somebody using really huge ios systems or fast ios systems like SSDs okay okay SSD sounds good inodb IO capacity increase that the default is 200 IO per seconds this is much too small for huge SSD disk SSD disk can do 10,000 to 100,000 IO okay that's our my school variables influencing writes what influencing the access patterns so sequential inodb primary key is fundamentally important for the access pattern why because inodb tables are clustered by the primary key that means means the selection of the primary key influences how the rows are sorted in the tables okay so you when you create a table you decide how the rows are physically sorted on disk and in memory yes that's true only by inodb huh that means out to increment is typically good the rows are sorted by time increasing that's typically what we want the highest numbers is in cash and the highest numbers are the news data are the hot data we want to have them in RAM that's good what also influences sequential access pattern is covering indexes if you know we want to have a select on ABCDE and not from the whole table you can create an index on ABCDE and the rows in the index always sorted according to this pattern to get sequential scan through the index on the cost of more disk on the cost of more maintenance etc etc etc random patterns inodb primary key who was the guy talking about merging and federation and he left the room okay here you are but I've seen with ERP systems they are made for that they want to merge two companies and they like to use UU IDs from the inodb point of view this is really really bad what are you using other increment okay so bash bat primary keys would be hash but why any idea anybody it's completely random that means our data are spread over the 100 gigabyte data so we don't have really a hot data set because everything is hot okay a little bit less words is UU ID this depends on the version of the implementation of UU ID it changes on the left side and it hops around after I don't know how many numbers it starts again from the beginning so you have a chunk of rows and then suddenly a chunk of rows and the chunk of rows which does not really mean we have a really concentration of hot data so pay pay attention with such kind of primary keys non-covering indexes means we can make a range scan on the index which leads then later to random access to the table and as bigger the random access on the table is as more expensive it becomes just imagine an index of gender average distribution 49 49 2 okay so if you say go to the index of gender and select all females it will have a range on the index but then 50% of the rows from the tables in random access which is really really bad so pay attention on this kind of huge ranges on indexes non-covering indexes so let's talk about high availability and performance as understood next cloud is critical infrastructure right so we need some kind of high availability yes okay what do you recommend for high availability database solution okay you are aware that Mexico is not open source anymore right okay okay you know that read right splitting only works for how to commit okay yeah okay okay so a general misunderstanding of the concept of clusters is I see every time that the customers clusters are typically not for performance they are for high availability and this does not match to each other so it's either performance or a high availability but typically not both or mostly not both I don't know why people our things a cluster is for high performance maybe because of the top 500 where we always are talking about high performance but that's not what we do we have primarily high availability clusters so what we have is master slave oh that should be a mm master master so it's asynchronous and that's fast that's good and then as you said we also have Galera and this is synchronous that means it's slower synchronous always means I have to wait so if I have to wait I'm not as fast if compared to if I do not have to wait so it's always a question between synchronous and asynchronous so it's a question between Galera and master slave sharding is another thing to get high availability and more throughput you said if you have multi-tenant installation you do some kind of sharding so you have to set up multi-next cloud instances as well so there is no next cloud knows that part okay okay feature request for the future in xbox 13 this will be included okay great okay so sharding will be next next cloud otherwise my school has fabrics but yeah okay if you do it yourself it's better so do it yourself so you have it completely under control another thing that's very problematic high availability is not kiss kisser keep it stupid simple so keep it easy if you have a very experienced DBA or Unix administrators in straighter group then it's okay but we very often have customers ten thousand employees and three people in IT department they have to take care of Microsoft exchange server Microsoft how this is called this mail stuff and then I said there is a Galera cluster on Linux you can imagine how fast it will break it has nothing to do with Galera but clusters are complicated last week on our way it was this week I had a very clever customer a Windows only company they have to install Linux but they say we don't want to have a cluster because classes breaks more often than single instances so that's depends a little bit on your philosophy just keep in mind high availability is not case and with case you typically don't get high availability so you have to decide again and then question to your developers maybe you can say yes your class your software must be cluster aware this is not a Galera or my school specific problem this is always a problem with clustered setups the software must also be capable to deal with the cluster this is very often a problem with Galera they say all we use a Galera cluster and then Galera has some disadvantages or has special functionality and the software kind of cope with that but you say yes you support it so next cloud is cluster where good so Galera and performance I compare here single instance and Galera what do we expect from different patterns so read latency we typically expect the same latency than a single node instance or master slave right performance you say you don't do a lot of right so it's not an issue but right performance or right latency is worse because if I write to this node I have to wait to the group on this until they say yes we have it okay so right latency in Galera is worse throughput read is bigger because you can spread all you read too many Galera nodes what you do according to you say okay so what about right throughput this is equal to a MySQL demon or maybe a little bit bigger but we don't have a big issue with writing according to you okay then mistake we see very often with Galera setup that's for you as a user again don't set inodb flush log a transaction commit to one on the Galera cluster it will become heavily and horribly slow okay in Galera we can say if we get the okay from the other nodes it's on other machines the data so if we get a power off if we crash we can assure the data over there so we don't have to wait for the low slow disk so that's the reason why in Galera it's okay to set this value non equal one and that's now a question to you again do you know this variable in Galera we have some wait no Galera okay Galera claims to be a virtual synchronous cluster so your neck here should stand up why okay so first of all virtual synchronous means it's not synchronous otherwise I don't need this word virtual so what does it mean virtual synchronous virtual synchronous means they have the data over there but it's not visible yet that means you write to one Galera node by default and if you really read immediately after another node it's not arrived yet there so it's some kind of semi synchronous or as in chronos you are aware of that okay that's good so what do you recommend full synchronous or virtual synchronous okay so then the read latency will become longer it has an impact on the read so it's a question do you really need it or don't you need it to make it a little bit more interesting for you you can set a per session so if you know in your session oh this is really time critical you can say set session real synchronous and all the others is semi okay so what is evil missing indexes I hope we don't have anymore otherwise I will tell you how to find bad filters this happens very often with very flexible user forms I can filter by this and this and this and gender and date and etc etc etc so you developers have to make sure that the that the user forms are not too flexible and then we have a competition with the sales guy where is it he has a user request and then technical and doesn't matter match bad indexes too many indexes can also hurt not that much than missing indexes but you should not overdo with indexing then complex queries again frameworks select star is mostly not that evil we hear very often our select star is really really really evil that's only partly true so don't focus on select star focus on the filters focus on the indexes that's much more important sub queries for my school up to my school 5 5 sub queries was always a problem also for the databases in my school 5 6 it became a little bit better but it's still a problem so whenever possible you as a developer don't use sub queries write the join if possible the my school users manual there are some hints what you can what sub queries you can rewrite the joints then too many table joints so if you join 30 tables together it will not perform some frameworks do that so please avoid it will give slow queries long primary keys that means the joint feels makes it slower you have more CPU cycles so keep the primary keys short avoid hash you you ID and similar stuff on primary keys auto increment is mostly good except if you have time serious then auto increment is suboptimal blob and text and other trash should not be in the hops tables it spoils your ram your buffer pool so move them to another table which is lost less important to a cold or warm table so split a table into hot and warm part to keep the blobs pictures whatever you store somewhere else and then such things like lock files mouse click tracking monitoring and something you told me before in the in the last presentation somebody such patterns you should not store in the database there are better stores for it for locks trash and stuff are you doing that wait wait the documents and timestamps do you do you store timestamps and document last thread and such stuff okay yeah if it happens in huge amounts that could become a problem okay yeah yeah mail mail huh every read I write to the database yeah so every mail it touched and then it has an impact yeah yeah such things is not really good for the database so I'm optimistic that it should perform yeah okay then if not we can discuss about it where we should do something so that's more or less I have to talk about database performance and scalability stuff any questions up to you now come on do you want to go for lunch or what okay so if there are no more questions I will be here all the day I will leave around six o'clock so you can ask me later or you also the others will know where you can find me and okay thank you very much for listening and enjoy lunch