 Peter is not here yet. MariaDB is basically six years old, launched in February of 2010, our first GA release. Drizzle never really made many releases. They never really got too much adoption, and we'll go into why they kind of didn't quite work out, I guess. Anyone use Drizzle before? Okay. And WebScale SQL was announced in March of 2014 as anyone heard of or use WebScale SQL. Interesting. And WebScale SQL started off as a consortium of companies deciding to actually make MySQL better. And these companies, it turns out, have a lot of MySQL hackers. Facebook probably easily has 30 odd, maybe even 40 MySQL hackers. Google probably has close to 20. The folk at Twitter have at least two or three. So it turns out that the MySQL hacker community outside of MySQL is also pretty vibrant. So you kind of go through a little bit of a history of MySQL just because you kind of have to know your past before you know your present and future. I'm going to take a wild guess that many people didn't use Unirag. Did anyone here use Unirag? That wasn't even born. Yeah. Monty actually started making a text user interface to an ISM data store where the rows were written to disk and the indexes were kept on top of that. And basically he had this setup where you would actually have forms to enter data. And this is where you see these FRM files that if you look inside your MySQL data directory, where they come from, they're actually forms to enter data. So if you think of it in the way of how access is a database, you enter stuff into a form and it stores in a database. Then sometime along the lines you'd see that MSQL came as freely available but not open source. So Monty basically decided to plug the ISM data store to MSQL. And then in 95 when the internet was starting to get popular, I mean, I'm going to take a wild guess here that most people got their first Internet connections here around 94, 95. I mean, that's when it started becoming pretty available to commercial users. AL started sending out free CDs, Prodigy, et cetera. I know this because they'd send the CDs to Malaysia too for some absurd reason. So Monty, alongside Larson and David X Mark founded MySQL AB in 95, 96 was one of the first releases that one would say was very usable. It powered things like Slashdot. People still read Slashdot, yes, maybe. Well, Slashdot was all the rage. Maybe today it's hacker news, I don't know. And then in 2000, Monty and the rest of MySQL AB made a pretty phenomenal decision to actually make it GPL, make a database of a GPL. It was also shortly thereafter, and I guess many of you probably also use Linux, using Linux in 1995 or 2000 versus 2016 has got a vast difference. Like in 1995, you'd probably have to work out how to write a modem driver. Today you just plug it in and EtH0 just works magically and who uses modem anymore. So shortly thereafter, they decided that a good way to make money would be to actually create a dual license, especially around the client library because it's hard to make money from, well, it's hard to get GPL software embedded, I guess. And then the Holy Grail was a statement-based replication. So you didn't have to scale up, you could scale out. It's not the best form of replication, as you'd see later on. We introduced things like row-based replication, but statement-based replication means you could buy many, many cheap white boxes and actually have reads scaled out. And if you think about the evolution, May 2000 is when you'd have statement-based replication and in the Postgres world, replication out of the box basically only came around September 2010. It's a huge delta before out of the box Postgres became as usable as my SQL from a replication standpoint. 323 was probably very popular and was at one stage one of the last client libraries that were actually shipped because of the license change and it took a lot of convincing until maybe around 2004 or five before other Linux distributions started shipping the LGPL client library with the false exception. 2001 also brought InnoDB, which presumably all of you used today. Who here does not use InnoDB? That's what I thought. And InnoDB brought transactions and VCC. It wasn't my ISAM any longer. If it crashed, it could recover. And many people were excited by this. It made my SQL a real database, real. It was always a real database. And then in 2001, you'd have VC dollars and Martin Mikos comes over from SOLID to actually become the CEO. He was the CEO till he led the company to an exit actually. And in 2002, my SQL also did something quite interesting. The Newsphere case with the Gemini storage engine was basically the first test of the GPL and the Cots. It showed that the GPL is actually enforceable. Gemini was eventually open sourced, but then abandoned. And if you're still interested in finding the Gemini engine, Anthony Curtis, who occasionally comes to these scale events, actually has a tree sitting on launch pad, possibly now GitHub that you could take a look at. 2003 with Series B money was also the time that engineering was invested in really heavily. We had a huge growth spike in terms of engineers and everything else that came along with it. And it was pretty interesting that when 4.0 became GA, there were two alphas at the same time as well. It turns out that shortly thereafter, SAP was also realizing that it makes sense to make my SQL a real database. They wanted views, they wanted triggers, they wanted stored procedures, et cetera. And at the same time, they had this idea that it might make sense to also create another SAP-compatible database called MaxDB written in Pascal. You can still probably download the sources to MaxDB and get it running with Free Pascal today. I haven't done it in quite some time, but Free Pascal ships at least in Debian. 2003 also meant that Alzato got acquired and that brought in NDB cluster. And pretty much every other cell phone home location recorder runs NDB cluster today, I would say. It gives you five nines uptimes and it has improved tremendously. Previously, it was only in-memory storage, then it gave you on disk storage. Oracle has put huge amounts of effort into making NDB cluster better today. In fact, deploying NDB cluster, upgrading it used to take 54 commands or something. But now Oracle has actually, at least from open world last year to now, have improved tremendously how you'd manage NDB cluster, thus making it a possibly more usable and usable database. You didn't have to be spending millions of dollars and figuring out how to actually do stuff. So I think this is going to be really interesting going forward as well. So kudos to Oracle. In 2004, 4.1 became GA and then you also have NDB cluster. And it turns out that because of all that money we'd raised, pretty much all the external developer community got hired. If you wrote a patch, you probably got hired. If you hung out on the IRC channel, you probably got hired. You probably go through five or seven interviews, but getting hired was really, really easy because if you were passionate, the company had the money, they would pay you to work anywhere. And then 2005 brought us the 503. And frankly, that was one of the first releases that I had noted where the ABI got broken when it became GA. It was broken actually by Monty. And one of the first things I had to do was tell Monty, hey, you broke the ABI. It was kind of a terrible release, but it had lots of enterprise features. So it had the views, triggers, thought procedures. It was definitely driven by marketing. And I would say that even internally to get it running on State Eventum, which is a support database, it took close to like a year. So it became GA quite much later. Now, something else happened. Oracle decided to acquire, in a base, the makers of InnoDB. So Oracle has had its foot in the MySQL world since 2005. And we were all terribly worried, like, oh my, we need like a transactional storage engine. How is this possible if our competitor owns a transactional storage engine? So Monty started with a few engineers, the Maria project, to make a crash-safe and transactional version of my ISAM. When times of crisis hit, the next best thing to do is raise more money. So we took even more investment. And that meant we could now acquire Net for Structure and give you the Falcon engine, which was supposed to be able to store blobs, was good at blobs, was a gym stocky engine, had MVCC, and was meant to solve the InnoDB problem. And at the same time, we said, look, we're not going to just be dependent on my ISAM and InnoDB, or End of Falcon. We want everyone to write engines. We'll make a pluggable storage interface so that anyone can write an engine. And I just hated, there are probably like 40 pluggable storage engines out there, like you could stream HTTP inside, you could do write to S3 buckets and so forth. As for quality of the engines, they varied tremendously. So not all were fully supported and so forth. So we were the only database manufacturer with a pluggable storage engine architecture. This is not true anymore. I mean, today you have MongoDB with storage engines for now about a year, plus you have Postgres, I think, talking about it as well. And it turns out that MySQL cluster development was starting to happen at a rapid pace. They said the server is being developed too slowly, so we're going to fork internally. So MySQL cluster actually had its own fork. Might have been the first MySQL fork sanctioned by the company. And they put cool stuff in, like row-based replication, circular replication, online DDL, all this in the cluster fork. And NDB, because it was bought and then tacked on to the SQL interface, was actually probably the first no SQL engine out there, to have other interfaces to NDB without a SQL interface. You could use it as a key value stop from the start. You could use LDAP front-end with NDB back-end, and that worked as well. So when people think that forking of MySQL only happened shortly before Oracle picked it up, NDB had already forked years ago. 2006 also brought along another engine, PBXT, which was, again, in some instances, performing better than NDB, especially right before they kind of killed it. It had a log-based design. It had pretty good performance. But sadly, you don't really hear of PBXT much anymore. It only got shipped in a few releases of MariaDB before it died. And you think, right, what makes engines different? And there are a whole bunch of things, like how it's stored, maybe on disk, on the type of indexes it uses, does it support transactions or not, locking levels, caching, backups, foreign keys, all of this was basically given to the engine to handle. And this is a different way if you look at the MongoDB world, how they're actually giving you things like GIS in the server. They're actually giving you things like many other things in the server, and the engine doesn't have to provide as much. So it's much easier to write an engine. You know, anecdotally, you can see that the RocksDB engine that Facebook had is basically running in production with something called Mongo Rocks for the APAS servers today. And it took them less than a year from the time the engine interface was announced to the time that this was integrated and running in production, which is phenomenal because getting an engine in the MySQL world stable in less than a year and in production is a pretty hard task. So November 2007, MySQL 6.0 alpha came out as well and we were regularly working on the 5.1 tree. We spent most of 2007 being extremely quiet. We were planning to IPO the company. 6.0 brought interesting things like online backups. So in-server online backups. And there's not much of this that's still available on the internet today. I think all the trees have been taken down, except if people still have copies. But there is at least one video on YouTube that talks about online backup. History gets erased pretty quickly on the internet. So obviously we didn't IPO. Sun bought us. And shortly thereafter, the Drizzle Fork started. They realized that they should fork MySQL 6.0, make it more modular, make it more microkernel in terms of an architecture, have UTF-8 out of the box. And this was a special team within Sun Labs that could work on this. November 2008 also brought the 5.1 GA. And there's a theme building up here. When 5.0 was GA, it was clearly not a GA release. When 5.1 was GA, it was clearly also not a GA release. But Monty discovered how to blog. So he actually wrote this really long blog post about bugs in 5.1. So you can kind of tell that Monty was getting ready to exit the company sooner rather than later. Now, a lot of people don't give much thought to RDelta binaries, and the work that RDelta did. Anybody remember RDelta? Okay. RDelta actually decided that they would take on these patches that were coming from the community and actually make binaries, make supported binaries. And the only claim to fame was, look, when we give you these binaries and RPM repositories and Debian repositories, back then no one else provided this. The only thing they expected in return was you to actually file bug reports, or if you felt like you needed to buy services, you could visit Open Query. That was actually pretty interesting. And they built this system called the Bakery, which survives till today, because MariaDB makes use of the Bakery to some extent. It's a pity that they stopped building binaries, and I guess it's largely because resources were crunched, and they were building largely on Percona's patch sets as well, and Percona decided that they would stop making a patch set and actually release, I think, was it 1.0.2. So around December 2008 is when Percona basically said, here is Percona server 1.0.2, enjoy. And it wasn't called Percona server, it was probably called Percona SQL. They had a whole bunch of names before they actually settled on Percona server. They even had these bills called the Hyper for bills. And that graph I show you there is picked out from 2009, when Flickr themselves said they had migrated to the Hyper build and they made it public that they had done that. So Flickr was one of the early adopters of the Percona server Hyper build. And you would see that Wikipedia had a patch set that came out from DOMAS, that was an InnoDB plugin for AIO read improvements, and a lot of other Percona patches overall. So Percona, Google, and Facebook were the people who were actually making my SQL better even then. And our Delta, again, were taking the Percona patch set and putting things like the micro slow patch for slow queries. They had a faster master promotion which they taken out of the Google patch. So there were all these little patches that they were actually building binaries for. So that was an active ecosystem then too. I should probably also mention at some stage that there was proven scaling in between who are republishing my SQL Enterprise binaries because they had bought a subscription and then they'd given it to you, binaries and source. In February 2009, Monty basically decides to leave Sun. And you will actually note very carefully that it was only in April 2009 that Oracle proposed to acquire Sun. Monty had already left Sun before Oracle had proposed to acquire Sun. This part of history occasionally seems to get muddled. There was also a final 6.0 alpha release which was kind of pretty much dead. This was the last of the 6.0s. Basically at this stage, you would see that Falcon was dead. Drizzle was not looking very promising. But what did happen after Oracle proposed to purchase of Sun was that Monty program was started shortly thereafter and there was a long battle in the EU. And all this happened at the MySQL conference in Santa Clara. It didn't take very long before there was a beta release of MariaDB 5.1. There were a whole bunch of MariaDB releases in between which is why you'd understand eventually this numbering scheme where MariaDB goes from 5.5 to 10.0 because it took time for Oracle to actually buy Sun and it also took time for Oracle to actually make a GA release from 08 to 2010. But when 5.5 came out, it was heralded as a pretty good release. People really enjoyed 5.5. In fact, MariaDB rebased on 5.5 as well. So you'd realize that the 5.2 and 5.3 trees were built on 5.1, but 5.5 was built on 5.5 as well. And it didn't take very long for Precona to release a Precona server 5.5 stable as well. And there are obviously various things being peppered into releases like 5.2 had the idea of virtual columns. It was a release that was filled with community features. 5.5 had a lot of MariaDB performance improvements from Oracle as well as MariaDB and Precona. You'll now note that there was a pretty huge delta to get MariaDB 5.5 out. It was 16 months after MySQL 5.5 and that was largely because of all the merging and the code changes that had gone in when it became clear that this model of merging with such huge changes, especially in the optimizer, was probably quite unsustainable. The MariaDB foundation, anybody heard of this one? Okay. So, yeah, they announced creation in London, November 2012. And then it wasn't much longer after that MySQL 5.6 became GA. So here we were with 5.5 and 5.6 and we thought sometime in late 2012 that 10.0 was going to be the next number because we knew 5.6 was going to be worked on and we didn't expect 10.0 to take such a long time to make as well. Around that time, SkySQL picked up Monte Program, SkySQL being this venture-backed company, Monte Program being a band of hackers just working. And Precona Server became GA as well. And you'll notice that they also had a delta because when Precona believes in doing heavy QA, heavy testing, and they also wait for all the bugs in bugs.mySQL.com to get fixed. And this is no testament to the quality of 5.6. 5.6, maybe a GA wasn't so hot, but after a couple releases it got really, really, really rock solid. I mean, 5.6 is what powers a lot of the internet that you see today. So, again, a testament to Oracle learning and fixing things and being responsive to their users extremely quickly. And then, 2014 brought us MariaDB 10 GA. So, kind of a little delay there. And around the same time, they realized the consortium of web companies realized that there will be a web-scale branch which they will use as a base. So Facebook uses it as a base to actually add more features inside the archery. And other companies are more than welcome to do that as well. It's probably also worth noting that a lot of Linux distributions, by the time 10.0 came on, it also paid close attention to MariaDB. And Prokona also started paying close attention to Linux distributions and started working closely with the Dabian and Ubuntu teams, I guess, including with Galera Cluster because Galera Cluster and Prokona, actually, DbCluster, becomes tremendously important when it comes to the OpenStack world. Anybody use OpenStack? Okay, brilliant. So, ah, before we go to the talk about... I believe the next slide should be the one where I talk about codership. I'm going to tell you that there is something else that's worth mentioning as a milestone that in October of 2015, separated by two days, MariaDB 10.1 became GA, followed by MySQL 5.7 becoming GA. So, today, you have two fairly new GA releases that you can kick the tires around for. Codership. They aim for at least three releases a year, and just a couple weeks ago in January, they did make a release as well. And their upstream is GaleraCluster.com. They've been open source for probably in excess of six or seven years. There are two major distributions of codership's GaleraCluster. There is Prokona, actually, BCluster, PXE, and MariaDB GaleraCluster, MGC. And PXE is right up until MySQL 5.6, plus Galera, because Prokona has not released Prokona 7.7 yet. And MariaDB GaleraCluster is right up to 10.1. And there are plenty of users of GaleraCluster out there, including people like Greets, HP, DNS as a service, OpenStack, Mercado, Libre, PagerDuty, AVG. The list goes on. This is something that OpenStack people love as well. Now, when it comes to the open source community, I'd say that, generally speaking, MariaDB not only takes external contributions, they also are very, very kind to external committers. So, after enough contributions, you can get commit rights relatively quickly. MySQL has also been very good with taking community contributions. They're definitely welcome. But so far, the commits are not, and we hope that will change at some stage. MySQL has already moved to GitHub. It's so much easier to just submit a pull request. It's easy to fork, make a change, and submit a pull request. That would change in the near future. One of these features that I think were pretty good would be generated columns. That was a patch that sat in bugs for maybe 10 years. It doesn't matter if MariaDB took it five years earlier, but it's still a case of, it sat there for 10 years. If you're a contributor and you wait a long time before your feature gets accepted, you get pretty annoyed pretty quickly and go move on to something else. MySQL is much quicker when it comes to accepting patches even if you don't work at Alibaba, Facebook, Google, LinkedIn, or Twitter. Now, Google is an interesting contributor to MySQL. They review pretty much every feature that goes in there and every patch that goes in there despite not running MySQL in production. The major concern with MySQL is that everybody there doesn't want to sign the Oracle contributor license agreement with the exception of Twitter. They have made it clear in public, probably last year, April, that they will give everything to Oracle if it was given under the Apache CCLA, but they just won't sign the regular CCLA. I hear lawyers are working on this, but we don't see any progress just yet. Perkona will accept your bug reports. They will probably accept your payment for features, but they do not accept commits into their tree either. I think the other interesting way to get contributors is Google Summer of Code. We ran this at MySQL for some time as well, and shortly after the, I'd say, probably the Sun acquisition, Google had stopped giving us space in Summer of Code. Then, again, MariaDB managed to get it in 2013. It turns out that we've obviously improved the way development happens because we are shipping their code within a year. As opposed to not shipping, so a good example would be the per query variable settings that you see today in MySQL, Perkona, Server, WebSkill, SQL, and MariaDB. All of this basically came around the 5.6.5.7 mark, but that feature was written by Google Summer of Code student back in 2008, I think, or 2009. So that student never saw his code shipping for a very, very, very long time. So a great way to grow the younger community would be through things like Summer of Code. I guess the other good thing that we're seeing from Oracle is that they're also hiring a lot more younger developers. There are a lot of developers' names you can see in the bugs that MySQL commits of people that I obviously have no idea who they are, and they obviously knew. That's really good. Now, when it comes to security, it's worth always generally watching for what is known as a critical patch update, and Oracle just released one probably two days ago. The real problem with critical patch updates is that Oracle is very obscure about their CVs. They give you a CV number saying that they fixed an unspecified vulnerability. This is a problem. So this then gets tagged. So when MariaDB reviews the changes, and Perkona does this as well, one would have to go through the changes and then figure out which CV and figure out what was actually being fixed. So it's a lot of extra work, and I guess this is what's making some distributions like Debian get annoyed lately. It's a lot of extra work. A little annoyed, yeah. So with MariaDB and Perkona server, you don't have to wait for CPUs or the next release. Like if the security fix warrants it being fixed in less than 48 hours, we inform the packages first, and then we make a release so that everyone is on board. And for one issue, I guess MariaDB was very quick to get patched. This is the one where if you tried up to 300 times, you would definitely get a root login to MySQL. It was an error with memcompath. So, yeah, I think when it comes to security, this is something you need to think about as a DBA, as a user, because security matters and making sure that things work out of the box and people can't crack you also matters. Another favorite question I get is, is MySQL dying? Frankly, no. I think it is actually at its most vibrant now. You know, you should see the chats just yesterday, many people got their acceptance and rejection notices for the famous MySQL conference that happens in April. And there are people going on Facebook going, I wish I was there. Why wasn't this talk accepted? I am developing this on Alibaba's cloud, and it's running in production. Don't you think people would love to hear this talk? And I was going to have one week holiday with my family, and now my plans are shattered. MySQL is far from dying. There were like 400 talks being submitted. There are at least maybe 100 hackers working on MySQL, not working at Oracle. If we add the packages and the Debian MySQL team does a wonderful job, I think they have maybe at least eight or nine active contributors. There's a huge chunk of people working on making MySQL better. I cannot imagine why people would actually say it's dying, or the ecosystem is dying. How many people here are not using MySQL anymore, or something in the ecosystem? See? Though you are in the MySQL track, so maybe I should have asked this somewhere else. So, you know, I think the external contributions, the people using it, the shared knowledge, and Oracle's actually, frankly, doing a pretty good job of running MySQL. I think it's getting better and better in MySQL releases, and you benefit. And when, say, MariaDB does something, and Oracle does something similar in the next release, that means you benefit as well. And if WebSkill does something today, and you pressure Oracle to have such features, or you pressure MariaDB to have such features, again, you benefit. The ecosystem benefits by having many players. I mean, look at the Linux kernel, right? As an example. And if you think it's dying, oh, that looks pretty bad. If I could turn the lights off, which they said I couldn't do, then it would look much better. Anyway, for what you can't see in blue, you'll just trust my words and realize that they run MariaDB. For what you can't see in pink, they run some form of WebSkill and are including their own trees. For those that you can see in white, they're most likely just running stock MySQL, which is brilliant. With the exception of Amazon, Amazon has got stock MySQL. They have got this thing called Aurora. They have got MariaDB. They've got lots of things, and they're not really stock. They have different backends and stuff. And for that one thing you can probably see in red, that one doesn't run MySQL, and I think you can probably guess why. It's a Microsoft service. So that's the Alexa rankings of the global top 20 websites, and they're all powered by something in the MySQL ecosystem. So really, I can't imagine why you'd think MySQL is dying and or dead. And people like the Alibaba group, so you'll see Taobao there somewhere. They gave MariaDB multi-sales replication, and multi-sales replication went on to being used in production at Tumblr even, and so forth. And it didn't take much longer, I guess, from the 5.6 to the 5.7 release that now MySQL also has multi-sales replication. So you can see features in one tree first. And today, like you see 5.7, there's wonderful JSON functionality, which you can see a vague release in the Facebook tree, and the MariaDB tree will probably get that in the 10.2 release. So I took a look at the people that actually do share stats, and I guess popcorn is not the best way to actually track people, but in the absence of any statistics, we should just give you some vague stats that do exist. You'll actually realize that defaults matter tremendously. For example, MariaDB 10 has probably about 10% of the installed user base compared to MySQL 5.5 in popcorn, and if I'm not mistaken, MySQL 5.6 has an even lower, around 2% install base, so there are more 10 users than there are 5.6 users, and that's largely because the default is MySQL 5.5. Ubuntu, surprisingly, has people using 5.1. I'm not sure why. I'm not sure by then 5.0. Ah, it's an LTS release. Okay, so I guess the Ubuntu LTS releases are really popular. I mean, the margin for 5.1 to 5.0 is huge. So I guess there's a lot of LTS users. And then you see people using 5.5 and then 5.6, and then you've got 10x less users of MariaDB and 4x less than MariaDB for Percona. So the numbers, you can go check them out at popcorn.debian.org or ubuntu.com, and you'll be able to see this yourself. But today you also get stats from things like the Docker Hub, the Jujujams, they all export stats to you, so you can actually start gaining a better understanding of who's using what today. And that doesn't mean that we're talking about the top 1% of the websites out there, like the Alexa Top 20. OpenStack ran a user survey of about 1,300 users. And I was surprised because in the Galera cluster space, Percona actually B cluster came out first, and it genuinely used to have, I would say, way more users than MariaDB Galera cluster. But this survey suggests otherwise. So that's, I guess, good for us, but we don't know how many people are using this and so forth. So well worth taking a look at and we have to rely on these user surveys because we don't really have feedback or update plugins. We do have a feedback plugin, but people don't turn it on. Also I think this kind of survey will change the next time that the OpenStack user survey is run because, as I said, now that MySQL is investing heavily with NDB cluster and Docker containers and running all this in OpenStack and making it easier for people, the next time they run this survey, probably, you know, maybe this year in October or something, you may see a marked increase maybe in NDB cluster as well. This was October 2015, so the latest that I could get. They may have another survey before their next conference, I don't know. Okay. So yeah, you can get this from openstack.org, public user survey report. And then you think further, about the other interesting companies, but there are also so many others, like Yahoo, Yelp, Dropbox, Booking, Pinterest. If you go to GitHub and you look for MySQL-related utilities, you will actually be surprised as to the amount of utilities that are available today and the vibrancy around it. I mean, there is a great tool called Orchestrator that allows you to visualize your replication topologies, and that's being used as like Etsy, Booking.com. You know, it's written in Go by OneChap and it's amazing. So there are huge chunks of technology being developed for this ecosystem. So again, I cannot imagine why people are worried about MySQL overall. You know, 5.7 has a whole bunch of interesting features. Some of them include turning on GTID online. There is obviously multi-source replication. There's GIS functionality, which I think is cool. JSON functionality. And there's a website called www.thecompletelistoffeatures.com that will list them all out for you. Or you could read the wonderful manual and truly the MySQL manual is wonderful. Search is not so wonderful, but the manual is wonderful. I think there will be a 5.7 talk by Dave Stokes, right Dave? Just JSON. You want it? Exactly. It will take definitely more than an hour. So 5.7 is really rocking, so I suggest you check it out. Now, I just picked on some of the features that MariaDB 10.1 still has that 5.7 doesn't. And MariaDB 10.1 is rocking and we don't unfortunately have a MariaDB 10.1 talk, so you'll have to go to the knowledge base and take a look at it. Then there's Bacona server 5.6 and we're still going to eagerly await the release of 5.7, but there are cool things there like proxy protocol support so you can use things like HAProxy, per query variable statements in a 5.6 release. The UDFs are great. Fast index creation. Webscale. Webscale is interesting because there's a lot of hype around it. There are a lot of large orgs around it, but the funny thing about Webscale is one can actually get bills, production-ready bills from them except from this one fairly small company called PSE. I'm going to guess they're a bunch of ex-percona consultants who form this company and they make production-ready bills for PSE, but I don't actually know if people are using them. But there have been some interesting changes, like you can specify in milliseconds as opposed to seconds the read, write and connect timeouts from the client side. There is also things like somewhat semi-implementation of an asynchronous MySQL client library and the full implementations on the Facebook tree. Super read only to prevent writes by super users. This is also made into 5.7. The option to add the innerDB idle flash percent so you can tune page flashing so the system is relatively idle you don't flash so much and you don't want to have unnecessary writes to expand the life of your flash. So some interesting changes as a base and I believe that most of these mergers are now being actually done by Lorenius at Percona as opposed to being actually done by Facebook themselves because the guy who was really into this decided he should move to another part of Facebook to make those cool and seamless VR things. Facebook Facebook has a wonderful wonderful tree and I highly recommend that you look at that tree. It has pretty horrible documentation but every comment tends to equate to a feature and in the comment message you get pretty good understanding of what's going on and something that just added recently was the option of max running queries and max running queries so you can control the maximum number of running queries on the database and a maximum number of waiting queries as well so if the weight limit is crossed the queries will simply then fail that is something you don't see in other shipping distributions. Facebook then created Dockstar and they spend a lot of time speaking to I guess the MySQL folk who then implemented it in kind of a different way and I'm not sure which one Facebook ends up making and for you to end up using it I'll give you a hint if you download the tree you need to actually start the server up with dash dash allow document type equals one because otherwise it will not work. Again it goes back to very very bad documentation it's not really made for end users it's made for Facebook. They also work heavily on RocksDB and they also include MyRock sought backup Facebook RocksDB both from the MySQL side as well as the Mongo side and I guess from the Mongo side if you're interested you know go download Prakona's MongoDB server the full asynchronous MySQL support and a whole bunch of other things so again I'd highly recommend you to check out the tree you know I initially said that we're going to talk about Twitter and Twitter has contributed features into the web skill branch including NDB flash idle percent which I talked about earlier but I believe that when Jack Dossi became the CEO he fired a lot of people he decided that open source wasn't so important anymore they still do have MySQL engineers but they haven't actually made any updates since about mid-June 2015 the best of my knowledge still run MySQL they also made a project called Apache Cotton for MISOs so you could run MySQL in the R-clustered environment and it turns out that I believe the Apache Cotton folk were also fired so I don't know the status of these things and I could not predict this when I submitted the talk because they were still employed then so Ali SQL this one's interesting and there's not much documentation out there except for the fact that they work in private and they actually send features either to web scale and to MariaDB but they're going to talk about this later in April and also put the tree out there in public for you to use and Ali SQL is what powers Ali-Yun which is their cloud offering so you guys here have this thing called Cyber Monday yes it's the Monday after Thanksgiving everybody goes crazy buying stuff online so they have the same thing in China it's called Singles Day it's nothing to do with Thanksgiving I guess if you're single you buy stuff anyhow they have these ideas that they have very hot SKUs and they have made improvements that you can do inventory deductions on hot SKUs and they're going to be sharing more about that soon they've given stuff like logging shutdown information as well as the NDB auto increments I've put an MDF there because that one is still being worked on and they're also working on column level compression as opposed to just row based so a whole bunch of interesting things that are running in production for Alibaba's entire systems as well as being offered to you via Ali-Yun and Ali-Yun is possibly going to become something as they decided to have another data center here in the US so people are going to start using Ali-Yun the non-server ecosystem is actually also quite expanded you'd realize that MySQL is pushing amazing things there out, they've got router fabric, group replication you see that VMware picked up continue and Percona has obviously still got the toolkit, extra backup scale DBs out there there's still things like MMM several lines, cluster control lots of interesting stuff GNU Linux distributions with the exception of Debian and Ubuntu have mostly given you MariaDB so beware because when you ask for MySQL and go hey I'm trying to use this JSON stuff hmm wonder why it doesn't work well it's probably because you're using MariaDB and not MySQL so keep that in mind I guess Debian is having discussions now they're hardly talking about if they should drop MySQL or not and I think they have precedent on doing this once with Elasticsearch which is kind of scary because actually MySQL shouldn't be dropped I think variety is a good thing, not a bad thing so you obviously have the choice of MariaDB server and Percona server in Debian and then we should talk about where some of these folk now right so Drizzle is basically dead and the large problem with that is because it was a single company open source project so there should probably be a list of things that we should write down that you should evaluate before you actually back an open source project or product and one of them is never ever go for a single company open source project if the company decides they're not interested in developing it anymore there goes your project most of them went to work an open stack PBXT is actually doing relatively successful with this project called TeamDrive it's like Dropbox but used heavily in Germany amongst the media industry but they out of the engine business Infobride last I heard was pivoting to make a Postgres engine Kelpond got bankrupt so MariaDB Corporation picked up some of the assets and Oracle picked up most of the crucial assets and then Tokitec Percona picked them up last April as well and then there are a whole bunch of other dead engines or machines that had engines like kickfire and schooner and all that went through acquisitions as for the future you know 5.8 is definitely being worked on I suggest you take a look at the work logs you follow the blogs and look out for the interesting work behind cluster and group replication group replications interesting feature that came out this week was it also runs on windows this is one bonus that over Galera cluster because Galera cluster basically only runs on 64 bit Linux Percona has said that they never aim to become a fork and they will work extremely closely with Oracle look forward to their 5.7 they've made one alpha release I guess you can see the documentation of what's being ported from 5.6 as well as a rough estimate it 5.6 for Percona server I think lagged about 8 months I guess 5.7 will be a lot less than 8 months probably a lot quicker MariaDB 10.2 that was the planning meeting last quarter and it's still extremely available for you to join in and plan but there will be things like window functions common table expressions making it 5.7 compatible WebSquare SQL has made it quite clear that they're going to skip MySQL 5.7 and they will maybe backport interesting features and then they will look at 5.8 closely it's also worth noting that there are plenty of these cloud services available Amazon now gives you MySQL in various versions as well as MariaDB Rackspace cloud gives you MySQL, MariaDB server and Percona server and they have got two versions including a high availability version Google Cloud SQL gives you MySQL there's Aliyun with MySQL and TokiDB whole bunch of others that are like pairs players then there are some like Pivotal cloud foundry which will sell you a MySQL pass but if you look at the fine print you're really getting MariaDB Galera cluster 10 so I don't know you have to be really careful as to what MySQL actually means to vendors and you at the end there are plenty of places to actually discuss this and follow development and I think one of the better ones is PlanetMySQL.com then the Maria developers and Maria discuss lists are pretty active IRC hash Maria is pretty active, hash MySQL is pretty active Percona discussions list is very active on Google groups so you don't have to start a launchpad account which I hear is a bonus sometimes Max scale also has a group if you're interested in that, if you're interested in Galera coda ship team is awesome it's a pity that the internals list is kinda dead and a lot of discussion is happening today on Facebook and Twitter of all places Facebook being a completely walled garden which is really hard for you to remember or search or figure out what was said before because it was on someone's wall and Twitter because you have 140 characters and how much discussion can you get there without most of it being snark and if you are interested in Google's Vitesse or should I say YouTube's Vitesse which powers YouTube today Vitesse's discussion list is also extremely lively and it's also a Google group when it comes to bugs good old bugs in MySQL works you do need a launchpad account for Percona bugs then you need a Gira account for MariaDB bugs and just to add to your fun if you're following web scale or the Facebook tree you need a fabricator login so for this it's preferred that you have either a Facebook account or a Github account you're starting to realize that keeping track of all of this is not as easy or one track as one would assume it's no bugzilla you know I could add the distributions as well and that's actually quite hard as well because they all have different ones so bugs.wm.org and bugzilla.reddit.com are actually two of the better ones as well and then there are all these conferences and events that have a dedicated MySQL track and wow I have four minutes left you are actually at one of them FOSDEM is the other one which happens next week in Brussels so if you like beer get a flight then there's the Percona live data performance conference in Santa Clara in April this one I highly recommend you to go if you're interested in MySQL because this is the old MySQL users conference everyone goes there to learn new things then they have one in Europe Oracle OpenWorld is great and it's not as big as I expected to be but it's a great place to go learn new things and they have awesome concerts, I think Elton John and someone else played last year it was a thrill then they have all these dbtech showcases in Japan so if you fancy visiting Japan at some stage they have them every couple months and if you like Buenos Aires the stakes are huge data ops lat-time happens around November highly recommended these are all like MySQL focused I don't know if you care about Gartner but if you do you'll realize that there's MariaDB, there's Percona, then there's Amazon Web Services and Oracle if this sort of thing makes it different for you you're more than welcome to read it what should you use today I think you really want to think about innovation that's happening today avoid vendor lock-in like the plague because you want freedom and you want vendor independence you want to ensure you're well supported remember you don't go for a company because they change the website every year think about it, that's not their product there are plenty of support vendors, some of them are here available for you you can also look at funding, revenue, etc Stonebreaker, famous guy in the database world I'm sure all of you have heard of him once said that for Facebook using MySQL is a fate worse than death so the nice folk at MySQL at Facebook decided to print that and I took a photo I reckon that mastering the ecosystem takes a lot of time because MySQL has been around for a long time so there are lots of wide and varied solutions some good books that you may want to read, MySQL High Availability I think it's the second edition High Performance MySQL, great book co-written by Peter, totally outdated Peter when you're updating it sometime, yeah it's up to 5.5 so you need 5.6 and 5.7 and then there's Learning MySQL and MariaDB that may be interesting to you a project that I'm currently working on is Features.today, it's coming soon so you can see all the features from all the varying trees it's hard and then a word from your sponsors two minutes, MariaDB Cooperation sponsored me to come here so maybe visit them at booth 511 and apparently you can win a BB8 Sphero which costs 150 bucks I would never buy that toy but if I got it for free maybe yeah and you can buy support training, services, etc so with that I'd like to say thank you very much for listening you're more than welcome to check out the slides and slide share which will be uploaded right after this talk you're welcome to pop by the booth and we have approximately less than a minute for questions while Peter can come up here and set up any questions yes there will be I guess once I upload it and then put it on the website there will be a link any other questions well thank you regular expressions all of them use standard regular expression library except MariaDB which uses the PCRE regular expression library so I guess if you fancy PCRE then MariaDB wins available in 10.0 by the way and no the others don't have it Peter are you planning to do PCRE regular expressions no so basically only 10.0 and 10.1 at the both well thank you for listening I need to put it in your back pocket and then you need to stuff this over your face oh okay I thought you just wanted to touch my butt it's not Russia okay so how do you get it out just put it over here yeah this way okay let me see and they are recording so don't move don't move okay don't turn the lights off and don't joke right okay oh that's the camera out here what are you putting out there okay so you're not calling the names out here right now okay that's fine okay do you guys hear me and I see you can see my slides that is wonderful the only thing I'm missing is water but okay I guess we'll have to do without that so we'll talk about my school Sharin today so anybody knows what Sharin is here okay some of you do right anybody thinks Sharin is painful oh okay that's good so first let me mention a few words about per corner and why really we are speaking about that so we as a company focus on really helping our customers to succeed with MySQL and more recently in ADB we provide a whole bunch of services running from support to managed services and we write a lot of software for MySQL Colin has mentioned some right we have Percona Server, Percona X3B Cluster as well as tools for backups Percona X3 Backup and Percona Toolkit there are I think two things which are important about us one is all our software is open source we are not having some enterprise version which is closed source so there is some other way to require you to pay if you want to use that and we are also really vendor neutral that is not our goal to really make you to use Percona software that is our goal as a company to really provide you the best solution for yourself and frankly that's maybe MySQL that's maybe the Percona server that's maybe MariaDB or Amazon RDS if you are happy we are happy so let me start with story when I talk about Sharin a couple of years back a customer comes to me and say oh my gosh you know I read this all wonderful stories about Facebook, there is Sharin and there is a Dropbox Sharin and there is everybody in the world is Sharin so can you guys please advise me how to Shard and say hmm ok let's look at your system first and I would ask you what exactly is your database size right probably it's like well it's about two gigs in total interesting right do you guys have a lot of traffic actually no it's about 100 queries a second well maybe you just guys just launched and you are planning to grow approximately 1,000 times in the next three months well no actually we've been in business for about five years we are growing about 7% a year and they are expecting that for next year right well wonderful story right and well the case in this case is obviously with such a small system oh thank you you don't really need to Shard right because even if the Moore's low is kind of you know kind of getting dead right they are still expecting performance improvements in the system much more than 7% a year right so the outcome for that is before you really decide how you're going to Shard you guys really need to decide wherever you need to Shard at all right and I think that is a very important question because if you guys go to different conference law read a lot of famous bloggers a lot of them work for really large and famous companies and they are often going to talk about how to my school to operate my school on extreme scale that may not be exactly your case and in your case the different principles may may apply now what is interesting is if modern technology you can really go quite far without Shard right and I and I think if our semi obsession with distributed systems in the recent years we may forget how much you can actually get from a single simple my school box right and let me throw out some numbers from a single MySQL server you can get more than 100k query seconds you can get more than 100k rows updated or inserted a liter per second if you are going to some data kind of scanning queries you can get probably more than 5 million rows in memory MySQL can deal with 10k of concurrent current connections and if you are having some good box with a good storage network you can actually have 10 terabytes or so on the single MySQL instance right and this is not the marketing data and it's not extreme right so if the storage that's one of the things which I hear a lot of people tell me really do you really have anybody on the planet having 10 terabytes on the single MySQL instance and I'll tell you yes I've seen many people doing that and I have seen people up to about 50 terabytes on the single MySQL instance right that is kind of extreme and I wouldn't quite go that way but but I think that gives you a good perspective now guys these are not some extreme marketing numbers right these are something what you can get from a pretty mid range system even in the cloud right and for a change I'll show you some marketing numbers right so I stole that from Oracle there is Dave here right Dave he ran away anyway so I hope nobody is watching but anyway these is the marketing slide which is actually also true right but it is on a larger box with relatively simple query but you can see we are getting well over half a million queries with MySQL 5.7 on the high end right so 100,000 queries is quite conservative now let's do some math right let's say we have sort of system and we are planning for maybe 3 million active users and those users have 30 interactions per day and with all our kind of fold we are kind of ejecting interactive probably for single interaction you are going to have unlikely more than 10 queries right so with all this math and accounting for peak for free x between your night hours and kind of peak hours right we came to oh shit I was talking about run slide okay anyway that's the math I was talking about can you actually say she is on camera here oh I did twice oh shit no okay anyway so if we do this over math what do we come out to it is about 30,000 of queries a second right so really it is not a lot even with those quite large numbers of users interactions and so on and so forth right so if you look at examples the companies we worked with avoided charging that's quite quite range a lot right we had worked with pretty large enterprise household name here in US but the more household name it is the more they don't like you talking about them and naming them right I know why but anyway so those guys are using Drupal for everyday work flow for everybody for 200,000 employees right can use Drupal no charging require MySQL server loaded about I think 3% in average of course they would use some replicas right for high ability purposes but from performance alone you don't need that we also worked with number of e-commerce merchants right and some of them selling well over the telling million of dollars per month so that's a very pretty serious e-commerce businesses don't need any kind of charging so as you guys already told me at the start of this talk is a charging is a pain right yes well and I know some of us like pain well maybe many of us like pain I'm not quite sure what kind of this audience is but well pain is not always good right let's all agree on that when it becomes too much pain we still stop liking it right so in a lot of cases even if we charging is pain and in a lot of cases we want to delay it or avoid it right so what kind of pains do charging bring to us well a lot of that has to do with with complexity complexity for developers right now I have to instead of querying one system I have to figure out where my data is and again even if it's charging done automatically for you if you're a good developer you still have to think about that because your performance profile is going to is going to really depend on the data distribution there is more complexity from operational standpoint there is more kind of complexity in technology because you have more bits and pieces you have to deal with much more complicated failure scenarios compared to just single blocks essentially which works or doesn't and much more complicated performance profile with bottleneck network involved in much more sense and so on so forth now MySQL charging is frankly especially painful because compared to many other systems like MongoDB in a good example MySQL charging is still have to is very manual right MySQL came from an error when this kind of many service operating together as a single entity wasn't really a thought right and it was after thought it was bolted in bolted on right and in many cases it was bolted on by the users right let's say Facebook would bolt on their own charging right or the tests which you mentioned YouTube their own right there are guys like Oracle or MariaDB working on MySQL fabric, MySQL router, MaxScale all those things have came much much later right so if you can't avoid the charging you can often delay it right and why can't you avoid the charging well frankly even though the modern systems became much much more powerful and can handle a lot of beating for a single node you can't build the Facebook on a single node right even the system which is probably 1% of the Facebook size so but in those cases such systems are typically not born overnight right and it can take quite a while to get there right and you may not need to shard and waste a lot of your development and operational resources on dealing with that issues until you have much larger resources much larger development and operational teams so what few strategies do we have for avoiding the sharding right or delaying the sharding and they go across those several dimensions which is using architecture functional partitioning replication, caching, queuing and using some what I will call supplemental technologies right let's go through them one after another so when you look at the architecture we'll talk about a lot of architecture questions in other top areas as well but what I think is one of the changes in the architecture recently was how people start to wall to either microservices each of owning kind of their own data right and maybe actually using different data structures right on the back or some people would call it not quite microservices but service architecture but in any cases you essentially do have certain number of block rather than one monolith system right which have different data stores right so the load and amount of data each of them have to store is different right so for example you can say hey this is my interface which provide let's say user profile services right and this is maybe my billing system if they live in a different if they design a different services they probably don't have to live in the same database now this also can be seen through an avalanche called functional partitioning right and what that necessary means is what we want to keep our separate data separate right think about I don't know even such a basic side as a per corner right we have a lot of different bits and pieces out there right we have our blog we have our Drupal based website we have you know some other components really each of them are quite independent right and they live on the on the separate databases to start with I think it's important in this case to really maintain them to be independent and not accidentally mix them when you don't don't do that because then if you need to scale and separate them on a different systems later on that's maybe hard to give you one of the simple practical advice is if you have systems which have to which operate in an outside on a different databases give them different my school users have different permissions which only have access to the databases they are designed to have access for right so that means for example if your main website needs an access to the content in Drupal that wouldn't happen by the accident and then you say oh let me just move those to let's say blog and Drupal website on different systems you have things broken down because there are some joins or anything else you didn't expect okay replication right so anybody here use my school replication yes well I think the my school replication is probably by far the most popular strategy both for my school high-vability as well as for my school scaling right what replication allows us is to mainly scale reads but what is a good thing is in the web applications reads are typically very large portion of a workload right majority of them for many applications what you need to be aware about my school what my school replication built-in replication that is is asynchronous right and don't let words trick you semi-synchronous replication is also synchronous right so semi-synchronous is not synchronous that is actually asynchronous with some better recovery properties if you're looking for other my school replication options I would consider perconnex Ruby cluster or other gallery-based solution if you are the Amazon Cloud Fun the Amazon Aurora also does its own unique replication approach to my school the next trick which is very powerful is using some sort of caching with my school which also allows us to scale reads right so anybody used things like memcache here or radius maybe for caching right or if you guys have your application reaching in Java you may actually caching just in the application level right not to go to a database all the time caching is really really powerful and I've seen applications which through use of a caching alone would able to reduce the traffic to my school by more than 95% right so that can be that can be pretty cool so my school have also built in cache query cache but the thing what we know about the query cache in my school is what it sucks I mean it was designed in my school 4.0 it did not get any attention until now it doesn't scale very well both with many CPU cores as well as the large cache sizes right so if you have some like really really bad applications some really really bad queries and low concurrency you actually may benefit from my school query cache but most high end installations have to disable that the interesting thing about the query cache that is one of the things where Amazon Aurora seems to have implemented a number of changes Amazon Aurora query cache is much much better than my school query cache and frankly I believe a lot of benchmarks published that is really the cause right just much better query cache I mean maybe this kind of competition between the giant or giants of Amazon and Oracle will make Oracle also interested to make some better query cache like solution for my school too well let's talk about applications server cache, memcache and radius two other things I think are quite important to know one is you often can cache my school and my school itself building the summary tables for some very heavy queries instead of running them all the time right you can call it like materialized use if you like which are like glorified summary tables can be very very powerful optimization techniques in IFA in which I know a lot of DVAs developers don't think about is things as HTTP cache in a lot of cases caching on HTTP level is absolutely the most powerful because frankly the best way for you to give your customer the best performance ever is if he doesn't even have to go from his you know system right to request refresh an object if it's cached right there browser memory right over disk space and you can often achieve that for a lot of data by implementing proper HTTP caching policies right so read on that queuing that is another kind of big secret if you look at that any large scale system out there does use some queuing at least for some of the things right because of many things right queuing helps us to scale rights for variety of reasons right but it also allows us often to balance the spikes right if you think about that if I go ahead and something fun happens and I upload the spike of you know maybe 10x more YouTube videos right which needs to be transcoded then average well it would be overload the system trying to transcode them all at the same time in real time right by just putting them in the queue and then getting to them then you can get with your resources right maybe doing some dynamic scalability on the background that is much much better architecture and that also applies to a lot of database things right let's say if you build in something like Twitter and you have a lot of followers to deal with it's much better idea not to try to put all of those notification in real time but put that to the queue right and then do whatever you have to do in the background so there are a number of systems which we use for queuing it's Revit mq active mq radius is very popular and what we see a lot getting a lot of traction those days especially for certain use cases is Kafka anybody use Kafka here okay well great to see fair amount of hands now the other thing which I mentioned is use different technologies right well I hope that's not a secret for you but the MySQL is not a silver bullet right nothing else and as much as it's great for certain things it absolutely sucks for a lot of other things right and frankly it's very interesting for me being involved in MySQL for so many years to look back and say well let's say 8 years ago we will do something like we'll try to write apparel scripts which will go through let's say 10 slaves or 100 slaves right run some queries in parallel and then aggregate those results of a group buys and all the other crazy stuff right and then what happened well actually we decided what those things are much better done on Hadoop and Spark or Spark right when you need to crunch through a lot of data in parallel just don't do MySQL for that right I mean it will be just the main and probably little gain there are also technologies for full tech search MySQL 5.6.5.7 have a full tech search thankfully it works in ADB now right so we don't have to peak between transactions or full tech search abilities but it's not really scalable solution right it's what you can call like a low-duty right if you're implementing search for home dvd collection right or something like that if it's not over the large you can use MySQL full tech search right if you are building the search for many hundreds of gigabytes or terabytes of data you will be much happy using elastic search swing solar right or some other custom made full tech search technologies which do that in parallel and you will be also able to have much more flexible search at much better quality there is also selections of document stores you can use there is also Cassandra which we so for some very simple but very data store but very scalable and reliable one it has been amazing the next thing to consider is what you actually want to consider optimizing your system before sharding right I don't know I see a lot of people were being kind of lazy about how we just like those kind of to go to a boss ask for a lot of hardware and then kind of play with distributed systems and kind of instead of just going freaking tuning the indexes right I don't know why but that is what we see the time the time and again so what you can do and you should do in terms of optimization hardware fast fast CPUs right especially because MySQL executes single query in one core don't buy into advertisement will give you CPU with 14 cores but they all will be you know less than 2 gigahertz or something like that for MySQL that doesn't work well right MySQL likes faster cores and for more than CPUs you want to look at the turbo boost frequency right not phenomenal one because that is what's going what's your CPU is going to run at when you're just running one very very heavy query right you want plenty of memory because accessing the data in memory is much much faster than going to disk and guys please understand that happens even if you're using very fast fast storage the fastest fusion IO right or some NVMe storage is still going to be orders of magnitude slower than just going to a memory to fetch that you know 10 bytes you need from a field in the row flash fast flash storage right solid state drives I mean for operational databases you should not be running anything else anybody else is here suffering with spinning disks why okay okay well so I mean I know in some cases we have some issues right with I don't know it may be politics it may be what somebody doesn't believe what SSDs can be reliable you know 10 years later right it could be what I spent all the storage budget for the next 10 years buying EMC for 10 million dollars 5 years ago right or some stuff like that well but anyway if you have been practical you want flash storage and you also want to have a good network of course right a lot of the modern application that comes to my SQL can be dominated by network through latency right because I I recently looked at let's say how much if you look at the gigabit network right for example often you will have what are going to be around you right in terms of in the data center I think it's probably about 300 microseconds right now if you look at the mysql it can really process the simple queries typically in less than 40 microseconds right so if you think about that simple select or something latency is going to be dominated by the network latency at least that one gigabit network right another important thing about the networks is to think about the distance and number of hops I have seen so many people making a mistake of placing mysql server and app server very far right from from each other either distance wise or just having very a lot of a network equipment in between which can add up a lot of latency right if you go through a few routers and so on and so forth it's all add up environment well I probably would be preaching to acquire in this case but I think that good is Linux is pretty good operation system to run mysql right would you agree with me guys right and thank and actually that is also the more popular ones right I think a lot of large scale deployments are done on Linux these days I see I think a little bit more actually free BSD deployments to start popping up over the last couple of years but Linux is still absolutely dominating newer mysql versions generally scale better not surprise but scale better at this with multiple connections with many CPUs with more complicated queries because of medieval optimizer but for simple queries for a single thread mysql has actually becoming slower and slower over last releases right last benchmarks I've seen mysql 5.7 is somewhere like 10% slower than mysql 5.0 running single thread benchmark like sql bench right you can look up mark colligan has wrote about that mysql 5.7 you guys heard about that that's a pretty interesting GA release I see there is a lot of still things kind of in flux where some changes implemented right so maybe I would not just go to 5.7 today but wait the does settle but if you guys are starting the new development and plan to go in production in a few months 5.7 is a great choice and we are also working on the Percona server 5.7 which should be out sometime in February and we have release candidate available already so you guys can check it out give us some feedback configuration right some people may mysql default configuration sucks and it is by design because really mysql was designed something like you install new server it sits out in the corner not taking a lot of resources it can run some applications so it was not designed to go in and take all the resources of the box and get anybody else to run what that means is what you want to tune mysql server right and here's a link to a webinar I did on this topic it's not rocket science typically you get like maybe 10 options you tune in mysql and you get at 95% where it needs to be compared to defaults storage engine when it comes to storage engine inadb is the full storage engine and it is absolutely fantastic for a lot of workloads in some cases you may check out where toku db for high inserts and if you want to get some much high level of compression then inadb you can explore that anybody try toku db here I see some hands that's good to hear I think if you're sharding so when do you want to shard well here's a challenge we have if you shard too early we are likely to waste our resources because development and operations are going to be more complicated after we shard it but if you shard too late you can run into a wall right because your application just unable to handle the performance required on that so in this case what I like to be thinking about is the term called architecture runway right sharding is one of the architecture considerations you guys have and think about this way and take it as a part of your architecture runway planning all the other things you guys consider as a part of architecture cleaning up technical depth other optimizations and so on and so forth and what you need to understand in this case is how long it's going to take you to implement the sharding depending on your team pace depending on your application complexity and so on and so forth it may vary a lot I can honestly tell you people have come to us ask for sharding advice and tell me someone will go ahead and implement all weekend I'm saying are you crazy? you're not going to implement all weekend but they actually did we had the application simple enough small very agile team which will go and implement sharding or we can in the application in the other hand I have seen some other legacy application which was written for 10 years ton of queries complexity it may take a year or more really to implement the sharding right and I think that's important to make yourself an honest assessment and balance with your capacity how much you guys can run your current application with how long it will take you to implement implement sharding right so that brings us to the capacity planning so for us not to run into a wall we need to know where that wall is right make sense so we want to figure out how we can do that to understand that you want to do some estimates maybe do some benchmarks be conservative right in this case because it's much better to prevent to predict the wall 20% lower right than 20% further down right and when you run into a wall and you're still not ready for implementation and one is very important thing is do not ever plan for linear scalability so if you're looking let's say at your system and saying oh well currently I see only 10% of the CPU used so that means I can handle 10x of query on this system that is a very very bad math right that's not going to happen for for many reasons now I have been kind of very critical of sharding but I have any benefits of that right well and as if almost everything in this wall there are always like positives and negatives right if you look hard enough so what are the positives for sharding well one is sharding is really your role to the ultimate scalability right if you guys are building something facebook to scale then you will need to shard sooner or later and sharding may allow you to reduce some other complexities right or eliminate them so for example I have seen people saying hey you know what yes we do some caching but you want to avoid having that extensively complicated caching level right with all those kind of invalidating cache properly right our developers don't like that it's complicated well we just better buy the bullet ones we shard it we'll have most of our data in MySQL memory and accessing data in MySQL memory is pretty fast as well right you guys have seen the numbers of more than half million in a second asynchronous replication that's another one right I have seen a number of customers saying oh my gosh our developers they absolutely hate dealing with asynchronous replication because it has this property of it works for a lot of time right I can go in most cases on my slave and get the data which will be reasonably accurate but sometimes I get just some silly stale data developers don't like it a lot because there is no built-in guarantees in the replication about the data stainless which it will be to provide so in many cases we see people just sharding when they say hey we read from a slave from masters we write from masters our slaves as used for availability purposes and for some other applications for example some reporting because reporting I know reading a little bit stale data not a big deal sharding also helps us to have very security and compliance as well as possibly having a data closer to users so for example a lot of software as a service applications often would shard things to really separate the data between different customers you know completely different customers live in a different my school instances so even if developers do some mistakes it's very hard for this data to to the intermix right and frankly some of the customers may even require this kind of isolation as a part of that contract so that is a benefit right now you guys also may hear some this shard on the global scale right some of you may heard about those recent law changes and as well as like political changes in the countries where some countries may require you to keep the data for the users in that country it's much easier to move to that kind of environment if you shard already right makes sense so often that will be kind of not purely technical consideration but something beyond that now costs may also be an important factor here now if you take a look at the majority like managed hosting providers and the cloud providers they really kind of want to take advantage of your pain right so what I mean by that is you would find what the margin right or how the cost escalates a very very high powerful system is phenomenal right you can see the high end system being charged you know 10x as a low end system even if you go to I don't know let's say the store to buy one right the difference in price will be only like 2x or so right so what that means in practice is what you often find if your cloud providers or I mean even if the systems you buy the systems which have an optimal price performance rate too to give an example you can get 4 socket systems with Intel CPUs but those will be very very expensive those days they are much more expensive than 2 socket systems even though they don't even give you like typically 2x of performance right so if we don't have to go for that the most expensive hardware we can have a much more optimal system and Trident allows us to pretty much pick the systems of any size because if you have implemented charging well you can charge over 10 systems or 20 systems right and that's really should not matter and as I mentioned that's especially important in the cloud because cloud may have a relatively lower instance size what you can get compared to physical hardware you can buy right in terms of the top performance which is accessible in the cloud so when do we charge right as a summary well first is charging can be a good idea when it's easy in your case some applications are easier to charge than others in both in development and operations second is when scaling up is an impossible to expensive I mentioned cloud but enterprise can also have a silly politics right I remember the guys who had to charge because there was my school DBA team and the ops team and the ops team believed what 16 gigabyte ought to be enough for everybody right and they would not provide the virtual instance with more than 16 gigs of memory right so they had to they had to charge well now you also may want to do that then your application is growing so far so charging is imminent anyway right and in this case you may want to invest your sort of charging which will give you huge performance gains compared to some small optimizations trying to pick up 5% here and 10% here so what have a few charging questions we can talk about well charging level key unit how we handle have ability and charging technology let's look at those in more details and that will make sense what is the charging level well there is a two ways to think about that one is you charge only on the my school level and never you kind of think at your charging by deployment unit to a full stack level right you can say hey I take my my school and you know web service and the cash and I shard all of that all of that completely right now you often do the second one for example if you want to place those things in different regions right or or something like that or to get either better security isolation and isolation at all right because if you have essentially some very thin load balancer which kind of splits your customers to different pods right or where we want to those nodes then they're completely independent right and the failure effective affecting one of one of that is probably not going to impact everybody else right and you may have just I know 5% of unhappy customers which is much easier to deal with from all kind of things right including providing them a good customer service Sharding keys when you pick a keys on the database for Sharding we can see a couple of things one is what most accesses simple accesses should go to a single shard right if you if most of your accesses hit all the shards that is bad design right now no shard is large in terms of data size or loader generates right so for example sharding by country may be a good idea in the first one right because most accesses for many applications will go to a single country but then there are way to large countries right for most application and they are very disbalanced right and you have a country like China or US a ton of population that have somebody like Czech Republic right which is you know much much smaller another thing to consider is when you shard you don't have to pick one right in certain cases for certain data access points you may want to pick a couple of sharding keys and double store the data one way or never right so to give you example systems like Flickster for example right which is social network around movies it will have two very different but the most important data access pattern even through a movie when you have all the comments and why about the movie or about you as a user what you have said and so on and so forth so in such case it would make sense to double store some of the data which is very fast to access from one dimension or another because well storages is cheap sharding unit what do we do here do we shard by physical my school instance schema or do we kind of shard it on a very logical level where you can have multiple shard objects based on the same set of schemas and database well it depends right typically if you are dealing with something like software service where you have small number of customers which pay you lots of money maybe thousand ten thousand hundred thousand you can actually even set up a small physical my school instances especially those days if containers right you can you know run pretty large amount of them effectively but if you have ton of users something like a facebook then typically you end up having multiple kind of shard keys short in the same set of tables or schemas and I think to consider what is how we did it right because the more service you have the more there is chance for one of them would fail right so pretty much if you have rolled out the large sharding you most likely will need to implement some of the some high abilities as well right and it's even becomes more important than just single server because if you just have one single physical server I've seen instances when people would have no time in many years right and they just rely on backups and it works for them I mean if you think about somebody like facebook right they have many or google they have many many failures every day right every hour even probably so if technology doesn't compensate for that that would be a disaster so pretty much too common approaches to that you can shard even over master slave clusters right of the offender effort or use pxc or some other Galera related technology let's look at some sharding approaches which exist those days roll your own which is probably the most popular one especially on the large large scale right I mean all those big guys typically have something very tight in with all everything what they are doing and in my school world it's heavily custom VTES VTES is actually in my opinion the open source technology is to watch because I think it has been proven in production in youtube but those guys are very very serious about making that into open source project and putting a lot of very good documentation and so on and so forth right so I'm very excited to see what that would be there are some other technologies like jet pants and shard query which kind of exist out there in an open source but they don't have nearly as supported momentum out there other things plastics this is proprietary technology which has built in my school comfortable has built in sharding right from technological standpoint it's actually quite interesting but because it's kind of close source or a small community I didn't have a seat having enough a lot of traction my school cluster that is another interesting technology which does certain level of sharding automatically for you anybody runs your school cluster here anybody tried running my school cluster here you guys know it right people who tried running my school cluster there are much more of those when they are still running it this is rather complicated technology with a lot of kind of tricks it's absolutely fantastic for certain workloads but it's not as great for many other things it's relatively complicated and what I will also mention is there's actually a lot of interesting work done with my school cluster over the last couple of years or last three years so if you gave it a look like five years ago you may want to check it out again but I would say also when you speak about my school cluster it's kind of something which gives you like a medium level of sharding maybe if you need to shard and get let's say 10 nodes with all the data bits and memory that may be the application for my school cluster if you are doing math and saying oh I will have a thousand nodes in production right in a year or two that is not nearly what my school cluster can support right and in this case it's it's not going to be good enough few others my school fabric that is sharding effort in by Oracle which I think is interesting and I don't know what I mean by interesting in this case I mean I am really would be interesting to see it in more in production so far there is a lot of a developer talk from Oracle about that but I haven't seen it really deployed at the large scale Tizor database virtualization engine that was also pretty nice open source project but as they kind of moved out as a company and focus on different things I don't think it's been well maintained anymore scale arc has some rule based sharding engine which is also commercial but I actually see a lot of scale arc deployed so that is a pretty common solution especially by enterprise customers who don't mind using this kind of proprietary software there was alternative to scale arc here scale base which I keep here as a zombie on the slide as a reminder a lot of proprietary technology companies may die kind of the technology we had can just disappear and stop being available a few things in summary we guys can see what there are multiple technologies available for sharding in the space of my scale unlike some others there is no standard way to shard there is no sharding solutions which 99% of the people are using out there so you can see there is a downside or you can see there is an opportunity to become creative that's up to you so if you guys are looking for sharding what we as a Berconna can do well there is a lot we can do we have both helped a number of companies to implement the sharding as well as support them sharding environments you for support of managed services and if you guys are very interested to learn more about my school is a my school is a kind of big part of your life consider coming to Berconna life in Santa Clara in April we'll have huge amount of my school content and this year we'll also have content about a lot of other open source and analytical databases MongoDB Cassandra and others well that's it for me and I would be happy to answer your questions oh okay well yes well oh yes well I mentioned that I didn't clarify the abbreviation so when I mentioned PXC that is Berconna X3GB cluster yes yes but that's that is a good catch right so is that actually what happened you entered PXC and there is no PXC out there okay can you send me a screenshot of our marketing right then tell me you know how come yeah you can't even get us to the front page okay any other questions and if you want to troll me that's fine right I mean okay well get to two things one with another plug actually what I would be one of the talks I'm very interested in Perconna live Facebook is going to talk about how they're doing backs up at scale and I think it's for me it would be very interesting to see what exactly strategies are very important now backups you need for charging and what I see a lot is that people have some sort of fuzzy backups so for example say hey we take the charts and backups plus the backup the binary logs right so if you ever have to need to restore the chart to a given point in time approximately they can do it right having something like on a large starting system doing the kind of a global checkpoint that is possible but typically is quite is a bit painful right I mean you can I've seen people doing that by for example you know like posing replication on all the slaves right so not to impact the master then making some sort of well note on the binary log position which corresponds to them and taking the consistent backup right for everything if consistency is needed now and I think about the charts is consistency between different charts right is typically a little bit loose it's not like you have a foreign keys right and it's kind of absolutely going to be out there so because a far as failure scenarios if there is certain data consistency which has to be maintained between the charts then it is good to have certain scripts to check it and fix it because data drift will happen right and you better have some tools to fix it and that is what people can do right they can as well in a worst case scenario if something happens I restore backup maybe it's not kind of exactly consistent if some other chart right but I can well I know there is my source of truth right for the data let's say this user lives on this chart and then I go and check the things to ensure the relationship with other charts are consistent does make sense okay well I think it's a you always have to be looking at what exactly problem you're solving so because you can you can really get like 100% consistent backups right across but that's going to cost you right and a lot of cases people just don't pay those costs because again among other things well like here's the thing right I think and backups in general if you think about backups in in scenario where easy redundancy and there is no redundancy there are two different things right because if I let's say let's say have my laptop right and I just lose my hard drive or something like this right then in a lot of cases I will have a full restore right and that is going to my use case now think about if there is a master and the slave which is constantly replicated if I lose hard drive on this laptop I will just fail over right to the slave I don't do backup restore in this case when does backup restore happens well if you have some developer with dirty fingers right anybody seen those right which I kind of like you know ruin some database what happens in this case those is typically they trash one of the tables well some developers are very gifted and will trash all your tables right but typically it's going to one of a few right so you have to restore the tables which were trashed or even the data which was trashed right which is a very different backup scenario restore scenario right then you may be thinking about right so that is why when you have a replication often you have to optimize for how you can restore pieces of the data right and maybe kind of do partial binary log a recovery to catch up those pieces of data you restore right and fun stuff like that make sense okay well I think yes in some case you will see that as a consolidation right possible I think things have actually become a little bit better in this regard more recently with MariaDB and now in MySQL 5.7 offer multi-source replication which allows you to find in replication if you want to get all the stuff together you may be able to do that we also did some work with a tungsten a replicator to do that but well it is possible but that also would say not not totally easy because in a lot of cases people would have created their ideas which are conflicting and so on and so forth right so that is going to depend a lot on application what exactly you are going to do well yes yeah I think that is a and I think in a lot of cases it is also process of the during the sharding right is how do you because when you speak undo the sharding is that a consequence of a failed sharding project what they implemented the sharding and now that application version which support sharding just blow up right and we want to go back or it was in production for five years and then you say hey you know what application is not growing and you know now hardware so powerful we don't need sharding anymore right those are two very different cases because in the first case we will have to kind of temporary maintain some scaffolding so we can always go back to uncharted system okay any other questions well I think it's you know I can I think it's it depends a lot on the scale right line orders I see very well used on this on the kind of small smaller scale if you look for large scale resource the AWS is probably the most popular one we see we see some more of a rack space right people who has been using rack space we started to use rack space cloud a lot right we like the support approach and so it has been worrying right Google cloud as well also looks interesting but I think you know when it comes to us people really choose cloud for their databases right they choose cloud for their applications and their database are going to live in that cloud and that's what we have to deal with okay anything else yes go ahead talk a little bit more about what balance in balance in the demand spikes oh okay okay so when I balance in the spikes yes so when I spoke about the spikes right is I was talking about the queuing right and here is an interesting thing right is if you think about read and write right which of those are really hard to deal with a lot of people speak about writes right writes are hard it's kind of you have to actually modify the data but actually reads are much more hard in many cases because when you're doing reads you have to get the user data right now right until you have given that user the data I mean he doesn't get his value right when you speak about the writes in a lot of cases we can just put them in some sort of of queue and apply later and frankly my school is already doing some effect for you right think about what happens when you update the data in the debit table it goes essentially in the queue like a log right and then it's going to be flushed from a buffer pool to the disk sometime later right if you log files at a large it's maybe much much later right and that is essentially the idea for right so if you have some rights which are very expensive and you can afford certain delays right you put them in the queue and have some background work it to apply that at the given pace instead of slowing your all your users right so that I think is a main thing now if you look at the other load management technique which I would mention is well not all the features are created equals from a user value standpoint right so make sure you can disable them in the load spikes right if you can bring up in more capacity right or if you don't have any other tools in toolbox I remember for years for example you guys could see Wikipedia search sometimes being down right we're saying hey guys you know what Wikipedia you didn't donate enough money so we don't have enough servers so search is not working right come back later but if you just use Google as a search it still would be working and performing very very fast right and I think that is very important to prioritize what a user experience right and not trying to make it kind of bad performance or a bad experience for all your user for all features but better to have you know certain users to have a downtime right and certain features not work and others to be like absolutely fantastic that makes sense in terms of load management and what that means right is what you want to design the features so you can disable them easily right in production so then feature becomes broken right or it creates too much unanticipated load you can maybe disable that and what say well yes you think you're speaking about another interesting thing right then this is kind of a good development practices right if you're developing the light features on the scale often gradual rollout is great right so you rollout feature for 5% then you say oh it works you know gradually you rollout it for more right well yeah absolutely so it can be back and forth right it doesn't have to be black and white yes well I think that is very good question because if you think about that MySQL is not really great analytical data right in this case I mean it's because if you those days you have a lot of data and if you want to analyze data we want to have features typically right being able to crunch a lot of data in a parallel right and use all the multi course so what we have in this case is some sort of ETL process right and there is a lot of integration in those days you can use MySQL to get a data from Hadoop there is multiple tools right now how you can get from binary logs to Hadoop you can use tungsten replication there is I know some work how Kafka is being integrated right with MySQL so you get some data out and then put into into whatever system you would like to do right so lots of choices those days yeah and I would also mention with some people right we see who really love MySQL and they like pain they can use multi-source replication to try to get it to one Frankenstein box right and try to write the run queries out there yes well after on he is dead right if you haven't noticed right unfortunately with CPUs we pretty much don't have a choice those days right I mean in serigrate CPUs Intel owns the market right we may have something to come out of your power or like arm is kind of picking up a head on the server or there is a risk 5 right but it's all probably like 5 years ahead for now you typically would look at the at the zeons right and the choices you have is pretty much going to be defined by how many sockets you have right and how much memory you support because low end zeons would only support like 32 64 gigs of RAM right not enough right and then that is pretty much is main main direction what you're looking for as I mentioned with MySQL for most workloads you want for faster cores not necessarily so many of them right but that again varies right in some cases you may want it with more cores if your workload has enough concurrency so that's what I would mention any any other particles you're looking for well I would also mean a lot of memory right I mentioned for that if your workload fits in memory that's fantastic and for solid state drives SATA and SAS that right because that's kind of bandwidth at SAS and SATA interface provided just way to narrow so you want to get some NVME NVME or like other PCI express based storage if you're looking for best performance NVME right or other PCI express things right for example fusion IO is not quite NVME because NVME is kind of standard it's also very fast because it goes through PCI express bus not from some you know funky interface which was designed for spinning discs what say well for flash if you have a chance avoid SAS and SATA drives right we're going to limit your bandwidth compared to PCI express well I think this is kind of a little bit Apple and oranges right if you look at in this case so if you speak about the Galera that is replication technology right wherever you so in this case right and Galera doesn't necessary help it helps you in terms of you have a little bit more predictable latency in replication right but still you can scale you right 10x right with the Galera wherever it's you using one or another you need to think about that right also thing is actually I would say because of how Galera works right often the limit of the data size per instance you would keep if Galera is smaller than for convention MySQL right I mean I wouldn't go into let's say 10 terabytes Galera cluster because if you ever need to do SSD that's going to be very painful okay any other questions okay guys and I will hand out today and tomorrow at the Percona booth at the expert hall you know if you want to stop by please do we should also have some toys out here right if you like to thank you okay no Lisa is not here yeah the battery level is great yeah yeah I think we're all good to go I'll be floating around okay I also have an old Marine Corps command voice that I can revert to if I have well folks I'm going to do what they call in the restaurant industry a soft opening I'm going to start a little bit before time once again if you just walked in and be warned I throw these dolphins out for folks who ask questions and participate in the talk and my aim is horrible so if you really want to make sure you get one come closer to the front yes sir that's a little better and she wants it already did it bounce the wrong way okay for those of you who don't know me and haven't suffered through me in the past six years my name is Dave Stokes I'm a community manager for Moscow products which means Oracle pays me to go around the world and we do have an opening in the community management department so talk to me over at booth 527 or 327 if you're interested this is a talk this is my third time giving it first time I did it I did it in 25 minutes which is pretty great for 57 slides I did the same talk to the San Diego PHP group last Tuesday night and one over at much slower at about two hours so I'll be closer to the first one the second one the slides are up at slideshare.net slash David M. Stokes or see me at the booth or send me an email my Twitter handle is Stoker by the way if you go through past Twitters for some folks two years ago there was a Nicole Kidman movie called Stoker so if you see stuff mentioning red hair and nudity it's not me this time MySQL 5.7 let me get rid of that resume slideshow thing up there MySQL 5.7 has a lot of really neat features if you made Colin Charles' talk earlier I told you about the complete list of features.com which covers all 150 some odd changes we made in 5.7 5.7 came out in October one of the more customer interested is the JSON data type we have better security better replication group replication where you update one master and it writes the other masters that happens at the server level not at a floating layer above it whole bunch of other really neat features but so far the biggest interest I've seen so far is in the JSON data type now for those of you who haven't been paying attention JSON is JSON object notation they picked the worst word possible by using object that means different things to different programmers and different languages if you saw the JSON talk yesterday by Kristoff Pettis on how Postgres is doing their JSON you'll see we do things just a little bit differently and I'm sure as this becomes more and more of a feature in all the relational databases because all the relational databases are adding this things will shake out now if you're really old time Los Angeles you might remember this interesting movie comes on Turner Classic movies about twice a year it's one of the last of the old Ray Harryhausen movies where they did stop motion photography and they actually had skeletons fighting with swords against a actor you might notice the skeletons have no muscle so it's really an interesting movie if you can suspend disbelief so what does JSON look like well it has a key and a value the value can be a number it can be a string it can be a real it can be an array of strings they can be just about everything the JSON spec is rather loosey goosey I think it's five or six pages and there's a lot of things they don't specify which is great for programmers because they don't specify it you're going to do it your own damn way anyway so how does all this work well before we go into that let me just say there's more than one way to skin a cat if you've been programming long enough you know that the corollary is how many skinless cats do you want running around in the old days you can actually store JSON data in a MySQL database going back to the earliest beginning the trouble was you're going to store it as a character in this case we have a table that has an ID number and some data and in that data we shove in some JSON works people have been doing it for a long time well what's wrong with that it's not sexy it takes a bit to dig into that JSON data to get the stuff that you want so we end up doing nasty nasty things there's regular expressions how many of you really love regular expressions okay those who didn't raise your hands stay away from these folks the other problem you have is you're breaking the first rule of data normalization which is what relational databases are built on you cut up the data into the smallest usable bits like zip codes states street addresses first name last name with JSON one of the problems is you're shoving everything into one bucket going through that bucket gets kind of nasty also it's slow if you're going to go through all that they're going to do a full table scan for those who aren't DBAs that means you have to read everything in the book from first page to last page to go through it to make sure you got it all slow and efficient nasty once again regex is just kind of for most developers with 5.7 we now have a JSON data type like we have an integer or a real or a double or all sorts of character data types when you deal with json documents the default for everything is utf 8 mb4 once again the default for everything is utf 8 mb4 it's part of the spec for those of you who are lucky enough to do only with good old usa latin one data sets no matter what you do if you're doing it in json it's going to store it in utf 8 mb4 when you go to put in a json document into mysql it's going to make sure that it's a valid json document if it's not valid it's going to get kicked out we'll come back to that a little bit later when mysql server gets to your json document it's going to put it in a binary format and sort some stuff out for easy searching so if you're looking for something like the json data type in postgres where they just take everything as a straight dump whatever you put in exactly what you get out not exactly what's going to happen here column limits the size of your json document is going to be about one gigabyte so if you're doing documents bigger than that you're going to figure out a way to parse them up, cut them down or do something with them by the way if you're on the server the value for max allowed packet size is the system variable it changes on your session so how does this work Dave? well let's create a table called t1 have a column called jdoc and it's going to be a type json and into that document we insert the values key1 value1 key2 value2 and what does it look like when you call it out? you get key1 value1 key2 value2 now how efficient is this storage wise? well in memory it's going to take about 4.5% more space to put in all the overhead for the easy indexing of the json document how many here really have to worry about this space 100% of the time like we used to 30 years ago? nobody so you're the one guy okay facebook now something else that happens is when you put in a json document into mysql and you're reusing the keys remember it's a key value a key and a value you're going to lose the second and through nth one it's going to take the first one so here we're doing a json object of key1 value1 key2 valueabc key1 def 2 key1s server is going to give you key1 or value1 and key2 abc this key1 def goes off in the bit bucket this is part of the behind the scenes optimization it also will internally sort those for the purpose of making lookup easier so let's say you want to look through a json string and get your value here's a case where we have a json string where id is 14 and the name is azatlon and we're extracting whatever is in this string that's what the dollar dot stands for name and it pulls up this azatlon there's a couple other notations coming up that I'll explain the trouble is basically when you're throwing a document into a column we need special notations, special functions to dig down into that document and pull out the values and as you see it can get rather confusing but it works fairly easily so the pass syntax uses a leading dollar to represent the document that you're working on currently think of this in the PHP role as your this arrow whatever object you're looking at by the way these slides I know they're a wall of text so please download them and go through them I'm not going to read everything off of them to you I can do it at the booth if you want I know we're going to be limited for time so here's an example of an array you dollar refers to this entire array so dollar zero evaluates to this over here the string the array starts with value zero or count as zero value value dollar one is this here dollar two equals this and there's no fourth argument so value dollar three gives you back a null if you're trying to pull that out and because dollar one dollar two are non-scholar values you could actually go through and dig down further so one of a equals this one a one equals that and so forth have I lost anyone yet also if you're coding stuff you have a space in your key double check your work and put double quotes around it unquoted it's not legal it's going to kick out this example we have a fish has a value of shark and a bird has a value of sparrow the string dot a fish will evaluate to that to shark you also have wild cards that you could use so if you know that there's something out there like an address and it's going to be made up of various parts of an address string you can just ask for address and it'll pull back the various parts of it if you have that separated in your JSON document in recent versions of five seven you're going to see the the column arrow path operator which is going to be a synonym for JSON extract that I'll go into detail in a minute the idea is putting JSON underscore real real real cumbersome also if you go back through some of the earlier blog posts on this we were calling our functions jsn underscore whatever the function was and it got kind of confusing for some folks so we made them to json so if you go to Planet MySQL and look for old JSON information and you see the jsn please expand it mentally the json right now you can use the equal less than or equal greater than greater or equal not equal and a couple other operators but right now you can't use between in greatest or least within JSON comparators if you're doing order by and group by be careful SQL will order the null before everything else so be careful there also sometimes you're going to be doing you're going to want to cast values here's an example where we're pulling out a B field from a json document and we want to return that as an unsigned just put that in your SQL statement and you'll get what you want depends on your programming language and how you're handling it but if you want to make sure if you're an anal retentive programmer which all programmers should be cast it the way you want it functions so since you were breaking the first rule of data normalization in a file within a column you need certain ways to go down and dig down to that information so we have a whole slew of functions for you to use some of them are for creating JSON documents some are for searching some are a whole bunch for modifying and a whole bunch to get JSON information like how deep the document is how many items are in there and that sort of stuff so quick way to create an array select JSON array pass it your values and it will create an array for you JSON object here you can pass possibly empty string and it will give you back things in a proper JSON format once again if you put an invalid JSON into a JSON column the server will kick it out this is a good way to make sure that before you send it into the server that it's quoted properly I'll talk a little bit later why if you're a PHP programmer I needed all of this for Python a couple other languages why you want to do this rather than use the native functions out there here's one where you're quoting null how many folks here use nulls all the time in their data so you folks are probably the old time DBAs we're not teaching that in schools anymore unfortunately so if you're looking for an address you're searching someone's records that's a JSON document and you're looking for their zip code if there's zip codes in there with the target you're giving it you'll get it back here's one that's similar where the JSON document at the given path or pass level has what you're looking for extract this is how you're going to pass this is how you're going to pass this is how you're going to pull a value out we'll go over this little bit more in generated columns I have a great example in there so if this is kind of nebulous don't worry it'll soon be clear once again remember the shortcut where we use the arrow operator so here we're using the id column within a column that's called c where the value is equal to 4 to do an update so if you're used to running SQL you have to get used to the arrow operator and do a few more coding but it's nothing outrageous or massively changing in your SQL code JSON keys will return the keys from the top level of a JSON object if you're going down deeper you have to do some tunneling but if you want to go to the top level to find out the keys here you go search returns the path to a given string returns null if any of this JSON doc search or path arguments are null or no path and you can also do it where you can get the first one that comes up or all of them so if your document has things that repeat you can find them all array append it does what it sounds like it appends stuff to a JSON document for modifying you have insert here's some more information on insert if a member is not present in an existing object it will add it to it if it is there it will update it for you yes ma'am I don't play enough with SQL lite I don't know if they have a JSON data type yet I'll run into the guy who's the main guy for SQL lite later this year and I'll ask him then but I don't know I'm guessing they'll probably have one of these days soon merge merge will do that for you remove if you have something you need to get rid of this is the way to do it and replace find an existing value and change it in place and also set now somewhere you're going to ask what's the difference between set, remove, update and insert while set replaces existing values and add non-existing values insert inserts values without replacing existing values and replace replaces only existing values you're probably going to use these functions most of all what if you have something in JSON and you want to edit out of JSON you can use JSON quote you can also embed all sorts of things like form feeds tab characters backslashes, utf-8, multi-unicode values something else you might be wondering about the depth of your document how many items you actually have in that document how many keys you have out there how many number of things you have out there and remember everything you're going to get back is going to be utf-8 mb4 so if you're not used to playing with unicode character sets please be advised they do take up extra space before you throw something in the database you might want to run it through JSON underscore valid to make sure that it is valid JSON once again if it's not valid JSON the server is going to kick it back and if your program isn't designed to handle that properly it's just going to frustrate you pass syntax once again redimbered dollar is going to be the synonym for the document synonym for the document you're currently working on which if your dba transport translates the column there are wild cards dot dot star represents the value of all members of an object and a star within bracket equals the cells in an array simple right? you got all that you're going to pass the quiz when I give it to you in five minutes right? this is going to take a little bit of working even for folks who are at long time dbas once again we're breaking the first law of data normalization so you have to go through these various hurdles to get to the data how many of you are php programmers? okay I predominantly speak to a php audience so I apologize if you're not a php coder and I highly encourage you to become one JSON by default or php by default has four functions JSON, decode, encode last error message and last error you saw the list that we have just to get to the data these are the ones that php is supplied by default simple example taken right from the php dot net manual if you encode JSON encode at the array it will give you a JSON doc only trouble is 70% of the time it's okay when you're putting into mysql the other 30% of the time no so please go back and use the JSON quote function that I showed you earlier here's another example for those of you who have never played with mysql this is a standard mysql query we're going out and pulling everything from a table called simple it doesn't matter if it's JSON or not you're going to get everything back from that table because you're using the star wildcard so this part of the mysql development process is not going to change for you however the insert will change in this case we have insert into table name values and we're going to force it of one quoted ABC null and true once again you got to make sure it's valid JSON or the server is going to kick it out yes sir 70% of the time yeah there are some things that just doesn't quote right at the wrong time so I'm warning people use the JSON quote function to make sure that mysql is going to escape it the way you want in engineering terms they call it a mismatch where the person who built part A is expecting part B to work another way and it doesn't quite do it and unfortunately this is the case in this point so one of the problems we have with the JSON data type is you cannot index that column with a normal sql index so since we're breaking the first rule of database normalization you just can't index all the junk in there so we have an option called a generated column generated columns come in two types virtual which means when you do the read it goes out there and does the math that I'll show you in a minute to get the value you want or stored where it does the math and stores it into the table it materializes that value into the data so here's an example called t1 it's going to have an integer field called f1 and an integer field called gc and gc is going to be stored which means it's going to be stored actually in the table and gc is defined as the value of whatever f1 is plus 1 an example of using this is if you know your sales tax it's 0.0625% and you're selling something and you want to put item price to sales price with tax and table this will generate it for you so here we have a table called jmamp for j employee we have a column that's json type we have another one that's an integer type that we're calling g and it's generated which means it's going to be materialized in the store and to get that value from this json document we're going to pull out whatever the id value is and store it in g and then we're going to index that g so if you need to go right to that id number it's there yes sir it should be able to I'll have to play with it usually it's going to be a secondary key and in this case I don't have to look at it but in this case it's going to do it as a secondary index I believe so when a secondary index is created in a virtual generated column the values are materialized or actually written down in the table and you can use the arrow operator as a shortcut so generate columns solve the problem of not being able to index the json document by itself another neat feature with generated columns and it's a great example of the blog post here we need to do case insensitive searches of last names or whatever what you can do is well indexes on a column are always created with the collation of that column so if you just index the last name you're going to get that collation for the last name now if you want you can create a new generated column with the columns data but stored in a case insensitive collation I'm saying this because I know somewhere in the next six weeks one of you will have to do this and hopefully the little colonel there's a gentleman over here shaking his head hopefully this will pop up this is a neat trick I've used it twice since this blog post has popped up syntax for generated columns column name data type virtual or stored remember for UTF 8 JSON data we want it to have it as stored it's nullable and you can actually set it as a primary key so go back to that question right now sub-queries variables, stored functions and UTFs used to define functions are not supported if you really really really need that let me know I have a couple engineers who would like to talk to you and use you as a test case also auto-ecrement cannot be used in a generated column definition so all is just wonderful but you're either on the JSON bandwagon or you're not is this the best thing since sliced bread or is it just another XML so JSON is great but once again you're breaking the first rule of data normalization this adds an extra layer of complexity which means if you have a bunch of folks who don't deal with SQL and JSON documents in MySQL they're going to look at this and their eyes are going to roll back in their back to their head and they're going to go take a three hour coffee break so you are adding extra layer of confusion but what is it what if you really need to store JSON formatted documents because that's the way everything is going all the APIs are going JSON this is the way to store it it's easy, it's fast it goes with just about everything else you used to do with MySQL and it's been available since October in MySQL 5.7 I had a gentleman at a talk in New York City had this before lunch speaker slot and he got very excited when we had the beta version of this code and he immediately went home and took all the multiple listing service real estate data for the state of New Jersey that his company had moved it out of Mongo put it in this and put it in production yeah that's one of those things where beta software, please go out and break it but don't put it in production well he did it and it's been running solidly since last July so you might find some holes and please follow the blog report we do but it is fairly solid I haven't done any direct one to one Mongo MySQL 5.7 comparisons it depends what you're doing one of the things Mongo does is when you write a record adding in there so if you do a replace or an insert they already have the space carved out NODB doesn't do that they want to take that old record part of the MVCC stuff and give you a new one and keep the old one around in case you roll back so in some cases Mongo is going to be a little bit faster in other cases we're going to be a little bit faster by the way I apologize I'm probably going to bonk you to get it to him Solomon no questions that's where you're in I was hoping that he would have done a start transaction and an end in a commit at the end I have no idea what he did it's a list of chances are by the time some of those houses would have already completed their selling what I'm saying yes sir I believe it's of a numeric value I have to go back and read the man page explicitly I believe if it's a numeric value it's going to return you a numeric value if it's a string value it's going to return a string but let me go back come to the table double check you should be able to do that with a simple cast cast something back as an integer or pull the items out of the array pull there yes sir whoops right now it's about one gigabyte now that's writing it out to disk and getting it solidified while you're manipulating stuff in memory there's one gigabyte and you're pulling out 5k here and adding another 5k it's going to actually be bigger in memory than one gigabyte but once it gets written out into storage it has to be no bigger than a gigabyte yes sir it should it's going to complain loudly about it something I haven't done and I need to do it yes sir well it has internal indexes for the document retrieval of the various items but it's not an index you can touch it's a generated index so think of the indexes within the JSON document as only for that document and over for in the server to use you can actually take advantage of piggybacking off that it's for the servers you saw me yes sir yes part of the things with 5.7 was we increased it I think back in double check it's been a month since I looked at it I think we're up to 32 characters I think it might be 32 we had so many people asking for it to increase it from the default 16 that we've done that so yes ma'am way back there depends on how quickly you fill up that gigabyte of data so if you're going to have one value, one key, one value, one key, one value it's going to be up to you so yes sir yes you might want to play with extract to make sure that you're getting the exact values that you want yes sir how many data types in JSON I'd have to go back and look at the spec but they have all the normal things like numbers strings arrays all that it's fairly loosely typed it's a simple respect in the CSS spec so it's kind of loosey-goosey you have a JSON document that's changed do you replace it with the functions that we provide or you do it yeah I'm not really sure that's probably going to be more programmatic how you're doing the data I'm guessing if you want to do it in place with the functions that I just listed it's going to be fairly simple if you want to pull out the entire thing change it then write it back yeah your mileage may vary yes sir you can do multi-columns but it gets messy all the quotes, commas and brackets it's designed to be easily human readable and of course once you give it to a human the first thing they're going to do is try to obfuscate what they're doing so yes but be warned it gets messy yes sir it shouldn't be if you're using the generated columns that's probably going to be the best way to do that and I do that all the time yes sir variables within JSON docs well unfortunately that's not part of the JSON standard JSON standard is kind of like I won't say a final copy of the data but it's all hard it's nothing going to change on the fly it's all determinate data oh by the way if you didn't get one of these I have more swag at the booth at 2 o'clock I have hats, I have some shirts I have squeezable dolphins I have boogie bots if you don't have a boogie bot you need to get a boogie bot and I have some other so I will wind up dancing robots so any here you go Solomon yes for those who don't know this is Solomon Chang he's been around the MySQL universe forever and ever he also likes plastic bags for some reason but if you have any any other questions but you're too shy to talk in front of this group here of nice friendly people see me later see me at the booth I'm here till Sunday midday at least yes sir well people just want to store stuff in JSON format they don't want to break down the schema into its component parts so there's a lot of folks out there who don't want to analyze their data or do anything just want to throw it out there which has its benefits is that you don't have to plan before the problem is the speed is going to suck and get confusing if you're doing composite index like you have the zip code and something from a JSON document you just create the index with the column name and then the generated column name and away you go yeah you can do that so any other questions about JSON or MySQL or yes sir not yet I'm on the road for the next month and that's on my to-do list unfortunately I was a JavaScript programmer long long time ago and I've fallen off the wagon and I need to get back into Node.js to figure out how this impacts this but I can tell you with Python and PHP this works beautifully yes sir I know the engineers are tweaking some stuff but I really don't have enough information to give to you if you have a feature you want please propose it to us we just came out with 5.7 in October and we're on a roughly 24 month release cycle so if there's something that we don't have and it's not on the drawing board we can put it on the drawing board for 5.8 you may not actually see it come out in a test version for but if you are serious about it and can tell the engineers what you want they will try to get it in there yes sir a layer or level of arrays inside of that you're going to have a different table for that I mean I see it more as something that you would actually be more so be converting table relational table data into JSON rather than having a need to do it the other way around yeah there's an art tune strip called Pickles I don't know if you've ever seen it the grandfather is showing his grandson an old rotary phone and the kid sees the phone and goes wonderful looking phone how many pictures does it hold the thing is you and I have gray hair we're old dinosaurs we want to break everything up and normalize it a lot of folks don't want to do that so they want the store stuff in JSON so a lot of folks are doing APIs or they're doing audit trails and everything in JSON and be happy so well anytime you break the rules of data normalization things tend to bite you in the rear later like you'll hear people say oh yeah I pulled this column from this table and put it over here this way I only have to do one select and then they never go back to the parent table and update stuff so that's why your customers don't have the current phone number but it's out here in the sales record but it's not in the customer record so when their credit card goes bad you have no way to track them unless you know how to do that if you're saber and you're doing 30 million transactions a minute you're not going to do it with a JSON you're going to do good old standard SQL if you're a mom and pop real estate store or real estate shop or a store this might work for some of your documents you need to store it's just another option for you yes sir an option cache where you're just throwing stuff you don't know about out there but the dinosaurs like Solomon over here and myself are going to say but you should have put that in your data spec there is a standards committee and things are kind of in flux which means sometime in the near future they'll probably come out with a recommendation and an RFC and a voting round so two or three years from now there might be something a little more solid but right now you probably can't take our JSON and plug it into Postgres SQL servers, anyone else's I'm hoping that in a couple years the entire NoSQL SQL stuff goes away and it's going to be like the old ASCII Epsidic Wars ask him about that yes sir, ooh I haven't played with that yet I don't see why you couldn't do that with a generated column I'll have to play with that that goes on the Judo list and I'll blog about that yes sir I haven't played with that I'm sure that is a column storage and I'd have to look that up and then backtrack from there but yeah there is going to be a wall that you hit and of course you never hit a wall gracefully any other questions yes sir well the full text search isn't designed to be something like a solar or a leucine or elastic search it has its limits it's kind of you're looking for certain key phrases within a document it's not as robust as elastic search for the type of stuff you're doing so json get there hopefully one of these days but if you're doing a lot of corpus searches for certain keywords and stuff like that this will work what's real funny is we're doing a lot of work with me cab processors for various different languages non-western languages and doing searching for patterns and a lot of that I think is going to be retrofitted back in so I know the folks who are doing the Korean, Japanese pattern searches are finding some really interesting stuff and it's beginning to filter back into the full text world so hopefully with 5-H you're going to see that vastly improved but I don't think it's ever going to be something that's going to say oh everyone threw out elastic search and go to this yeah Uncle Larry Ellison my big boss hopefully we'll get there one of these days but it's not anytime soon by the way for those of you wondering how we're doing at Oracle the number eight most popular class in all of Oracle our renewal rates for support are in amazing numbers much higher than the industry standard we're hiring so if you have a resume and you want me to tell you how to get through the Oracle system let me know we're working on plans for 5-8 and there's some things we're looking at 5-8 which is once again 24 month release cycle that may not make it 5-8 but 9-9 so we have plans that are 4-5-6 years out there if you have suggestions, concerns, comments or gripes let us know always hit me if you have any concerns comments or gripes or just odd questions and once again thank you all for coming out be sure to stop by the booth I have lots of swag and be sure to thank the volunteers who work here so thank you all hello do you hear me is this works thank you guys sorry for the delay let's get started my name is Alex Kosminski I will be talking today about my scale indexes we were supposed to talk together with my colleague but he had emergency data recovery work so yeah this is what we do in our normal life so sorry for that too this talk is about my scale indexes and you may wonder why not talk about my scale indexes but I'm glad you came probably have your reasons but what I wanted to tell here is I wanted to explain you give you some insights about how indexes in my scale are built and how my scale uses them to produce result of queries and the point is if you know that if you can picture what's going on underneath, you can efficiently write queries straight away and make my scale database more performant I'm Alex Kosminski I used to work a while ago as consultant in Percona as well as my colleague now my scale is a recovery in Dropbox and together with my colleague we also run TwinDB that does data recovery backups, disaster recovery all that kind of work for my scale so here we go I highly encourage you to ask questions if you if I need to go to my scale, feel free to ask although there is a Q&A in the end so it's up to you there are two big parts of this talk in the first part I will explain B plus 3 indexes what it is why this design was chosen for my scale for in the DB in 1994 and in the second part of the talk I will illustrate you on circular table how exactly my scale executes different kind of queries so you have better understanding what's going on many times back in our consulting times we've heard these stories when we had a hit limit of my scale and people were suggesting to migrate to something else but after careful look we found out that okay you don't use my scale right way so you have to tweak something and here you go databases so let's see how data is organized in my scale and I will be talking about in the DB only at some level it applies to my sum but in my opinion my sum is not relevant now so let's just drop it the foundation of in the DB data organization is B plus three structure this is binary tree this picture is from Wikipedia maybe it's familiar this B plus three structure has multiple properties features that were useful for data organization and was good choice when in the DB was designed key points here is that as any other tree it starts from the root and the data is clustered together with key values in the lift nodes of the tree this tree is quite optimal for searches its speed is all logarithm n and the same speed for insert and delete and update so B plus three in B plus three all data is clustered together with keys in the lift nodes this is one of the reasons why in the DB requires to define primary key primary key is the primary index is exactly the data structure where in the DB keeps the table the difference from B plus three well in the DB implementation of B plus three is a little bit different and one of the differences is doubly linked list of nodes in B plus three in classic B plus B plus three these are pointers to the next page to the next lift page in the DB there are this list is doubly linked so in the DB can perform optimally queries like order by ID and order by ID descending so any order will work fine because of that it's important that keys these are primary key values and inside the tree and inside each node keys are sorted in the order this is important for queries like range scan and so on and we have first question please sir yeah let's say you have query like select stuff from table and you have order by ID descending if you have descending order you can access this node and get some value for the result and you can output values in ascending order but in case you want to do descending order you need to have link to previous node this is to highlight the difference between classical B plus three and inner DB implementation it's different indeed and the B plus three has this property that for all leaf nodes height is the same that means that no matter where our key is in the beginning of the table, in the end of the table we we can access value with given number of searches yes question yes and this is what happens when you insert new values, new records or delete some records from the table and yes, you have to rebalance tree because B plus three is balanced tree but what is good about B plus three is you don't have to do that at every insert because one leaf node actually contains many records, many rows and you don't have to rebalance until the page is full after it is full you have to split the page and do rebalance but that's optimal I'd like also to know that B plus three was chosen for InnoDB when in 1994 when InnoDB started it was time when we had spinning disks and B plus three is good for those disks to access data on those disks because B plus three structure is very shallow you have height is very small even for large data sets and if you need to read from table it's all sequential access this is what rotating disks are better in oops, sorry yeah, so we talked about reducer balancing B plus three is efficient for range scans because we have all data in leaf nodes nodes are connected to each other and when you do full table scan or full index scan it's sequential reads from the disk and data is sorted by the key and from that mysql benefits as well for all kind of queries where you need to do sorting or where you do group buy etc and I noticed there was a question some other yes, please I'm not sure I understood the question what do you mean this is just few of them because B plus three with other data structures to give you another example not covered here is for some trees insert speed changes and the worst case and average case are different for B plus three is the same which means you have predicted performance which is good because you can expect how to meet SLA for example for query response time okay as I said B plus three are very shallow and that means that mysql to access leaf node has to do less comparison and to get to the leaf node in a DB has to do random random lookups inside the tree and because the height is shallow height is small we have to do them less this is formula to calculate exact height of the tree and I would highlight here that first of all height grows slowly with growing number of records another thing is it's important to keep key as small as possible because in that case we will have more keys in intermediate node that means that our branching factor is higher and that means that the height of tree is smaller which is good because the smaller tree the less random lookups and to illustrate the previous formula from the slide I calculated height for some some number of records and here what happens for 1 billion records what primary key is 4 bytes integer height is only 3 that means that you have huge 1 billion records table and to access any value you have to do just 3 random lookups this is very good and to compare with longer primary keys for example some people don't know how to do ID it's okay but to understand implications of that you can see that like on 1 million table in a DB would have to do 50% more random lookups it may be critical if you are IO bound and you have to do a lot of work and now I'm slowly going to the second part and I will be explaining all access methods in a DB uses based on this table this is a table from circular database like reference table and it has a primary key and one secondary key by last name what it means yeah so when you create this table you get actor IBD file if in a DB file table is on inside this IBD file you have 2 B plus 3 structures 1 B plus 3 structure is index primary and this is where you have all your data you have key which is actor ID and all other fields of values in this tree first name, last name and last update there is a secondary key as well and that means separate B plus 3 structure and the key in that B plus 3 structure in that index is last name and value is actor ID so when you create the table when you populate it with data in the IBD file you have these 2 structures and based on this example I will review all access patterns and here I suggest a rule if you can access data and produce result well if you can produce result while you are accessing data then it will be fast in other words if you don't have to do any additional like temporary tables additional sorting if you can if client requires a query result and you can produce that result just accessing the index it will be fast because in that case you just spend the list of resources to produce the result I hope it's not too confusing and in the next slides I will illustrate this rule so feel free to ask questions and don't sleep on my talk please so the simplest case point lookups we have we need to get record by its primary key our actor ID is 3 and that happens very quickly because B plus 3 is fast perform its search speed is fast and we can quickly search from the root node from the root node to the lift node and get our record if we need to do range scan range query that's also optimal because it happens in the DB finds record with ID number 3 and it doesn't have to do additional lookups it just reads all further records and that will be the result of the query we don't do any extra data processing we just read records from the index and send it to the client that's it let's see what happens when we query the table by the secondary index let's say we need to find all LNs you remember we have these two indexes this is our primary index this is our secondary index we can quickly find records LN key LN in the secondary index and the value is primary key in the primary index so if we need to we need to output first name, last name last update and to find that we have to take actor ID go to the primary index find this record 118 and output it the same for next record 775 we find it and we produce it so see what happens we quickly access the leaf node we do range, read but we also do random lookups from the primary index and sometimes that can make huge difference because if this table is huge that will lead to a lot of data from the disk now let's see how for what kind of queries we can use secondary index only so we don't we do it in the most optimal way one of the examples is when we do select count if we need to count LNs we don't really need primary index we have the data in the secondary index we read from it and we can count it that's it and this is what called using index optimization and if you run explain against this query you will see that it shows using index that means exactly that in order to produce the result we don't need we just using secondary index and that's it we don't use primary index this is what using index means and this is very good optimization because we avoid all random lookups from the primary index covering indexes they help to use the same optimization let's say we have query like select first name where last name is LN if we have our that old secondary index we would have to do this additional primary key lookups because we need to know first name and in the secondary index we have only actor ID so we do little trick we created another index secondary index and we include first name in it and although we do not include it in the where close we still benefit from it because we can use this first name to produce the result of the query and again if you run explain you will see this using index this is good so you sacrifice disk space because you need either big index or you need additional index but you don't do random lookups and that's good you cannot use both indexes at the same time so my scale optimizer will decide that ok I will use secondary index on the last name to access data and it will have to do random lookups to the primary index in order to get first name for each record secondary index on the first name will not be used at all here I see yes actually you can benefit from that index if you if you search for Kim Allen you can use this index efficiently if you have two indexes and you have the same query like where last name is Allen and first name is Kim then my scale has to decide which index should it use it should use because again it cannot use two indexes at the same time for the same query so it will choose by index cardinality which index is more selective and for selective in more selective index it will it will use it to access the data but then it will it will filter out all extra records where first name is not Kim and you will see using where in that case that means that my scale has to do extra filtering out in order to remove all all unnecessary records make sense? let's go over another kind of queries that can benefit from secondary indexes select distinct if we need to find distinct unique last names from the table we can use secondary index for that it's very easy because we read many Allen's but we just throw them away until we find some another another last name and because the records are sorted keys are sorted we know that we are not going to meet Allen anymore further make sense? again using index for group by see what happens if we have group in function count it's similar to previous query but we do not output unique list of unique last names but we also count them we can also use secondary index for that because we read it we do index scan and while we read we count them and it can be max here or minimum or whatever and it will still benefit from it and what I like the most about this concept this way of picturing indexes how it works how it accesses is that you can understand if you do that if you picture indexes in that form you can even understand complex things like index loose scan yes please question for maximum so what you will still benefit from it because see what happens ah max now I understood what you mean yeah max max what if max vector ID then yes if something else then no but if you have index like this and you do last name max first name it can use it because first name for any given last name first name is sorted as well so the last record will be maximum and the first record will be minimum yeah good catch thank you so loose index scan what it's all about let's add another column rank and let's fill with some values so we have actors with their ranks and now the question is yeah we build this index with last name and rank and this is our primary key so the question now is let's find all losers so we find records by their last name and for any given last name we find person with minimum rank we can use secondary index for that and this kind of optimization is called loose index scan because see what happens for any given for any last name we have to find value with minimum rank so let's say a croit we because fields are sorted by last name and for every last name by rank we know that for the first record for the first record of a croit rank will be the minimum so we know that this this is our record from our result and then MySQL can choose either to do range scan or do loose index scan which means it will do another lookup for next key and that helps to avoid large index scans so you just read from the secondary index this row this row and maybe some other row and that's it so you don't scan the index you just pull some selected records so this is loose index scan MySQL benefits from indexes being sorted by the keys for queries like this let's say again you have this secondary index and if you need to output all record all a croits ordered by rank you can do it because you already have the order rank the index is sorted by rank and so you don't do additional sorting yes sir MySQL tries to well InnoDB tries to reduce impact what you said is right so when you update the rank this index has to be rebuilt and it can shuffle indexes it can change the tree to reduce the impact from that InnoDB uses index buffer pool wait index index buffer no no InnoDB buffer pool is a whole thing where all IO comes yeah index insert buffer sorry but they change it in 5.6 the name changed it to change buffer because the same happens when you update records or delete records from the table secondary indexes has to be rebuilt as well so it uses that change buffer to reduce random writes to the disk because secondary indexes are rebuilt in this change buffer and then flushed later on in batches it doesn't work for unique indexes though because change buffer doesn't work for unique indexes for unique secondary indexes because you have to ensure uniqueness of the of the index B plus 3 itself it doesn't require uniqueness keys can repeat each other yeah so it's on top of B plus 3 feature and this this slide can also illustrate you maybe you remember there were many advices that like you have to you have to put in the back close you have to put columns with less sign to the most right side so let's say close where X equals A and Y more than B will use index and if close is X more than A and B and Y equals B is not going to use index look at the at the index if we have rank equals 40 and order by last name can you use secondary index for that no you cannot it's just doesn't make sense because index is sorted by last name and then for each last name it's sorted by rank secondary index is a question please exactly this is this is why you cannot do like percent value percent because if you have index on that field keys will be sorted let's say there are strings and you miss the beginning of the string so you cannot find records you have to scan all of them so joining tables in MySQL tables are joined in a nested loop so if you need to join three tables MySQL takes one table for each record it finds for each record in the first table it finds records in the second table that satisfy joining close and then for each record of the result it searches for records from the third table and to do these searches MySQL uses secondary indexes if it can yeah but in this case we want to we want to output all movies with actors starring in them and we want to to order by title and I don't I'm not sure you see that but this is the best what I could get in terms of formatting the thing is MySQL decided to take table actor join it with table film actor and then join field and then the result it will put in the temporary table and do additional sorting because we need order by title and the title is the column of film not column of actor so it has to do additional sorting and remember that rule this query breaks this rule because it has to do additional sorting it cannot access the table oh my god it cannot access the table and produce the result but MySQL optimizes decided to do so most probably because it thinks that table actor is small so not big deal I will do additional sorting but I will read less records let's force MySQL do what we want to do and we know that title is the column of film and we know that we will access records in the sorted order so we will force MySQL to use this index by title to produce the result and then for each title we will join other tables so it will give us the result at least we will not do additional sorting and yeah it looks like it's worse because it has to read more records but it doesn't do temporary table it doesn't do file sort and response time of this query by the way you can easily reproduce this because circular database is available dataset is available and I managed to reproduce this on 5.6, 20, something 25 this query is 5 times slower than this query because we just used write to index and MySQL optimizer chose bad index and bad way to join tables so yeah you can check that by response time but you also want to check handler counters this is API between storage engine and MySQL server part and if you run show status after the query it will show you how many times handler read key how many times MySQL asked storage engine to read key let's compare them first of all you see that in the first case MySQL accessed indexes 20,000 times and on top of that writes for select this is because of our temporary table yeah in this case it did only 12, 15 yeah 13 13,000 lookups 20, 13 no 25 even 25 20, 13, 13 wins right and 5 times shows that we optimized query took 5 times last time and questions well it's a big story with MySQL optimizer this is what they do and both Oracle team and MariaDB team they work hard to improve optimizer but in the end of the day optimizer is stupid computer and we humans we know how we can access data the most optimal way optimizer is we cannot exclude the optimizer because MySQL has to decide how we access tables this is job of MySQL optimizer sometimes it makes mistakes as you see it makes mistakes only on small tables reference tables that were around for many years yet this is life nothing is perfect more questions yes please exactly you have to understand that the index is not sorted by your second column is sorted for first column and then for each value of the first column it's sorted by the second column so you're right you cannot search by second column using that index because the index is not sorted by that column what you can do how you can work around that is you can have two indexes by each of them and if you have query by first column it will use first index by if you do query by second column it will use second index if you do OR if first column is this or second column is this MySQL can even do union so it will get all records for the first part of the close get all records from the second index merge the result and merge the result and give it to you so it can do but what is important in this talk I believe is to understand what's possible to do because if you know for sure that this is possible then most probably it's implemented in MySQL and if not you can give a hint to MySQL optimizer or you can file a bug hey why this query is not optimal even though I have these indexes you have to make sure that MySQL can still use the column you have to define proper secondary indexes for that sometimes yes so you have to have indexes so even though you force order join in order MySQL still can use them new indexes because as you said if you change order some indexes may not be used so you have to have indexes or maybe change join in close so the join tables you need to have secondary indexes anyway question please the thing is it really doesn't make much sense to increase join buffer because of several reasons first of all if you access tables in not optimal way you will end up with more IO anyway another reason is if buffer pool is large enough so your data set is in the buffer pool then you don't need join buffer because data is in memory already in the buffer pool so yeah I cannot remember cases when changing join buffer improves performance significantly so I don't know maybe rare cases when you join large parts of the second table it will help but I'm not ready to give you example yeah this is the danger with procession buffers short buffer, join buffer some other buffers they are allocated per session and if you have many sessions multiply that amount by number of connections and MySQL and in the DB particularly particularly when it allocates memory it doesn't check if it was allocated so if you run out of memory most probably MySQL will just crash because at some point it will reference null pointer and here you go crash thank you and one more question please yes you can if joining close is right it can use secondary index in the same table then why not I'm not sure what kind of yeah I wasn't 100% accurate when I said that query cannot use more than one index that's not true and I I gave another example with index merge when for the same query MySQL uses two indexes like to get part of the result from one index to get part of the result from another index and then merge the result that's another example when MySQL can use multiple indexes to produce the result joining tables again you at least use three indexes to join three tables if you join table to itself exactly this is what happens to join the table second time you can use any index the table defines so there are no limitations including foreign key index because foreign key index foreign key forces to create secondary index effectively this is secondary index anyway so yeah you can use it thanks question? for low cardinality columns well depends on situation because even if index is not very selective you can still apply limit optimization for example if you need just 10 men right so you have index by sex and this is classical example of low cardinality index but you just need 10 so it will access index read 10 records and will stop reading so that query will be efficient even though the index is not very selective and yeah there are many yeah depends on situation general solution I don't know you have to be as specific as possible yeah it will well it's good to double check anyway but this kind of situations it handles and limit optimization is well known to my skill optimizer so it will probably use that low cardinality index to access the data any more questions no ok then thank you very much you guys for the questions alright can you hear me perfect can you see the slides because I can't alright so welcome to this session about scaling MySQL and MariaDB I guess I have both here because the things I'm going to talk about apply pretty much equally well on both but there are some differences and when they come up I will mention them so who of you all were here at Peter's talk this morning ok not a whole lot so Peter talked about sharding and when to shard and when not to so this topic is very similar except that I'll talk more about the technologies how it's actually done with MySQL and or MariaDB and not so much about the holistic pictures as Peter was talking about so they do compliment each other quite a bit so a few words about myself my name is Max I'm from Finland and I live in the US I live in Georgia I don't have a southern accent yet but maybe one day and I work at MariaDB so obviously I'm not totally objective in all of these things and I used to work for MySQL abe back in the day as well so I've been doing this way too long and I have no comments about this picture it's a marketing team that put it here so what we're going to talk about today is scaling or scalability so I kind of looked it up in Wikipedia and this is the phrase I found that kind of best described what we're doing or what we're talking about so scalability is the ability of a system or network to handle a growing amount of work and then in a capable manner so I guess it means that we're able to as load increases on our machines or on our system we're able to co-put it in some form and then what that is of course depends on your definition of scalability and what the limits are should you be able to cope with 5x of traffic without the user noticing anything or do you have other constraints so that's an up to the service provider to define what you need it's a bit similar to high availability where the definition is pretty loose it depends a lot on your requirements do you really need the system to always be the same and fully available and ready to be used at any point in time because typically the harder your requirements the more you have to pay some in money but maybe not only money but also time and admin work and so forth and it's the same of course with scalability the more you want to be able to scale the more you have to invest in time money hardware and so forth right then typically when we talk about scaling we differ between horizontal and vertical where horizontal just means you add more hardware to your machine and vertical means you sorry the other way around vertical means you add more stuff to your machine and horizontal just means you add more machines and typically in the MySQL world, MySQL world we've been doing a lot of horizontal scaling where we add more machines and that's pretty much what I'm going to focus on here it's a lot less interesting to just add stuff but for those of you who were at Peter's talk typically when you want to do horizontal scaling it adds pain especially if you do sharding but typically when you add more nodes it becomes more complex to manage so adding more resources to one machine is typically a much easier solution but it's less interesting that's why I'm not going to talk about it so when do you typically need to scale well to scale especially in particular when talking about horizontal scaling of course it's when the resources of your machine is not enough and it could be that your data set is very large so you have several terabytes of data and well performance is not as good when you have several terabytes like InnoDB is typically very good at scenarios where your memory size is same range as your data size so if you have say 500 gigs of 256 gigs of memory and your data is about 300 gigs InnoDB still performs pretty well but when you start getting to 10x of data compared to your memory size InnoDB's performance typically drops so it doesn't impact on your performance and then the question is well what performance degradation can you live with and when do you need to do something so that's typical when you need to scale and there's a few examples here of what that could be so then when you look at what are the technologies available in MySQL and MariaDB there's a bit about doing sharding in a different layer than doing the sharding in the application but the technologies available to do some kind of scaling are replication which most of you probably know Galera clustering and then you go into these sharding technologies and that's pretty much what I'm going to talk about here are these technologies so let's start with replication who has used replication actually who has used MySQL and MariaDB who hasn't that's a better question everyone has used you're in the right talk so replication most of you have used replication replication was added to MySQL a long long time ago 15 years ago almost it's been there for quite a while it's been upgraded a bit changed a bit throughout the years but kind of the basic technology is the same you have a master you have a master everything you ever write or every query that changes data is logged in a binary log and then you have slaves that are connected and they basically ship these log events to their own relay logs and then they implement these changes as well right so it's totally slave driven the master doesn't enforce the slaves to do anything the slaves basically connect and they keep track of their own what they're doing and so forth but from a scaling point of view what's interesting here is that first of all this is not synchronous right so the slaves are never well there's no way of guaranteeing that a slave is in sync with a master so basically you don't know there's a delay it might be large it's kind of undefined and the second thing to notice is that all nodes have to perform all writes right so if you do an update on the master every single slave will also do the same update so you cannot scale writes with replication because every node will have to do every write so replication can only be used for read scaling right and there are well interested so that's kind of the basics so if you look at the phase of replication it's basically three phases for the standard replication when you commit to a transaction at the same time it's written to the binary log on the master this is actually optional you can do this in two phases but let's say if you care about your data integrity you don't want to have it optional you want to have it in one phase there's an option called sync bin log on in MySQL who knows that option no one so sync bin log is the option that actually forces MySQL to to synchronize your binary log events with your commits if this option is not enabled your binary log is not necessarily synchronized with your transaction log which means that if your master crashes you might have transactions in the binary log that were actually never implemented on the master so well if you care about your data integrity you should always have this sync bin log on I'm wondering if the default has changed Colin do you know but it's on now it's 5.6 or 5.7 so before 5.7 by default it's off so by default your binary log and your transaction log are not synchronized right so that's phase one and then phase two is shipping the changes to your slave and then phase three is applying the changes on your slave so it's a three phase well so it's the redo log I'm talking about the redo log on innerDB which is the crash log I mean this is not a problem if your master never crashes but if your master crashes your binary log and your redo log might not be synchronized which means that you have a problem or I mean your data is not the same basically if you're using something else than innerDB well typically you already have a problem then like if you use my ISM there's no integrity already on the table level so it doesn't make a difference anymore alright and then so that's the replication there's a thing called semi-sync replication which basically ensures that one slave has received the changes so basically this phase one and two are made one phase so when you commit on the master it will wait for a slave to acknowledge the slave has received these these changes so it adds basically a small weight here but it doesn't change the nature of replication at all right so what about using replication for scaling so you can use replication for scaling it was actually originally developed for scaling so replication exists in MySQL for scaling because MySQL used to be used heavily in on websites and typically on websites you have a lot more than you have writes and that's why we developed replication we had a customer who asked for this feature to be able to scale scale is read traffic so the idea is that when you have a replication like this you direct all your rights to the master but the reads can go to any of the slaves now the problem here is that the slaves are not necessarily in sync right so you don't necessarily read the latest data so you need to have an application where you can read all the files so for example if you store blogs it might not be so important that every user gets the latest comments to each and every blog so there you can easily read from the slaves and if there's a delay but let's say you're dealing with monetary transactions there you don't want to have delays you don't want someone to see that their account has more money than it has or something like that so that's good I calculated some numbers with with regards to using replication for read scaling so if you have an application where your read-write ratio is 95% reads and if you then add four slaves and you compare that to having just one machine basically your load goes down to 24% of having just one machine so it's a fairly good distribution you went from 100% on one machine to 24% on each machine by adding four slaves but now the read-write ratio on each machine has dropped to 80% so if you now add four more slaves you don't get the same benefit anymore because the read-write ratio is heavier so basically replication will incrementally decrease in value as you add more nodes per node so if you start with something like 50% read-write ratio and you add four slaves you still have 60% load on each machine because you can't scale the writes so here the benefit is much, much less and then after doing this you're at 16% read so it's useless to add anything more after that anymore so from a scaling point of view replication is great if you have a lot of reads, if you don't it doesn't help pretty much then you can use replication for other things like having a hot standby or something like that but for scaling it doesn't doesn't help no, it goes down for all of them because the slaves have to apply all the writes so that's the problem basically and now it's not one to one because so in the beginning what you ship was the actual transaction so in the binary log if someone says insert this you would ship insert this and you will re-execute the statement so it's actually exactly the same on the slave as on the master but now there's a row based format for the binary log so it's a row operation so it's not statements anymore so it's not exactly the same operation on the slave but in time the difference is very small so you still have to execute every single operation on the slaves so that's the problem alright so that's kind of replication any other questions I mean 50 is definitely, I would say a little bi-addling machine so you probably pay more you probably want to look into some other I mean first doing some kind of vertical scaling on them trying to get as much as you can out of the one box and then after that you have to look at sharding options which is basically some kind of sharding is the only way you can scale writes pretty much so I mean I think it depends it depends on your use case but and again you can use the slave things as well so you can use if you have two slaves if you also use them as a backup or something then it makes more sense as well right but one thing of course is also to better hear that when you have this so you need to have a proxy or somehow your application needs to be aware of of the setup right because you need to be able to direct all the writes to the master and all the reads to the slave so either you have a middle layer or your application is aware of the underlying topology and that's kind of the same with Galera cluster who here has used Galera cluster okay only a couple of guys okay who has heard of Galera cluster yeah they I mean they mentioned it every talk well many of the talks today have talked about it too right so Galera is the technology and you can get it in Percona XRDB cluster or MariaDB Galera cluster and it's basically a clustering technology where nodes are kept in sync so when you commit a transaction before it's actually committed locally it's certified by all nodes in the cluster and they at the same time implement the transaction as well so you basically get a cluster with all nodes in sync so it's mainly an HA technology there so that you could have several copies of the data and the main difference between replication besides that it's built into innerDB so it's below innerDB so it doesn't use the binary log anymore I use a WSREP library that basically has the Paxos communication but the main difference is that it's not it's synchronous replication so you don't have to worry about slave lag or there are options you can choose that slave lag doesn't happen so basically your slave will always be in the same state as the master and that also means that from a cluster point of view there is no master all nodes are equal in a Galera cluster so you can write to any node and you can read from any node but similarly to replication every node has all of the data so every node will have to write everything so again you don't really get any right scaling really so same thing as with the standard replication it can be used for read scaling because now if you have three copies of the data you can read the data from any of them the difference is that you can actually use this for reading any kind of data you don't have to worry about slave lag anymore now you can actually get a bit of right scaling from Galera just because you can relax some of the I guess requirements of your disk because you don't have to worry about durability on a single node because you have durability across the cluster so you don't have to care about making sure that the read log is always committed on a specific node because the transaction as long as you commit your transaction it's already in the cluster so it's exist in one node so you can do small optimizations like that but it doesn't provide through right scaling so it doesn't work for right scaling but it can give you some benefits because you can relax some of the durability requirements that you normally have locally because you now have multiple nodes but same thing as with replication you typically need some kind of load balancer in front of Galera to make this work alright any questions about how Galera works I mean I explain it but very quickly mainly HA technology but it can be used for read scaling so if we then look at adding a load balancer because both both replication and Galera clustering they don't provide a load balancer by themselves so you need to put something in front of it which could be the application your application could be aware of this but typically you don't want to have it in the application because you want to be able to change the underlying topology for your data cluster or anything without having to change the application every time so what if you add more nodes what if you remove nodes do you have to change the application so typically you want the middle layer that takes care of this and here's a list of proxies or similar technology that you can use I'm going to show a few slides about max scale partly because it's our product so of course marketing and partly because it's fairly good as well but again I'm not objective in this right so max scale is a proxy we built which is has great features and it's very modular and all these things but from this point of view the main positive features with max scale is that it's a proxy that's built specifically for MySQL and MariaDB so it has built in monitoring that's specific to the application and Galera technologies and one thing we have in max scale is we have a thing called read write split split or read write split router which basically allows you to not have to worry about splitting reads from writes in the application but max scale does it for you so basically if you have a cluster or a replication setting you have a master and you have slaves all you have to do is send your queries to max scale and max scale will then based on whether it's a write or a read it will either send it to the master or send it to the slaves so that's something that's great because you don't have to do it in the application and max scale also monitors all of the servers so it basically checks that the servers are there but it also checks for replication it checks the lag how far behind are they and you can create a rule saying alright I'm going to use a slave who's too much behind and then because max scale monitors this max scale will just stop sending queries to a slave if it lags too far behind so max scale works with both so max scale is just a proxy write but it actually has built in functionality for both so this is the replication part you can also use it with Galera and here it's the same it monitors the state of the servers because in Galera a Galera of the servers can be up because so if a server crashes in Galera a node it comes back up then it needs to synchronize itself from another node so it can be there but not synced and basically max scale monitors the state of the Galera servers or Galera nodes and only sends queries to the ones that are are there so these are kind of the benefits for having a proxy is that the application doesn't have to think about all these things because this proxy does it for you right and max scale is fully open source GitHub wherever it is actually an interesting thing with max scale though so when you use max there with Galera you can use it and do just round robins and queries to a node but one of the potential drawbacks with the way Galera is architected is the fact that you can have deadlocks upon commit so I spoke a bit about how Galera works when you send transactions to Galera basically all of the transactions processing is local when you send a transaction to Galera it does everything locally and only when you do commit is it sent over the network right so that's great because it means that you don't have any network hops in between your operations but the drawback is that you might have changed the same row in a different node at the same time so you might have a conflict and I guess the good news is that Galera is aware of this and Galera will detect it and Galera has a conflict resolution which is basically the rollback of the transaction so when you press commit you might get an error saying transaction rollback due to conflict and then you have to redo your transaction or do something else but this only happens if you write to all nodes and also only if you write to the same row at the same time but an easy way to avoid is by just directing all writes to one node because you don't really gain anything from writing to multiple nodes because every node still has a write everything so you can actually combine Galera cluster with the read write split router and just write to one node now there's no this node doesn't become a master it's just a master because we say so it doesn't have any time and it doesn't affect anything right but here by using the read write split router you basically avoid all that you cannot have any potential deadlocks or conflicts between the different nodes because you only send writes to one node so everything is handled locally if there's a conflict it's handled by local locks and not by aborted transactions so that's also a good use case for Mark's question but because Galera is still a synchronous cluster it doesn't matter so if your transaction is committed here it's also committed on the others because they're still in sync so you send your transaction here, you press commit and if you get okay it means that it's also on the other server so then if this one crashes you just use another one and it's already synced you don't have to do any failover with replication if your master crashes you put one of the slaves to become the new master but then all of the other slaves have to failover to the new master so you have a failover process in Galera you don't have that because they're all in sync all the time so there's no failover process can you scale max scale? you mean can you have multiple max scales? yeah so the answer is yes I guess the problem is that they don't communicate so there wouldn't be any communication between a different max scale but you can have multiple max scales to the same underlying database cluster yes but if the max scale crashes if a max scale crashes you basically have to start all of those sessions over there's no session recovery at the moment so you can have some kind of virtual IP between two max scales or something just for HA but it doesn't really scale for scaling purposes for HA you can have two max scales running one crashes you use the other one but all the open connections on the crashing max scale are lost alright so that's basically what you can do for re-scaling replication or Galera clustering yeah it's totally different it's totally different I'll get to mysql cluster actually in a moment because mysql cluster is based on sharding so it's completely different and Galera cluster every node has all of the data so it's great it's more like what you'd normally would think a cluster is whereas what used to be called mysql cluster which is by the way the worst product name in the history of mysql because no one ever understood what it actually was I used to be part of the team who was implementing and selling mysql cluster so we always had to explain what it actually was yeah so Galera cluster because I mean you have multiple versions across the data and you don't have a central system so if you have a shared nothing architecture the problem is always how do you how do you distinct network failures from node failures and typically there is no good way there's no good way of doing it the other node is not communicating you don't know if it's because you can't connect it's because he's not running there's no way of doing that so Galera uses basically a majority vote basically so that's why you need three nodes because if you have only two nodes and they can't communicate both will stop because neither has majority you can actually in Galera you can have an external arbitrator node that means you still have to use a third machine so you might as well make it the cluster of three and they're not good so three nodes is kind of the minimum and I mean 80% of our customers maybe 90 even use three nodes I've seen a few with larger yeah that's a good point as well for load to be able to scale the load of course if your nodes start being highly if you have two nodes and it's above 50% one node can't cope with all of the traffic anymore alright so that's kind of for re-scaling but if your problems is right you need to scale the writes none of these replication or Galera helps you solve that because every node has to do all of the writes so then you have to go into sharding which means you have to somehow partition your database across multiple instances and how you do it well you know this is kind of a high overview of different things you can do so kind of the best way to do it in a sense is to do it in the application logic have some kind of clear way of dividing your data or dividing your use cases so that you can use different servers for different parts of your application so if you somehow can do it in an application that's typically the best way to do it because then you also control exactly how you shard it but in some cases you just can't do it like this you have one big table and basically one big table for example and there if you shard in the application or lower down it doesn't really matter because you still need to be able to put the results together from queries in this big table for example so there are some connectors that offer sharding functionality like the connector J or together with MySQL Fabric it works you can do it in the proxy level it has some very primitive schema based sharding so if you have the same schema with different schemas basically it can shard so that each server will only have one of the schemas but it looks like the server has all of them from the application point of view but then if you have the case as I mentioned of the one big table this of course doesn't help either the one table into multiple pieces and you can't use external tools for that like scale arc or something but there are also some storage engines that offer sharding built into the storage engine like NDB which is MySQL cluster I prefer calling it NDB because then it's less confusing to what it is or spider and some others so actually Peter kept on mentioning sharding equals pain there's a lot of reasons why you shouldn't shard there's a lot of disadvantages with sharding your queries become more complex if you want to be able to aggregate the data together you typically can't do it in one query unless you use a sharding technology that's on the storage engine level like spider or NDB but if you use any higher level you can't get the results on the application level there's no query that can give you results across all the shards because you have to do it in the application layer so you're not able to solve your problems with SQL it's harder to manage because you have multiple servers it's not high availability of course is much harder because you need to make sure that every node is highly available and backups were also mentioned and if you want to back up some things you have to somehow manage that they're all synchronized as well from a data integrity point of view if you do a query that touches multiple shards and one shard crashes in this operation how do you now check how do you now make sure that there's not three shards that has the change and one shard that doesn't have the change and things like that so you have to somehow deal with those kinds of things as well so pretty much everything becomes more complex and you have to make sure that there's not three shards so I mean Peter had a 50 minute talk about sharding and basically all he talked about was how not to shard so because of this right so I mean and he has good points right but I'm not going to go into that if you want to know more about this look at Peter's talk so just looking at where you can share I talked about this database because there are storage engines that provide sharding so someone asked a question about the MySQL cluster or NDB and NDB is basically a sharded storage engine originally it was actually a key value store a sharded key value store so basically in NDB you separate you have the MySQL servers or MariaDB servers up here and the lower level has basically a sharded key value store and then you have functionality for getting values now and basically NDB has built in sharding it's based on the hash well a mod of a hash value of the primary key so you have a table with a primary key by definition it's sharded and it's all based on a hash of your primary key and you can't change it and so forth so it's you can't basically start MySQL cluster without sharding and it also had built in HA and all that kind of stuff so it's fairly fancy but so who here has used MySQL cluster or NDB to a couple of guys so was it easy to use no exactly it's a lot of trouble we had customers who tried to using it and it typically for generic OLTP use case there's a lot of drawbacks a lot of things well many of them are just because it's sharded because you can't do the same things with sharded but because we provided a full SQL interface people would try to do like joins and aggregations and things like that and because everything was sharded it took a really long time to build these queries I remember use case where we had a customer who had a fairly small table it was a 10 gigs or something not even though it was less than that it was like whatever two gig table they put it in MySQL cluster that was the big table then they had a few small tables and they put it all in MySQL cluster I think we had four nodes we did a query it was a join it took eight hours to get the results so then because it was only it was only a couple of gigs a table it was four gigs or three gigs whatever it was and other tables were smaller so we put everything on a local server and it came back in seven seconds eight hours or seven seconds first of all at the time MySQL didn't have any other joins but nested loop joins which means that you do it's like a for loop give me the first row of the first table and then first row of the second table and every single time you do this you do a network hop because it's distributed on a network so if you have a table with a few billion rows you have to do a few billion network hops and this is for every table in your joining seven tables the latency times a few billion and it starts showing so it's all latency it wasn't processing none of the machines was running on high CPU load so that's kind of the problem with charting now some of these issues have been solved with cluster it's not as bad anymore but you can't do complex queries as efficiently if you chart it's just out of the question so that was NDB I'm not going to talk so much about NDB we get customer calls we have this MySQL cluster we have data on it we got it running four years ago because no one remembers any commands anymore and it just runs so we still get this but typically it's customers who want to figure out how to get rid of it because no one knows how it works anymore but I guess MySQL Oracle gets more of these than us so another thing I wanted to mention charting technology is spider has anyone heard of spider no one so spider is built into MariaDB so if you download MariaDB you actually get spider there it looks like a storage engine but it's actually not really a storage engine it's kind of an overhead I guess to storing the data so what spider does is that it uses the partitioning interface who knows if MySQL or MariaDB has partitioning no one one guy so there's a syntax for doing partition by then you decide how you want your table to be divided into different partition it could be based on a date value or lists or whatever basically it's how you want to split your table up into smaller pieces and that's been in MySQL since 5.1 so what spider does is that it allows you to use this partitioning syntax but instead of having local pieces of the table you actually put them on remote servers so the spider I guess node won't store any data but it will basically be a relay to where the data is actually stored so this allows you to very easily from a syntactical point of view very easily split up the table and you decide completely how it's split up because you say partition by and then you use any of the built-in partitioning mechanism and then spider basically takes care of distributing the data now spider is not I guess an enterprise grade great feature you have to do a lot of these things manually so for example you have to make sure that these tables exist on the other nodes you can't do resharding easily it has to be done manually and so forth so there's a lot of things that could be improved with spider but still the basic concept works and in general it provides right scalability because now if I have the spider table here I've split the table up into three based on just different customer values so now if I send the right query to the spider node it won't do anything it would just relay the query to where it actually should go so with this setup I could do three times as many inserts per second as I could with just one node or I could sort three times as much data or I mean both technically of course there's a small overhead of having the spider node but it's fairly negligible right spider is also transactional which is great it uses the XA transactions so it's actually when you have a transaction that uses multiple nodes it actually creates a transaction so that these can be rolled back in case of an issue so basically it's how it looks I mean technically the remote table can be anything so you could actually have separate types of tables depending on what you're doing with them so depending on what the use case is and the remote servers are not particularly spider where it's all done by the spider node and you can have multiple spider nodes attached to the same spider cluster similar to max scale where you have multiple max scale nodes attached to one cluster so it's an easy way to do sharding and it's built into MariaDB so here's an example of how you do it I create a table on the spider node I say partition by and I give the partitioning scheme and for each partition I basically say where should that partition be and then on all of the back end nodes I have to create the same table but here there's no mention of spider because these nodes are not spider aware so then if I do an insert on the spider node basically it looks like one insert but it will actually create three inserts every node will just get one of these of these rows so if you look at the performance of spider so basically I mean if you do key lookups it's really fast but but if you do anything more complex similar to NDB if you start doing joints and things it won't work as fast anymore however spider has some optimizations for doing some things like if you have small tables like an auxiliary table that you often need for joints spider has actually built in functionality for doing the joints locally so instead of doing the joints on the spider node which would mean fetching all the rows first and then joining you can actually tell the spider nodes to do the joints locally and then ship the data so you do have some optimization and obviously for writing inserts are generally faster because each node is independent so basically you can scale writes and that's pretty much why you would use sharding or spider in the first case right any questions about spider or sharding the spider nodes fail over design so basically you basically have another spider node so I mean similar to maxi you would have two spider nodes that have the same setup you use one and not the other and you have some kind of VIP in front and then if the spider node fails you switch to the other one but again I would stress that spider is not an enterprise great thing it works we have customers using it but we acknowledge that it's not enterprise great tools I wouldn't recommend it for anyone who doesn't know what they're doing but obviously typically the customers we have we've told them this and they have a use case where they need something like this I don't think so I haven't heard of anything where it's built because this is like low level sharding so typically the sharding solutions are above the database layer and this is below I can't think of any that's like this either alright so to summarize I mean just scaling in general I mean I typically agree with Peter that sharding is a pain you should avoid it at all costs but there are of course reasons why you might want to use sharding and Peter had an example about the customer who wanted to shard and he started asking questions and basically for no real reason they just wanted to shard because they heard it was a cool thing to do and we've had similar even worse we had a customer who had performance issues so we started looking at what they were doing and they were sharding and they were sharding with 16 nodes and they were using an external tool for sharding and they basically had 10 gigs on each node and we looked at their traffic and it was the same they had a couple of hundred queries per node so we asked them why are you guys sharding it was like well because we contacted the sharding company and they told us to shard I was like well yeah they had sold their sharding technology to them so they were like saying well I think we said well the best solution to your problem is actually to use one node get rid of all the sharding put everything in one node because their performance problems were with complex queries where they had to aggregate data from all of these nodes so yeah we have these conversations a lot right so you should only shard when you really have to right when there's a really good reason to it you have a table that typically it's one table more than that but something like that and I mean if you have a high read ratio I mean these replication was built for read scaling right so if you have a high read ratio just use read scaling don't start by using sharding either alright any questions yeah so is Galera a commercial product or is it free? Galera is open source so in that sense it's completely free there exists so Galera so the company behind Galera it's called Codership I think they're going to change their name to Galera but they basically just build the replication part of it it's an underlying library so what you want to do is you want to get a binary with this built in so you can either get Percona X4DB cluster or MariaDB Galera cluster and in MariaDB 10.1 the Galera is built into MariaDB and it's not a separate binary before Maria 10.1 it was a separate binary but after MariaDB 10.1 you basically just get MariaDB and you enable Galera and it's there basically I mean Galera Galera is great in the sense that it's synchronous right so you can use it, it's a very good HA technology as well but then it depends on your load because it's complex so standard replication has a lot of drawbacks, it's more complicated you have slave lag and all this stuff but if you have huge amounts of reads and you need to scale to like 30-40 servers then you wouldn't use Galera so it kind of depends a bit Galera is great for a smaller use case where three nodes would be enough or maybe four or five or something but if you need to scale a massive then Galera is probably not I would probably do like a Galera and also another thing with Galera is it's synchronous so you don't want to have them far away network wise so you would want to have all nodes in the same data center and things like that so even if you use Galera for HA I wouldn't use Galera for like for a remote data center I would use standard replication from one Galera to the remote data center for example right so standard replication for a larger amount of nodes and Galera more for HA any other questions alright if not thank you very much for listening to me this long and have a good conference can everybody hear me hopefully okay cool okay for those of you that just came in actually there's some papers going around we're raffling off a little performance database monster at the end if you're interested there's some sheets going around fill it out put your name there's a brown bag going around put it in there and I'll ask at the end too and we'll quickly raffle them off just to get started this presentation is on need for speed best practices for my sequel performance tuning just before we get started a little bit about myself I've been a DBA going on for oh gosh over 25 years now I started out with Oracle version 3 if you can imagine that and actually I've done many other databases throughout my career most notably right now my SQL I actually worked with it version 3 and 4 in the early 2000 but we just know solar winds makes monitoring products and so we have a product last September we came out we're now monitoring my SQL this isn't about solar winds this is about performance tuning but I just wanted to tell you where I'm from I came to work for solar winds going on 9 years now and there one of my tasks is to work with our prospective customers and our customers looking at problems in their databases and giving them some ideas on how to tune it and there's always this common kind of deer in the headlights look when they find that SQL to tune it's like well where's that magic button what's going to fix it I don't know what to do so kind of the reason for this talk I'll take questions at the end it's a long I've got some case studies and stuff so we'll take questions at the end if we can the agenda we're going to talk about the challenges of tuning who should tune, what SQLs to tune and then we're going to talk about a technique or methodology we've used I've used for years called response time analysis I feel that's the best way to quickly find the SQLs to focus on and also give you great clues on how to go about tuning it we're going to talk about gathering details about the SQL that you're going to tune we're going to look at the explain plan and examine the best way, the best execution plan and how to find that I use SQL query or SQL diagramming or query diagramming anybody familiar with that okay good well good we'll have a few people in here it's a great way to give you clues to find the best execution plan so we'll go through that it's kind of like a little scientific way for you to think of how to tune without having tools tell you and then finally we're going to go through some case studies and then at the end we'll talk about monitoring to make sure it stays tuned okay so let's get started let's face it I think SQL tuning is hard and the reason why I think it's hard is because I believe you have to know a lot of expertise in many areas you need to know how to write good SQL for one thing because a lot of times when you tune you may have to rewrite it I also think you need to know how to read that explain plan and know how my SQL is going to access that data off the disk and find the best way for each SQL statement it's not always the same way and it's not really you know so it's I think you need to know that I also think you need to know the business reason for that SQL you know what's its purpose who's running it how often should it run I can't tell you how many times in my career I've worked with developers I've you know done some SQL put it into production and then they were just amazed that it's running millions of times a day when they thought it should run once or twice a day so how many people are developers here just out of curiosity okay DBAs do about anything that's what my twitter tag is okay yeah so yeah so a lot of times the business side of it is really important to know years ago I worked for a manufacturing firm and every month then they complained about the database it's slow and it's like okay what's going on so we started looking around and here this accounting clerk was running these reports horrendous reports they're taking 8-10 hours to complete and we started looking around to see who was using them and no one was she inherited a job from downsizing was instructed to run these reports and file them nobody even looked at them they were obsolete that was the quickest way to tune that situation just to shut them off you know it's always not that easy but tuning takes time large you know there's a lot of SQL statements in a database you know I've seen Peter's presentation you know what do you say a million SQL statements a second or something and go through there's different SQL statements running in my SQL and not only do you have to find which one to tune to get your biggest bang for your buck but your solutions to tuning it is not going to be the same there's no cookie cutter approach to each to any different SQL statement low priority in some companies is another challenge a lot of you how many have vendor applications not too many good because a lot of times they tie your hand and you can't tune them or a lot of people how many have tried to throw hardware at a problem did it work for a little while I bet I bet you come back to it at the end inefficient SQLs will eat up resource and they just the bigger the data grows it's not going to eat up so you know sometimes that's a challenge challenge of tuning is also never ending but that could be a good thing especially if you're known as a good tuner in the company people will seek you out and that's a good feeling because you know you're affecting something in the company at least who should tune that's another challenge you know developers you think they should tune because they wrote the code they know the application sometimes though they're not focused on tuning they're focused on a deadline you developers out there how many of you been given a deadline before you've even gotten the functionality that's needed yeah I always love that one you want it when well then what do you want but okay so they don't have time to focus on tuning and oftentimes they don't have an environment to that looks like production so even looking at it running in a development environment might be the wrong plan working in production and a lot of times and I worked at one network company there were Java developers and they used Hibernate to create all their SQL awful SQL but they didn't know how to write SQL so that's kind of a gotcha there DBAs you think they should tune because they know they should have know how to write good SQL however they're often focused with keep the lights on in this day and age and do more with less people how many DBAs here are responsible for 50 instances more how about more than 100 okay I mean I've seen some awful you know some DBAs having an awful number of database instances are responsible so finding a SQL statement within a database and tuning it that's a time that's they just don't have because they're trying to keep everything up and running so I think the best thing is to do a team approach and I really do think tuning should be a project I know I get mad at DBAs when they you know go and shut their door and tune for an hour you know don't tell anybody what they're doing because really I think it's a team project you should call it a project get the use end users involved get the developers and DBAs and actually you know tune it and show it to your upper management because it's the biggest the only way you're going to get the biggest bang for your buck by tuning your databases with little time and little money so great way to get back resource as I said which SQL finding the SQL to tune can be a problem how many people have tuned something and nobody noticed okay good there's a few yeah and again kind of wonder if you're working on the right things then if nobody notices when you're tuning well okay I've used these methods in the past to find which SQL to tune the top one there is easy you know you got end user complaining you go about tracing their session or looking at what they're running and actually working through the SQL they're running or if a batch job is running longer you look at the queries within it but if you want to holistically tune I've used these other methods too you know I often I will sort my queries by IO I'll look at the rose examined versus the rose affected or sent and compare that and I can if I got a lot of rose examined for the results that they're returning that I know they're inefficient queries and I'll tune them and reduce that down you'll see me use this my case study as a metric to compare against as I tune you might want to find your high consuming CPU consuming queries because if you can tune those you can actually give resource back for other processing and then finally what I'm going to talk about here is high response time DPA our product uses that but basically we're going to look at the performance schema and the information schema and get some good information on that to find not only which sequence we should tune but we'll get great clues on where to start tuning what is response time analysis well if you think about it most historically most databases look at health metrics you know they look at their statistics in the database they look at their server you know and see what that is but if you know the and the conventional tools often cause you know finger pointing it's one of the monsters we call blame a source is one of the monsters in our database because you know conventional tools just don't get you to the root cause as quickly as looking at what the end users are waiting on these are just this is my only funny slide but I thought I'd put it in here because these are monsters and I relate to them because they actually have a few more that I could put in there but you've got the performance hog you know the one that's hogging up all the resource and they are having to wait because of that performance hog or you get the virtual vandal if you if you've actually virtualized your databases you don't know what resources you're actually getting for your database because they're being taken from you or moved around on the on the fly you get the time sucka trying to find the right the sequel to work on and then finally the query blocker who's blocking everybody else in the database and they're just sitting and waiting blamosaurus we talked about but look at that last one it's a dev obstacle and that's most of the dbhs that have taken on bad code and are afraid to put it in production because they don't want any more messes in their production database so anyway you can probably think of your own monsters that are happening in your databases but just think about them that way response time versus health metrics we'll just think about it well how do you figure out the fastest way to work do you look at your gas gauge or oil maybe you look at your speedometer but how does that help you if there's a wreck up a fret or a slow school bus or maybe you got that one stop light that you swear that keeps slowing it down your way each time you know you probably use navigation tools like Google maps or whatever to see that and try to go around that roadblock and try to find the fastest way to it and so if you think about that that's what my sql and most of the database types that I've worked with in the past instrumented themselves well either what we call weight or thread states and so as you think about a sql request going through a database and it does hundreds or maybe thousands of steps well these databases not only record off the time each step took but it also records off the resources either weighted on or used up so then if you think about that if you collect that data at that detail you can quickly see where all the time is being spent for your end users as well as getting clues on where to focus on tuning if it's spending all it's time on I.O would you want to add more CPU I mean you could but it wouldn't help you so you need to know what resources and where to focus just to show of hands anybody on five five still okay earlier versions that's good when five five is when the performance came out actually and actually you can see down here at the bottom I have it had 17 tables in the performance schema beginning of five six people everybody on five six or most everybody okay five seven anybody on five seven yet a few okay well five six and above they greatly improve the performance schema used to be a kind of a performance hit when you access it it's not so much anymore it gives you more information 52 tables it has a five point six it gives you current historical events at all levels you can get at the statement level the stage level and weights so my SQL has actually instrumented self so granular at the pico seconds to me it's just amazing is one of the the better one that's actually instrumented that you can get all that good detail about what your sequels are doing who they're waiting on and all that good we're going to go into that beginning of five six performance schema consumers are now turned on for default so no longer do you have to try to configure all that and then the storage engine now defaults to N-O-D-B instead of my ISAM which is great because it's that transactional data now you can get blocking locking and all good stuff on exactly what what your queries are doing if you're using N-O-D-B been five seven if you're not there yet even more improvements it actually has eighty seven tables they've reduced the the the overhead and the footprint greatly so you can use it you can actually pull that and get good information back in fact they've instrumented in five seven ten that's what I've got loaded a thousand and five instruments you can turn those on they come out with a number of them on by default but now they've instrumented transactions metadata locks memory usage and all good stuff like that so you can get really pinpoint where your sequels are running when I say thread states basically I don't remember how many thread states they have out there it's hundreds of thread states that you can actually look at and each one of these thread states are documented not only what that state is doing but what solutions you can do to actually reduce the time spent on those thread states so here's an example of sending data and this is kind of one that's used a lot and I can kind of zoom here and you can see sending data is when my SQL tends to perform large amounts of disk usually don't read you know so it's saying who should resolve it but you can get solutions you know basically you know look at number five there look at your rows examined to your rows affected you know if it's more than ten times that then you need to tune that sequel statement and it gives you solutions on how to now the manuals have that out here this would be an example of what's in our tool but you can go to the manuals and get the same information so that's what I'm talking about that you're recording off when you query those performance tables and I skip this slide because I thought I had it's coming up ok so here's just an example we had a developer in our office that decided that he was going to look at each one of those steps and just to show you how granular my SQL has instrumented itself remember I said it instrumented the statements the stages and the weights well he actually issued this one command it's select let me go up there so you can kind of see select visibility from link db link table it's just a test we're passing some parameters and he had it for an update there and you can see all the stages and weights that it went through you know you can see that the first it did a net you can see the event time there the stage statement time and then there's the actual thread state you can kind of look down here I'm not going to go through all these steps but how granular each step and how much time it's spent and you can kind of see here there is my weight of 11% a weight I host table SQL handler which means it's going to disk you can see that it's doing a fetch ok above that was 17% of the time spent on statistics and that was coming from the statistics stage there so knowing that data and concentrating on those expensive steps is a great way to tune because that's going to quickly point you in the right direction on where to focus to tune how do you get this information well this is the performance and information schemas now I haven't listed them here I can kind of go through them in the information schema and if you haven't how many know what that is not know what the information performance schema is ok the performance ok the information schema is metadata about all the objects that are sitting in your instance your my SQL instance ok it's going to give you table definitions constraints all good stuff like that the performance schema is a real time database if you will that records performance information as it's happening now it does it in current tables that real time that you can look and see what your threads are doing right this minute and it also has has historical and long historical tables that give you a little longer time of history so you can get good information from this now I've only showing you the current tables here so in the process list notice I can get the process ID I can join that to the performance schemas thread table and from there I can get user host db I can get the command it's running I can actually get the time and you only want to look at your active session so the state there will either be active or idle and you don't want to pull idle sessions and then finally you can get if it's been instrumented or not and you take that threads table and join it into the performance schema events wait current which will not only give you the event name but if it's working on an object it'll give you the schema the name if it's an index it'll give you that and the operation that it's doing question no this is just the performance schema within my SQL yes this is my SQL work bitch yeah yeah and I and I actually go into that a little bit so I'll talk about that a little bit later but then you can join threads into the event statement current which will give you the digest the SQL text notice my rows affected rows sent rows examined if I wanted to just get a quick hit I could go to this table and sort by rows examined and you know bring back rows sent or it affected and kind of do that comparison I could get my top SQL statements right there and know where to focus because what is the slowest component in a computer anybody care what disk access I oh yeah so physical desk is a slowest so if you can reduce that you're going to speed up your queries okay so then from there you can join the threads into the you know db transaction table which is in the information schema there you can get not only the SQL again but the tables it's it's using as well as if it's locked there's any blocking going on and that's where you can dip into the you know db lock wait that's in the information scheme as well and you get the blocking transaction ID and get the thread for that so you can actually see the blocker and the Reuter which is nice if you have locking blocking how many people have locking blocking problems in their databases okay now that's it's nice to see the blocker when you do that so basically if you were take these five or six tables and these this is not all the columns by any means and like I said there's 87 now tables in the performance schema so if you're not familiar it's worth it to go and point yourself with it because there's a lot more information but if you wanted to do response time analysis you could just create a query that goes on these tables pull it at some interval put it in another table and then you can rank your sequels we do it with a time stamp we pull every second this information put it in a table add the time stamp and then you can sum it up and see okay not only I know how long it's spent total time but I know how long it's spent on any thread state so I get the expensive thread states and start tuning them here's just a quick screen of how response time analysis graphically can show you real quickly and actually tune this one what this is is basically a chart of 24 hour period basically the different colors are the different sequel statements and you can see here I've got this this kind of olive green one that's running spent an awful long time I don't even want it spent what did it spend on this day 10 to 11 a.m. it actually spent 35 minutes out of the 63 minutes of all the activity of the top 15 sequel statements so pretty pretty hefty 57% of the time so if I tuned that I could give up to 57% back to my top 15 right so you can see that in that hour it executed a thousand and 50 times and average execution was a one to two or you know a little over two minutes each execution so pretty sluggish that's that's my top sequel and I see that real quickly just by you know putting in that one table and and and actually I looked at it I tuned it and what response time analysis can do for you can give you really quickly how well you've tuned it so here it is after tuning it I in now taking 12 minutes 5,000 executions as opposed to a thousand so not only did I reduce it to nine seconds per execution instead of two minutes I increased the throughput five times so great great things to brag back to upper management I'm doing my job I'm doing my job you know that type of thing I don't know but just quickly another blocking blocking issue if you have locking blocking issues and you go to that those NODB transaction tables you can get the blocker how long they blocked and who they blocked and not only that what they were running while they were blocking as well as the waiter and this is just an example of that where we can see that this one speed he caused 235 seconds of wait you can look at his details you know you can find that he was actually updating stock these guys are all trying to select from stock for an update so they had to wait so it gives you a clue of where to focus if you have got blocking locking issues here's a scenario for you which scenario is worse I have a SQL statement number one executed a thousand times cause 10 minutes of wait for an end user waited 90% of his time on sending data or had SQL statement number two executed once cause 10 minutes of wait for an end user waited 90% on a system lock which one's better which one's worse anybody care to guess well which one whatever you think which one would be worse for the end user or great you answered it correctly end user doesn't care he's waited 10 minutes but developers might see number two as being more difficult cause it could be a concurrency problem as opposed to sending data to an end user or restructuring the query a little bit so we've got that SQL we found the one we should be working on with the response time analysis what do we do with it well the first thing I do is I gather baseline metrics how long does it take now and then find out what's acceptable especially if you're getting to tune a lot because you want to talk to your end users if they have something running maybe 10 minutes they want it down to a minute or two that's probably okay but if you've got an 8 hour 10 hour process running and they want it sub second it may be more than just a tuning issue it may be a design issue so set your expectations and only tune to that don't tune it to the nth degree cause I'll tell you you'll burn yourself out you'll get yourself into a corner and you'll never want to tune again so just you know set that I always gather rows examined and rows center affected as well because I want a metric as I tune to compare against you know I'm tuning a query that has a million rows examined and when I'm done tuning it and now it's 10 million rows examined I don't think I tuned it right I want that reduced so you want to have some metric I like that one there are other ones you can choose and also gather the thread states you know locking blocking system lock IO ascending data calculating statistics is of course statistics thread state network down is writing to that slow down many some sequels may have more than one issue what I usually do is I will start with the the biggest weight time spent on which which thread state and first tune that because a lot of times the thread states will change as you continue to tune and just remember that all of the thread states many of them have different resolutions so when you look at them after you've documented your metrics you want to get the execution plan and basically my SQL gives you many different ways to get that you can do explain just a simple explain or you can do explain extended that's one I like because it's going to tell me how my SQL is going to transform that query after I've entered it so it's kind of a neat way to see maybe you can improve how you code you optimizer trace is available in 5.63 and above and I've got examples of all these and then my SQL workbench oh here I want to back up just a little bit let's just go through explain plan to see how you get explain plan to work as you just put explain and you know session and then give it your SQL statement and this one I'm actually I've downloaded Sikila the little dvd rental store out there I think Pocona has it but you can you know play with it and so I had this load going and so the select title first name and last name from film and I'm joining that with film actor on film ID and then I join it on actor on actor ID and I'm looking for last name like percent pack percent or first name like percent Angelina percent you know so I'm just nonsensical query but as you can see here I use aliases and so they explain in my SQL it's pretty easy to read you just read it top top down so the first thing it does is it goes into actor which I've aliases a and this is something if you don't have to alias with my SQL I wouldn't do it sometimes you can't avoid it because you're accessing the table twice but if you don't have to alias you you doesn't repeat the table name and explain so if you have lots of tables it's kind of hard to read but anyway that was just the reason why I had that circle there but you can see the first thing it does is it's doing a simple type into actor and it's looking at the primary key but it and as a possible key and it decides to use idx extra full name okay and you can see the key length the number of rows it's going to do and it's filtering in on the where clause then it dips into film actor and basically decides to use the primary and it returns 13 rows there using index and then it dips into film again using primary and returns one row there so you can kind of see how that works what's neat about this next cheat sheet at the end actually this is kind of dated but I like this chart and I actually printed this out because it's a great cheat sheet to actually see all the piece parts within the explain plan you can see here this column over here you can kind of see let me see if I can actually get there this actually identifies all the columns and what they are but then you can actually see that over here it gives you the definition of the type the type is actually the data access actually the most selective to the least selective so you can see the top one is system and then constant and the bottom one is full table scan so just kind of a neat cheat sheet to look at it also gives you a great way to actually see what's in the extra columns and what all that means so I don't know if you download my slides I don't know if we post them I'll figure that out but if there's a place to post them I will post the slides so you can have it here's an example of extended explain and basically all you have to do when you do that is come in here and do explain extended now you can see I'm not using the aliases so that's why it makes it a little nicer to read same plan I'm not going to go through the plan but what you do is when you use the extended explain at the end you do show warnings and that's when it's going to come back and it's going to give you what my SQL transformed the query into so if you have implicit data conversions if you've done a select star you know it's going to actually translate that into readable for itself and notice it actually put in the database in there even though I didn't have that there so kind of a neat way to learn how to write better SQL why are you making it transform it when you should write it in the first place that way so just some thoughts optimizer tracing anybody use this I thought this one was pretty interesting it gives you all kinds of good information how you use it is basically you set optimizer trace and enabled equals on and then you run your SQL statement and then you set it off and then it actually stores the trace information in the information schema called optimizer trace table so you can select from that and you see it kind of spits out a lot of information in fact it will tell you every join consideration it was going to do and it will tell you all the filtering predicates it will give you all good information about cost that it thought about and all that it's quite wordy and you can see this thing that I've got circled down here missing bytes beyond max mem size it only puts by default 16k in this table so if you got a table with lots of joins it's going to fill it up right away and not report it all so just realize you can actually change that setting and there's a couple of them here just before I go on to that notice this is how you set it off just optimizer trace enabled equals off there up here let me go back up here you can see set optimizer trace max mem size you can set that up so if you want to see all of the trace you can do that just realize it's in memory and you're taking up memory and production might want to be careful especially if you've got a huge query that you're using you can also set optimizer trace features greedy search equals off which will make it not as verbose it won't go through all the join transformations that it has to consider so you won't see those but it'll give you a decent plan in fact I couldn't list all the plans here, the whole plan here because I couldn't get it on one slide but you can kind of see the row estimates you know the cost there of all the tables that it's going to do and how it decided to use the plan it did if you get an execution plan that you think MySQL is stuck on I would trace it to see why because this is very useful information MySQL workbench another one is great if you haven't used workbench you download it's a neat tool this is where you can do ERDs really quickly which I'm a great believer in doing an ERD of the join of your queries or the tables in your queries because you're going to see their relationships and I can tell you I can't tell you how many systems I've worked on where the developer didn't know the full relationship he got the right answer so he quit but you know you needed to know the full relationships between the tables in there that gives the optimizer so much more information to do better queries this is that same query in workbench you can do tabular or visual and you know what I like about workbench it's going to show you the expensive steps and so mark that red is expensive it actually goes through and says you know cost it can be high especially for large indexes so and then it goes in and tells you that okay we looked at the execution plans we know the expensive steps so what do we do with it? well let's go get the table definitions and first of all is it really a table is it a view? you might have to first look at the view and tune the view I like to use this one little utility called mysqlshow dash dash that is and you can give it the database name or you can actually get specific and give it a table or you can even do database table column but I just did the database here and you can see it will print out the tables, the rows that it thinks it has the average length, the data length, the index and the data free there so it just gives you a good quick way to see how many rows you're working with within a table I also think it's important to examine the columns and the where clause know the cardinality of those columns are there indexes on them and if they are indexes and if they are multi-column what is the left leading because mysql will only use the left leading in an index if you don't have that left leading column in a multi-column index and your where clause it won't use it let's see also know if there's data skew because that can make it do different plans at different times depending on the values you're passing I think I got all that also know the cardinality of the indexes as well how selective are they and their sizes because actually mysql will look at the size of the index and kind of make silly decisions sometimes and I think I've got a case study that shows this where I didn't even reference the column that the index was on yet it used the index and the only thing I could think of was the size of the index you know and why it would do that okay so we've got all that good information when we haven't tuned anything so I like to use case studies to actually go through and show how to tune and I think it kind of brings it home and plus we'll talk about sql diagramming okay the first case study here deals with it actually was a real time customer that called in and asked us to help him it was a university and they were trying to answer the question of who registered yesterday for their billing system because they billed their students as they signed up and all that so I changed it a little bit and all the timings are on my little laptop but it actually did come from a actually this was an Oracle database but I kind of transferred all the data and just to see if I could do it in my sql and it actually behaved just the same way so anyway we were trying to answer the question who registered yesterday for sql tuning and we're selecting student name first name, last name and registration signup date we're joining that on registration on student ID and then we join it to class on class ID and we're looking where class name equals sql tuning and the registration signup date is yesterday if you will and then where our cancelled equals in the registration table was historical in nature they never deleted from it if a student dropped or you know cancelled his class they just did it with a Y so it was a very large table when I ran this on my laptop you can see 9,320 executions an hour average execution time 9 seconds look at my rows examined and look at the rows I got back and look at what it was spending all its time on sending data 99% of it is time sending data actually what I did is I used my sql slap does anybody use that it's kind of a utility out in Linux and you can say okay I want these many connections I want you to iterate this over this query these many times and you just give it a query or you can give it a script and then what I did is I actually put it in a loop and I changed the date so this date was kind of looping through the last year or whatever my dataset was so it was never running on the same day just over and over again and so that's why it had zeros rows because nobody registered on the day that I looked at let's look at our response time data here we can see it's spending all of its time on sending data there's a little table metal data lock that I'll get to waiting for table metal data lock there but you can see it was spending almost 10 seconds it dipped down a little bit you can see the throughput here was going up and down as amount of time that's the executions and then finally you can see the rows sent and like I said that day was the days were changing each day and there was zero rows nothing had registered and we didn't get until probably around where it spent more time almost 13 seconds on average we're looking at hourly data but it actually throughput went down but it started actually getting data if you will so it actually had a result set that came back so you can see how it was running pretty sluggish we go get the execution plan and what MySQL Workbench said you've got a very high step here full table scan on student almost 9,000 records right so and then we're going to do nested loops into registration using primary and we're going to get four rows and then we're going to look at nested loops into class and get one row and of course it's very high cost large tables no usable index is found on this table so it's telling me all that good stuff well how do I know how to tune it from looking at this other than the little thing about might be very high large cost for large tables SQL Diagramming comes in and if you don't know SQL Diagramming I don't know this man probably sold him a lot of books because I always have him in my talks because I really believe in SQL Diagramming your queries and it's a great book to download and use he not only goes over this he's just a great tuning reference but basically what SQL Diagramming is is you're going to take your tables in your query and you're going to draw on an upside down tree what you're going to do is you're going to take your detail table as your top in our case is registration and then you're going to draw links or arrows out to your lookup tables or your master tables if you will so if you remember our query our detailed table was registration with a look up into student and a look up into class right so then the little numbers on the side I'm sorry these little numbers here on the arrows basically your join condition so for this relationship we're saying for every one student he probably has about five hours of courses in registration for this year right that's probably a decent load of university for a semester and then for every one class there's probably about 30 students probably not so for a university but stay with me okay so the join criteria basically how you do that is for every one student you'd never have to read more than 150 records out of registration if I'm reading a million I've got a problem so that's what you do with the join the little numbers with the underline here are basically your filtering criteria and if you remember we had two filters on it on registration we had where registration sign up date was between yesterday if you will and where our registration canceled equals no and out of that we had 4,228 records out of 79,981 records in the whole table so if you look at that about 5% of registration would have to be read with that filter right our other filter was on class where class name equals SQL tuning we had two records come back probably different levels of tuning 1,000 0.2% of class would have been read with that filter which is a better filter the more selective filter class right okay so we want to drive this query by going to class first because remember I don't know if I said it already but really when you're looking at tuning queries you want to read the least amount of data and then build upon it as opposed to reading all the data away at the end I mean isn't that a waste so we want to find the driving table so I think class is our driving table so let's go look at our ERD like I like to do and find our relationships and you know what? No relationships we're right there is a big red spot here and remember I said I got this from Oracle well I transported the data from Oracle and forgot to bring the keys now that would have given my SQL because my SQL uses foreign keys very good and so I don't know how many of you use constraints or do not use constraints would be a better question okay well just remember if you don't use constraints you're actually kind of tying the optimizer's hands because that's giving him more information on how to go about creating explain plans so I didn't I forgot to add the constraints and none of that when I showed the indexes you can see I have an index here primary keys and my primary key here in registration is on student ID, class ID and signup date I have a primary key on a student ID out of student and a primary key on class so let's go add those foreign keys and here you can see I'm doing alter table registration add foreign key on student ID I didn't give it a name it will create its own name and I reference the student ID and I do the same thing for class on class ID references class ID and then I make sure they're there this time by selecting from information schema key column usage you can do this to see your constraints and you know I'm looking for table schema equals CSU and the table name equals registration and you can see here I not only get the table name the column name the constraint name that it automatically gave to it and what the reference columns are let's go add the index I'm going to add it on class name because remember I want to drive it by class so I create index on class name or class in M on class name rerun the explain and you can see now I'm driving it very efficient class name two rows nested into 38 nested into student with one row much less than that 8k it was initially reading the first time I did it did I improve it well here we go as you can see it was running really sluggish here we can see that you know 13 upwards of even 22 seconds up there at the top at the end you can see that little metadata lock that's when I added the four keys a little purple sign I don't know if you can see that very good but notice my class name when I added that index during that one hour after that it just went through the roof executions per hour 86,000 I think I was doing 9 before 9,000 432 milliseconds per execution 2.7 remember it was like over 3 quarters of a billion it was reading before with no rows 2.7 million examined and 1.1 million returned so pretty good usually you don't want it more than 10 you want it way less than 10 still doing most of its time on sending data but look at very little in the milliseconds so great way to see that we've tuned it running out of time we'll make sure second one is just a little more meaty I actually created like I said I downloaded the saquilla database this is a DVD rental store and then I created these queries they're kind of just my contrived what I have is a master slave environment and what I set up is a on the master they were doing I had all these customers running and paying overdue payments and all this good inserting updating type of stuff and then the slave was doing all the reporting so this happens to be on the slave and we're looking for overdue DVD rentals for customers for last month because sales reps needed to call them and say come on return the DVDs will you please and so this select statement is actually looking at customer last name first name phone number getting the film title and the rental date and it's going from rental and it's dipping into a customer on customer ID then it's joining to address on address ID and then it joins to inventory on inventory ID and then it's joining to film and then it's looking for rental return date is null because it means it's still out and rental date and it actually does an interval with film rental duration and looking for the last 31 days to see if it's still out there and then it's looking at customer last name and you know passing that and we ran this and it's ordering by rental date descending so when we ran this it was executions per hour taking on average two minutes each execution again look at my rows examined 424 so about half a billion rows examined 59 million returns but still pretty sloppy there with all the rows being read and of course sending data and you'll see sending data is going to be the thread where if you do have IO intensive queries it's going to mostly be on sending data now I have a really poor VM environment that the sand is horrible so all my queries are on that so it may not be that way for yours this is the response time data and if you remember this chart probably looks familiar this is the one I tuned and that was the query I was running but notice it was here's just the sending the actual executions two minutes on each execution you can see taking a lot of time it was the top one in my database we go get the explain plan again and here you can see of course all the good stuff but here's the explain plan it first dips into film it doesn't use any keys and it reads all the records full table scan so 3679 records then it dips into inventory it uses that IDX film foreign key film ID gets one record then it dips into rental and it gets 49 records and it's using aware and then it dips into customer using the primary keys as well as I address the primary keys returning one record you can see the transformation there I always think that's easy and good to look at let's diagram it to see that's the way we should really drive that query so here we've got five tables here our very detailed table is a rental with a look up into inventory and customer and then inventory has a look up into film customer has a look up into address and you can see we're saying for every one film there's three in inventory and for every inventory we have about 94 rentals and we're also saying for every customer there's one address and for every customer there's 91 records in rental so again you can do the math and figure out how many records you're going to join in all actuality if you make sure your join columns are indexed left leading however you've indexed and make sure that it can use that index MySQL is going to do the right thing you don't need to figure out the join data just make sure they're indexed what really matters and in fact when I do SQL Diagramming I don't even bother with the join numbers I just go for the filters because what really matters is you're trying to find the driving tables and the filters are going to do that so we can see we had two filters one on rental, one on customer customer was more selective we can see here it was 0.2% on customer and how I did that because I was doing a like is I got the average count the last name and I counted them just to see what would be the average to kind of figure that out so 32 similar names were in there and so that was 0.2% versus the 7% of rental looking for return date so I want to drive with customer so let's go get the ERD and lo and behold there is an index on customer well why isn't using it correct sloppy coding sloppy coding because it's not going to can't do the B3 because it doesn't know where to start so it can't use the index it has to do a full table scan other ways to fix this we might be able to concatenate a field if you have to use this percent concatenate use a multi-column index and try to maybe put it on rental date you can't use return date because it's looking for is null so it doesn't keep nulls in it unless you concatenate that with another column so I just know this is me being a sloppy coder I put it in my little load routine and just a percent and a parameter and pass different names to it and so I fixed it just by taking it out and so when you do that you see then it's going to use the index and so instead of that 4,000 records or so it only looked at the 2 that it needed and actually got in there so did we tune it this one was kind of funny because we did tune it and I kind of forgot because I had it iterating a thousand times well I sped it up so much it got done so you can see I missed a couple hours because it wasn't running anymore I did pretty good because it went through the whole load last one and I know we're running out of time this one is just more meaty I did 11 tables here and it was trying to it's mainly to show you how to diagram but a weekly sales report by category by region and again you can see all the 11 tables here very sluggish taking 5 minutes each execution 110 executions when I ran it without doing anything to it you can see this is where it was kind of funny I didn't even reference this column so it dips into film but it's using this index idxfk language id I didn't reference the column so I don't know why it decided to use that index other than the size of the index I don't know then it actually dipped into film category and all that and as you can see here again reading all the tables and then throwing them away at the end so um look at the number of rows examined 233 million the rows affected was 2000 so again very inefficient this is where I was going why isn't it using the primary key it seems like that would be a better way to do it but because that's more selective the cardinality you know primary key this one it had two values in it so maybe that was why um and maybe it thought it was equally distributed in all actuality there was data skew so um probably not a good reason to take that here's diagramming that out just to be more complex to show you how you can do it again notice I don't put any joins there I just go for the filters and I'm saying rental our staff has the best filter so we're going to go put oh shoot I didn't mean to do that we're going to go put um an index on uh staff and when we do that you can see that it rearranged it a little bit it still had to go through a rental and look at some some information and use those but it actually really did improve it uh you know the throughput we went down to um oops 920 40 milliseconds most of it sending data and uh look at our rows examine instead of 233 million we're doing 15 million now and we're affecting much more rows because we increase throughput so finally monitor is the last step you know um prove that you're tuning made a difference brag about yourself because no one else will know and just uh keep remembering tuning for the next opportunity work on the right sequels to tune and you know shameless pitch for our product download us we got a two week free trial um there's a summary I'm not going to go through that because I know we got this uh the raffle but if you want to stop by our booth I've got this little infographic that you know it's a kind of neat little poster that's 12 steps in MySQL tuning it's just a neat thing reminder you know stop by our booth it's 531 you can pick it up for free or if there's some place you can download my science and you're a cat lover there's a 12 step program for cats too hi hi thank you guys uh huh can you draw another one before you pick it you can decide to choose or go through okay uh sorry Ricardo I'll answer your question Ricardo? oh you're there yet come on up yeah you can tell do you want to take a t-shirt or a monster whichever one first come first serve so okay