 Good morning and welcome. I think this is the last presentation for the morning, right? And then we will have a lunch or anything else, but so I'm just behind you and your lunch. Presentation today is, well, you can see the title. We're talking about database as a service, but not just database as a single instance, but database cluster as a service in OpenStack. Let's see the objective and I hope you will find information that might be interesting for you. First of all, my name, Ivan, or Ivan in English, Zorati. I'm the CTO at SkySQL. SkySQL is a company behind, well, I think the most interesting brand is MariaDB. Who knows MariaDB, by the way? Okay, not many, but quite a few. So we are basically an alternative for services and products to the standard Oracle MySQL. And we are funded by the core developers of the original MySQL, meaning Multividinus and WDX. But let's talk about the objective today. First of all, I will give you a brief introduction of where we stand now, where we stand today with MySQL, MariaDB and Pacona, which are of course the three most interesting distributions for MySQL. And then we will talk about this integration into OpenStack and other cloud services that we will touch a little bit, AWS and others. So let's start first of all with MySQL in the cloud. So some of you are familiar with MariaDB. Let's see, who knows Percona server? Who has used Percona server or is using Percona server? Not many. What about MySQL? Let's see. Okay, a few more. Right, so we are pretty much 80 or 90% if we sum up the three ones. So as I said, you can see these three columns here. They refer to the three different distributions from MySQL nowadays. And I mean, I forgive, I really ask forgiveness for some of you who may know very, very well MariaDB or Percona. Because of course these are not very detailed bullets, but you will see that I mean, pretty much they basically match the reality. This is, let's say, main elements. So MySQL, of course, is now from Oracle, acquired with the Sun Microsystem in 2010. And the trademark, the code, the documentation and all the bugs are available in the various MySQL websites based on the MySQL domain still owned by Oracle. The current GA release is 5.6. The majority of the, let's say, of the installations right now, I believe they are pretty much on 5.5. There are still few on 5.1 and earlier releases. 5.6 is catching up, I would say. It's a very good release and of course there are very good reasons to upgrade. 5.7 is currently under development. There are developer milestone release available. They're definitely not feature completed and they're still under heavy development. So it's good to start testing them but not, probably not really a good idea to put them into production at all. MariaDB is a project started by Montevidinos. As I said, Monti is one of the original founders of MySQL and the original creators of MySQL. And it's based from a branch of Oracle MySQL 5.5. That's the current GA version with some add-ons from Percona and other key contributors. We have now in development and actually as of today, this slide is already outdated because we have 10.05 out starting tonight. So we have the first beta release which is a real fork from Oracle MySQL. It's no longer a branch and we have add-ons and new storage engines that I'm going to explain in a little bit more details in another slide. And then on the other side, we have Percona server. And in this case, the latest release is a 5.6 GA. And it's a branch from Oracle and again with add-ons that are from MariaDB and from other key contributors. So all in all, you can see that there is a quite a strong interaction and collaboration between the three. Now, there are lots of features. I mean, many people are pretty confused by which version and which distribution to use. And I mean, I definitely don't blame them because there are really lots of features that are similar or at least they're claimed to be similar. But then you look into the details and there are quite significant differences. But anyway, just to be a little bit more detailed here on MySQL 5.6 in Oracle, we can see great improvements in terms of scalability. And that's really great, great news. New features with plug-ins and new tools for analysis and performance review, et cetera. Definitely great features within the DB, which is a transactional engine for MySQL. And another interesting feature is in MySQL replication with the global transaction ID that was a long-awaited feature for MySQL. But probably not quite right with 5.6. And we are looking at 5.7 for a real good implementation of a global transaction ID. Plus many utilities, more online DDL operations, which is another big problem I would say for MySQL since the early days, compared to other databases like Postgres and others. And subquery optimization and other optimizations. And then you see, I know it's a little bit small. The font is small, but in red, there are some pain features. So these are not open source. These are basically offered as a feature if you buy a subscription on the left column from Oracle with the thread pool, the palm authentication, and the OD plug-in. MariaDB, on the other hand, is still based on 5.5. And that's the GA version. So it's a little bit behind from this point of view. But there has been a lot of work on the subquery optimization. So if you have quite complex queries in your SQL, you can really benefit from that. And well, let's put it this way. The original optimizer team at MySQL, the original MySQL company, is now working in the MariaDB team. So of course, there is quite a lot of effort there and they're very good results from this point of view. So that's why you can see in this condition push down the batch key access, etc., that have been significantly improved in MariaDB. The other interesting point for this release is the use of, well, of course, the extension of collaborations with other companies. So there are more storage engines. And the storage engines, of course, give you like a perfect tool or a best tool for what you need to do. Like for example, for a full text indexing, or if you need like a kind of sharding scalability or a more specific request like column arranges and so on. Another point that I think is quite important is the group commit, which is another interesting feature that was long missing. And we have in 55, it's been in 55 for MariaDB for about, I would say, 18 months now. On the right hand side here, you have Percona S56GA, where you can see that 56, of course, is a port from Oracle MySQL 56. Plus a definitely great addition, which is the evolution of the InnoDB storage engine called ExtraDB. So basically, with ExtraDB 5.6, you have all the great feature sense and the improvements that you can see in the InnoDB side on Oracle plus extra bits that have been added by the Percona team. Okay, moving on, of course, here on this slide, we have what's coming. So what is not GA yet, but what you should expect for the next releases. And you can see that, of course, with MySQL we have 5.7. And with MariaDB we have 10, MariaDB 10. So there is a significant difference, like with MySQL, there is a continuation from 5.6 to 5.7, where I would say Oracle has very well fixed, or they are fixing some of the issues that were available on 5.6. And of course, there are also new great improvements. MariaDB 10, as I said, is a fork. So in this case, we have basically, we deviate from the original trunk here and we have something more ambitious. And again, based on the collaboration with other companies and other developers outside the core MariaDB team. So there are new storage engines, like the Cassandra engine, which integrates MySQL with Cassandra. The connect engine, which is an engine that can basically connect all sorts of different files and other databases within MySQL. So you can think about that like a connectivity to standard file system and files or to other databases through ODBC connections or JDBC connections, etc. So that's basically the idea of the connect engine for interoperability between databases. So all in all, as you can see, of course, there are lots of other choices out there in terms of databases. But with MySQL, there are still improvements, significant improvements, and definitely the database is alive and it's really going on incredibly well also in terms of new features, new improved performance, scalability, etc. We don't have, I don't have anything on the right side from Percona, and the reason is that Percona is just five, six, and that has been a significant effort from the Percona team. And we are waiting, of course, for the next bit and the next move and what we can add here in terms of it. Perfect. Right, and the question is what's beyond five, six? That's, I think we'll see, right? Great. Perfect. Okay, when we talk about MySQL, it's always been a matter of high availability for performance and ease of use. And we still, I mean, that's not just MySQL, to be honest. I mean, it's also for any other database if you want. But we're still working on these specific three aspects. And in terms of high availability, let's see what we have here, what we have available with the different options that you have. On MySQL, the standard availability is based on MySQL replication. Are you familiar with MySQL replication? Who's familiar with that? Okay, right. DRBD is another good choice. So you have a typical active, passive environment based on the file system and the storage underneath. And then there's a specific share storage availability based on Oracle Enterprise Linux. So I would say that's a little bit of a niche considering the adoption of Oracle Enterprise Linux. On the MariaDB side, we still have the MySQL replication. But here we have added something more. We have added what we call MHA, Master High Availability Tool, which is a tool that controls the interaction between the master and the slaves. And in case of fault of the master, it basically helps the database administrators. Or it works automatically to move the master from the master to one of the designated slaves. And the result is that we have, let's say, less errors or less possible issues. Because you, as a database administrator, you may do something wrong or may miss something that should be done as an operation. And we also added, of course, a space maker with the idea of having a resource agent that can control the whole operations integrated with MHA. But one of the probably most interesting aspects, and that's very much into what we are going to discuss now with OpenStack is the idea of MariaDB Galera cluster. So in that case, we have synchronous replication. So no longer asynchronous, semi-synchronous replication. And this technology is being developed by a small company in Finland called Codeship. And now MariaDB, as Percona does with a similar product called Percona X2DB cluster, is basically available with this synchronous replication. And then there is also support from the different companies for DRBD and the shared storage. Similar to the MySquare replication on the Percona side, we have MySquare replication with what is called the Percona replication manager. So as you can see, there are similar offering. The key point here is the fact that there is also a synchronous replication on these two specific distributions. For performance and scalability, there is more improved scalability related to InnoDB, as I already said, and a few other bits and pieces, like the group committed, et cetera. I will not go into the details of these aspects, because I think we don't have much time to cover all the other aspects. But you will have a slide if you have any specific question, of course. You can stop after the presentation, or you can always drop an email. And so for these of use, again, there are utilities and tools around. Some of them are from the original MySQL. Others have been added. And as you can see, the red ones are commercial. We have, of course, extra backup, which is one of the most important aspects, especially when you use InnoDB and transactional environments. And they are available. Extra backup is available for MariaDB and Percona. And the Percona toolkit is one, of course, of the most interesting tools that you can see. And now we have something new that I'm going to discuss a little bit, which is called MariaDB Manager. And you will see it in a few minutes. Now, what do we have in terms of availability of MySQL in the cloud today? Let's say what we have. What's the availability in terms of database as a service? So Rackspace provides relatively old versions of MySQL. If you go into Rackspace and you want to use a MySQL server, well, what you have is a MySQL community 5.1. Otherwise, the other option is to use a standard server. And you just bring your own database, your favorite one. It might be Percona, MariaDB, or MySQL, and there you go. So that's one option. You have, of course, limitations related to the instances and the size of instances you can have here. But that's pretty much it. On the other hand, on HP Cloud, you have something similar. But in this case, the database is a little bit more up-to-date. So you can have 5.5 and 5.6. You have typically Percona in this case. And you can manage a single database instance using typically a Rackspace API. So that means that you can, of course, integrate and orchestrate the provisioning and the availability of the database directly with other tools. The standard servers are where you can bring, of course, your own database. And in that case, of course, you have more choices in terms of size and RAM, et cetera. If you compare what we have available in typical OpenStack based clouds with what AWS does, well, RDS is pretty much advanced from this point of view. They provide different versions, 5.1 to 5.6. They have the ability to provide provision I.O. and flash storage with MySQL. And they already provide automatic backups, storage replication, and MySQL replication combined all together. If you want to start a server, you can do that. And then next, you may say, OK, I want to connect another server with MySQL replication, and there you go. And then you can continue to add more. There is a limited tuning, though, which is a good and a bad thing at the same time. Because if you are not familiar with MySQL or you database is really not your primary job, maybe it's a good thing to not really mess up with in-depth tuning. On the other hand, if you are pretty good with that, you may want to control it better. And of course, you can't do it with RDS. So the other option, of course, is to have your standard server that you can just fire up another instance in EC2, select something from the marketplace if you want, or again, bring your own database version. And you have a similar kind of size for the instances that you may want to use. Now, let's see what we can have here with OpenStack, of course, refer to MySQL. First of all, we have to see two aspects. One is the fact that MySQL can be used as a repository for OpenStack. So that is integrated into OpenStack. I'm not referring to the use of the database for the clients, for the users. I'm referring to the use of the database within OpenStack as a software. And that's the alternative to SQLite, of course. The reasons are pretty obvious. You have increased portability. You have high availability using MySQL. But you have also some other issues. Because of course, once you include something that is not just a set of files but a set of processes that run, of course, you have to take care of that. And you increase the complexity of your infrastructure. So that is very much used for larger infrastructures that you may want to use. And as I said, availability is the key factor. So the three options that we used to see with OpenStack are just the use of standard MySQL replication, which is still a very, very good option. I mean, yes, there are some flaws. Yes, there may be problems. But all in all, it's probably the most robust technology that you can use compared to Galera or DRBD. But still, DRBD or Galera are the two options that are definitely interesting. So when we talk about MySQL replication, the definition would be you use a MySQL server. You may co-locate it with one of the nodes, like with the control node, or you may have it completely separate. And you create a cluster of one master and at least two slaves. OK, the minimum might be one master and one slave. But by having one master and one slave, then you will have possible issues in the way you want to use a third node for administration reasons. So probably the best solution is always to have this environment with the free machines. We're not talking about restless, hungry machines here because the database is, yes, of course, very important, but not heavily used compared to other use of MySQL in the usual environments. Definitely the best thing to do is to attach MHA to that, the master high availability tool. Again, why? Because it's just a set of scripts, fairly lightweight, that run together with MySQL replications. And in case of fault, they can basically simplify and, let's say, optimize what is the failover from one server to another. Another important point is that you can also implement directly with MySQL replication also a disaster recovery environment. So by using the very same technology, you can also replicate to other data centers. And in this case, of course, you may have a complete recoverable system also in case of a full disaster in a data center. What is still not great is the fact that when you have a failover inside the data center, you have serious problems with this disaster recovery because basically you need to move not just the master there, but you also need to reconnect the systems here. And with MySQL 5.5, without the global transaction ID, that is a serious problem. So in case of fault internally to a data center, you may have some issues here. That's why with 5.6, this is definitely a better solution. And I would recommend to consider these automatically only if you are going through 5.6 and not with previous versions. With DRBD, you have something very basic, very simple, but still working very well, meaning that you have a typical active passive environment. In this case, you don't have shared storage, but you have a synchronous replication of the block devices between two servers. And the result is that when you have a fault on one of the servers, of course, you can switch over to or have an automatic failover to the other server. Still, you can create a replication between one data center to another for disaster recovery. And that is something that can be done in a more clean way compared to the standard MySQL replication that I showed you before. And here it works definitely flawlessly. Because you don't have any possibility of messing up with different servers that run at the same time. Since with DRBD, you have only one server running at one time. Galera Cluster, on the other hand, is probably one of the most interesting ones. Because the first point is that this is really all active. So you don't have any more difference between the masters, the slaves, or the actives, and the passes. The systems are all running at the same time. Yes, there are some issues, strictly speaking. And it's definitely recommended to go and work with one server only, usually, with read and write operations. But you have all the other servers, like, for example, in a typical configuration with three nodes, you have all the other servers running at the same time. If you hit one of them and you write data, you synchronize have this data available on all the other nodes. And that, of course, means that you have less issues and less problems if you have any glitch in your communication. And you see one of these servers down when, in reality, maybe it's a network problem or something else. So let's say that by using Galera, you have a more stable and more reliable environment in terms of high availability. And even in case of situations and conditions that are not optimal for high availability, you can still have a very consistent database, which is something that other technologies suffer quite a lot. And again, you can attach replication to Galera in order to move from one data center to another. It's not recommended to use Galera itself as a replication environment. I mean, there are some examples, but it's something that needs to be carefully reviewed because of a latency that may happen, because of the fact that we are talking about synchronous replication here. And of course, physics are always, the law of physics is always something that rules. And you must consider that when you also introduce remote data centers. So these are the three options. There may be others. But I would say if you are considering to install and run an open-stack environment, which is pretty large, definitely one of these options may work for you. And so far, I would say the most recommended one would be Galera because of the availability, as I said, and the fact that there is less risk, generally speaking, in running this in a high availability environment. There may be situations where other solutions are better. But based on the experience that I have and that we have in the organization I work for, definitely this is one of the best options. Now let's talk about MySQL, but as a service. So in this case, we refer to the fact that MySQL is available to our users when they run applications within open-stack. So starting from the provisioning, one of the key aspects is, of course, that we must have something that is an API available for the provisioning. And that's something for interoperation. And that's something that with the new projects that you may have seen also here at the summit, this fall, we already have within open-stack with the project trove. Now that's one very good aspect. I think there are other things that must be absolutely considered. And I'm going to discuss them in a minute. Of course, for end-users, a GUI-assisted version may be also very good, because in that case, we can provision as we have seen with HP Cloud, or Rackspace, or AWS. They can basically have a self-service operation here. Now the key element, though, is the last bullet that you can see in the slide, from the server to the cluster. So today, if you look at the deployment of MySQL, how many MySQL servers that run in serious application, I mean applications that must stay on 24-7 because of a service they provide, are just a single machine. Not many, I think. They always require high availability. They always require some sort of scalability that is needed in order to handle the peaks that may happen during peak hours, for example, for your application. And yes, of course, you can provision two servers, but then you have to do these operations manually. Yes, there are some recipes and tools that can help you in setting everything up in a best possible way. But then technology evolves, and then you have new versions or new products. You basically break these recipes, and you have to have new ones. So there is a lot of manual operation around that, and that is one of the key elements. Another important point is what I call the false promise of elasticity. So everything is elastic in the cloud. We all know that. That's why we are here. But guess what? The less elastic element is the database. Well, maybe there are others. But at least from our point of view, that is a big issue. We talk about something that is great in the cloud, because if we have a need for more power, then we can just fire new instances. And then when we don't need them, we just shut them down, we destroy them. You can't do that with the database, right? Or you can, but how hard it is? It's really difficult. So the typical use with MySQL is basically, you can have a standard MySQL replication for read scalability, which is good. And OK, it's limited because it's only read, but it covers many of the requirements that are in typical web applications or mobile applications. Or maybe sharding is another aspect. But how can you consider to increase and just fire up another MySQL slave or 10 more MySQL slaves when you need them? You need to replicate. You need to copy all this data. And then there must be available. It's not something fairly simple. So that's a quite important aspect to consider. And even more in sharding. If you need to re-shard environments, that is not easy. There are technologies out there that consider that. And they just say, OK, if I need more storage, I will add a new shard. But then the compromise is that they need to deal with the sharding keys and the way the data is distributed and work with chunks that do not balance well enough sometimes. So there is always a trade-off here between what you can do and you cannot do. And what I'm referring here is, of course, to MySQL. But many rules apply to no SQL technologies as well. It's not just a MySQL issue. So that's why you can see around so many companies trying to fix this problem. But the problem, let's face it, has not been fixed yet. And it's really, really difficult. It's probably one of the most difficult aspects to face. One of the other points that we must consider in this case is, are we running a database as a service, or maybe it's better to work as a platform as a service? Meaning, do we really want to have the database separate from our applications and the way we want to set up the application, or is it better to define that the database is part of the application and we simply release the whole environment with the database in it? I am a database man. So of course, I always think of what is in the left side there. But the reality is that if you are an application person, you look at this. And that is another problem with the combination of these two aspects. Past vendors, of course, rely on this. But again, there is a big issue because the way you can scale and make the database elastic really relies on how the application is being made and what allows you to do so. So it's not just the fact that you are there some more minuscule service and it's job done. You really need to use this minuscule service and the application must be do that. Must do that. Here is another problem. Well, not a problem in this case, but it's just the way things can be handled when it comes for storage. We have two options, block storage and object storage. And of course, the best thing to do is to use a cheaper version of this storage when we don't need all the full capabilities that we can have with block storage. So for example, as Trove does, but also we see with Percona and MariaDB installations, Swift is definitely the best solution to backup with full of incremental backups from our database. Because once you have done that, then of course you have a cheaper storage and you can also move your backups easily. You can always use Cinder to manage your table spaces within the database. And that is another interesting aspect because with 5.6 and with the new versions, you can basically move these table spaces. So you can take what you have within Cinder and you can copy that or move it from one instance to another and then reattach it and run my SQL. Now, when you can use an API to integrate this aspect, then you have something that works very well for high availability and also for scalability. And let's say that we have done another step towards that ideal elasticity that I was talking about. Because you can take one of these table spaces, you can replicate it, and then continue to run your system. And at the same time, you can also increase the size of your database cluster. Another important aspect that you have to consider, of course, is networking and the way you handle the networking within your environment, within Open Start here. Here we have two main points. On one side, you have an application that has to rely a cluster, and usually the application is based on a single server. And what you need to do is to basically create something in between a proxy system, a load balancer that can help you in identifying what is the best server to use down there. The other option is to have this proxy as part of the pass. And in that case, you define that you have less hopes, and you define that the proxy and the application work together, and they will identify which database and which database server to use down there. Now, we wouldn't need this if we had a more advanced set of connectors, but unfortunately, the current connectors are not designed to work in this way. And that's another area where MariaDB and others are working on, because with the clients that are not up to date with what you can do in the cloud environment, you have problems with elasticity that I mentioned before. And then the security is another aspect. So in OpenStack, we have Keystone, and we have usually integration with LDAP or with LDAP. And that's a great thing. What about the aspects that we have inside the services that we offer? Yes, we can have integration with LDAP, or we can have integration with Keystone. And that would be probably ideal, because in this case, we don't distribute and we don't create duplicate authentication within the database and externally. So another way to look at that is to have an integration with the security plugins that are available in MySQL, or again, use a proxy that can define basically the security plugin and integrate the security plugin with LDAP or Keystone and the database. So very briefly, because we are running out of time, as I can see, we have something that we have added to the MariaDB server, which is quite important. It's called MariaDB Manager. So with MariaDB Manager, basically, let me skip a little bit these slides. We have something that is already available with Trove in OpenStack, but with Trove, we have these on a single server. What we are going to do with the MariaDB Manager is to manage it basically a RESTful API and monitoring connected to Agents to deploy and to control a cluster of servers. That's the idea. So it's all based on, initially, it's being based on Galera. So we have this solution starting with a Galera cluster, but we will evolve this with also MySQL replications and other technologies. Typical block storage is, of course, another important aspect for the cloud. Interesting enough, this is available not only in OpenStack, but it's also available on premises and on AWS. But the first deployment is on OpenStack, as you can imagine. So for example, if you want to provision a node, there is a set of API that we need to define. Where basically the first thing to do is to create a node. The second aspect is to run the connection to that node and check what's in that node. The third one is to probe the state of the node and see if it's able to accept the software and the final point is the provisioning of the software. That is all available, and it's already available today. And you can use this API just to run them within OpenStack or by adding it to your own tools if you need that. Then you can start a node, and you have, again, another aspect, another API that can start it. Or you can use the GUI to do that. All these aspects, I will skip most of these slides, but all these aspects are available. They're open source, and they are, of course, available for you. Let's say that the only known open source part is based on the extensions that we may have in the future. But right now, everything is available as is. You can also monitor the databases, of course. And here is what's coming. It will be, of course, as part of MariaDB10. It will be integrated with other proxy servers that will cover the aspects that I mentioned before. And there is, of course, a project to integrate this to the current availability of Trove. So that's the idea to have the cluster side, basically, the evolution of Trove from the server to the cluster side. That's one of the aspects that we're going to add. Anyway, you can find this information on MariaDB.org and MariaDB.com. There are not only aspects related to MariaDB, despite the name, but generally to MySQL and all others. The slides will be available. So, of course, you're free to take pictures, but you can review them later. I'm sorry, we were running out of time. And that's unfortunately a problem. But I'm available. So if you have any questions, I can take them.