 Hello everyone. Today we'll talk about how it can take your database and get it from running into container to really very powerful but fully open source database as a service powered by Kubernetes. Now, let me start with a bit of a history or at least what my experience has been with the infrastructure and open source software. If you look at the early days of open source software, at least as I remember those, things have been quite complicated. I remember in the early days, during the Linux and other open source software, things have been quite tricky. You would have to download the source code, find the proper compiler version, which compiler of it may be applied some patches, compile lots of stuff yourself, and so on and so forth, right? Well, basically it was quite tricky. But since that we have seen never ending move towards simplicity as we are trying to be as much efficient as possible. And it is of course a simple things which drive such efficiency. If you look at the path we've taken, it went from downloading the sources, patching and compiling to when we had like our GZ binaries and install script which will simplify installation greater, but which will create a problem of dependencies. Like you may install the binary but it wouldn't have some libraries or would have incompatible libraries, right? And not work properly. With that, we ended up to getting the packages with dependencies such as Demp and RBM, then simplifying that package download and installation with Upt and the YAM repositories. Then the problem remains is having multiple conflict inversions of a software at the same time. Many of software packages would share the same paths. So you could not easily install multiple versions of the software at the same time. With that, yet another simplification developed like the technologies as Docker and Snap which allowed to avoid those problems. And in this presentation, we will focus on the Docker as I think by far the most common software on of its kind. So what is about running database in Docker? How and why? Well, there are actually two different ways or two different kind of environments where you can consider running your database in Docker. One of them is test and prod. And in test and dev using Docker to run your database is actually very common and very convenient. It is wonderful because you can create many components of any database which are isolated from each other. For example, if you want to test your applications working in different database versions, well, you can deploy multiple of them running at the same computer very conveniently. We have it without having to deal with complicated installation of multiple versions at a time. If you look at the conversion convention of MySQL installation, for example, in most Linux distribution, you cannot really have easily multiple versions installed at the same time, but you can easily have that in Docker which can be very handy. You can also use Docker compose to simplify the deployment of your database or whole application with Docker compose which can be handy. Now, if you look at the production though, there are some problems and challenges. First is there are peers of overhead. I think they're mostly unfounded at this point, but indeed early versions of Docker in some cases could cause significant overhead which was too much to pay in production. The second problem of course is extra complexity you have to deal with. You would very likely need to make sure, for example, to store data in a separate data volumes and be very careful of not losing your database by dropping the container, which I think something which beat in the, but number of people when it started to use Docker for a long time. Also, if you think about the monitoring and observability tools, many of them initially lacked proper Docker support which has additional overhead. Now, if you think about a state of open source database and Docker, you'll find what most open source databases have official Docker images out there. And they with that are very commonly deployed for test and dev, but also have relatively limited use in production. I see very little people in fact saying, hey, you know what, we just use Docker to run our database in production. Typically it goes, either you are running your databases on VMs or bare metal, or you go all the way to the Kubernetes which we'll talk about a little bit later. If you look at per corner, we also provide some solutions in this space. We provide our Docker packages for MySQL and MongoDB and the per corner provided and hence enterprise grade distribution which is free and open source. Basically what that means is we provide many features which are only available in the enterprise edition from respected vendors, but as an open source edition. And let me just clarify something here. What per corner provides is strive to provide open source solution, but where MongoDB itself changed the license to source available license a couple of years ago. So our per corner server for MongoDB because of that is also server source available rather than truly open source as we would prefer. Now what is unsolved problem? Unsolved problem with Docker. Well, at large extent it is date to operation. There Docker makes it very easy to provide the provision with database. It doesn't really happen very well, managing high availability, upgrades and so on and so forth. Yes, of course you can go ahead and tear down your database and provide the new one, but that is not really how we operate the database at scale. Typically we would have in production the clusters provision or the single servers and then we do maintenance, upgrade and so on and so forth. We want to do that in a rolling way while cluster remains highly available even though some of it node goes down for maintenance. Docker itself is a single node solution, so it doesn't really mix with the cluster. And while there was a solution Docker swarm, it did not really get much traction and the path does not really have a very good solution for managing database. What has our solutions instead is a Kubernetes. Now, when I talk about databases and the Kubernetes, then some people are surprised. Though I would say less and less people are surprised every year. And indeed Kubernetes and database, they have kind of complicated relationship because Kubernetes was designed for stateless applications, right? And database is well, very operative state, right? Database is typically where you can store your state so your application can be a stateless, right? And with that kind of rallying cry for Kubernetes is hey, write your applications in a stateless way and run them on Kubernetes. It sounds almost like an fM2 running database on Kubernetes. Well, things have improved and changed in recent years. As Kubernetes ecosystem matured, the understanding times where there are a number of stateful applications, which you want to run on Kubernetes and database is important part of that. There is actually very active data on Kubernetes community which is focused exactly at that problem year which was growing rapidly and getting a lot of success. Now, if you look at this case, even few years ago, even some of the very experienced people in Kubernetes, they're not very excited about running state for workloads on Kubernetes. But I think things are changing. And I wonder what Casey would say right now, right, three years after he posted this to you. Now, in reality, we can see a lot of Kubernetes adoption with a database technology. In fact, there is a pretty large amount of public database as a service solutions which are powered by Kubernetes. Co-crouch Cloud, in BlackDB, PlanetScale, DatastracksAxtra, Altinity Cloud, I also heard their EnterpriseDB, LaunchCycler, their public cloud offering which is also powered by Kubernetes. And what that means is what we are talking about, tens or probably even hundreds of thousands of the database nodes running right now in Kubernetes in production. So it is possible and that works. Now, what is so exciting about the Kubernetes? Well, I think about the Kubernetes as an operating system, but for your data center, right? And I think in this regard, it's kind of taking the path which is somewhat similar to what Linux was having, right? If you remember early Linux days, it was kind of not very good operating system. I remember when Linux was not supporting files more than two gigabytes in size or was limited to four gigabytes because of the 42-bit system for 42-bit CPUs, right? People who worked with a real big R and were laughing at Linux, well, but guess what? It's improved to become pretty much ubiquitous and the standard operating system we work with. And I think that Kubernetes in the same way but provided us the same experience for data center, right? Because distributed systems are different, the API and the kind of like a mental concept what Kubernetes provides is different from what we get to the Linux. So it's can be hard for some people to get a grease-operated process, right? But at time, over time, it becomes very wonderful. I was explaining to some people what Kubernetes is like here. When you try to be at first, right? It's sort of first taste bitter, right? But then you develop acquired taste and start loving it. What is great thing about the Kubernetes is it has a very robust mechanics to deal with node failure which is a very important for life scale distributed system. And then what is also very helpful for database specifically is in the new versions of Kubernetes ecosystem have this operator framework which really allows for automating a lot of complex database operation tasks, right? Upgrades, various maintenance operations, various self-healing from various kind of failures. All of those are not trivial, right? And it requires some work with which all can be implemented in that operator framework. Now the support of Kubernetes by database vendors is kind of interesting which has a lot slower pickup by vendors themselves. And I think this is because many of them want to steer you to their managed hosting service, right? If you look at how would Oracle, for example, want you to consume MySQL in the cloud? Well, guess what? You just run Oracle Cloud, right? The same goes for folks like MariaDB, MongoDB with MongoDB Atlas and so on and so forth. And the fact there are actually many third party Kubernetes solutions there develop. You would often find multiple Kubernetes operators for MySQL, for Postgres, right? And so on and so forth, right? And those Kubernetes solutions actually can come in two parts. It could be evil hem charts or operator packages. And the hem charts is something which just helps with the installation where operators is typically more advanced and they provide a full life cycle manage. In terms of per corner, we have solutions for our operators from MySQL, MongoDB and Postgres which you can install directly or install operator for Helm chart. And as I already mentioned, that provides improved enterprise-grade features. But I think if operators really go much more than that and our operators provide many features which are even more robust or just simply non-existent in upstream, right? So for example, if you really want to have very robust operator to run MySQL, there is really no match to per corners of operator in terms of functionality and the word extent was vital best. The other thing which I love with operators is also it allows you to have the software defined infrastructure and have an infrastructure as a code. This, for example, is how you can define their cluster configuration, right? It tells you everything what you want to deploy, what kind of storage, how many nodes of your time, right? And applying this as a YAML file, right? You basically would get always the same, the same cluster, right? And if you want to change with your deploy and you can, you know, change the YAML file and you put in a version control, right? And I think that is this infrastructure as a code, concept is supported by operators uniquely well. But so what are unsolved problems with Kubernetes? Well, there is I think a couple of problems here. Now, if you are looking at running business critical stateful application on Kubernetes is not really easy because there are many moving parts in Kubernetes and you have to figure out how to properly set up your storage. So it's robust and has no single point of failure and it's performant and it's secure and so on and so forth, right? You'll need to set up backups and all this kind of stuff, right? Which is not easy for non-Cubernetes experts, right? And another thing which may not be easy is the API. As I mentioned, the API of the Kubernetes is different and it's more declarative than action driven. If you are somebody who got used to the Linux approach, right? It's often goes as a step-by-step action. I don't know the package, I install it, I went to installation to complete, then I, you know go ahead and let's say create my school users which I'm going to use all this kind of stuff. Well, that is not how you really work with a Kubernetes. A Kubernetes is a declarative, right? And in many cases, it's a single, right? You sort of the deploy, say, hey, create new this environment and Kubernetes does its work and eventually you get that environment in its state you wanted provision, right? That's can be slightly unexpected, confusing for people and that requires kind of the different mental work. Now, if you want to learn how to use their Kubernetes with their data, mySQL, I create this blog post which includes a very good tutorial for Minikube, right? Which really gets you to go through their most important operations with the Plastic. Okay, now in the start of this presentation, we talk about simplicity, right? And let's think what is about the state of park simplicity of a database, especially database in the cloud. And it is not surprisingly database as a service. When you think about the database as a service, we really have a database as a service multiple solutions, which all tends to be property. First, all major cloud and actually number of secondary cloud those days they have their own property database as a service. And what I mean in this case, where the database itself may be open source as with Amazon RDS, mySQL or maybe proprietary, right? If it is Amazon Aurora or Google Spanner, in the end it is wrapped around the proprietary management code and API, right? Which offers you very different experience compared to their open source component alone, right? That is why I call that as a property because if you go from a cloud and say, hey, I want just to run it now on prem, well, guess what? You would not be able to use the same nice GUI you're able to provision the database as you can with Amazon RDS, for example. Now database vendors also tend to have their own proprietary servers. You think about MongoDB Atlas, TySQL, CoCoach, Cloud and so on and so forth, right? All of them have the same thing just from their own side. And then also there is the new generation pop-in app of a multi-vendor, multi-cloud solutions such as Avian and Instacluster, which again have their proprietary management layer over open source database. And those companies often would state how wonderful open source is and so on and so forth, right? But in the end, a resolution is still more partly proprietary, especially in terms of management layer. Database as a service is actually fantastic. It provides a lot of benefits also from removing toil, right? Which stands for activities which are not particularly productive for your system, right? Managing availability, database patching, backups, performance union, right? Database service make database easy to scale. In most cases, you can just say, hey, go to their larger instance size, right? Well, scaling by the credit part is very easy with database. They also are often open source compatible. But open source compatible in the database service way is typically saying, hey, if you're using open source database, it's easy for you to move to our solution, right? And then we'll provide you that kind of extra value which is not available in the open source package itself. So it would be very hard for you to move back to the proper open source solution. So be careful with that. What you also see is what the database as a service, especially when it comes to their major cloud, they tend to be over marketed, right? They are often stated as fully managed database as a service where people may say, oh, great, that means I don't need any database experts on stuff. Amazon will just do everything for me. Well, then you turn around and hear well, actually a lot of things are shared responsibility, right? Amazon Aurora is not going to at least yet automatically design your application queries, your schema, your indexes and so on and so forth, right? So you have to know what that fully managed means and typically that does not mean what you can avoid having any database experts on your team. And also we have an issue of potential database as a server when the locking. As I mentioned, that is their game what everybody plays in this market is to make sure what you are so much online and database and the service solution you would depress hard to run just open source solution. And what is interesting in this case is what the price differential between the hardware, right? Or infrastructure right in the cloud it costs to run a database as a service and then how much it costs to run that database as a service venture. The cost differential for that is continue to grow, right, being more and more. I remember in initial release, for example, RDS, they're charging about 40% surcharge for their kind of convenience management layer and in the latest generation, especially with Graviton, it's more like 2X. Now, I think that is also where we can learn the lessons from a past and if you are wondering who is that good looking guy here on their picture, this is Jan Larry Ellison from Oracle who was initially saving folks with Oracle from the hardware vendor lock-in of IBM who was requiring you to run everything on those huge mainframe computers. But guess what? After folks have been sufficiently locked in the Oracle database, they have been on never-ending journey for raising prices, right? And now there is a saying goes about Oracle what Oracle doesn't have as the customers Oracle has posted, right? How things change. And I think that is something that you have to be very mindful of locking in with other technologies and specifically in the cloud. I believe cloud is in a very interesting stage right now. We kind of have two ways to run things in the cloud. Either we can go in and completely drink a vendor Kool-Aid and lock in for example with Amazon or other proprietary cloud vendor and use all the path highly differentiated very cool but very proprietary solutions and then we are completely at Amazon Mercy or we can go with solutions from cloud native computing foundation, right? Which offers now a lot of open source solutions and which treats cloud as commodity. Cloud is great. You still are very likely to run a lot of your application in a cloud over the next two years but there are multiple ways you can do that. I would also point out that is something what Amazon themselves recommended the years back. That is exactly slide from their presentation which compared cloud computing to electricity. Of course for most of us in most environments it doesn't make sense to run our own generator all the time. It makes sense to buy generation from utility but the thing about electricity is it is commodity. You can change your utility provider and still keep your TV, your fridge, your laptop, right? Well, that is something what cloud tries to take away from us and say, well, you know guys if you are buying this fantastic TV but guess what, it only works if you buy electricity from us. Not a very good situation. So again, Kubernetes is fantastic solution in this case because it provides a universal API which works both in public and in private cloud. You run Amazon you can use Kubernetes. You use Google, Azure or your on-prem solutions Kubernetes support all of them and we believe what you can use that API to provide databases service life experience which works in the public and private cloud. At the corner, that is work and progress. Our operators are very robust and we have a lot of deployments for our operators and Kubernetes and their emission critical environments. Our databases services currently in work and progress. It's experimental. But we would encourage you to check it out if you have any questions which you do by downloading PMM give us some feedback and also hey, it is open source if you don't like it you can fix it and send us a pull request. So we are seeing in generally as use of open source database as a service in kind of two angles. From one point is the interface for developers say hey, I want to make sure I have a single API call and just provide the cluster which is self healing, self patching, self tuning and so on and so forth and that is something which we can provide as an open source package then if you are looking at the fully managed solution well that is you actually need people to deal with some of the database problems because even if software gets better and better it cannot solve everything and that is something where you have to do it yourself or work with your partner and if our vision is what we create the open source software so you can do it yourself you can work with partners and Percona would be one of such companies which you could hire to run their open source software so if you look at the summary take away those open source databases are really on the path from support of containers which is very robust and have been there for years to really have a full open source database as a service experience the tricky for open source database experience though is that is something for us to build because many of the database are the service vendors especially than those are sort of like an enterprise open source of single vendor like in MySQL like MySQL for example they have other proprietary databases service solutions which are in competition with this ideology so Docker support as we discussed is quite mature Kubernetes support is getting there as I mentioned we are seeing thousands and thousands of nodes and database clusters running on Kubernetes their fully open source database experience is still working progress so as it goes in open source you can be part of the solution and we encourage you to do that to finish it up I believe the database as a service really has won hearts and minds of developers because it just offers unparalleled convenience of using the database software vendor locking sucks and as many times before the open source is coming to the rest that's all I have to say and here are some links if you would like to connect thank you