 It looks like it's a time to get started and today we'll talk about running databases or in containers all the way through having a database powered, let's see, something just popping up, okay. So we're going to talk about the databases and its path from a container to database as a service in Kubernetes. Now on this talk we're talking a lot about how the Linux has turned 30 this year. Now I am too young to be using Linux 30 years ago, but it isn't close to that and I remember the early days of open source and how complicated that was compared to what we have to deal with now. I remember a case where if you want to run some open source software, typically you have to download the source code and then maybe you have to figure out what is the proper compiler version to compile it on or find some patches for that to work exactly on your software version and so on and so forth. I would imagine some of you also had that experience, which was quite complicated and then from that we had the never-ending drive to simplicity, not just an open source, I think in technology overall became much more easy to use over the last couple of decades, but if you look at their software in Linux space in particular, we can see this path where we started with, hey, you download the source, you patch it, you compile it to then target the binary and kind of install scripts that started providing, which was next level of simplicity. We still have quite a bit of stuff to figure out, for example, you would need to ensure you have a proper version of libraries installed and stuff like that, but it was much easier then we went even further, having packages at Dev and RPMs, which offer dependencies so you don't have to figure out what particular library version this thing needs and then yum repositories or apt repositories, right, to even install everything they need right now, right? And finally, if you look at something that you do install, we got solutions like Docker and Snap, which simplify that even further, right? For example, if as a developer, as if I use Docker, it's very easy for me to install, let's say, multiple versions of the same software or similar software which would otherwise kind of conflict with each other. For example, in my SQL space, if I would want to install in a normal Linux package where both my SQL and MariaDB side by side to try them out to see what works better for me, I cannot really do that because they conflicted with each other, they used kind of the same data paths, but they use a different data formats. But if I'm using Docker, I can do that quite easily. Okay, so here comes database in Docker. And let me ask you, anybody of you has experience running database in Docker here? Anyone? No? So, okay. Now, I think for databases, and that is, I think very important, whatever environment we are looking for, it's very important to look at the two different use cases. One is test and dev use case, where we often look at, hey, how can we get that as simple as possible, right? In this case, the simplicity and velocity is important. And then there is a production, where in the production case, first and foremost, database should not crash, should not lose your data, should not expose this data to intruders, right? And we are often ready to pay for all that extra hustles in the production. Now, if you look at the test and dev, we see a lot of people using Docker, especially those who are using Docker for other parts of their application, because that really provides a lot of benefits. You can really have a database provision of Docker, which is very clean and very consistent environment and doesn't have any other dependencies on paths, on libraries, or anything else you may have in your operating system. It's easy to have multiple environments, if you want to. And in many cases, you can use tools like Docker compose to simplify the development of a full stack, right? You can spin off your database, your app server, right? Wherever you want to do those things right there on your machine. Now, in production, things are more complicated. First, as if any, what I would call like virtualization or kind of layering technology, when you talk to a database, people, they tend to be concerned about the overhead, right? Like I remember people running database would be very concerned about virtualization a year ago, saying, oh, it has a lot of overhead. If you really want to get your best performance stability, you have to use a bare metal, right? Docker kind of transitioned in the same process. And especially as you look at some early Docker versions, some configurations, we did have overhead in the database. Well, a lot of those have been fixed by now, right? And if you look at the modern Docker on modern Linux, the overhead is quite minimal, but still some people have rumors, oh my gosh, it will slow everything, everything out. Now, it also comes with some extra complexity, right? On operational stuff, right? For example, well, to configure things, you need to properly better to use data volumes because otherwise, when you just drop your Docker container, all your lovely database disappears with it, right? That is not your experience on Linux, where if you want to uninstall mySQL, Postgres, or wherever, well, that is not going to wipe your data, right? They take some learning experience to do that. And also their observability and monitoring, at least initially with Docker was kind of more complicated. You couldn't get as much insights in your database operations if it's running on the Docker as you could if you're just trying as a plain process or set of processes on Linux. So what is the state of our open source database right now with Docker? If you look at most databases, they have an official Docker images. Or at very least, if they didn't have, then Docker would have built that for them, right? So it's very easy. They are rather commonly deployed for test and dev, but they are not super commonly used in production. So what does Percona do in terms of Docker? What are our approach? Well, obviously, we provide also Docker packages, which we have for mySQL and MongoDB at this point. We do have those for our extra software called Percona Distribution. Again, Percona Distribution, MySQL, Percona Distribution for MongoDB, which is our enhanced version of those products, which is open source for MySQL and Postgres and source available for MongoDB, but which includes number of performance enhancement and enterprise features with no added cost. Now, what would be the reason, in my opinion, for that relatively limited Docker usage in Prod? And that, in my opinion, is what a lot of what we call day two operations are not really solved very well by using Docker alone, right? If you're really running the database at scale for a long time, you kind of install it relatively rarely, right? Typically, you have to do the things like upgrades, you know, have ability. If your replication breaks, you have to figure out how to fix it. And Docker does not really simplify any of that. The fact, it allows you to provision multiple database instance in the same physical host of VM, while very important in development. It's not really that important in production where we often would want single database per VM, right? Like for if nothing else, isolation reason, right? So, a lot of problems which do not make that operations much easier. And here, I think that is where the Kubernetes brings us a great solution. Now, if you're here on this conference, I assume, where you already heard about Kubernetes for a reason, I don't really need to explain you what the Kubernetes is no more than I would explain you what Linux is. So, let's talk about Kubernetes and databases in particular. I think Kubernetes and databases have, I would say, complicated relationship, right? And had it from the early days. Because the concept of the Kubernetes first was thinking about stateless application and stateless ideas saying, hey, you know what? Do everything stateless and that would be wonderful. But stateless database is kind of a simmer on, right? The database is where you have your state in the Kubernetes term, right? So, that is where a lot of requirements were required, going on from early Kubernetes version, so it's to be able to run databases, right? And a lot of improvements were done through last several years, right? I think important things, for example, they're different storage concepts, right? With the persistence volumes, the stateful set, right? Which are very important in terms of how you deal with databases and other stateful applications and Kubernetes. And another third company which I find very important is the operator frameworks, which allowed to have some very complicated process to be run in the Kubernetes clusters. Because unlike something like a web applications where you say, maybe, well, I want to provision 50 instances of this stuff. And you know what? It doesn't matter in which order they come, they just came around, they're all equal and they will just happily work together. In a databases, typically as you bring up a database cluster, for example, the sequence is important, right? You often depend on what the database technology is. You need to provision database in particular order, handle failures in particular way, and so on and so forth, right? Which requires more controls, which now we have. Now, even with that, I would say what we still have a number of people which are rather impactful in the Kubernetes communities, like Casey Hightower, is not particular enthusiastic in the stateful applications and Kubernetes. Now, to be honest, I've been using that tweet for about a couple of years now. So we probably should check if it changes opinion, right? Because I think, sooner or later, right? We will consider Kubernetes database capable, right? At least in a certain extent. Now, something else I want to mention about the Kubernetes. When we talk about Kubernetes, if you look about the people who want to argue their case about what you should not be running database on Kubernetes, they can tell you, well, what if I have this 50 terabytes, single instance, golden, or oracle, postgres, you name it, can I really move it to Kubernetes in terms of performance reasons, right? Or how it handles with that instance of a such huge size? And the answer problem may well be not yet, right? And the same answer we had in the past about virtualization or cloud, right? You would really often keep your monster very mission critical database on bare metal well after you moved a lot of a smaller, or less business critical database to virtualize environments, right? Like for example, 15 years ago or so. I would see the same thing happen with Kubernetes. Now, I think what is also interesting is what if all that talk about where Kubernetes is helpful or not, right? We do have a number of vendors which stated publicly what they do use Kubernetes in their database as a service offering. And with that, we can speculate that many tens, if not hundreds of thousands of database nodes are being run in production right now in Kubernetes by those cloud vendors alone, right? Obviously, in many of those cases, they either design that from ground up to Kubernetes friendly or have retrofitted appropriately, right? Like for example, if you look at PlanetScale, the test folks, they are very specific to avoid running very large instances. They shard the database across many containers where each of them would be relatively tiny, no more than 256 gigabytes, right, or something, okay? So anyway, I think I spoke about this already, a slide here, I thought a promise what Kubernetes have and how it can help us with the databases, right? Operating system for your data center rather than a single server, which is important because if you look at the real production databases because of the high-vibrity framework, they cannot really be managed within a concept of a single server. It has a very robust mechanics with a different, hundreds of different failures, right? And allows us to build the automation for handling that with operator frameworks. Now, if you think about the open source database on Kubernetes, it actually has a slower pickup by many vendors. And because of that, you will find what many third-party operators are becoming available first. If you think about PostgreSQL, there has been a variety of PostgreSQL operators in existence right now. None of them is really blessed, as I understand, as the one and only official solution for PostgreSQL global development group. For MySQL, Oracle did not do the operator for a very long time, right? I think they just released the better version of operator officially, right? It's not GA yet. And I think some of that is what for a lot of vendors, they're sort of conflicted about what the market position is. So if you look at somebody like Oracle, well, how does Oracle want you to run MySQL? Well, they want you to run MySQL on the Oracle Cloud, right? And if instead they provide you awesome MySQL operator, which you can use to run MySQL independently, you may be confused. And instead of allowing Larry to get a bigger boat, you will just run your own MySQL solution. Now, the Kubernetes solutions for Kubernetes typically package either as Helm charts or operator packages, or Helm chart which install the operator packages. In my opinion, you really want to look at the operators because Helms, they help with the installation, but they generally on its own do not really help with all that kind of day to operations, which is really where a lot of things happen. From your corner, we are really focusing a lot on this market because we think there's a lot of potential for that in terms of being able to build fully open source solutions. And we have operators for MySQL, MongoDB, and now for Postgres, better, right? We have our own branded operators because we want to make sure we can provide a very uniform experience for different database technologies. And these are available to install as operators or as a Helm chart. They work on majority of managed Kubernetes solutions probably clouds or majority of independent Kubernetes distributions. Now, what are the unsolved problems with the Kubernetes and databases, as I think? Now, as I mentioned, their stateful application of Kubernetes is kind of still kind of tricky. It is not impossible, but it's not trivial, if you are looking at a mission critical database where you can never ever lose data, you really need to have significant Kubernetes experience to set it up where if you are not Kubernetes expert, it may not be easy for you, right? Even if you just learn to Google and kind of copy paste some comments to get your stuff deployed on Kubernetes, that probably is not going to be enough. And compare that to their seductive simplicity, which we have the major cloud vendors offer, right, called database as a service. Database as a service experience, that means what you, as a developer, can provision the database with a simple API, call a couple of clicks, which will do a lot of stuff for you, right, manage availability, maybe backups included and so on and so forth. Now, what is the current state of the database as a service solutions? Well, if you look in this case, all the major cloud providers, they have proprietary database as a service offering, right? Of course. Now, even tier two providers, you think something like DigitalOcean or Linode, no, Linode doesn't have one, but the DigitalOcean, right, or some providers in, you know, countries specific like Alibaba in China or Yandex in Russia, they all would have database as a service providers as well. Now, we also have a database vendors, typically everybody now is building their own proprietary solution. Obviously, MongoDB has Atlas, SkySQL from MariaDB, you have a cockroach cloud, you have solution data stacks from InfluxDB, right? I mentioned wherever technology is, typically it's going to have a cloud vendor and there is also a third party solutions like AVN, right, or Instacluster folks, right, who are presenting here on the show. They also have a solution for databases, right? And while the database itself may be open source, fully open source and not modified, then this kind of a management layer on the top of that is typically proprietary, which creates sort of locking. Now, I believe database as a service is amazing, right, which gives you a fantastic development experience, right, can have manage availability, database patching automatically, backups, maybe some performance tuning, right, better than running databases before default, providing you like easy kind of push baron or swipe accredited card ability to scale, right, if you're running slow. But also it is, has challenges if you really love open source approach in your life. Because as I mentioned, they're not open source. Many of them would advertise what they are open source compatible, right, and that is especially the problem with heavily modified solutions, right, think about Amazon Aurora, right, which, well, it has what MySQL or Postgres can do, right, and also other things, but what that means if you fully adopt all Aurora features then you cannot really run on the open source database even besides their management framework. What open source compatible often means is what, hey, this technology allows you to move to the cloud to our solution, right, and we don't particularly expect you to need to move back, right, and everything what we noticed with database as a service is what cloud vendors like to call it fully managed, right, you can see that in any kind of public cloud environment while it is not really quite fully managed, right, you will find, for example, a lot of security incidents recently they correspond to some public cloud and database out there and then you can learn, well, what security is shared responsibility, right, well, as frankly it should be, or you may learn what while from performance optimization there are certain things which is done for you, the cloud vendors are not going to really work with you to see how to design your schema or optimize your queries and so on and so forth, right, in many cases at least at the base incrimination of that management services. So I think the main thing with databases I would say is what it is accounts with a lock in and while it's maybe painful now for some in terms of a cost premium you have to pay for that, it is likely to be even more painful in the future, right, why am I saying that? Well, if you look at the Amazon RDS which I think is interesting example here because it was around for a long time, the first generation of the RDS had I think 30% or 40% price premium compared to what your hardware to provision it would be. Now if you look at the latest, graviton powered instances, the price premium is about 2x, right, so you essentially have to double your spend, right, and pay for half of hardware, half for the software in the end. And that is for RDS which is relatively unmodified Minesquale or Postgres, if you are going for Aurora, right, which is positioned as more advanced database, that is going to be either more. Now I think in that is the case where history may continue repeating itself, right, and if you don't know this guy right there, this is my friend Larry from Oracle, and if you think of how the Oracle was started and what Oracle was doing, it was actually saving people from the lock-in of the big blue dominance, right, in the early days. You would have to run by this kind of mainframe to run a database on them which would become really expensive, or you can get Oracle, right, and run it on smaller computers, but we know, well, what as that technology got adopted substantially it became, well, expensive to the point is what many folks in our space and the database space are looking to find a way how we can migrate from Oracle to the open source databases, not because Oracle is a bad database, right. Many people I talk to say what the Oracle engineering is fantastic and Oracle database is great, but well, you know what? If it's keeping Larry happy and losing my job, right, or moving to open source software, then, you know, we better do that. What I think is interesting in this case in the cloud is also how cloud was advertised versus how it is advertised right now. You can see this slide show is actually from some very old AWS presentation. It even has old AWS logo. And what you see, they have promoted cloud at the time it wasn't so known. What the cloud is very similar to the electricity, right. It doesn't make sense for you to make your own electricity unless it's in some exceptional cases. It's much easier to buy it, and we all or most of us do that. But I think what is important with electricity though is electricity is commodity, right, and you either have access to kind of multiple vendors, right, where you can maybe choose which are kind of easily replaceable ones, or if not, then that is really kind of heavily regulated as utility so government doesn't really allows the only company who provides the power in the region to really skin your life, right. Well, if you look at the cloud, it is kind of different, right. It's kind of saying, well, if you buy electricity from us, right, and then we'll also sell you TV, but it will only work if you get a power grid from us, right. It wouldn't make my sense, would it. But that is exactly what a lot of the cloud vendors are focused on. If you look at this case and to talk about cloud vendors on the kind of high level, right, to learn what is the best practices to using Amazon, or Google, or Azure, it always would be used as many with kind of a highly value proprietary services as possible. Use DynamoDB or Aurora, right, or Redshift or stuff like that, right. Those folks are not going to sell, hey, you know what, just use the commodity stuff such as, you know, compute services and build the stuff of a value on the open source, for example, on the Kubernetes platform. But that is what will make sense for you, right. And I think that makes sense for us as a community because while open source is kind of not as easy to use in the cloud yet as those cloud solutions, native cloud solutions, I think it will get there. I mean, on my memory, I think open source often takes a time to get there to match the level of performance or usability compared to the proprietary software. But with enough folks focusing on that, it gets there. You guys probably remember the situation with Linux, right. I mean, I remember starting to use Linux in 1999, I think, right. And at that point, I would talk to Solaris folks, right, or some others and they would tell me, oh, what a joke. Really, that is an operating system which cannot even handle files more than two gigabytes in size. You know, if somebody remember those challenges, right, or you are restricted at those 2.7 gigabytes of memory, right, with 32-bit kernels, right, or stuff like that. It is a joke, right, compared to the Linux real Unix operating systems. But they have been replaced. The same happened, let's say, in the web browser. Not the web browser, but the web service space, right, where you would have initial proprietary solutions where almost completely wiped out by whatever it's like Apache and Nginx-based solutions. With database space, you cannot say what Oracle is gone, but if you look at development for new applications, I think very few of those really happen at those legacy-property databases. It really goes into their open source slash open core source available, whatever database technologies. Well, anyway, so in our vision, in terms of what is the alternative or kind of more open source-friendly way to run database in the cloud, it is really to use the Kubernetes as a universal API for public and private cloud, because one of the things about it is pretty ubiquitous. It exists on pretty much every major public cloud as well as many private cloud solutions. And what we have been doing, in this case, is building their sort of a graphical user interface and API gateway, which really allows you to get that database as a service experience, right, from the open source solution, right, and again, in an environment you completely control. Now, it is work in progress, right? You can think about that as better right now, right, but we are going to get there. And I actually hope what we are not going to be the only vendor who invests in making sure what that usability piece is only offered in the proprietary package, right? Now, one question I got in this case is saying, well, Peter, if you talk about the database as a service and open source, how do you exactly think about it? Right, because if you look at database as a service for folks like Amazon, Google, and so on and so forth, they combine their software piece, right, if obviously they are the people piece, right, and you don't often know what is what, right? For example, if you are having Amazon Aurora or RDS, if something breaks because of a software bug, they may update software fix it without you even know that, right? So what I think about database as a service is this kind of two different components. One is your interface. As a developer, as a user, with database as a service experience, that means you can provide the full database experience, provide the cluster, right, with simple actions. If I want to update my database cluster to the new version, for example, it doesn't mean, oh, I now have to figure out how to install the new packages on 25 nodes, but I can click a button to update the cluster or have an API call, right, something like that. And that is something that Open Source is very suited for. Now, when you speak about the complete the management piece, right, that is where you need some humans to be involved because wherever great software we build, as many as artificial intelligence, self-healing, and whatever buzzwords we want to include in that, there are going to be cases where the software is going to fail in some unusual pace, and at least at this point of a state of artificial intelligence, we still need humans to go and resolve complicated problems, right? And that is where you even need to have that capacity in-house, you know, same as many companies do right now, right, or have some partner to do that with. So if you look as a summary, well, you think about the Open Source databases, right, they are really having to travel in this path from containers to a database as a service functionality right now. Docker support, as I say, is very mature. Kubernetes is getting there, and I think that we are in relatively early stages having a fully Open Source database solution right now. With that, to finish it up in a simple words, I would say what I really believe is A, what the database as a service really have won hearts and minds of developers, because it really gives us unparalleled simplicity and convenience of using the database. I also believe what a software vendor lock-in sucks, right? And I think many people know that. Even more people are going to learn that, sometimes very hard and unpleasant way. And I think as with many things before, that is where the Open Source will come to the rescue. That's all I have, and if you folks have some questions, I would be happy to answer. Can you talk a little bit more about this work in progress? Is this the work to make Kubernetes a universal idea? Is that already done, or what is it? Okay, so what we have from our... We have a project, P-MEM, the corner monitoring and management, right, which does have a database as a service functionality which isn't better. Where basically you can register your Kubernetes cluster in it, right, and then you have the GUI instead of APIs where you can provision the database clusters, kind of scale them up, down, upgrade this kind of process. Does that answer your question? So it's limited specifically to the distributions per corner? Well, that's right. That is specifically for the distributions per corner is often, but there is also something I think important here from two parts. Is A, well, it is the completely open source project including our distributions. B, we're always building software in a way, something what we do not necessarily need ourselves as a company because our customers, they run per corner variant of software. We want to make it easy and pluggable for other people to build what they want, right? So if you look at, hey, you know what, I want to really use that sort of internal API to, let's say, provision Redis, which you don't support, right, to have our own, you know, package of Postgres, right, or we are very welcome people doing that. And again, that's the open source of what makes it possible. Okay, any other thoughts, questions? Come on, nobody's going to say, Peter, you're on, that's complete bullshit. No? Oh, the lady is going behind. Thank you. Okay. Okay, so what I would say is this, right? If you look at the open source database as a service, right, as I mentioned, I am not aware of complete package, which is GA quality right now, right? So at some extent you can say, well, you know what, stay put, right? But in your plans, I would encourage to understand what that is coming, and that is probably coming both from us as per corner and from other vendors because I think there's a big need for that in the market. And, you know, as having an open source, which we could have a variety of solutions coming up, right? That's one thing. The other thing, and that's, again, depends on what scale of a company you are, right? I think it's important to know what we can achieve more together, right? So if you say, hey, that sounds interesting. Hey, can we work together to meet our needs faster? Yes, let's do that. Well, you're right, right? And I think if you're speaking about the moving of databases, right, there is always kind of free risk, right? Even if you're moving from, I don't say, PostgreSQL 13 to 14, right? It's not going to be, like, completely. But if you are moving from completely different database technologies, let's say, hey, I want to move from Oracle to PostgreSQL, right? That's even more risk and complexity, right? And, yes, with PureCon and our partner, that is something we can help both from the database, one technology to another from one cloud to another from cloud to on-prem or from on-prem to a cloud. We have a lot of experience doing that. Yes, yeah, okay. Thank you, Matt, right? Yes, so there are kind of three levels if you look, right? The distributions, which is GA quality, right? Then you have operators, which are the second layer. We have MGA for MongoDB and MySQL and Beta for Postgres, right? Which is going GA, I think, in a month or so. And then this kind of GUI slash API, that is what is work in progress. Yeah, thanks, Matt, for clarifying that. Okay. Any other thoughts, questions, concerns? Well, if no, then that's all I had for you folks. And I can get you out of here a couple minutes early.