 Well, welcome to this lightning talk about addressing a very popular discussion topic about whether you should run your database on Kubernetes or not. I'm George Zanzaras. I'm a director of engineering for MongoDB on Kubernetes and based on where I work, you probably know what the answer is going to be. So as the days went by, leading to this presentation, I kept on getting the question why are we still talking about this? And the truth is that although most of the reasons have been overcome, this is still a major concern, especially for major companies. So how did this debate all start? So first off, Kubernetes wasn't originally designed for managing stateful applications. So Kubernetes was designed to manage dynamically containers, and managing databases is already complex enough, so adding Kubernetes makes it a bit harder. So Kubernetes was originally designed to manage containers dynamically, meaning that individual containers may be brought up or taken down at any time to manage your deployment's needs. Persistent volumes provided a safe way to store and manage data. And however, Kubernetes often needs to restart and your schedule pods and in deployments creation and deletion of pods happens in a random way and pods don't have IDs. Thus, a persistent volume is difficult to move data between nodes while ensuring they're attached to the correct container because of not having IDs. As a result of that, stateful sets were introduced to bring some order in this chaotic way that pods were managed. Pods in stateful sets are created with an ID and are deleted in an ordered way when needed to scale down. So the result of that is that when a pod is being rescheduled, we can easily attach the right volume to it. So that kind of solved the storage problem. And this seemed like a great solution for databases. So being able to add indexes for containers and make distinctions between primary and secondary nodes within a deployment made it sound like a ready to use idea. But although this was a much better solution, this was not good enough yet. And the reason was that was the second point that we made earlier that databases are hard to deploy and manage in cluster setups. Day two operations are very complex, things like backups, restores, all of those things are very complex. And Kubernetes primitives were not enough to address those issues. Stateful sets provide the solution for the persistent volume issue but don't address the management issue. And we often refer to Kubernetes as the platform or platform because it's now considered not a ready to use out of the self platform of its own, but a set of tools to extend and build the platform on your own. And it's all based around interfaces. So around interfaces, you can build plugins, you can customize Kubernetes the way you want. When it comes to networking, you choose out of multiple C&I plugins. When it comes to storage, you have multiple CSI plugins, CRIs and so on. So you see that everything in Kubernetes is a bit pluggable. Similarly, stateful sets aren't a solution that can just solve everything. It's again an interface that can help you solve the storage issue. So when it comes to managing databases on Kubernetes, it's easier to instead of taking the stateful set and running everything on there, you can use stateful sets and wrap around those. And the way you can do that is by using custom resources backed by custom controllers to manage the database deployment. And the operator framework came to provide with the best way to automate database management and automate this through Kubernetes and automate the task that we would expect the human operator to be doing to our database. So operators came to solve the second problem of managing complex databases. So back to the question, should we run databases on Kubernetes? The answer is yes. But there's always a but. We should always make sure that our database wants to be ran on Kubernetes. And for that, we use four questions. Is the database designed to run in distributed setups? Is the database designed to be replicated, started and so on? Can the database tolerate fault on each of its nodes if the database cluster loses a node? Can it continue operation? What are the consistency guarantees that the database cluster provides us? In most cases, eventual consistency might not be good enough when we have a large multi-availability zone setup. And finally, can we support horizontal scaling? If we have a database that we want to do vertical scaling, Kubernetes is not usually the best option for us. And that's it. And we have a few minutes for questions. Questions, anybody? How do you feel about some of the more advanced storage operators, such as, for example, the one from AWS, which does provide for backup and storage via Kubernetes resources, which are then agnostic to what kind of database you might be running? Well, there are quite a few operators, both kind of, as you said, there are a few agnostic of the underlying database. And there are others that kind of, as our operators are database specific. In selecting the specific technology, as I talked before about interfaces, I would say each use case has different needs. If you have a setup where you only run MongoDB, then probably running an operator that is agnostic doesn't make much sense. If you have a multitude and a very complex engineering organization, maybe it makes sense to explore other options as well. But what we have seen in our case is that when someone goes down the path of building their platform on top of Kubernetes or running multiple operators, it's easier to run multiple distinct operators, one for Mongo, one for Postgres and so on. Because when it comes to advanced operations, kind of the one size fits all, but sometimes it's not going to work out. Any other questions? Okay, great. Thanks, George.