 Hi, everyone, and welcome to this CNCF webinar. Today, we have a very cool agenda for you. For the next hour, we're going to be talking about databases in communities and how to run them the most secure and reliable way possible, and most importantly, how to make sure business or DBEA-crafted guidelines can be applied and enforced during the lifecycle of your stateful application. And because it's also important to walk the walk as you talk the talk, we have a couple of demos in store for you. We hope you're going to enjoy it. Again, welcome to this session, How to build Kubernetes policies to ensure compliance for databases. During the next hour, we're going to be touching upon a couple of CNCF projects that are related to that topic. The first one is Flux by Weave, which is a continuous delivery solution for your application, including stateful application and databases. And the second tool we're going to be talking about is Kyverno, which is also a CNCF project and is a policy engine specifically designed for Kubernetes. So let's get started. My name is Nick Vermundi. I'm a Principal Developer Advocate with Ondats. I've been working with Kubernetes for the past five years. And before Ondats, I've worked with Eviatrix, started focusing on multi-cloud networking. And then before that, I spent six years at Cisco as part of the engineering team responsible for the container network interface or CNI. In terms of the agenda for today, we're gonna start with database in Kubernetes and talking about what is the status in terms of the industry. Is this a safe thing to do? Run database in Kubernetes and how people do it. And we also talk about how to do it in a way that can match the requirements of enterprises. Then in terms of delivering those databases in production, we're gonna be talking about policy as code, which is more focused around how can we put some guardrails around how people deploy databases in production. And to that same goal, we're gonna be taking a look at GitHub's principles and how to build GitHub's pipelines for stateful application by also incorporating some notion of policies and compliance. And then of course, we conclude by a demo which will be making use of all the principle we would have mentioned during this talk. But first, let's address the elephant in the room. Is it a good idea to run a database in Kubernetes or more generally speaking, is it a good idea to run a stateful application within Kubernetes? So in 2016, Kubernetes introduced the notion of Petsets, a first try at handling stateful application as a first class citizen. And prior to that, any application running in Kubernetes was strictly supposed to adhere to the 12 factor apps, basically a stateless application. So that's the war between Pets and Cattle. Over time, Petset became stateful sets and a comprehensive set of features have been added to safely run stateful applications. This includes solutions around storage, networking, identity, and application life cycle management. So for example, you can attach distinct persistent volumes to individual parts composing your stateful set. In terms of the network, network identity is guaranteed and is stable, meaning that in case of pod failure, the same pod can be restarted on the same node or another node while keeping the same original number as well as the same host name. Then when it comes to upgrading your stateful application, stateful sets support both running updates and partition running updates to feature that are critical when it comes to managing your stateful application. So all these various options can be tailored and fine-tuned to suit your need. But the real question is, is this sufficient to manage databases in Kubernetes? And we'll get to answer this a bit later. But first, let's take a look at what are the most popular container images when it comes to running stateful application in containers. So out of those 14 images, we can notice that we have Redis, Postgres, Rabbit NQ, MySQL, Mongo, Kafka, Vault, etc. All those containers are actually stateful application, meaning that they have to persist some sort of data to disk, which makes sense because people have been running stateless application in containers for a while, but there's not such a thing as a completely stateless application. You always have a component that is stateful, maybe a message queuing solution or a database. So this graph shows that people now wants to collocate their stateless application together with their stateful component. There may be different reasons for this, operations, but also maybe latency. Now, if we take a look at the top container images running as Kubernetes stateful sets, the result is actually very close, which shows that those containers, stateful containers that are popular should be run as stateful sets, not as classic deployments for the reason we mentioned before. Stateful set have specific requirements when it comes to network identity, stability, host name, and also the order of operation. When you deploy stateful application, you want to deploy the pods in a particular order, and if you need to scale down, or if you need to upgrade the image of the application, then you need to take the reverse order. And this is really key because it's not only about deploying pods, it's making sure that the application that is running on top of those containers are actually in a healthy state, which means, for example, for Mongo, it means that the cluster is formed. So if we remove a node or add a node, then the states of the cluster also needs to be updated. This is not automatic, and we're gonna see how we can potentially alleviate this portion. But essentially, yeah, those stateful application needs to be run as stateful set to begin with, but it's only one part of the picture. Once you have your stateful set defined with your application template, you now need to install the application on top. And what is particular to Kubernetes is that you have now a higher level of controller, which is the stateful set controller that controls how the containers and pods are provisioned and deleted, et cetera, and updated. You also need a mechanism to be able to install the application on top, because most of the time, the application on top need to form some sort of cluster and organize itself as a whole, which means you need to encapsulate some sort of knowledge into code. And there's a specific pattern that actually exactly does this in Kubernetes. It's called the operator pattern. At a high level, a Kubernetes operator is just another type of Kubernetes controller, but that monitors custom resource, as opposed to a native Kubernetes resource. A custom resource is just an extension of the Kubernetes API that allows you to represent any objects within or outside of the Kubernetes cluster as a first class citizen within Kubernetes. So the custom controller effectively watches custom resources and any action that is performed in terms of CRUD operations, so create, read, update, delete. Any action that is performed on the custom resource is translated into an automated action from the custom controller. And that can be anything. So for example, if we take maybe, let's say a custom resource represent a AWS cloud instance, then as you add new custom resource, the custom controller will create new instances in AWS. That's why as you delete the custom resource, then the custom controller will delete the instance within AWS. And you as a developer need to embed the knowledge required to do all this action within the controller. So how to delete a AWS instance, how to create an AWS instance, et cetera. So if we apply this principle to a database, then we can have a lot of benefits, right? So it can automatically perform operations for straightforward critical components such as database scale, backup, upgrade. All of that can be managed by an operator and potentially simplifying the deployment, scale out and scaling of cloud native application. And finally, because there is a reconciliation loop that is performed by the operator, it also sort of enforces natively compliance by design. So for example, let's say the, when you deploy the database with the operator, you set up an admin user with specific permissions. So the database get deployed with that setting. And let's say you want to change that permission manually, you know, using your database command. So what's gonna happen is that because the operator is constantly monitoring the database and you know, the different components, it's going to reverberate the permission as defined in the custom resource setting. And this is because their reconciliation loop trust the declarative intent, not the imperative comment that you have applied to the environment. But the operators also come with their own set of challenges. So first of all, there are no standard to express custom resources settings for a database. For example, you know, how you want to call the storage, the PVC in terms of the path to that particular settings is not defined in Kubernetes. So every operator, creator or provider can choose its own, you know, schema to express its own settings. There are no standard across all databases, for example. And typically once you start using operators to deploy and manage the lifecycle of application and software solutions, you tend to get a sprawl of those custom resources which can lead to, you know, increase difficulty when you are troubleshooting what's happening in the cluster. Then we also have potentially challenges with the supply chain quality control as you are probably going to use existing operators rather than create your own. You need to be able to trust the people and the software engineers that are building those operators. And finally, you know, as they are not really any standard for those custom resources, how can you validate that the settings you enter and you configure for those schema are valid into your environment. And this is what we're gonna try to think about in the next section. And finally, documentation. It's quite difficult to match exactly your use case in terms of finding the documentation. So you will find a lot of operators, they have GitHub repositories, of course, with some example of use cases, but there's not really any border plate for your particular use case. So you will have to find, you know, different pieces here and there, put them together and just try it out. You know, if it works, if it really fits your use case, et cetera. There's not like comprehensive documentation you can just find, you know, typically. However, because these custom resources are fundamentally Kubernetes resources, you can explore, you know, their schema, their field, simply by using kubectl explain and just specify, you know, the custom resource you want to explore. So far, we've been talking about the operators, you know, architecture, but now let's focus on, you know, how to use these operators and apply some policies in terms of, you know, the values you express in the settings. And for this, there's, you know, a pattern that is well known as policy as code. One of the tool we're going to be talking about is Kyverno to realize this. But when it comes to policy as code within Kubernetes, it's not really code anymore. A Kubernetes policy engine should just support YAML, right? Because YAML is what we do in Kubernetes. So let's take a look at how we can use policy as YAML with Kyverno and apply this to custom resources. So there are a couple of principle to be applied when using policy as code or as YAML. So first, we want to decouple the validation or the enforcement of the policies from the directive decisions themselves. So for example, we want to be able to store our policy independently from the process, from the validation process. So typically you want to use Git or any, you know, version control system to store your policies so that you can track history, you can share with your teams, et cetera. So of course, it needs to be in a declarative format because Kubernetes already has YAML. There are solutions that are introducing a new language to represent, you know, the policy. But if you're already running Kubernetes, this is not necessarily something you want to start with, right? We have YAML, so let's take with YAML and if the solution that you are contemplating doesn't satisfy your requirements because they are too complex, then maybe you can try a solution that involves a new language like Regal, for example, right? So in that sense, it doesn't have to be Kubernetes, the policy as code solution doesn't have to be Kubernetes to work, but you should start, you know, using your native tools in the case of Kubernetes. Well, let's just use YAML, right? So we also want to control and validate the source before committing to the cluster. If we just rely on an admission controller to validate or, you know, mutate the input, then I would say it's already too late. So for example, if you have an application composed of, you know, five different manifests and two of those manifests don't pass the policy validation, then you'll end up with three manifests deployed to the cluster and two while not deployed, potentially leading to some, you know, inconsistency. So it's better to begin with to have the ability to use validation within your GitOps pipeline before deploying the manifest within the cluster. So optionally, it's always good to have the ability to eventually mutate the input. So if you have a non-conformant input rather than invalidating and sending an error message, what you can do is transform the input to make it, you know, fits within your policy boundary. And so there are multiple solutions on the market that can help you with, you know, building this policy as code or policy as YAML. So OPA Gatekeeper, Kaiverno, Daytree are all valid example. But for this session, we're gonna focus on Kaiverno specifically. So if you look at the traditional process to handle any API request from a Kubernetes point of view, you can insert webhooks at two different sections within that workflow. So mutating and mission and validating and mission are two valid webhooks where you can insert specific logic, external logic from any sort of software you want to integrate with Kubernetes. In the case of Kaiverno, the validation and mission webhook is used to validate or invalidate specific statements. And in our case, compare values against policies. And if the values are within the policies, then Kaiverno will validate the request and send it back to Kubernetes. In case the validation comes with a negative answer, then Kaiverno will return just an answer saying, no, this is forbidden. We are not gonna move forward with that request. And the mutating and mission webhook will again do its own thing regarding the value of the different fields. If the value is not within what the policy has determined, then it's gonna be changed into the value that you want to apply. And actually we are going to demonstrate the mutating and validating capabilities of Kaiverno later on during our demo. And but basically Kaiverno has a wide range of capabilities. We'll be talking about validation, mutation. So a quick detail about mutation, you can use either a strategic merge patch or JSON patch depending on the granularity you need to go into when modifying a particular field or set of fields. Kaiverno is also able to generate new resources when a new resource is created or when the source is updated. It also has a notion of preconditions, which means that it can gather data from the admission request payload, so the admission review actually, and reuse part of that data, save them into variable that you can further use when building Kaiverno policies. And in addition, Kaiverno supports image verification through the verify images rule, which uses cosine to verify container image signature, attestations and more stored into an ACI registry. And finally, Kaiverno has created James Path, which is coming from the name of the person who has developed this language. And it's actually a language that Kaiverno supports to perform more complex selection of fields and value and manipulation of all these fields combined with filters. So let's take a look at how we can integrate Kaiverno with your traditional continuous integration pipeline. So a GitOps pipeline allows you to use Git as the single source of truth for both your application code and your communities manifests. They can actually sit within the same repository. Doesn't really matter as long as everything is hosted into a version control system. So as usual, the first thing you do in even in more traditional pipeline, you build a container from your application code, then you specify the container name, tag, and other details into your Kubernetes manifests. And you can use a tool like Customize to do that. Its job will be depending on your target environment. We'll field the right fields with the container information and environmental specific values as well. So once you have your communities manifest that are ready to be pushed into your Kubernetes cluster, so you can have staging, prod, dev, customize overlays, then you will have a GitOps tool. In our case, this will be Flux that will pick them up and make sure that's the reconciliation loop, synchronize the state of the cluster with what you have within your communities manifest repository. And so on the diagram here, you can see that Customize is used as part of the pipeline, but Flux also supports natively Customize, which means that the only thing you need to have is just the overlays in the communities manifest repository. And Flux will be intelligent enough to leverage those overlays, create the targets in your manifests, and then use them to deploy your application within the target communities cluster, which leaves us with a question, where can we integrate Kyvernal in this picture? So there's actually two solutions. The first one is you can use Kyvernal as a CLI, let's say within this dotted line there. So as you build your communities manifests, just after that, you can use the CLI to compare it and to check it against the policies that are defined as YAML file. So Kyvernal CLI will use on the one side the communities manifest YAML file and compare them against the policy that are written in those YAML files, the Kyvernal YAML policies that are also sitting in the repository. The second solution is to use the Admission Controller, and then again, we can either mutate or validate. What that means is in the case of validation, things will happen after the communities manifests has been deployed in the cluster by flux. Flux is gonna first check the manifest. If there's a difference, it notices a change, then it needs to synchronize with the communities cluster. So it will send those files through, like a more of a pull mechanism into the communities cluster and apply the manifest to the cluster. As a result, the Admission Controller will either authorize or prevent the manifest from being deployed into the cluster. And then again, we're into a situation where we can have some inconsistencies because some of the manifest have been deployed, but maybe not other ones. And in the case of the mutation use case, then same thing again, right? You have the single source of truth which is sitting within this repository. It's picked up by flux, flux apply the manifest into the cluster. The Admission Controller then will change part of the values to match your policies or so that the values are within your policy boundaries. But then as a consequence, one could argue that, okay, now the communities manifests here on the Git repo are not the source of truth because some of the values has been tampered by the Admission Controller. And that is a fair statement. Therefore, it's up to you to mutate or validate within the cluster, but my personal preference would be to keep Git as the single source of truth. So use Kyverno in the context of a GitOps pipeline as part of just a CLI tool within the workflow. So I hope this makes sense to you. And let's just sum it up. So enforcing compliance with Kyverno, when, where, and how. So what we have been discussing so far, when, well, ideally during your pipeline execution and preferably if you're using communities, GitOps is the best of great solution to implement continuous integration and also using flux then as the continuous delivery mechanism, right? So as part of this pipeline, try to enforce compliance where obviously you want to have your Kyverno policies sitting in a Git repository. How? Well, preferably using the Kyverno CLI that will on the one side leverage the manifests in one Git repository for the communities application. And on the other side, the Kyverno policies which are also represented as YAML file, probably in another Git repository. So now let's check this in real life into our demonstration where we're gonna be creating flux, source and customization. So that's our application can safely be deployed in the communities cluster leveraging GitOps principle. We will validate the application using the Kyverno CLI of cluster. And then just for, you know, comprehensive testing we'll also show you the same thing but this time with an admission controller that will validate and mutate non-conformant resources. Let's get started with the demo. So this demo is available as part of a workshop or lab I've developed on Instruct and I'll make sure to put the link in one of the last slides of the deck so that you can also use it if you wish though. So let's launch this lab. Okay, so let's jump into the last section of this lab. The first thing we're gonna do is validate some policy with Kyverno of cluster meaning that we're gonna be leveraging some application manifests. So made for two distinct environments. So we have the first environment which is a dev environment. The customized overlay is there. So we have a couple of components within this application. We have a front end which is a web application a flash web application displaying Marvel characters that are picked from a MongoDB database. So this database as you can see here is a representation in the ammo as a custom resource. So as soon as we're gonna push this manifest the MongoDB operator that has been installed in the cluster will react and deploy a stateful set and also the MongoDB cluster on top of that stateful set. We also have a storage class. So we're gonna be using on that as the storage class for the stateful sets. So on that will be responsible for the underlying distributed storage layer. So providing features, I mean enterprise grade features such as at rest and in transit encryption, persistent volume replication, NFS share if you wish so, optimized performance, et cetera. But it's gonna be just used as a storage class as part of this lab. Then we have the last component which is a job. So the goal of this job is just to make requests on the Marvel API and populate the database. Then the front end application will display the information that is sitting within this database. Then the production overlay is essentially a replication of the dev section but with some differences. So for example, you can see in the dev environment we have two replicas for the front end in prod. We'll have five in the prod overlay. We also have a service, a specific service that is exposing the application in the outside world. In the dev, this is just a cluster IP so only available within the cluster boundaries. And in terms of the storage class, there's also some differences. You can see in the dev environment we have no replica, no encryption. For production, we will enforce, I mean we want to enforce two replicas and encryption as well. And for this, we're gonna be using Kaiverno policies. So the validation policies are described there. So first in relation to the MongoDB database, what we want is to have here one user admin that has, I mean at least one admin user that has all the permissions that are sitting there. If you compare to the original manifest in the application, you can see that it has four different rules. And on top of this, we also want to check that the encryption is enabled. So you can see like the kind of this object is cluster policy. All the policies have this type. Again, it's just YAML, right? Nothing's too different from a normal, just Kubernetes manifest. On top of this, we also have the number of persistent volume replicas as defined by the storage class should be greater or equal than two. And finally, the maximum size of any persistent volume that is associated with the MongoDB community custom resource should be less or equal, equal to 10 gig. So now what we want to do is validate the policies with Kyverno only using the text file, only the manifest, nothing in cluster. So for this, we're just gonna check that we don't have any policies in the cluster or any application. So there's no, the application has not been deployed. We have Kyverno that is installed, but we shouldn't have, so cluster policy, this is the objects or C pull, we don't have any policies implemented in the cluster yet. So before checking the policy, let's introduce some non-compliant information. For this, we're gonna go back into the editor. We're gonna be using the production overlay to simulate this non-conforming or non-conformant information. We are going to delete the cluster admin permission there. We're gonna also change the data volume size to 50 gig which is more than what is allowed. And then also in the storage class, we're gonna change the encryption to false to disable the encryption. Okay, here we go. We saved the changes and now we can just use customize to generate the manifests for the production environment and then pipe them into the Kyverno CLI which will be using this directory as the source for the policy validation which is exactly what I've shown you before. Okay, let's run this. And hopefully we should have three errors, right? So this is the errors I've just introduced. So we have one which is require your Mongo permission. So we have one permission missing, if you remember. Then the encryption that is not enable in the storage class and also the PVC volume size that is greater than 10 gig. Next, we're gonna see what happens if we use the dynamic admission controller for the validation. So basically doing exactly the same thing but this time applying the manifest into the cluster and using Kyverno in cluster. So we need to deploy the policies in the cluster first. Here we go. And if I check now the policies, there are four of them. So it should be, okay, now all of them are ready to go. And I'm not gonna use flux right now to deploy the application. I'm gonna use flux for the last one for the mutation. Well, it's the same principle. The only difference is that I'm using Coup-CTL apply rather than a GitOps CD methodology. So now again, let's see what happened. If I paste the command here, again, we're gonna be using customize to build the manifest and pipe them this time not in the Kyverno CLI but directly Coup-CTL apply. So deploying all the manifest in the cluster. So as expected here, we have errors and this is exactly the same ones as before. So complaining about the volume side, the Mongo permission and the encryption. But the difference now is if I look for the application, you can see that now like it's literally when I've applied this manifest, we have part of the application that is running. Of course, we are missing the non-conformant resources which is the storage class and the stateful sets. So our application is currently broken and this is not necessarily something you want. Right, so this is why I was telling you before for validation, it's better to use it in a pipeline. And as to the mutation, then again, if you're using GitOps and you mutate values as your application is getting deployed into the cluster, is it really GitOps? I'm not sure, right? But let's proceed with the next section which is gonna be around mutating these resources. But first, let's delete the application altogether so that we can use flux to deploy it again. Okay, so let's check that now our application has been deleted. This is good. Now let's apply the mutate policies there and let's take a look also at them. So in Kyverno mutates, what are we going to do there? So we're gonna enforce the admin user to have those four permissions for the MongoDB customer resource. For the storage class, we're gonna enforce a couple of settings here. So regardless of what the user will input in the storage class manifests, we will change it and force the number of free class to two, force encryption, force the usage of XFS at the file system which is the one recommended for Mongo. And in terms of the volume size, we're gonna enforce five gig for all the volumes that are going to be attached to the MongoDB stateful set. So not only for data but also for logs for any volume that are going to be attached there. Okay, so let's go back to the console. The policies have been, mutation policies have been created. Let's just check now again, the cluster policies. So you can see these three new policies are now living in the cluster. And what we need to do now is to create the flux resources so that flux can deploy the application into the cluster in a GitHub's fashion. So first we need to tell flux where is the repository to monitor the non-conformant resources. And then we also need to create the corresponding customization to tell flux what to do with it. Okay, this is done. So now if we have, if we look at the Fleet Infra Repository, which is the repository that hosts the configuration for flux, we've been using the apps prod YAML file, which is this one. So now if I display this prod YAML file, you can see all, I mean, these two resources, the source, the Git Repository and the customization. So the repository that I've configured there is essentially a mirror of the changes that we made. You know, when we change a conformant, the conformant prod manifest into non-conformant, this is exactly what this Git Repository contains. So we have modified this locally within our flux configuration repository. And because flux itself is using GitHub's principle to configure itself, everything is done via Git. So we need to push the changes into our remote repo. So here you go. And we can monitor the flux customization. So after a few seconds, we should see a new customization getting reconciled within the cluster and the application getting deployed. And what we're gonna see, hopefully, is that the settings that have been sets within the manifest are then going to be overwritten by Kyvernal. So now let's verify the application configuration. So for this, we can check the volume size for the MongoDB database. And you can see that we moved from, you know, 50 gig and one gig respectively for the data volume and the log volume into five gig, which is what has been enforced by the policy. Now, if we also get the storage class, so if we get the storage class on that prod, we're gonna see that again, encryption is true. Number of replicas is two. And the file system is now XFS, which is what the policies has enforced. And finally, let's check that our application is running. So we have our job that is now completed. We have those five different replicas for the front end and we should also see a load balancer type service that is there. And if we go to, if we browse to that IP address, on port 8080, we should see our application running. So it is there, it's working. We have some random model characters that are being displayed on the screen as expected. So I hope you enjoyed this demo. Again, if you want to try that by yourself after, the link will be in the next slide. So a few takeaways for today. First, we've seen that Kubernetes is now ready for hosting databases and run cloud-native data. The only thing is that the key is to make sure that you can reach the right level of availability, scale, and performance. So the underlying storage solution is also very important to make sure that you can run these stateful workloads with enterprise-grade features. We've also seen that GitOps and policy as code principles provide best-of-class paradigms to manage enterprise application lifecycle. So we've been testing Flux, we've been testing Kyverno. And finally, let's embrace this principle to enhance your platform security, facilitate collaboration between development teams, and ultimately experience faster innovation cycles. And finally, a couple of action on your side. So again, if you want to test the lab, you've got the link there, or if you want to learn more about on that, we also have all the labs you can try out. The link is also displayed there. You can test on that in your cluster, on the SaaS portal, also subscribe to our newsletter. And if you want to chat with the on that team or with myself, you have any question about this session, then you can join us on Slack. So again, thank you for making it through the end. This was a webinar I really enjoyed doing for the CNCF. I hope to see you in the next one.