 Thank you, Scott, for the introduction. So as Scott mentioned, our talk would be around how we can use policies to have a secured and automated GitOps workflow. Hello, everyone. My name is Avni Sharma. I am currently a graduate student at the Indiana University. I also interned as a product manager at Intuit for the ARGO CD project. And I've also been a software engineer at Red Hat. We also have here Jim, who is co-founder and CEO of Nirmata. He is also co-chair of policy and multi-tenancy work groups. And he's the maintainer and co-creator of Kivarno. So this is the agenda. We would be going through the GitOps trends and what are the gaps currently in the GitOps workflows and how policies can help in minimizing those gaps. And then we would be demonstrating that with an example demo workflow. So let's quickly walk through some of the important GitOps trends. There was a user study that was conducted in 2022 by the ARGO project, which showed that more than 80% of the respondents were using Argosidian production. And more than 66% of respondents have been using this for more than six months in production. We also saw from the user study that advanced app patterns has led to the impetus to application sets in the project. And it helps in going for multi-cluster deployments. We can also see it from a snippet that is taken from that user survey. That how the app of app patterns has helped in multi-cluster deployments also. And application sets is something that's being widely adopted in the community. And we've also seen from the survey that it's not only applications, but now GitOps is going beyond applications where we are increasingly using it for infrastructure management as well. So what doesn't work well? Security, any day in Kubernetes itself, is very complex and hard. And it becomes even more hard with complex workflows, like multi-cluster deployments. Self-service with guardrails is something that is not really cohesive, and it doesn't really work well. So we want a model that has guardrails, which helps in preventing misconfigurations and helping in standardizing workflows. Automation across multiple projects and controllers is also something that is tricky, and that is something where we can definitely improve. So these are some of the gaps that are currently there. And now, Jim will go over how we can use policies in minimizing these gaps. Thank you, Omni. So let's talk a bit about what policies can do to help with this, right? So the way I think about policies, especially in the Kubernetes world, is there a contract between the different roles that end up sharing a cluster. So you typically have your developers who are deploying the applications. The security teams, of course, they want to secure everything from the cluster configs to the workloads, the pod security, et cetera. And then you have your operations team that's managing all of the shared services and the cluster itself. So policies become this digital contract between the dev sec and ops roles. One of the main use cases we see with policies is to prevent misconfigurations, to be able to standardize and automate. And that's one of the other things. It's like typically, when you think about policies, you think of them as restricting things or enforcing certain types of checks. But policies can also be used for automation. And we've done this in other IT systems, in several other domains. Why not do that with Kubernetes, especially with the powerful declarative configuration concepts we have? So here in the demo, we'll use Kiverno, which is an admission controller, a policy engine designed for Kubernetes. So with Kiverno, policies themselves are declarative resources like any other Kubernetes resource. So if you're comfortable writing a deployment or a pod or any other Kubernetes resource spec, Kiverno looks extremely familiar. There's no other language to learn, nothing else to kind of step out to. Kiverno runs inside the cluster as an admission controller, but you can also use it as a CLI, a command line tool in your CI CD pipelines that can also do background scans. So lots of different flexibility in how you apply Kiverno policies. In terms of the policy rule sets itself, you can use it for validation. So verify different configurations, either just in audit mode or to block certain configurations. You can use it to mutate different configurations as they're coming into the cluster or even generate brand new resources based on certain triggers that you set up. Then those triggers could be existing resources. The triggers could also be new resources that are getting created. A classic example is when a namespace is created, you might want to generate a default network policy, different role bindings, roles, et cetera, for that particular namespace itself. Kiverno also, in the last talk, we looked at cosine and how it can be used to sign and verify images. Kiverno has built-in integration for verification of image signatures as well as YAML manifests, also integrating with six-store cosine. That's a very widely adopted or growing use case as well. So here's a quick example of a Kiverno policy. This is an extremely simple policy. It's just checking for your tag and making sure that the latest tag is not allowed. So just to make sure that your tags are specifying a version and you know and have a mutable tag. Here's a more complex policy. And one question we often get as well, Kiverno's nice and simple. It's so easy to get started with. Can you really do complex policies? And our answer is almost always, well show us the use case and we can generate a policy for that. So Kiverno can do extremely complex policies which require API calls, other kind of lookups. It can even fetch data from OCI registries like signatures, layer information, et cetera. So everything you would do, you would want to, for Kubernetes guardrails, you can do. In this example, we're making an API call and then applying various James Path expressions to be able to do certain logic in this case to block duplicate ingressos. So one note before we proceed further. So I mentioned that Kiverno can do validation, mutation as well as generation of resources. Mutate policies and getups do end up sometime conflicting with each other, right? Because with getups, the whole idea is, get is your source of truth, you want to make the changes there. You can mutate certain elements, but then you have to make sure your getups controller can exclude those and you don't get into a reconcile loop. You can also, with some of the newer features like flux as server-side-apply, Argo CD is also adding that feature in and with support for managed fields. Policy engines can also claim ownership of certain fields now within a resource manifest itself. So there are ways to work around this, but just be careful if you're adopting mutate policies. All right, so let's take a look at a sample workflow and we'll use this for the demo itself. So the goal that we kind of set out to solve when we were looking at this, and Europe, we wanted to use a combination of different tools, see how policies can help. We wanted to offer secure self-service clusters to teams within, let's say, within an organization, within an enterprise. We want to use standard tools wherever possible, like so command line, kubectl, other tools which are already familiar, and we want this to be completely automated end-to-end, right? So just some terminology we'll use during the demo. So let's assume you have a platform team that's managing all of this infrastructure, the policy sets, and the shared services. You have a cluster owner who's gonna request a cluster and assume that role, so they can manage the lifecycle of the cluster, and they have full cluster admin permissions. And then you have a management cluster and a tenant cluster. So the management cluster is what you're gonna use to provision new clusters using cluster API. Tenant clusters are the clusters that your team or folks within the organization will end up using. So looking at the scenarios, some of the challenges we immediately encountered as we were trying to build up through this demo is, first of all, cluster config is complex, right? So if folks who have looked at cluster API, it's very declarative, very powerful, but there's gobs and gobs of configurations, which is not something you want to allow everybody to edit and use. So you wanna standardize those, you wanna automate how cluster configs are managed. CAPI or cluster API also has this concept of templates now, and it has clusters which can leverage those templates. So we're gonna use those and we're gonna use policies to standardize the types of clusters somebody can request. The other challenge is every tenant cluster, Kubernetes, of course, once it's spun up, you bring up the control plane, the worker nodes, that cluster itself is not secure. So you need to harden that cluster, you need to put policy management on it, you need to put maybe other security tools, other add-ons on that. So how do you standardize that delivery of add-ons with each of these new clusters that get spun up by a user request itself? Another challenge is the management cluster now becomes a shared resource. So of course you could take the approach that nobody touches the management cluster, but we want it to be a little bit bolder and say let's share that cluster. Let's see if we can enable multi-tenancy on that management cluster, allow multiple application teams to make requests and not trip over each other's tenant clusters that they're provisioning itself. So that's where again we're using policies for multi-tenancy with generate resources, enforce RBAC and other standard configs you will find in Kubernetes itself. And finally one of the other challenges is every cluster owner, so let's say I request a new cluster, I have cluster admin. So there's nothing preventing in a vanilla Kubernetes deployment, there's nothing preventing the cluster owner from going and removing all the policies your platform team might have configured. So we wanted to prevent that and we'll show how that can be done as well. And you wanna make sure that even though you have cluster admin there are certain restrictions and guardrails in place. So with that context let's dive into the demo itself and Avni will lead us through that. Thanks, Jim. So for the demo there are certain prerequisites that we have established on the management cluster. So we are gonna use Argo CD as our continuous delivery tool. This is going to be our GitHub software agent. Then we have cluster API which we are gonna use for creating and provisioning our tenant clusters. And then as Jim mentioned we are gonna use Kivorno as the policy engine and all of this is already pre-installed in our management cluster. We've also defined two cluster owner roles who are gonna request for tenant clusters. Those are Nancy and Ned. So how are we going to create our or request our tenant cluster? So a user Nancy would create a namespace let's say it's prefix with cluster. You can have any reggics for that. We just wanted to distinguish a tenant cluster creation from a normal namespace creation. So let me just create a cluster real quick here. So I'm gonna create a namespace cluster one as Nancy and this is going to trigger the tenant cluster creation. So once the namespace is created it triggers off this workflow. This is like cluster as a service concept. The policy that's there on the management cluster it will create relevant CAPI templates, roles and role bindings for the specified namespace. And the cluster API or CAPI also provisions the tenant cluster that is bringing up the worker nodes and the control plane. Once we have our tenant cluster ready there is a policy that generates secret to register that tenant cluster with Argo CD. For this, after this workflow we have Argo CD application sets which are already there installed in the management cluster and once the cluster is registered with Argo CD the Argo CD application set will start applying the Calico controller in each of the tenant cluster via app sets. And then Argo CD will also apply the Qvono controller in each of the tenant clusters via app sets. And once the Qvono controller is installed up and running Argo CD will apply the policies in each of the tenant cluster with the help of application sets. And to achieve all of this we are using the cluster generator concept of the application set. So we know that the application set controller also it greens the secret from which it knows what clusters to be registered with Argo CD. And we can also target specific clusters with matching labels. For this in each secret we have a label which is cluster type colon tenant. And once the application set controller knows that it basically spawns out the Calico controller Qvono controller and all the policies for the tenant cluster. All of this is sequential and it is been enabled by using sync waves because we want the Calico controller and the CNI plug in basically to be up and running first and then the Qvono controller and then the policies. So y'all let me just quickly refresh the apps. Here on the UI if we'll go under the clusters it will show us that our tenant cluster was registered and it's taking a while to load right now. Anyway I'll just show through the CLI. So here we see that our tenant cluster has been provisioned. We have a worker node and the control plane up and running. So let me now switch to my tenant cluster one context. So the Qvono pod is initializing and let's wait for it to run. So here you can see that the tenant cluster one that was spun up is showing as a listed cluster on the Argo CD UI. I hope the screen is visible should I also enlarge it. Okay so now we can see that our Qvono controller is up and running thanks to application sets. Now let me try and edit a cluster policy which is disallow latest tags. And so cluster policies has been set by the platform team and the intent is that nobody is able to actually edit this. And once I try to let's say edit an image regex here it should disallow. So here the web book the admission controller is kicking in and it's not allowing us to really edit that cluster policy which is being managed by the platform team. Now let me try and run a test image which is not signed and it's not gonna be validated by the policy engine. It's a basic engine X image and I'm trying to run it but the policy engine is telling that it's not allowed because the image is not signed. So let me try and run again a test pod with image that is signed. So here even though it's signed we had certain policies that had to be validated like we cannot spin up our deployment in default namespace. The deployment requires some labels like app.cubanetist.io, the pod also require probes readiness probes et cetera and it requires request limits. So let me show a good deployment YAML. So this is a good deployment YAML which has everything that's gonna pass the validation from the Kivorno policy engine. So we have a signed image, we have all the requests limits liveliness probe, readiness probe set. So if now I try to run this. So we do remember that we needed it to be in a different namespace than the default namespace. So let me just create a namespace test and let's apply this good YAML there. And now we see that the creation of that deployment succeeded. So the controller acted up. It helped in validating the deployment against all the policies that had been added on the tenant cluster as a standard add-on. Now let me go to the management cluster again. So we also elucidated that how the management cluster itself a shared resource. So we need to have some namespace level tenancy as well. So we also have a user Ned who has not really created the tenant cluster and so he shouldn't be able to list, delete or do anything with the tenant cluster. So we can just try that by kubectl get clusters as Ned and we see that the policy on the management cluster is showing that that is not allowed for the user Ned. If we do it as Nancy, who had actually requested for the tenant cluster, it shows a success. We also have to elucidate on a concept of creating a bad namespace that is requesting a cluster which is not being used as a standard add-on. Which is not being validated by the engine. So let's say we don't want to get crazy with worker nodes going more than three and we really want to keep the resources to a limit. So for this example, we've specified that the control plane and worker nodes cannot exceed three. But in this bad namespace example, we see that we are providing something greater than three and these are provided as annotations. So if I try to apply this bad namespace, if we try to ask for a tenant cluster with something that's not valid, we would have an error. So here you can see that there is a validation error and we can only have a maximum of three control plane and three worker nodes for this demo example. So we saw how we can achieve a policy-based GitOps workflow which will enable in having a standardized, automated and secure GitOps workflow. There was no manual handoff and it was all secure end-to-end. In the demo, we went through creating a tenant CAPI cluster and was provisioned based on configured standards. GitOps was used to configure required add-ons and policies on the tenant cluster. For us, it was cluster one. And policies were also deployed to the tenant cluster for workload security, best practices, image signing, and manifest signing. So quite a lot of things we covered in the demo there, but of course, there's still room to improve and do better. So a few things we were thinking of, we would do in addition, if we were to put this into anywhere near production use. So obviously, you could also use Git to provision some of the initial workflows. In this demo, we used kubectl to just create a namespace, but that may be done through a pull request. You also, ideally, might want to separate roles and service accounts for the cluster admin as well as the Argo CD itself, which is also provisioning and managing things on the cluster. But we saw policies can still be used even if both end up with cluster admin. You can still add additional restrictions to your policies. And finally, you probably want to protect, like, your Calico and your kube system and other namespaces through additional policies as well. So just some key takeaways here. I mean, policies are not just about enforcement and validation, although that's one of the key use cases. You can do end-to-end automation trigger, you know, and use policies as that glue, as that handoff between different controllers. GitOps and policies work really nicely together, the combination is extremely powerful, as you saw in the demo. The one caveat being with mutate policies, there's new features coming in both, you know, policy engines as well as GitOps controllers to help with that. And you can, you know, the whole goal here is to get to secure self-service. And that's an interesting combination of words, because security by itself is difficult. Self-service, yes, you can automate in many different ways, but putting that together, you know, otherwise without things like policy engines and GitOps becomes extremely daunting and a very complex task. So feel free to, you know, if you want to check out Kiverno or, of course, you know, also ArgoCD or any of the tools, go to some of the links. All of these projects have boots Wednesday through Friday, so stop by if you have any other questions. Here's some info on the Kiverno project itself. And certainly, if you want to get in touch with us or ask any questions or give feedback or anything else you would like to see, feel free to do that as well. Thank you.