 Hello, welcome. This is the Open Policy Agent project update and intro. Today we've got Max with us. Hello, I am Max. I am an engineer at Google. I am one of the maintainers of the Gatekeeper project. You can look me up on either these Opa Slack or I am on Twitter, though not very often. Alright, and then myself, Patrick, Opa Maintainer, engineer at Styra on Slack all the time in Yelp. And Sam, you can tweet at me, but I may not tweet back. Alright, so let's talk for a second about what Opa is. It's an open source project, CNCF, founded by Styra in 2016. In 2018, we've donated the CNCF sandbox project. Since then it's gone into incubating. I have a peer-graduated question mark. As of recording this right now, it's up for voting by the TOC, but it is in process to become graduated. Yeah, fingers crossed, there's no blocking issues, so hopefully by the time people are watching this, you know, good things will happen. The project has got tons of contributors from a pretty broad spectrum. We have kind of the usual suspects at Google and Microsoft. We also have a bunch of startups as well as end users. We have people like Chef and some others that, you know, they use Opa, they came back, they contributed, they help out the community. The users are also very, you know, across the spectrum, everywhere from, you know, the big kind of cloud giants down to startups and financial institutions and other sort of more traditional enterprise. So what is Opa? It's a general purpose policy engine. It sounds super vague and the reason being, it sort of is in the sense that it's not domain specific. It's not just for solving, you know, authentication, authorization. It's not for, you know, image scanning. It's a policy language that and the runtime and the tools to actually like, you know, evaluate that and make decisions. At a high level, Opa, you query it, the query, you know, any input can be JSON, whatever, you know, if you can JSON serialize it and go into Opa's input. The decision, same deal. Oftentimes you're going to see Boolean, you know, like allow yes, no kind of thing. But that doesn't have to be the case. It can be a set of reasons why it wasn't allowed. It could be, you know, a set of mutations and what labels to add to a pod or something, right? The service here in this picture, I think all these logos have an existing integration, but it really could be anything. Your custom service can be a plug-in, some external authorization tool or, you know, some of them actually have plug-in. Some of them actually have like a separate REST API or something that needs to get implemented. But it doesn't really matter for Opa. As long as you can translate it into some JSON payload, it'll work. So again, like digging a little deeper, you're getting the declarative policy language, Rego. It essentially, it's a query language that's built to let you reason about structured data, asking questions typically like, can a user do something, you know, some action on some resource? Or, you know, is this object missing a field or some value or property of that object, you know, invalid or valid? There's a ton of built-in functions in the language. It provides the ability to do kind of context-aware things. The main kind of like selling point there being that you don't just have your policy and input. You can have external data too. So if you're trying to answer a question about some API requests, you can also provide external JSON data, such as maybe the entitlements for users, group permissions, you hook it up to your Active Directory or something. You know, it's very flexible. Again, JSON and JSON app. The other piece of having this custom language is performance optimizations. So the query language, it's a declarative. So you write your policy, you don't have to worry about the performance so much. Let's see, our problem is open developers. The kind of nice part in addition to optimizations is that the way that it's written and structured allows us to do a lot of cool tricks for, you know, reasoning about the policy in addition to like, you know, how fast it is, but also like how correct it is. The last piece here might be wondering what we've been talking about features, but like what actually is it. It's typically going to either be used as a going library or as a sidecar post level demon. So OPA comes as a binary that you can run. Most people just use Docker image. And that gives you like the kind of like lowest latency approach with those. The kind of like higher level stack of features. The, I guess the kind of, yeah, we'll call them enterprise features here. You have a set of things for essentially managing the OPA agents and this includes their status audit logs, things like packaging and delivery for the policies themselves, as well as like dynamic configuration and things like that. The last step here to sort of round it out is the tooling to write policies. So the CLI that the OPA binary provides a bunch of tools built in, but there's also a whole, you know, I'd say pretty, pretty like fully fledged ecosystem now of ID and editor integrations that give you like really nice ways to write your policies but also test them and profile them. You know, do kind of regular like code development with them. One of those very common is the OPA playground. So I'm going to take a second to show that to you. This is the regular playground. It's a open tool. Anybody can use this. It's hard to see the URL, but it's play.openpolicyagent.org and go to the OPA website. There's a link to it. This gives you a really good way to just play around with the language, try things out, help kind of explore not only how to write the policies but how to troll shoot them and reason about them. Walking through just kind of your, well for some of you, your first policy here. Essentially each policy file defines a few things. They each have to have a package which defines where and kind of the virtual document structure that we query against that that policy exists. We also have rules that we define as a few types. The overall syntax is basically defining some rule name and a value for it with conditions. So if we look at something that's maybe a little bit more complicated than this hello world one. Look at the rule based access control example. What we're saying is build the virtual like we're declaring a document. Under that document tree under the app.arbex app, we have a rule. And by default, this rule allow is false. There are no conditions. It's just, that's it by default it's false. Our next statement says allow is true by, you know, emitting the tools true. These are equivalent statements. We're saying the allow is true. If the user is an admin. Users admin here is another rule. Now we can see that users admin is true if for some I I being a variable in data user roles input is whole big long selector here. What this is doing is querying the document structure and saying find a variable I if a value exists for I inside of this, this path that I'm defining this rule is true. If everything in the whole body is true. So data, we're referencing into our external data. So we have here in this middle section. Data user roles inside of user roles. We're going to be looking for the input user. If we jump up to our input tab here, we can see that there is indeed a user. The value is Alice. So if we go back down to see user roles Alice. Okay, so far so good. That thing is an array. There is at the zeroth elements, a string that says admin. So we've we've sort of reasoned our way through what opa would evaluate in that. Yes, there exists an I where admin is, you know, equivalent is the string admin is equivalent. We can verify that we can do something cool here in the playground and evaluate the selection. What's telling us is that opa crunch the numbers found a variable says I value zero made this expression true. And really this this concept is what everything is built on opa is trying to find variables that make things true. If it does, they exist they defined that it bubbles back at the top. This is the case for the input we're looking at the users Alice, they are an admin. Therefore, allow is going to be true. And if we evaluate allow, we will see that yes, indeed it is true. So, there you go. This is our, you know, crash course, everybody is now an expert. Your first case of the language but it's, it really is kind of that simple and that you're defining these things that essentially just they set up some kind of condition over that structured data and, you know, opa tries to see if it's true. And switch back to slides. Okay. And now that we've seen a little bit of the policies themselves, you might be wondering, you know, like where do you use these things. There's a slide earlier that showed, you know, a service box and had some logos. The important thing is to sort of step back, take a look at like, you know, we're at KubeCon attending KubeCon. And the, you know, in these ecosystems, everybody's got CI CD pipelines, they've got deployment management, they've got container orchestration, they've got cloud management, all their infrastructure is code, you know, nobody is well. You know, have it automated, there's tooling configs, everything's going through APIs, right. Your applications themselves are all built on microservices, most of which are using some kind of, you know, service mesh API gateway proxy, whatever. And then you also have birth databases, backbone, a lot of these things. Every single one of these spots has policy and every single one of these spots has an integration point for over. And that's an existing plugin or a place where you can add your own kind of custom call out to do the authorization or the sort of policy enforcement. OPA is there as the common unified way of defining these policies. That's really the primary objective, that's what it was built for. That's what we set out to do. And I think the ecosystem page sort of shows that we're succeeding. There's a pretty broad spectrum of integrations here. Everything from doing, you know, like object storage authorization to, I think there's like 10 API gateways that people have plugins for, as well as doing like admission control, terraform plan validation, config validation and all kinds of other stuff. I recommend you guys check this out. The short link should work. If it doesn't, just go to the OPA website and click on the ecosystem link. It's pretty easy to find. And we do keep this up to date. So it should reflect the latest and greatest stuff. Speaking of integrations, latest and greatest stuff, one that we want to call out is Conf Test. So I think probably a lot of people have heard of it. If you haven't, go check it out. You know, there's a quote here straight from the read me, but essentially gives a little bit of an opinionated way to write OPA policies and validate over structured data. In this case, that structured data is config files, whether those config files are Kubernetes manifest or a, you know, Docker file or whatever doesn't really matter back to that, you know point if you can turn it into a JSON object, you can write policy over it. And so they do that. It's super popular. Check it out. There's a reason that it's popular. One thing that I do want to call out here. It's really cool. Conf Test is now an official OPA project. So they've moved in. They're underneath the open policy agent board. Ongoing efforts to, you know, further integrate functionality between core OPA and Conf Test. But yeah, definitely something to look at. But where could gone? So let's, let's stop talking about config files because we love YAML only the probably most popular use case for OPA right now is as a admission controller for Kubernetes. This has been around in a few different forms. But needless to say OPA Kubernetes, they kind of, you know, peanut butter and jelly, everybody should be using it. You don't have this near cluster do it. What I'm going to do here is transition over to Max, who's going to talk to us about Gatekeeper. Okay, let's take a look at Gatekeeper. First of all, what is Gatekeeper? Our tagline is that Gatekeeper is a customizable Kubernetes admission webhook that helps enforce policies and strengthen governance. And if we look at the sort of history of the project, we can see how it is kind of an outgrowth of earlier Styra efforts. So Styra originally released the cube management sidecar back in 2017 that basically watched config maps as a way to import OPA policy. And Microsoft in December of 2018 wrote a project called Kubernetes Policy Controller that built on top of that netted mutation. And a little bit after that time, Microsoft, Google and Styra and other collaborators started to work together on creating a new sort of reimagining of what this could look like. That was called Gatekeeper and that's what we've been doing since then. And so a bit about what Gatekeeper actually is in practice is Gatekeeper is both a webhook that provides admission control for Kubernetes as well as an audit system that provides ongoing monitoring of your Kubernetes configuration to make sure that it is not drifted away from whatever policy. You are attempting to comply with. And if we look at the webhook piece of it and try to see how that works. We see that there's this big blue rectangle here called the API server. And when a user has an inbound request, say they run kubectl apply or something similar. It's going to hit the API server, which is going to as part of its admission control process send out an admission review webhook to or sorry admission review request to Gatekeeper, which is then going to forward a query to OPA to return a response. In terms of how we figure out what response to give what's happening is that Gatekeeper establishes a watch on the policy configuration objects. So for Gatekeeper that is constraints and constraint templates. As well as any other objects in the cluster that you may be watching in order to have referential policies. You know something like make sure that this the value for this label is unique on a per name space basis that that kind of thing. And based off of those cashed resources and the policies that have been loaded in OPA formulates response that gatekeeper then passes back to the API server, which renders its verdict back to the user. So, if we think about the core features that gatekeepers offering users. These sort of highlight features are that it's a validating admission controller. Right, that we just covered. Also, it provides audit functionality this periodic checking of your cluster config to prevent drift. OPA as maintains OPA's ability to provide context where referential policies, but with a specific focus on being compatible with the KRM conventions of Kubernetes. Well, if we look at the policy side of things the constraint and constraint templates conventions allow users to have easily shareable policy bundles that are designed to be non interactive. That once they load those constraint templates and they're able to create policy simply using constraint objects which should be just straight YAML configurations, like these are the specific labels I want to make sure exists as a common example. Another highlight high level feature of gatekeeper is dry run. We would like users to be able to test their policies on a provisional basis and get information as to whether adding a new policy would actually break the cluster. So this is kind of like a pretend enforcement that's giving you feedback, either through Prometheus metrics through logs or through audit results that if you were to actually enforce this policy, these requests would have been rejected or this particular resource that is already in your cluster would no longer be in compliance. And so if we also look at what's new since the last time we talked to you, which is the virtual cube con EU. There are a few changes. The main being that we are now stable. And that we have support for Helm 3 for those users who deploy using Helm charts. And in terms of efficiency, we have provided P prof profiling, which gives users the ability to see where gatekeeper spending time for its CPU cycles as well as where Ram usage is going. As well as overhauling the memory usage profile for audit, hopefully lowering the maximum possible memory usage when you start scaling to large clusters. We are exploring an events framework where certain audit incidents or rejections are reported to the user as Kubernetes events. We would definitely like feedback on sort of the granularity that users would like to see for events. And other than that, there's a lot of process improvements that have been happening on the back end. So backwards compatibility guidelines is a big one for users who want to have confidence that upgrades how to maintain compatibility across upgrades. The release management doc will give you some idea of our release cadence and we have split the constraint template library, which is a repository of constraint templates that have been developed by the community for users to use to form their own policy. We have split that into its own repository. All right, let's take a look at what gatekeeper can do. So in this shell I have open here, we have a connection to a Kubernetes cluster that is running gatekeeper. You can see this if we list the namespaces gatekeeper system exists, which is a good indicator that gatekeeper is at least installed. And let's look at some of the constraint templates we have. So we just have the one constraint template. In this case, it is called Kate's PSP privileged container. And this is an analog constraint to pod security policy. What it's doing is it is disallowing the use of privileged containers inside the cluster. So let's take a look at what constraints we have installed. Excuse me, which should then get constraints here. That will change in a soon to be released release. We can see that I have instantiated this constraint template as a constraints. Let's see the status of this is. Let's get constraints, PSP privileged container. Let's actually get the full YAML. And we can see in the status of this resource that we have four total violations. So what are they listed below? Gatekeeper actually by default lists a maximum of 20 violations per constraints. So each of these are going to be pods, right, because this is a pod security policy equivalent constraint. And we can see I'm running a version of OPA that is requesting privileges. Cube proxy is running with higher privileges than we would like. And a couple of other cube system pods. I'm actually going to ignore the cube system pods for now because it's cube system. Let's look at this OPA pods. So keep cuddle get pods. OPA, it was called. You can see that name here. And that it is in the default namespace. This actually gets the full config for it. And we can see somewhere in here that I've set security context privileged equal to true, which is why the constraint is unhappy. So that's good. This is what we would expect audit is telling us that our constraints are being violated. So what happens if I try to apply? So that's a bad object. So let's see. I've helpfully called this directory bad resources. And we want new OPA privileged. We can see this is basically an OPA container again with privileged equals true. We can see cuddle apply resources new OPA privileged and rejected because we are requesting more privileges than the policy wants us to. Can we remediate this? So let's edit this privileged. Let's just get rid of the lines where we request extra privileges to apply here. And the OPA pod is recreated. And to see this experimental new events feature, if I run keep cuddle get events on the gatekeeper system namespace. We find a few interesting events here. We have audit violations showing up. And we also should have, yes, right down here a failed admission. That is about 46 seconds old from when I tried to apply the invalid OPA pod. This is just a sample of some of the ways that you can use gatekeeper to define and enforce policy on your cluster. As far as gatekeepers current status as a project, as I said before, we are now stable, but we are always looking for contributors. If you want to try us out and report any issues or general feedback, we would love to hear from you. If you have interesting user stories that you would like to be supported, those are definitely welcome. Raising issues on the GitHub project is a great way to provide those. And of course, if you're a developer and you're looking to contribute to gatekeeper, we would love to love to work with you. Please join. If we're looking towards sort of the future of where gatekeeper is looking to go. These top three items are things we're actively focusing on. The biggest I think is mutation, which is pretty heavily requested. We have an active design out for that, which is pending approval as of this recording. And hopefully by the time you see this, we'll be in development for mutation for an alpha. Another area of emphasis is developer tooling for writing constraint templates and making that process a little easier. And as I said, the violations as Kubernetes events feature is in alpha, so we're going to continue to think about what that story looks like in the future. And these other items are things we're interested in. And we have our eye on the top three are where we're currently directing our effort. Please join us. You can contribute to Open Policy Agent by visiting their GitHub repo at opengithub.com slash Open Policy Agent slash OPA. Or you can look at gatekeeper, github.com slash Open Policy Agent slash gatekeeper. Join us on Slack or however you're comfortable communicating. We'd love to hear from you.