 My name is Peter. I work for styra. I'm the community advocate for the open policy agent project My name is really Jane. I'm a maintainer of the OPA gatekeeper project Awesome, and yeah curious a quick show of hands who here is already using OPA Couple people how many people are using gatekeeper specifically are a couple people who's brand new to OPA and like policy is code Excellent, okay new folks. Yeah. Yeah, that's that's why I asked the question I wanted to understand where we should start this luckily we have a few intro slides kind of talk about the project problems we're trying to solve and the direction we're going so With that let's get started today's Today's agenda here We're gonna do a little intro of the open policy agent project that we're gonna talk about the updates that have come Out over the last like six months and then I'll hand it over to Rita Who's gonna do the same for the gatekeeper specific stuff and we'll talk about how gatekeeper is has has opa and rego under the hood Cool. All right, so open policy agent quick introduction OPA is a open source general-purpose policy engine. It graduated. It's a graduated project for in the CNCF as of 2021 We are looking at the problem as a unified tool set and framework across your entire stack That means that we're going to be able to implement policy as code features with any of the tools and services that you want to That you are using in your stack, right? This works by decoupling the policy from your application logic so that the decision point happens next to these services not inside of it And then we'll get into rego and stuff in a little bit a little bit about the community and numbers, right? We've had over 300 contributors to the project We have 85 integrations in our ecosystem already these integrations are tools that people have built around OPA or on top of OPA that are in this in the ecosystem for for community members to pick up These are gonna help you do Specific things like working with service mesh working with API gateways working with the other tools that you have Community members have solved a lot of these problems. So it's gonna be very helpful to Work on top of what other community members have already done, right? We have a good number of Get up stars and inactive slack community. We have about 6400 members in the slack. So getting questions answered is Fairly easy. We have a good number of downloads. We also have some popular tools such as conf test for working with configuration data and Tools like gatekeeper for working with kubernetes and admission controller Also some like fun ID plugins for VS Code IntelliJ if your vim user we have a syntax highlighting and a bunch of stuff to make policy authoring nice and easy So how does it work? Right the policy decision model, right essentially like this as I stated This is gonna be a universal policy tool and framework It's gonna take all of the different Languages that you want to write in all the different infrastructure tools different clouds you want to use and it's gonna allow you to have a centralized Policy tool to work with all of them And so this is a very easy Diagram that we have on our docs shows you that the service that you want is gonna sit at the top Opa is gonna sit next to it. You're gonna pass in a JSON object, right? This JSON object is going to come out of your services. You're gonna take The information that could be JWT tokens user info whatever you want You're gonna package this up as a JSON object And hand that to Opa. Opa is going to look at this information and return What is a decision very simply this decision could be a yes no bullion this decision could also contain more information, right? You could pass data in like a JWT token and pass back a security answer, right? So you're gonna be returning information as a JSON object, right? And so that that Return decision happens by with the input data We're gonna take that compare it to the policy that you've defined right in this policy is going to state The intent that you want right this policy is gonna be written in rego Which will cover in a second and then any any any data objects that you want to connect to right? You may be storing data from your LDAP server your SSO any sort of Database information if you're just collecting stats or metrics, right? You can compare this information to the request that you're sending from the service, right? So now you have a a three part a three part three parts to this decision model where you have data You have your policy what is going to happen and you have the input data from your service, right? And that's going to give you that decision that you return to the service So it knows how to behave and how to perform Given this request More services so with rego Rego here is the Coding language that it comes with opa Rego is a purpose built policy language, right? And so it is a it was purposely built for for defining how your policy should look for your services, right? And so we decided to build something from the ground up. We tried To use various things we try to use go lang we try to use JavaScript We try to like compare how it would look to define policy with existing tools And we found that the amount of infrastructure you have to build around Those languages ends up making the trade-off not great, right? So we started from the ground up we built a language called rego and this language is a declarative Policy language, so that just means that declarative you want to state the intense of your policy you want to state the outcome that you expect and Rego is going to figure out how to take the input and create that output You don't have to do the imperative approach where you go line by line and do the conversion instead you just define what you want at the end and Rego figures out how to get there much like if you're using Hashtag Corp HCL language like terraform or something right that so you can think of that as like a declarative language Or it is a declarative language, so you can make the comparisons there Let's see Right and so with with rego We have a policy which is going to consist of rules So a policy is going to be a set or a package of these rules each one of these rules Which may be layered on top of each other you may have something like an allow rule Which is very popular to say like allow if right x number of things are Valid right so you do you do all of your checks the opposite side of that is something that's very popular like with the gatekeeper Side of things where you do a denialist you give all the reasons you want to deny an action Deny a set of things to happen and then you create those error messages from that denialist Right and so now you just stack all of the all the things that you want to check for and at the end state And how you want that to look at the end state All of this is well documented we have pages and pages of docs for each specific thing You can check us check out the docs at open policy agent slash docs You can also give it a try at play dot open policy agent org This is our playground this is going to allow you to bring your own data input that you want to check if you have a Service that you're already using you can take that data out you can use it as an input object and then Play around and create a couple of regular rules to see how you can transform that that data into a decision, right? So that is a very good tool from our community And and it's a tool that we use heavily in the community as well for troubleshooting as well as communicating Things that we want to work on if you are working on an interesting problem You have an interesting data set you can drop all that information into the playground and you can create a share link and Hand that to another community member or another team member and that's going to show you how the coverage of your data and The policy are going to look in real time, right? So it's a very it's an interactive tool that is very helpful for getting started and So with that that's just like a high-level intro Now we're going to get into the project updates Right, so this is now things that have changed we did a last our last update was in Valencia So it's I'm going to cover a couple of the language changes that have not been heavily adopted yet And so I'm going to cover a couple of those again And then as well as the new things that have come out in the last six months and Rita will cover the same thing for the gatekeeper side so as a declarative language, right we Try our best to make these to make the language as human readable as possible, right when you're declaring intent it becomes It becomes crucial that your your policies are human readable, right? You want people to understand what is happening as they're reading and writing it makes the process much easier So we've introduced some new keywords the first one. That's that's here's in right so now instead of creating a loop around a set of data around a set of arrays around Any sort of objects right as you can see this is how we were doing it previously the previous way still works So if you have existing policies you can still use them But that bracket at the end is going to be how you loop through a list or an array or a set And so now what we've done is to make this make this more readable You can just use the in key words now if you're looking for something in this set you can just drop in the end keyword And so partnered with this we have a way to create Variables right and so now We can create a variable before what we used to do is To have this this value And then we have this value that we create we assign that to a key and then you would use those variables now What we were doing it said is to do this some word that some word says that we're not working with a set of data Here's the keys and values Extract them from this object Right, so now you can see that this is just a simpler way for someone who may not be a regular writer to understand What is happening with that object and how we're using those variables? We also we also in We also used the if keyword and so this is one that caused a little bit of confusion in the community because people Thought that we were introducing a new logic the if keyword is actually just a little bit of syntactic sugar Right with the previous method you can see that we are are saying allow equals true If this body is equal to true right and so it's a little bit abstracted now You can just streamline it allow if true right and so this is just an easier way a little bit syntactic sugar make things nice and clean We also have the contains keyword now. So this partners with the if keyword there is a When you are when you are writing a rule we have one one style of creating a list might create a list of deny reasons Right, this is gonna be called a partial rule because it doesn't have an exact answer It's going to collect a bunch of answers and put them into an object for you And then you can call that object as part of like the data transformation for the decision Right, so you might have a list of deny rules, right? So this is what we're going to use the contains word for Previously you can see we're collecting a bunch of errors from the input and then shoving them into a list called errors Now you can use the contains keyword errors contains err if That exists and then that inputs it into the into the partial set Last one here. I think I have is the every keyword so Once again, we have the old style here. We are doing a comprehension which Looks a bit complicated because it is a little bit complicated You see internal containers and then we have this string here, right? The string at the end is essentially Selecting all looping through all of the containers and looking for things that start with that acne corp List and then putting that into the the initial variable that we created there, right? And so now instead what we can do is just loop through that list create a variable called container and then and then do that check Another update to come out is this is metadata we needed a way to Have have a morph so a little bit more than comments, right these the metadata here is actually going to allow you to Comment and annotate your rules and then you're able to actually call this information as well. So instead of just Making a comment that tells your developers What you expect to happen here now? You can actually have things in there that are callable on the second slide here You can see that we are doing the annotation dot custom dot severity and so we're actually pulling out the The annotation metadata and pulling that into our policy itself so that you can reuse this information For for for various rules inside of your policies There's this is not a full this is not a full list of the annotations available This is just a small subset. So check out the the docs if you want to see the other available annotations that we have cool the next one up here is we developed a Graphql built-in you could have done this previously before But there was a lot more manual manual steps to work with the data now Which are able to do is use our graph ql built-in to Kind of kind of walk the the structure tree of that graph ql query And things like this pop up due to community demand So if there are tools that you're working with that we have not built Built-ins for I think we have about a hundred and fifty built-ins so far for different tools that community members are using So if you're using something and it's hard to work with that data Let us know write a PR and right we can figure out if there's a need for it in the community And then build the tools around it so that we can make your jobs as policy authors much easier other new things Some more new than others the newest one on this list is Andy built in cash With Andy built in cash. This is going to allow you to Have better decision logs the decision log is the log of decisions obviously, right? And so so these decisions are every decision that opa has made to say I'm allowing this thing. I'm denying this thing And why these things happen? one thing that we noticed though is that for Specific built-in rules that have non deterministic values. We don't know what the value of random is going to be This did not get logged very well that made troubleshooting very hard So now with this Andy built-in cash. We're actually taking this information and letting you know that we did not we did not know what the outcome was going to be beforehand and Give you a little bit more information so that when you are troubleshooting you can trace that back and say why did this? HTTP HTTP call make the outcome that it did Delta bundles is the ability to Take your bundle of policy and data and update it in place without having to replace the entire policy Distortage is if you are working with a large data set Typically when you are working with data opa is going to store all of that data in memory, right? This makes it very performant very quick to come to these decisions But that comes with a memory overhead And so if you have a if you're working with gigs of policy data, right? There's about a if you're working with like hundreds of megabytes you'll end up with gigs of memory, right? We have about a I think it's like a 20x cost there so Distortage allows you to store a lot of these policies on disk and access it not as performant as in memory But if you are working with a large data set, it's very helpful function mocking This was this was a community request people asking for they've written a lot of custom functions And they want to be able to replicate that data or replicate that ability for for testing strict mode is For all of the new things that we've come out with strict mode will say hey, you are not using the latest opa features We're going to deny this. This is helpful as you do the Upgrade path for opa as you introduce new keywords as you introduce the new features You want to say hey make sure that this is using the latest the latest opa stuff So you turn on strict mode and it'll it'll tell you that hey your policies are no longer Going to be valid with the latest things and then we added OCI bundle registry support for for downloading Our storing bundles with the OCI product protocol and with that I'm gonna hand it over to read it over to do a little gatekeeper intro and project update Thank you. That was a great. Thanks for all the updates in opa and I know As a rego user those definitely helped the rego authors. Thank you So my name's Rita. I am a maintainer on the opa gatekeeper project I know this is cube con after all so y'all probably wondering how do I use this and for my Kubernetes clusters? Well, we have a project called opa gatekeeper. It's a customizable Kubernetes admission webhook That really helps your organization to enforce policies and strengthen governance What does that mean? Right? So in any large organizations, you know, you have the different personas You have people who write the policies you have people who enforce the policies You have the cluster operators and you also have people who are just trying to deploy their Workloads in you know from their CICD all the way to production so this is the solution that can help all these different personas and sort of create that separation of concerns where by your Policy authors can write the regos can think about what are the logic for actually validating the resources? I get to deploy it in production are actually are actually according to Your enterprise or your company's policies and then as a developer You then also just need to focus on am I following the right? Best practices right, you know things like container limits, you know, you know unique ingress Host names stuff like that So again, this is to separate the the folks who are actually writing the policies and the people who are Enforcing and rolling these out in your enterprise as well as the people who are just trying to make sure that they're doing Following the best practice and making sure that their stuff is going to run well in production So I did Well focus this talk a little bit on mostly on updates So I'm going to try my best for the you know newer folks to kind of talk a little bit about the different functionalities So I guess real quick a gatekeeper is for validating It works as a validation as well as mutation And it also comes with audit so think about you know trying to introduce policies into your organizations How do you make sure that before you even roll it out in enforcement mode? How do you get an audit to kind of see? Hey, how are the work for workloads doing by introducing a policy? Am I going to break people right so this is why gatekeeper has different enforcement Enforcement actions thereby when you introduce a policy it starts with audit and then it and then you turn on boarding so you warn the people when they Deploy their fault their workloads and last when you feel it's safe. That's when you introduce, but that's when you turn on the Enforcement as deny and that's how you get you block the valid. Sorry. That's how the validation will put blocks the deployment So so yeah, so the updates that are coming in In gatekeeper currently, you know, we upgraded to OPA 044 to leverage the latest and greatest and also now it's compatible with Kubernetes 125 You know in the past it worked with PSPs, but now it works with PSA So definitely great to move to to that standard And also mutation was in beta and now it's stable. So if you're using Gatekeeper V you 310 you can definitely rest or share mutation is going to work for your production workload And then we also introduced three new features that are currently in alpha state And validation of workload resources external data And then we also have a gator CLI that you can use as part of your CI CD pipelines I will talk about these three in more details in a bit And then you also have the ability to validate sub resources and then metrics, right? You know getting the that violations as part of your metrics solution whether You know, that's Audit for last run in time metric or ask support for open senses as well as stack drivers And then last but not least allow wildcard at the start and end for excluding namespaces I know a lot of Users have been asking for this to ensure that their policies would not would the exclude certain nameset spaces for from the enforce policies So yeah, and we We definitely care about performance a lot. So every gatekeeper release where we're constantly trying to reduce you know audit and the webhook memory footprint and Scaling making sure that constrained violation limits are set And reduced memory usage And then we also introduced the defaulting the max serving thread to the gold max procs Flag to ensure that CPU starvation and memory scaling is introduced in the latest gatekeeper release And also reduced the request duration when policies are actually running for replicated data. So again, think about In your cluster you want to check uniqueness and this is where you need to replicate the data to open cache And we've had to do a lot to improve the request duration so that every request will take less time to get validated And then also reduce CPU run time and again, this is when you need to add data to the opa storage All right, so I'm really excited to talk about this new feature. We got a lot of Users on github asking. Hey, it's great that I see my pod is Failing certain validation for example Container doesn't have limits, right? But I really would like to see that at the Deployment level right so this new feature called a validation of workload resources basically does that it rejects the workload resources when That actually creates the resource that violates a constraint Again, this is think about your deployment think about replica set or jobs, right? So you get an early detection of hey the resources that are created from the The pot templates are actually failing, right? and then also With this you can also see this in your auto results, right? I'll think about Deployments where the replicas set is zero, right? You can still get an violation that says hey your deployments are actually failing and when that replicas Increases more than zero. This is where the pots are gonna fail right again early detection Here's an example and as you can see Does this have a This one, okay. All right, and as you can see here's an example. Oh I have to point that okay As you can see, this is a new custom resource called expansion template and this is where you specify. Hey for these You know pot templates degenerated resources pod and for the actual deployment that you deploy It actually generates the pot that is the resource that we want to validate And as you can see when you apply this Deployment that as you can see doesn't have any Container limits it would actually fail and again, it would tell you exactly why it's failing The admission web book failed Denying this request because the connect container resource limit is not provided and therefore denied Right so again early detection and this has been a feature that the community has been asking for a while And the next feature I'm really really excited to talk about is called external data Again, imagine in any organization. You probably have data that doesn't actually sit on the cluster, right? So think of scenarios where? You know you where you need to your policy to communicate to systems outside of the cluster And this is much more secure than the HTTP send function in rego in out of the box you can pat you can back your requests natively and Think some scenarios that you can think of are you know You know how to LDAP like how do you validate if the user that is making the request is actually in the allowed list Right or another one that is very very dear to our heart is I want to be able to check for CVE Vulnerabilities before they're the container image actually starts running in the cluster, right again Where does that data live while it lives in probably in some scanner tool? Solution that sits outside of the cluster And then another one I actually don't have it here. Oh, no, I do image signature, right? So think about nowadays everybody signing their images, but how do you validate and making sure that is the image? That you that's come that's signed by your organization, right? And and that is what external data feature allows us to do and an available an ability to extend The policy engine to talk to external sources And this also works with mutation so you could you know imagine talking to another system Maybe I'll dab that says hey, here's the owner of my pod And it gets that information and stick it it mutates it the mutates of resource on on the request All right, and we love, you know early detections and nothing says early detection then Making sure you get that validation as part of your CI CD pipeline, right? So Gator COI is relatively new still alpha It's in a lot of ways is kind of similar to Conf test if you've used it before Gator verify is the way that you can actually Use to unit test your policies right test the the regos test the constraint templates and the constraints Before you roll it out to to your users And then Gator test is a way for us to the shift left where you can actually validate the You know think about your helm charts right your Kubernetes deployments You want to be able to test them before they are even even being pushed to your clusters and you can do this right? You can use the CLI right in your CI CD and Gator expand Basically helps us mock resources that are coming from the workload resources like again deployment to pod And for the brew users out there We now have Gator as a that you can use to brew install Yay All right, and here's an example of Gator as you can see I Don't think this is Here we go as you can see I have some Resources that I want to verify And then all I need to do is run Gator verify and then as a result My test suite basically says hey looks like your allow repo is not correct because these Yammels it have containers pulling images from another Container registry right and then here as you can see expected Violations, you know you got one violation. Why because you're using Image tags that it's latest and no one should be using latest right More demos and again, this is to test Kubernetes resources in CI CD I'm just gonna skip ahead And then also really exciting is now we have a Gator a gatekeeper Policy website that that you can search through and these are community maintained Gatekeeper policies that a lot of different companies have created and learned over the past few years Really great website definitely check it out And then we're also an artifact hub so again Check that out if you want to implement this in your organization and We're also on github slack So if you have feedback or issues and just want to talk to the maintainers, please reach out Awesome. Thank you All right, we have I think just a couple minutes for questions. All right that hand went up quick Yeah Thank you. The validation of our cruel resources is what we're looking for is really helpful, but a couple of questions Can can the validation happen at the deployment time as a runtime not only just deployment That's one question second question is Can I have the errors like in the soft errors versus hard errors? Do we have that capability? And third thing is you mentioned dashboard kind of one learn a little bit more about in how the dashboard work Thank you You want to take the first one? Yeah, so gatekeeper is an emission web hook So it only happens during a mission time and audit is one that's running continuously in the cluster So not runtime. All right. A second question was I think Can you expand what that yeah, yeah, I think by software you mean you don't want you don't want the the error To actually block anything you wanted to just yeah, like yeah That's available in regular opa. I think it's available in gatekeeper as well Yeah, so as I mentioned earlier gatekeeper comes with at least three Enforcement actions. So when you define your policy, which we call constraint. It's just a CR custom resource You can literally say hey, I want this to be in warning mode, right? Or I want this to be an audit or or I want this to be in deny mode So by default if you don't say anything it's deny But we add a warning and and dry run Specifically for that purpose, right because we know people want to test and get data from the running clusters Before you even introduce a policy because we all know policies can be very dangerous What was the third question was there was a third part to that was it was that it? Oh, you mentioned about dashboard You do you mean the website? Yeah, so we I mean if you've been following gatekeeper gatekeeper has a lot of libraries and it's it on github It's just open policy engine slash Gatekeeper library and this is where you can also contribute and add your own policies to share with your friends But yeah, we created a website specifically to make the search capability a little easier and also as you can see it comes with the The template but also how you can install and deploy it to your cluster Awesome. Thanks For Workload resource validation. Does that support expanding nested resources? So for example, if I have a cron job that produces a job that produces a pod Can it do multiple levels of expansion? I Yes, but similarly as you can see Damon said a replica set right so it depends on how you can totally write that As almost the parent is in this list. It should be able to expand. Okay. Thank you Next question over here One of your slides talked about open sensors and strike driver exporter Integration so I just want to know the use case because I mean, how do you enforce anything once the metric has already kind of Told you what has happened, right? So metrics works a little bit like I mean like any metrics where you want to get The violations as metrics, right or you could use a metrics as you know How is gatekeeper running right like it's a healthy or whatever so in essence open sensors and stack drivers are just Various options where you can export that data out. Does that makes does that help? Yeah, is it possible to use OPA to sort of validate against external data? So you build a policy and you may want to check for validation to something that's sort of an external lookup or something like that Is that possible? Yes, that's exactly what this feature does So imagine again, I think some of the examples I probably will speak to you are you know image signature, right? You have the signature in your registry, but how do you validate them at the time of a mission, right? I can speak to gatekeeper first. So We build this feature such that X any external data provider can extend, right? So the extension points are not entry, right? Anybody can write an external Plugin and that plug-in will be able to do the reaching out and you know communicate and grab the data and return it to OPA right or gatekeeper and and then gatekeeper will give it to OPA and the OPA would then Evaluate that response as part of the rego. All right, so anyone can write the plug-ins and in fact We have a reference implementation for anyone who wants to go write this and we have some sample external data plug-ins that Calls out to cosine as well as notary v2 to verify signatures And I highly highly recommend everybody do this in production And on the vanilla OPA side we have if it's a very simple call-out We have HTTP send if you just need one quick piece of information We don't recommend if you're doing like a ton of look-ups this way you'll get terrible network latency If you do need if you do need a more tightly coupled integration We have like the OPA SDK so you can write something or if you can build it, right? We have a go you can compile it with go and just write it right as part of your application to just connect the two things together Is the audit action available to non gatekeeper like regular vanilla OPA? Yeah, so with the audit features we do have a set of audit capabilities Built into just the the go OPA binary So you can do a lot of those features depending on exactly how you want to audit and what you want to do Question back there. That was a great talk. Thank you I went to a talk earlier in this conference about CEL and kind of how some of this functionality is kind of being put into The control plane itself and so kind of what are your guys thoughts? I guess has contributed to this project which obviously was a big part of that and how do you think about this moving forward I've been waiting for this question So I think the good news is you know, we all recognize Kubernetes admission web hooks are just they have a problem on on its own and And the community is working very very hard to try to solve those problems and and I think and just so you know, I'm in sigoth and You know the the cell cap You know, I reviewed it and there are people in the gatekeeper community who also is part of that That feature that's being added to Kubernetes so rest assured is We have folks who are actively working on it all to think about how to integrate these different solutions together to make The policy management experience better for the community Thanks for the question and please give us your feedback Do we have time for more questions? Okay, I think that's it. Thank you for all the questions. Thank you for coming really appreciate it. Thank you