 I'm Anders, and I'm very happy to be here. Good to see you all made it here. Today, the topic of today's talk is going to be an open policy agent. Anyone used OPA in the past? Yeah, quite a few. So it might be a lot of things that you might know already. It's going to be like an introduction, but we might touch on some more advanced topics as well. First off, I'm Anders. I work as a developer advocate at Styra, which is the inventors of the OPA project and still the maintainers of it as well. I have a pretty long background in software development, where I used to work primarily in identity systems. So the kind of systems where you authenticate primarily, multi-factor authentication, single sign-on, solutions like that. And what kind of got me into or interested in OPA was when you're in identity and authentication, you can tend to shy away from authorization. That's the concern of somebody else. And eventually, I got interested, like what if that was my concern? How would I tackle that? I used to work for a large corporation where we integrated OPA in a big microservice environment, a highly distributed environment, about 700 microservices or so. So that's how I got involved in the OPA project. And yeah, I guess I had so much fun. So eventually, I landed at Styra. When I'm not coding or working with OPA, my interests are cooking, food, football. And if you want to reach out later, just my first name and my last name at most of these social channels. So to start off, what's the challenge or what problem is it that we're trying to solve with OPA? It's basically this. We're trying to manage policy in increasingly distributed, complex, and heterogeneous systems. And when we say policy, what we basically mean is a set of rules. And our application stacks are quite complex. If you go to any larger organization, chances are pretty good. You'll find a few of these logos. It's a really large organization and you might even find all of them. And of course, your applications will have to live somewhere. You're gonna have to run somewhere. And your infrastructure is gonna have to run somewhere as well, as well as your data. And of course, it's not like OPA invented a concept of rules that's been there all along. So, but the challenge here is basically that all of these systems have their own way of managing policy or managing rules. And what we found is that this doesn't really scale when you start to add more and more of these logos and when you start to have like these 700 micro services. How do you know which rules are deployed to any particular system at any particular point in time? So the goal of OPA is to unify policy enforcement across the whole stack. And it's basically what OPA does. So if that's what OPA does, what is OPA? It's a general purpose policy engine. And we'll go into detail on the general purpose bit later, but it kind of has to be a general purpose in order to encompass all these diverse different technologies. So it provides a unified toolset and a framework for working with policy across the stack. It decouples policy from application logic. That's kind of a key concern for us. So the way you can think of it is like how you decouple storage from your application and move that into a database. You can kind of think of OPA doing the same, but for policy and your rule. So the rules don't no longer live inside of your application, but there are properly decoupled. So you can manage your rules, test your rules, and so on independently from that of your business logic. Policies, they're written in a declarative language called Rego, which I'll give you a little demo on later. One thing to note is OPA deals with the decision-making of policies. So based on the policies you have, the rules you have, OPA will provide you a decision. OPA does not enforce decisions, though. That's still up to your applications, and that's obviously highly context-specific based on what layer you are in the stack and so on. So it could be anything from sending back a forbidden response or it could be adding a log entry to the audit log, or I don't know, contacting the authorities depending on the severity. So a general-purpose policy end-in. So we have some use cases. Pretty much everywhere, but some of the most popular ones include admission control in Kubernetes, microservice or app authorization, infrastructure offers it or infrastructure policies, data source filtering, CI CD pipeline, so on. So basically anywhere where you have rules, OPA is a good fit. It's, again, it's an open-source project and a pretty vibrant community. We have over 300 contributors, 80 or more major integrations with different products like Kafka or Kubernetes, data sources and whatnot. We're used by over 1,000 other GitHub products, so dependents. 7,000 GitHub stars, 6,300 Slack users, 130 million downloads, et cetera. So it's a fairly established project by now. So OPA itself is kind of the core product or project. Well, the ecosystem also includes things like contests for running policies on files. OPA gatekeeper for Kubernetes admission control, there are editor integrations for VS Code, IntelliJ and so on. But of course, OPA is not just like a hobbyist open-source project, but it's actually used by some of the largest organizations in the world. And I said a little summary of this up until now, like from Kelsey Hightower said, the open-source project is super dope. I finally have a framework that helps me translate written security policies into executable code for every layer of the stack. That's basically what OPA provides. So if that's what OPA is, how does it work? So the first thing to keep in mind is this, it's a policy decision model. It's kind of how can OPA integrate with all these services and all these different technologies. And most of these technologies, they didn't have OPA in mind when they created those. So we kind of had to shoehorn in OPA somehow. So how does that work? So the decision model works like this. We have a service and that service, we have a pretty broad definition of service. It can basically be anything. Anything that services a request from a user or another service. It could be a Kafka broker, it could be a microservice, it could be a Linux PAM module or whatnot. So anything that services a request, when that request kind of comes in or is dispatched, rather than making a decision on our own, we'd rather kind of forward that or further on to OPA. So we ask OPA for a decision instead. And we do that via policy query. Policy query is basically just any kind of JSON value. So some common things that you'd normally include would probably be the name of the user, what endpoint, maybe the request method, things like that. Things that have meaning in this context. And based on the policy that we've written and optionally external data, we OPA makes a decision and sends back a policy decision to the service. And that response is also just JSON. So basically any technology that can read and write JSON can talk to OPA given that there's some way of kind of plugging that in. So that's the decision model. As for deployment, OPA is a self-contained lightweight binary, I think it's like 50 megabytes or so. And the common deployment model, again, we're working with distributed systems. So the common deployment model is not that you'd run like one gigantic OPA server, but rather that you deploy OPA inside of your services or as close as possible to your services. And one of the benefits of that is of course that you can distribute whatever policy and data is needed only for that particular service, but also that you keep the latency down to as low as minimum. So yeah, in a Kubernetes context, that would normally be a sidecar deployment. You have OPA running inside of the same pod as your application. There are many ways of doing it, but ideally as close to your service as possible. OPA normally is deployed as a server. So it's queried over its REST API. And yeah, inside of the same pod, that would just be a call to local host basically. So it's normally below or sub millisecond and latency overhead. So it's very fast. But there's also a few other deployment options. There's a Go library for Go applications. You can integrate it with Envoy and Istio-based applications and under service meshes. You can compile your policy to WebAssembly and more. So there's really a lot of ways to deploy OPA, which is also kind of key when you need to integrate with all these technologies. So the policies themselves, they're again, they're written in a declarative language called Rego. And by declarative, we mean that rather than saying like exactly how you want something evaluated or how you want something done, you just say what you want done. And then it's up to OPA to kind of translate that into instructions. So just like a real world policy, an OPA policy is just a number of rules. So one of the goals is kind of to try and translate what is a real world rule and trying to make that, trying to make something as close to that as possible. Rules, commonly returned true or false, is the user allowed or not. That's a Boolean decision. But again, any value that's valid JSON is also a valid response or a valid decision. OPA provides a unit test framework. So you can work with testing of your policies and rules, kind of detach from that of your business logic or application state. It's a well-documented project and there's a regular playground to try out policy offering without like even having to download OPA. So at this point, let's see if we can get a demo going here. I'm just gonna see. So just to kind of show the basics of policy offering. So the first thing I'm gonna do here, I'm gonna create a policy file, let's go policy.rego. Is it big enough or should I, it's bigger. So the first thing we're gonna do is we're gonna add a package. And this is similar to a module or a namespace and from other languages, just a name really. So we're just gonna call this policy. And now for the anatomy of a policy, a policy again, we said a policy is a set of rules. So that's what we're gonna add. In this case, we're gonna add a rule, which is a called allow, which makes sense for a Boolean rule. Are we allowed or not? But allow doesn't have a meaning to OPA. OPA doesn't have like any keywords like that. So allow or deny is just names that make sense to us. So we're just gonna say allow is equal to true. And here's the kind of the thing with rules is that they're basically conditional assignments. So if you'd, in a normal language, you'd say like if something, something, something, then assign true to the allow rule. But since the conditions are at the actual rules, OPA kind of inverts that. So we'd rather than saying allow, or if this and this and that is true, then do that. We can say then or do this if all of these conditions are met. So we're gonna add a rule body here. And if all of the conditions or assertions inside of the rule body are true, this is gonna evaluate. Or so the allow is gonna be assigned to true. So if we do something like one is equal to one, we know that to be true. And then we say OPA eval, we're gonna say the policy file there, data policy allow, we can make it pretty as well. We can see that that was indeed true. Is it big enough or? Nope. It's better. Okay, let's try it again. So OPA eval, data policy allow, that's the rule we wanna allow here. Yeah, sorry. Yeah, it looks better. It's gonna cut out there in the, something like that. Okay, so the allow rule is true because one is obviously equal to one. And if we do something which is not true and we evaluate that, it's just empty. There's nothing there. We're meaning the rule is undefined. So if we do want to ensure that there's always some value return, we might say that by default, the allow rule is equal to false. That's a pretty good default for any authorization system, isn't it? So of course, and if we were to say like one is equal to one, two is equal to two and so on, OPA is gonna eval from line. If the first line is true, it's gonna continue on to the next. That one is true, it's gonna continue on to the next. So if each of the conditions in the body are met, the whole rule evaluates. And if one of them is not true, that's evaluation stops and we fall back to the default value. So it's kind of an inverted if condition. Now of course, these examples are just silly. So in a real example, we might, we're gonna provide a policy query, right? So we're gonna simulate something like that here. I'm just gonna create a file called input.json and it's gonna look something like that. We might have a user that seems reasonable. We have a name, we're gonna see Anders and the user might have some roles. We could just say this is a developer and we might have a request, like if this was a microservice or something, we might say that request might have a method. We get here and a path over like path components. So it's gonna say users. This could be a string as well where we just have like slash separated, but if you find it easier to work with an array of values. Okay, so now we have a request coming into our service. We have a user here in the request, a name of the user and we have some roles. We also have a request method. So once we have an input, we can reference any value from that input by just saying input, input dot user roles. For example, if you wanna reference the roles and here we might try some iteration. We could do something like that. We're not interested in the actual key here or the ordinal. So we're just gonna say if, so if we iterate over all the rules and one of them is equal to admin, then allow is gonna be true. That seems like a sensible rule. So what if we wanna, we said like inside of the rules, remember how we checked all the conditions. So they can add it together. They all need to be true. What if we wanted to add more conditions here? We might, for example, have some public end points where we don't want to require any rules at least not for reading. So the way we do that would be an or condition, right? And the way we do that in rego is just to add another rule. We could say that allow is equal to true, either if you're an admin or the other. So all of these allow rules are gonna be evaluated. And if one of them is true, the final evaluation is also true. So we might wanna say input, if input request, request method is equal to get, and the input request path. We could just say like the first path component, if that's equal to users, I think it was, right? So everyone should be able to read from the user's end point or to read users. So that makes sense as well. You don't need any particular role for doing that. You can, and we can allow that for anyone. So if we tried it out, okay, we're false because we did not provide the input data here. So let's try that. We see that, yeah, indeed, that's true because we were sending a get request to the user's end point. If we tried something else here, I don't know, sending one to the admin endpoint. We are no longer allowed. So that's basically how policy evaluation works. Some simple assertions based on the input you provide to OPA. And we can, I think we have time for one more example. We might, for example, say that we now said that anyone can read users. It's fine, but if you wanna modify a user, it must be your own user. So it must correspond to the user name in the request. So we could say that allow is equal to true if input request method is equal to put and input request path. And in this case, we're gonna say the first path component should be users and the second one should be the user from the request. So input user name. So with this rule, we should be able to modify. Change this to put request. See users, there's. So we should be allowed to do this, try it out. And we are. And if we change now the path, try and modify Jane. We should no longer be allowed and we aren't. So that's basically a very brief introduction to Rego and how you can build your policies from these small rules and how they're all combined into something bigger. And of course, this is very simplified. There's a whole lot more to Rego than this in terms of iteration, built-in functions. There's over 170 built-in functions. So if you wanna ensure things like all email addresses must end with acmecorp.com or whatnot. So, but the basic principles are all contained in here. It's rules are built from simple assertions and in this case, we had a Boolean rule which returns true, but this could be one or it could be yes. Again, any value that is valid, JSON is also a valid decision. All right, so let's hop back here. Any questions on like Rego or policy offering? Yeah, so the question was, should the input also always be a JSON format? And yes, OPA supports JSON and Jamel by default. So if you have any other format, your advice to convert to that before submitting it to OPA. There are some projects around OPA like Conf test which supports a whole lot more five formats, but OPA, the core engine is JSON or Jamel. So it needs to be like structured data. Okay, so can I, yeah? Could you define the rule set in non-time if you have this running as a service? Oh yeah, for sure. So one of the ideas of having decouple your policy from your application state is that the policies can be updated independently of the life cycle of your applications. So if you have 700 microservices and you wanna add another role or you wanna add another rule here, it's kinda inconvenient having to redeploy 700 microservices. So that's one of the ideas. And I'll kinda get into that a bit later on the management features. But yeah, OPA can be configured to pull periodically to pull down configuration or policy and data from remote endpoints, for example. Okay, so with some basics, with some basic knowledge of Rego, I kinda figure we can take a look at one of the more common use cases which is Kubernetes submission control. And before we dive into OPA and how that works, we can add a little reminder of how the Kubernetes API works. So it's basically whenever you do something like KubeCity I'll apply your application or a deployment or whatever, that request kind of passes a series of modules. And so any request needs to be authenticated, needs to be authorized, and it will also pass the admission controllers which can either mutate the request on its way in or just validate that this looks all right. These modules are chainable. So you can have as many authorizers or admission controllers as you want. And of course, Kubernetes comes with a whole bunch of built-in modules for all of these, like RBAC for authorization and a whole bunch of admission controllers as well. But of course, for OPA, this is really what we're interested in because we wanna extend Kubernetes with our own rules. And again, if you paid attention, this is basically the policy decision model, right? You have something at a service that takes a request rather than making a decision of its own. It's passing that onto OPA and then has OPA make that decision. So we're gonna zoom in here on the validating one. So we don't really do often occasion with OPA, but authorization and the mutation or validation is what we do here. So I'm gonna zoom in on validating admission controller. And the reason I can choose that is it's by far the most popular module to extend. And the reason that is so is that it allows us to build these kind of policy-based guardrails around our clusters. So some common policies that are kind of popularly enforced to have rules. Let's say like you can pull down any image you want, but it must be from the company registry. Some organizational things like any resource deployed must have a tag or an annotation, a label which says which whoever, it's gonna labels whoever deployed it, the team name or the cost center and so on. So definitely not just about security, like these kind of rules and policies can be about anything really. Ingress and host path uniqueness, that's another one. And that's an interesting one because it requires some knowledge, not just hard-coded in policy, but you also actually need to know what other things are deployed previously. So there's no conflict when we deploy this. TLS for your service configurations. You might wanna deny certain attributes like host path volume mounts. You might wanna enforce some limits on resource allocation, policy security policies and yeah, basically anything you can think of. So if there's any rule you might wanna have, OPA is your friend. And just to provide a very simplified example, but if you remember from the demo here, we had an input which was just a JSON file. This is exactly what we'll get from the Kubernetes API server. In this case, it's gonna be in the form of an admission review object. So just have some JSON data here where we see, okay, these are the containers which are about to be deployed. We have a simple policy in the middle. It says, if not input request object metadata labels about cost center, then every resource must have a cost center. In this case, we return a string so that the admin actually knows. It's not very useful if we were to just say deny here because how would you know what to fix? So in this case, we just say every resource must have a cost center label and that error message is gonna be propagated back to the user who made the request so that they can remediate that. Offer station, another popular use case or common use case. And I'm not gonna go into all of the details around offer station. It's such a complicated and big topic but just to provide like a bit of an overview of what an OPA deployment might look like in that context. So in this case, kind of remember the policy deployment or different deployment models in an ideal scenario would keep OPA as close to our services as possible and not just have like one OPA server that they would all go and query. So in this case, we have deployed OPA in all these individual services. We have an identity system, which is also decoupled and those have been properly decoupled for the last 20 years or so. It's offer station has up until now kind of been still very much hard coded into our application logic. But now we've, so we have an identity system. We know who the user is. That user would popularly or commonly be represented using a token. This might be a JSON web token or something. So, which is then passed along in all these requests. So we know who the user is. We know the request method, we know things, whatever details we need to make an off rise or to make an informed decision. And when you see this, you might ask like why can't we just, why can't we not just put OPA in the first service? And that's basically how we did offer station for in the past, used to put that kind of decision in a gateway or somewhere where you'd authorize the request and then we just let it through. This kind of zero cross security motto says that that's not a good idea. Because of course, once you once, if you made it past that first step, it's basically an open highway. You can do anything because the request is just assumed to have been authorized somewhere. And once that assumption breaks, that's a pretty bad place to be. And another problem with that model is of course, one of these services might run a cron job which would then be behind the gateway and would then also not be authorized. So the zero trust security model, it kind of mandates that you do offer station in all these locations. In this case, we have a control plane as well. The arrows here does not mean that OPA is gonna reach out as a synchronous action. This is kind of what you were asking about before. So OPA does not reach out to this control plane when a policy is evaluated but it rather goes there periodically to fetch new policy and new data and whatever else it might need. And with a control plane, you get more things like decision logging. So any decision OPA ever made is sent back to that control plane or you can then later follow up. How many of our violations did we have in our cluster? Our OPAs are they all healthy? Are they all up and running and so on. And yes, diuretic, the company I work for, we provide such a control plane. But basically in order to just serve policies, you can just set up an engine X or an S3 bucket or something to get started. So you can go from very simple models to two more advanced. All right, so that was an introduction to OPA and if you think this sounds interesting, some advice for getting started. I think starting small is a good thing. So similar to what we did in the demo, just try and write a few simple policies and rules and kind of build experience from that. Again, the OPA docs are a great resource. So get a feel for all the basics of the Rago language, all the built-ins or built-in functions available. And then with some knowledge of the basics, then I think like start to consider possible applications near to you. So some, if you have a microservice or something, like start there and then try it out. And again, you don't need to rewrite your whole application to use OPA. You could just start with a single endpoint, maybe a single role, maybe have the admin check, go via OPA and have OPA decide is this user an admin or not. And from there, you can scale up. I mentioned the management possibilities or capabilities, things like logging, providing policies and data via bundle servers and so on. And that's obviously the Styra DAS, which is the commercial control thing from Styra. And I should say there's a free edition as well, which you can all try out. There's the Styra Academy, which is another good learning resource. And there's the OPA Slack community, where we were a lot of friendly faces going to, that are happy to help you on your OPA journey. All right, any questions? Yeah, that's a good question. I know there are, I have seen a few, like community projects for working with that. But I think in general, it's not gonna be at the layer of the firewall, but rather more of these kind of API gateways where you can work with HTTP or JSON and not just kind of more low level rules. But yeah, if they have some pluggable system for writing rules, I'm sure OPA could fit right in. So it seems like when you move to OPA, you move from a system where authentication is a distributed problem. And you move into a system where a whole lot of rules are coming into one system. Do you find that gets really complex and how do you manage it? Yeah, so the question was, when you move from OPA, you move from a distributed system to a unified system. Yeah. So I think, and the kind of complexity that comes with that, I think it's basically the other way around. When you do have your rules distributed in all these different systems and they are written in different programming languages and so on, it's very hard to know what rules are deployed at any time or without going out, especially if they're all managed by these different teams. And it's pretty much how I got into OPA. We were looking to solve authorization for a big company. We had, I think we had eight or nine programming languages running in our clusters. And we wanted to ensure like, and if we were to just say like, we want these are the rules we want in all our systems, we kind of knew that if we were just to go out to all these teams, they would all implement it differently. And probably not by like, because it's just like, there are so many kind of ambiguities or like things that aren't really clear and how to interpret rules and so on. So we kind of needed one way where we could say like, here are the rules. They're all in one place. And then those rules are kind of distributed to all these components, but they have a single place where they live and they're all defined using a common language. That final rule file starts to get very large and comes in a big complex system. And how would you go about managing it? Are there tools to help you, you know, validate it? Oh yeah, yeah, sure. So the rules themselves, then they don't have to live in a single file or so. So you can have as many files and modules as you want and you can use imports and so on. You can have like common libraries. So there are many ways to work with OPA and Rego for larger and more complex systems that will help you. Your policy files, do you treat them like code for being in and get labels distributed via pipelines? Yeah, I think that's gonna be a little bit different depending on where you go. But yeah, I think that's basically one of the good things or one of the premises of treating policy as code is of course that you can treat Rego as you would with any other code. So you have version control, you have things like testing, code reviews. So any change to policy must be reviewed by two members of the security team or whatnot. So it's basically another benefit of this approach. All right, I think that's it then. So thank you.