 Bring your time today with me. Today we are going to talk about the open policy agent and specifically how the open policy agent can allow you to enforce dynamic policy enforcement for microservice environments. A bit about me, I am Ash Narkar. I am a software engineer at Styra. I am one of the maintainers of OPA. And I'm excited to be here today. So let's get started. So we will first see OPA's community. And then we will talk about OPA itself, what is OPA, talk about its features, use cases. And then we will do a deep dive into a microservice API authorization use case. So OPA, which is the open policy agent, was started at Styra in 2016. And the goal of the project has been to identify policy enforcement across the stack. One of the earliest adopters of OPA was Netflix. And they've been using OPA for authorization for their GRPC and HTTP APIs. And there are a bunch of other companies, like Medalia, Chef, Pinterest, Intuit, who are using OPA for a variety of use cases, like AP authorization, RBAC, ABAC, Admission Controls, Risk Management, and so on. OPA is a CNCF project. And it's currently at the incubation level. We have more than 100-plus contributors on Git. It has a healthy Slack community of more than 2,500 members. And it has more than 3,500 stars on GitHub, which shows a lot of people are adopting and started using OPA. And OPA is integrated with more than 20 of the most popular open source projects out there. And we will see some of them later. So what is OPA? OPA is a jungle-purpose policy engine which you can use to offload your policy decisions. So the idea here is to decouple the policy enforcement from the policy decision making so that your services can offload policy decisions to OPA by executing a query. So let's try to understand this concept using this figure. Imagine you have a service. And this can be any service at all. It can be your Kubernetes API server. It can be your custom service. Or it can be your Kafka, anything at all. Whenever your service gets a request, your service is going to ask OPA whether this request is allowed or not by executing a query. And this query can contain information like the method, the path, the user, basically any JSON data. And so what OPA does, it evaluates this query based on the policies and based on the data that is loaded into it, makes a decision, and sends that decision back to your service where it gets enforced. So this decision, again, can be any JSON value. It can be a Boolean. It can be a composite value, and so on. So you can see from this picture that we are trying to decouple the policy enforcement, which happens at your service, with the policy decision making, which happens on the OPA side. So let's look at some of OPA's features. At the core of OPA is a high-level declarative language called as Rego. And what Rego does, it allows you to write policy decisions which are more than Boolean, more than allow denied through false. So for example, you can ask a question, what field is a user allowed to see? And you can write a policy which returns those fields. Or you can ask a question, what servers can this workload be deployed on? And OPA can return you the list of servers on which the workload can be deployed on. So you can see it's more than just yes, no, through, false, allow denied. It's much more than that. You can return objects as policy decision sets any JSON value. OPA is written in Go. And it's designed to be as lightweight as possible. You can deploy OPA as a sidecar, a host-level daemon, or you can integrate OPA into your Go application as a library. Typically, your OPA and your application will reside on the same machine. And this is done so that you get high availability and low latency. You can think of OPA as a host-local cache for your policy decisions. OPA does not have any runtime dependencies. So if it has to make a decision, OPA does not have to go out and talk to an external service or talk to an external database to make a decision. However, you can extend OPA to do that, but that's completely optional. OPA does allow you to pull down policies and data from your external service by using its management APIs. It also allows you to upload status about what's going on with the OPA agent itself to an external service. And most importantly, it allows you to upload decision logs. What actually happened? How OPA made that decision? What was the input to the decision? It allows you to upload these decisions to an external service where they can be analyzed offline and used for other kinds of analysis. And finally, along with the core policy engine, OPA provides you a rich set of tooling which you can use to build, test, and debug your policies. It gives you a unit test framework which you can use to test your policies before actually putting them in production. It allows you to profile your policies to see where they are going wrong or what part of the policy is slower. It allows you with tracing and profiling tools to enable to do that. And also, it has integrations with IDEs like VS Code. And there's one coming up with IntelliJ which is going to help you to author your policies. So these were some of OPA's features, declarative language, multiple deployment models, management APIs for control and visibility, and a rich tooling set. So like I mentioned before, OPA is integrated with more than 20 plus open source projects out there. And this is just a snapshot of those projects. One of the hottest use cases for OPA is Admission Control in Kubernetes. So if you guys are familiar with Admission Control, it's essentially a piece of code that intercepts requests to the Kubernetes API server before that request is actually persisted in HCD. So using OPA as an Admission Controller, you can write really fine-grained policies. For example, a container should not use the latest tag in the image. Or if your container is pulling down images from Docker Hub, you want to probably avoid that. Or if your containers or if your deployments are not specifying any CPU or memory limits, you can also stop that at Admission Time using OPA before that workload actually gets deployed inside your cluster. Another use case integration is with OPA and Linux Spam where you can perform fine-grained authorization over SSH and pseudo using the Plugable Authentication Module. OPA is also integrated with Kafka, Elastic Search, SQLite, Minio. And in the case of Kafka, there are some topics which have high fan-out. And you probably don't want people to write on that topic because it's gonna be consumed by many consumers. So if you want to control who can write to such high fan-out topics, you can use OPA to enforce those kinds of policies in Kafka. And finally, OPA is integrated with a bunch of service mesh projects like Istio, LinkerD, Envoy for API authorization. And we will look at this particular use case in detail a bit later. So the cool thing I need to point out here is that you can use any of these integrations out of the box without having to write a single line of code. So you can take these integrations and start enforcing policies today using OPA. So now let's go to how does OPA actually work? So we've seen this figure before. You have your service, it gets a request. It asks OPA for a policy decision. OPA gives a decision based on the policy and data it has access to and sends it back to your service for enforcement. So let's take a sample policy in English and try to write that in Rego and enforce it with OPA. So the policy in English says that employees can read their own salary and the salary of anyone they manage. So let's see how we can take this policy and write some Rego. So like I said before, when your service asks OPA for a policy decision, it has to send some field or it has to send some information in the input. So imagine for this use case, for this example, the service sends the request method, the request path and the user who's making the request. So one thing I want to emphasize here is that OPA does not do authentication. It's not trying to solve the problem, it's Bob who he says he is. It's trying to solve the problem what is Bob allowed to do. So with that being said, let's try to see how we can enforce this policy which says employees can read their own salary and the salary of anyone they manage. So to do this, I'm going to use the Rego Playground. And the Rego Playground is this awesome online tool which you all can use to experiment with your Rego policies and test out and share your Rego policies. So this is what the Rego Playground looks like. It's available at play.openpolicyagent.org. You can see that there is good syntax highlighting for your Rego code which makes it easy to read your code and has to debug the code as well. So the way you read this policy is that the rule allow is true if input.method is get and input.path is salary employee ID and input.user is employee ID. The cool thing about this policy is that the employee ID variable on line seven and line eight will be bound to a value from the input. So let's give a sample input to this policy and see how it works. So here in the input pane, I have some sample input for this policy where in my method is get. I have a path salary Bob and I have a user called Bob. So the question we are trying to ask OPA is Bob allowed to see his own salary? And if I evaluate this now, the result is true which means OPA says Bob is allowed to see his own salary. So what happened here is that the input method was get. The employee ID variable got bound to Bob and on statement eight, your user is Bob and the employee ID is Bob. So all the statements on line six and seven and eight were true. As a result, the allowed rule was true and that's what you get finally. So let's try with a different user. Now let's say Alice is curious about Bob's salary. So let's see what happens if Alice tries to see Bob's salary. So the question we are asking OPA is can Alice see Bob's salary? So if you evaluate it now, the result is false and the reason it's false because on line eight, the user is Alice and the employee ID is still Bob. As a result, this expression failed which caused the overall rule to fail and which is what we wanted, right? We don't want anybody to see Bob's salary except for Bob himself. So that's the first part of our policy which says that an employee can see his own salary. So now let's say Alice gets promoted and she becomes Bob's manager. So the policy says that a manager should be able to see the salaries of employees she manages. So we need to tell OPA this new information. We have to communicate to OPA that Alice is now Bob's manager. So how can we do that? So you can see this data plane. We can just give some sample data to OPA in this pane. For example, let's say managers is an object and let's say for Bob, Bob's manager is Alice and say Fred and let's say Alice, her manager is Fred. So what we've done is we are giving some data to OPA which kind of encodes the company hierarchy out here. And so you can imagine that this data can be given to OPA via your LDAP server or this data can be stored somewhere externally and you can ask OPA to pull it down during the policy evaluation. So but for the purposes of this example, we can simply provide data to OPA using this data pane. So what we can do now is we can extend this policy to make use of this new data. So managers and okay. So what we are saying now is that if the person making the request is the manager of the person with the salary they want to see, allow this request. So again, the question we ask is can Alice who is Bob's manager allowed to see a salary? So if I evaluate this and if I have typed correctly, the allow rule should evaluate through. So this is because even though the first allow rule was false, the second allow rules, all these statements line 12, 13 and 14, all these statements were true and as a result the allow rule was evaluated to true. So meaning in Rego, if you have two rules with the same name, it's essentially an or condition. So now that we've written, so this was the second part of our policy which said that a manager should be able to see the salary of users they manage. And so we can take that simple policy in English and very simply write a simple Rego policy which encodes that. So now that you've written your first Rego policy, you can share this policy with the entire world with friends and family by clicking the publish button and it gives you this beautiful link which you can copy. And then if you open that link, you not only get the policy that you authored, but you also get the input as well as the data that you sent to your policy. So again, if you evaluate this, it should evaluate to true and now you can share these policies in blogs or in posts or with your family and friends and share them throughout the world. So this was how you can use the Rego playground to write, to experiment with Rego policies and to share them. So we just saw how we could take that simple policy in English and write a Rego rule around that. So next, let's talk about a use case around API authorization. So we spoke about OPA can be used for multiple use cases like admission control, it can be used for RBAC, ABAC and one of the popular ones currently is API authorization. So microservices provide the productivity of individual development teams by breaking down applications into smaller standalone parts. And hence, many organizations are adopting microservice oriented architecture to design and build their software systems and use public cloud for deployment. However, microservices alone do not solve age old distributed system problems like service discovery, authentication and authorization. In fact, these problems are often more acute due to the heterogeneous and the ephemeral nature of microservice environments. Applications that are architectured as microservices have a huge problem. Imagine you have tens or hundreds or even thousands of microservices and each microservices has to decide whether one of their thousands of API calls per second are authorized or not. So you can understand the magnitude of this problem. In fact, Netflix, who is one of the leaders in microservices talk publicly about how important fine grain authorization actually is and how strict the performance and the availability demands are at scale. So what we need is a unified way of understanding and controlling the authorization policies enforced by hundreds of polygons services, tens of different data storage systems and a flethora of supporting software that are running across different clusters and even different data centers. I hope we understand at this point we cannot simply hardcore these policies into all these microservices because it's going to be a management nightmare when you start to scale this. Because every time your policy is updated, you now need to update your entire service which makes it really difficult and makes it expensive. So the solution must not only give us the correct authorization decision, but it has to be highly performant. Any guesses for what that solution is? It's OPPA. That's what we're here for. So yeah, you guessed it. The solution to this is OPPA. So OPPA was designed for architectural flexibility so you can integrate it seamlessly throughout your application stack from the UI to the gateway to the backend to the database. You can choose high availability and performance by distributing OPPA to the edge or you can choose strong consistency by integrating OPPA into a central service. So let's focus on integrating OPPA with your backend services. Now given that OPPA has multiple deployment models, you can do this in different ways. You can integrate OPPA as a library in your backend service, but this would require your backend service is written in Go. You can run OPPA as a service and have your backend service call out to OPPA on every API request. This is good, but if you're in a microservice environment and you care about latency and availability, you're going to have to pay for this network call on every single API request. And lastly, you can integrate OPPA with your service mesh or your network proxy so that you can enforce policy without changing the backend service at all. So let's see an example of this last approach using Envoy. Now for those of you who are not familiar with Envoy, Envoy is basically a Layer 7 proxy and a communication bus designed for large modern service-oriented architectures. So let's imagine you are deploying your application on top of Kubernetes and inside your application pod, you have your application container and you have an Envoy Sidecar running as well. So the way this typically works is that when your application gets a request, it's going to be intercepted by Envoy. And Envoy is now going to send this request to your microservice, to your application, get the response back from your application and Envoy then sends it back to your client. So you can take the same architecture, the same flow, and now you can inject OPA in this flow. So the way you do that is now when you get a request, Envoy has an external authorization filter which can take that request and send it to an external authorization service to check if that request is allowed or not. So this feature makes it possible to delegate authorization decisions to an external service. In this case, that service is OPA and provide all the request context to this external service. So Envoy is going to send OPA the request, the path, the user, any different kind of information that is coming in the request. All that information can be sent now to OPA. So OPA can now use this information to make an informed decision and send that decision back to Envoy for further processing. If OPA denies the request, Envoy sends a 403 back to your client. If OPA allows the request, like before, Envoy sends the request to your microservice application, gets the response back and forwards that response back to your client. So in this example, you have intercepted the incoming request coming into your pod using the filter for external authorization and then fetching the data and sending it back to your client. You can also do this on the egress, meaning before if your microservice or if your application tries to go out and reach an external endpoint, you can redirect that transit to Envoy and have Envoy check whether that egress request is allowed or not. So you can use this exact same flow for Engress as well as egress. And again, if you see, we are still the main goal of OPA, which was to decouple the policy enforcement from the decision-making, it's pretty evident here. OPA makes the decision. Envoy's external authorization filter enforces that decision. So now, we've discussed ABI authorization, which is a problem of can this user execute this action on a specific resource. In the previous slide, we saw this how Envoy can be leveraged to make use of OPA. You can also use STO. You can also use Lincardine with similar mechanisms, wherein you have a plugin which calls out to OPA. OPA makes a decision and sends it back to your enforcement point for enforcing the final decision. So it's a very flexible model that can be used not just with Envoy, but with other service mesh solutions as well. So the API authorization problem in microservices it needs to be solved, right? Because we've seen that we have all these different applications. Each application is different from the other. And then how do you find a unified way to enforce policies in all these different applications? So we need to have a good solution. So OPA's declarative approach to policy enforcement and its general purpose nature allows us to define custom security policies and enforce them by injecting our application with an Envoy side card like the previous example, which is going to result in policy-enabled applications that are now ready to provide least privilege access. You can imagine a situation in which you have different microservices. Each microservice wants to enforce its custom policy so you can have OPA running alongside every microservice implementing that custom policy and packaging this together in deployment along with your network proxy like Envoy in this example. So this was a talk on OPA, which is an open source general purpose policy engine. And we've seen how it can act as a unified authorization language and administration toolset that works across all your microservices to enforce fine-grained security policies. If you want to learn more about OPA, check out the website openpolicyagent.org where you will find all these tutorials with Envoy, with Istio, with Kafka, with Kubernetes. You can try them out from the website as well. If you have any questions about OPA or if you have any use cases around OPA, you can join us on our Slack channel. And also it's an open source project so check out the source code on GitHub. And if you like what you see, please do start the project. So again, I'm Ash Narkar. I thank you all for sharing your time with me and I hope you enjoyed this talk about OPA and API authorization. Thank you. I'm open for questions now. I'm going to go across some questions. So the first question we have here is, can the policy be consolidated in a single policy where input user is either a manager or user him slash herself? So yeah, so Leonardo has this good question about can the policy be consolidated in a single policy where the user is either a manager or the user is him or herself. So yeah, you can consolidate the policy in a single policy. So OPA has this concept of packages. So you've seen packages in programming languages like Go has a package structure which allows you to streamline your code to that particular set of packages. So yeah, you can do the same thing. You can have multiple packages inside of OPA and then you can stitch them all together into a single high level policy and call individual packages for individual policy decisions just like you would do in any other programming language. So just quickly, if I have to show an example, so like you saw here, there's a package play. You could have had a package play one, play two, play three and then you can have like a top level package, say package or Z which calls individual packages, gets a decision, stitches it together and returns the final response to the user. So yeah, that's totally possible to consolidate everything into a single policy file in OPA. Okay, let's look at the next question. Srinivas has a question. Can this be sought as a replacement for OAuth that is in place for many microservices apps in place? This is an excellent question. So if you guys are familiar, if you all are familiar with the OAuth architecture, so the OAuth framework is essentially for third-party applications to gain access to a user's account. OPA is not a replacement for OAuth, but OPA can be supplement to OAuth. So for example, inside of the OAuth framework, you have the authorization server. So what you can do is that authorization server can be implemented using OPA. One of the other pieces of the OAuth framework is a generation of tokens. So you can have OPA generating the tokens based on some kind of an input. So yeah, so I would say OPA is not a replacement for OAuth, but OPA can be used along with OAuth in its different stages. And one thing with OAuth is that there are so many stages involved in the OAuth framework, there is an inherent latency associated with them. But if you're doing something like microservices, your latency is like in the order of microseconds. So you would typically not use OAuth as in a microservice environment. That is why OPA and OAuth have different use cases and they supplement each other pretty well. Not a replacement. So yeah, thank you Srinivas for that question. The next question is by Rodrigo, and it says that do you have any performance benchmark results of time and memory consuming? So yes, good question again. If you go to the OPA website, we have performance benchmarks of how the memory and how the time taken by the policy increases as you increase number of rules. So one thing to note here that as you increase the policies and as you increase and as you provide more and more data to OPA, it's going to consume more memory and it's going to probably take more time to make a policy decision. But what OPA does is it has a bunch of optimizations which you can use to make your policies run faster. And so we have examples of this on the website. There is a section around policy performance which you can check out which discusses all these optimizations OPA has to make everything run faster and also how you can manage memory when your data increases. Next question by LingWang. So the next question LingWang asks is what would be the overhead look like by offloading the on Z request to OPA? So great question, LingWang. There is obviously going to be some overhead involved when you are sending, when you are asking OPA for a policy decision. But again OPA is a very lightweight agent and also the decisions OPA makes they are in the order of microseconds and you can actually optimize them even further. So if your overall latency budget which is pretty tight for API authorization is in the order of say microseconds, OPA can definitely meet that demand. So yes there is an overhead but it is a very small overhead and it is especially designed for such high performance use cases. Okay, next question by Alejandra. How does OPA integrate with Terraform? Yes, so OPA has an integration with Terraform and especially the use case around here is if you want to test the changes Terraform is about to make before it actually makes them you can use OPA for that and there is a good step-by-step tutorial on the website Alejandra which I would recommend you to check out. Next question by Biao. How do you monitor failure of token request to OPA? Okay, how do you monitor the failure of token request to OPA? So whenever you call out to OPA OPA will either return to you the policy decision or it will return to you that it failed or it will return to you like an undefined or a false or something like that based on how you author your policies. So then it is up to your service which is calling out to OPA to interpret that decision. So for example in the example I showed before if allow was through your service would allow the request if allow was false it would deny the request. So it's up to your service how to interpret those results or those policy decisions which have been sent by OPA. Okay, next question is by Eric. In your opinion what is the best way to retrieve external data for making policy decisions? How can decisions be affected if we need to make HTTP calls? This is an excellent question Eric and it's come up a lot and it's come up so many times that we actually have an entire page dedicated to this question on the OPA website. But just to give you a quick understanding of this there are many ways you can push policies and data to OPA. One way is you can use the push model in which you can use OPA's REST APIs to push your policy and data into OPA. The other way and the way we recommend is the pull model in which OPA will periodically pull down your policies and data periodically from a remote HTTP server. So imagine if you have an HTTP server running somewhere you can point OPA to that server and OPA will periodically pull down policies and data and use that inside your policy. And one way to get policy is within the policy itself. So OPA has a built-in called as HTTP.Send which allows you to make HTTP calls from within the policy itself. So imagine as you're evaluating your policies inside of Rego you can make a call to your service fetch that data and then use that new data inside your policies to make a decision. So the bottom line here is that there are multiple ways you can feed OPA external data but it depends on your latency requirements and it depends on what you are trying to achieve. Okay, next question is by Paul. The question is where do the OPA policies live? Are they stored in an SEM or in an external DB? Again, great question and it kind of feeds off from the last question. You can store your OPA policies wherever you want. You can store them in your Git. So you can imagine that you can have your OPA policies inside of GitHub. You can do the same things you do for normal code. It goes through the entire PR process. You can then push those policies to master and then you can probably point OPA to your service and it pulls down those policies from GitHub. So that's one way you could do it. You could obviously store those policies on disk and have OPA load them from there as well as you can store them on your external servers point OPA to those servers and OPA will pull those down. So yeah, depending on your use case there are multiple ways you can store policies in OPA-OPA to get from. Next question is by Robert. The question is, would you kindly provide a link to download the slides later on? This is more of a question for the CNCF. I think we will provide a link to the slides later on but probably they can answer this question better. So yeah, so I just got a reply that yes, we will be making these slides available later. Robert, so yeah, you can access them later. So yeah, so I hope you all enjoyed this session and if you have more questions we are always available on Slack so you can join us there as well and thank you so much for joining me today. Thank you.