 Topic for our talk today is, can it be done building fine-grained access control or authorization for backstage with OPA? My name is Anders. I work for Styra, the creators of the Open Policy Agent or OPA project. And I'm here with Peter McDonald from Vodafone Ziggler. Yeah, I'll let you do your introduction. Yeah, my name is Peter McDonald. I work as a developer at Vodafone Ziggler. I'm a user of backstage. And I'm a contributor to the project as well, both projects actually, OPA and backstage. All right, so I figured like since I guess most of you are backstage developers or users, but many of you might not have used OPA or even heard of it. So I'll start by introducing the project and what it does and how we can use it in backstage. And then Peter will talk about more about the implementation specific to backstage. So before we start, we talk about Open Policy Agent. What is even a policy? That is a good way or a good place to start, I guess. So basically what policy is in backstage or anywhere else, it's essentially a set of rules. And these rules could be things like permissions for app authorization or access control, Kubernetes admission control, where you have a set of rules that certain types of deployment should not be allowed unless they have a well-defined security context or so on. Or Terraform plans or basically anywhere where you can imagine having or wanting to have a set of rules. That is a place where you define a policy. Of course, these policies traditionally were often defined in things like PDF documents or Word documents. So this movement towards treating policy as code is really when we talk about policy, we often mean policy as code. We want all the benefits of dealing with anything as code, which is that we can test things. We can work collaboratively. We can work with pull requests, reviews, testing, linting, and so on. Another kind of core concept to OPPA and policy engines is that we want to decouple policy from our applications. And we'll get into the benefits of this model in a bit. But that is another key concept. We're basically sort of like how we would decouple storage and move that into a database. You can think of policy or decoupling policy in the same way. We'd rather want to treat the rules of our systems separately from things like application logic. So that's policy. What's OPPA? It's an open source general purpose policy engine. And the general purpose part is important. That's really what makes it possible to do things like integrating with Backstage, which is, of course, a project that wasn't even on the radar when OPPA was born. And so OPPA has a number of use cases. And yeah, Backstage obviously being the one we're talking about today. So as of 2021, it's a graduated CNCF project. And it was a CNCF project for a long time before that, of course. What OPPA does is provides us a unified toolset and a framework for working with policy across the stack. It decouples policy from application logic or whatever other type of logic you have. So we can work with policy as a first class concept. One thing to note is OPPA separate policy decisions from enforcement. So OPPA deals with policy decisions. It's a decision point. It makes a decision, but it does not enforce that. That will be up to your application or your framework to actually do. That could be sending back a 401 or a 403 response if there is an HTTP app. But it could be enforcement comes in many forms and types. Policies, they're written in a declarative language called rego. I think we'll look into that in a bit as well. So as for the community, we've been around for, I think, the first committee in OPPA since 2015, so almost 10 years. So during that time, we've established quite a vibrant community around the project. We have over 100 integrations listed in our ecosystem catalog. 9,000 stars on GitHub or something like that, 8,000 Slack users, hundreds of millions of downloads, and also a lot of projects in the OPPA organization or whatever we want to call that, like Gatekeeper for doing admission control, Conf test for testing things on the file system, and a whole bunch of editor integrations. So how does it work? If OPPA can work with these hundreds of integrations where Backstage is one of them, how does that model work? It's a fairly simple model, and I guess it has to be considering all these tools and systems that we integrate with. But it basically works like this. Anything where we service a request, i.e., a service, and we have a pretty broad definition of service. It could be anything. It could be the Backstage Permission System. It could be a Linux Pan module. It could be a microservice. But rather than trying to figure out ourselves, should we allow this or not, we delegate that responsibility to OPPA, and we do so via a policy query. So we basically send any relevant data we might have, such as the user identity, maybe the resource being accessed or the method being used. And OPPA, based on the policy it has been provided, makes a decision and sends back that to the application. So we delegate responsibility over to OPPA for a policy decision. So if we zoom in a bit, and we also look at the regobits, how does that work? What does the policy actually look like? So in this example, we have some input. We have a request, a fairly typical thing for an HTTP API. We have a path. We have a request method, and we have a user. And in this case, we have a policy that states that by default, we should not allow anything. Allow is equal to false, but we should allow if in the user roles, there is an admin role, which is not the case here. The next condition states that if it's a get request, somebody is just trying to read the users. We should always allow that. That is not the case here either. It's a put request. The third condition to our policy says that we could allow this if it's a put request, and the name of the user in the path is equal to the name of the user making the request, meaning someone should be allowed to modify their own user attributes, but not that of other users. That is also not the case here, since the user in the path here is Anders. Well, the user trying to make this request is Peter. So the policy decision in this case would come back as false. That was not allowed. That's basically policy evaluation in a very simple slide. So if you want to try OPA out, it's a reverse that I talk to. It's a lightweight self-contained binary on around 30 megabytes. Since we want to integrate with all these tools, it's also important that there's a whole bunch of deployment options. So you can either run it from the command line. You can run it as a REPL or as a server. And this is commonly the typical way to integrate OPA with things like backstage. You run OPA as a process on the side, and you query OPA on its REST API. But there are many other integration options as well, like libraries or wasm and so on. And if you want to try it out, there's also a whole bunch of useful tools that you could use for that purpose. OK, so why would we use OPA if there is already these permissions framework? Backstage has one, and it's pretty good. All these other tools like Spring Boot or C-Sharp like most of these tools will come with a way to do authorization. And the problem with that is, of course, that they all come with their own different ways of doing this. Well, the goal of OPA is basically to try and unify this across the whole stack. And one reason why we want to do that is, of course, that we can share policies and authorization logic across different types of applications. So we could, and that, in turn, allows for centralized management. So we can have things like organizational policies where we say, these rules should be upheld no matter what type of application it is, whether it's written in Python, Java, or if it's a backstage module or plugin. And another usually important thing, and even more so if you work in a regulated industry is, of course, this problem of auditing. If you're doing authorization in five different ways and five different applications, how would you audit what goes on? The less components you have to worry about, the easier to work with. Finally, testing, being able to test our policies in isolation and not being dependent on our application logic. And policy updates. That is a big thing for most open deployments, and I think, as Peter will show, also for backstage. We want to be able to update our policies without having to recompile or redeploy applications. All right, so before I hand over to Peter here, there's one thing I don't know how many have worked with the permissions framework in backstage. All right, yeah, so quite a few hands here. So one thing that you should be aware of, which is an interesting kind of implementation detail, is this a partial evaluation. Or that's what we would call it in the policy space. I think a conditional evaluation would be more backstage. But it's a common thing here in this space. What we normally had, and I think if you remember the example, we had a user, we had an action, or a get or put request, and we had a resource. That would be the path. Backstage does not, the permissions framework will not provide us the resource. We won't get the resource ID not provided in that context. But rather, we will have to do something called a partial evaluation, where we don't have access to all the data. And this works pretty much like a partial function if you work with a functional programming language, where you say something like plus one, yeah, plus one, and then plus what? That is not known at the time. So instead of getting back an answer, you get back another function that you can then use to plus one with something else at a later stage. And partial evaluation works basically the same thing, or conditional evaluation, where you get back another, you don't get back a decision directly, you get back another query, which you can then translate into something like a SQL query or a database query where that data is made available. So this system works, and a benefit of this is, of course, that it scales very well, because you basically delegate the decision down to the database level. So it's quite efficient. A drawback of this system is, of course, that if OPA doesn't really do that final decision, you don't have all the benefits of running OPA, which is, of course, like these audit logs and whatnot. Because if OPA doesn't do the final decision, it's not going to log the decision either. But there are ways to solve that with OPA and backstage as well. It requires more work. OK, thanks, Anders. So I'm going to talk a little bit about the why I did this. And to go back a little bit, my first KubeCon was in 2022 in Valencia. And I was blown away by all the cool stuff that I was seeing. And when I got back home, I wanted to really get involved in a project. And OPA turned out to be the one that I got involved in. And then after discovering backstage, which has also been a fantastic experience, and full disclosure, I'm not a TypeScript developer or JavaScript developer. I used to do, I'm quite familiar with Golang. But since picking up backstage, I guess I'm now a TypeScript developer or a Node developer. But I wanted to find a way to put my two favorite open source projects together and see if it could be done, basically. It was more driven from curiosity than anything else. And then when I found out that, oh, this might actually work, and it might be useful for people, then I decided to fully open source it, basically, as a plugin. So at work, I work at Vodafone Ziggler, a telecoms company. And we're in the early stages of our adoption of backstage. We call it cockpit. We wanted to, permissions were quite important for us because there's certain plugins we have. We don't want certain teams doing things, or maybe certain teams viewing things. So we wanted to use a permission framework. But we didn't like the fact that we had to redeploy the application every time we made a change, because the requirements tended to change more often than we'd like. We wanted another user-friendly way of defining permissions in policies, like for OPA. So we didn't have to be kind of the bottleneck of applying these permissions. We could almost hand it over almost completely to other teams if they wanted to define their own permissions in a policy. The conventional method, obviously, you have to, it's down to the integrator to do that in TypeScript. Not everyone's a TypeScript developer. And I think a lot of teams that use backstage don't always have a development team at the ready to take it on. So again, this was to try and lower that barrier of entry for permissions. And also, I wanted to be flexible so that anybody could just update a policy or update something without having to involve us that much, like, oh, you need to redeploy, because we want to change permission or something like that. And yeah, well, so can it be done? And obviously, the answer is yes, because otherwise I probably wouldn't be here talking about it, so it can be done. And now maybe for the people here that don't know backstage that well or don't, or they're kind of curious about the permissions framework, they're not too sure about how it works, I'm going to go over it very high level and carefully, because I know a few people here have probably built this. So the user triggers some kind of action, like a read request. That will go through the plugin that you're trying to do set permission on. And that will be forwarded to the permissions framework back end, which then delegates the permission to the permission policy. And this is what you define probably in code, in TypeScript or something like that. Then that makes a decision, and that forwards it back to the plugin, which then either allows the user to read it, or not in this case. So what would this look like with OPA? And if you're keen-eyed, you could probably see where I'm going with this. You basically just replace the permission policy with OPA. So instead of sending it to your policy that's written in TypeScript, the permissions framework back end just forwards it to OPA, OPA makes a decision, sends it back to the permissions back end, and then that sends it down the chain again to the user. This is kind of what we can send at the moment to OPA. So you can see the permission name is Catalog Entity Read. This could be whatever your plugin defines. It could be like, I don't know, Intizen, something, Admin, Read, or whatever you want to call the permission. And then the user, and then the user's claims, which is like the groups and the actual user entity. How can you use it? So with the new back end system, which is amazing, by the way, it's literally just this line. The old back end system is a little bit more, but I'm kind of close to time, so I don't want to go over the full thing. So you just literally import this, provided you have the permissions framework, obviously up and running, and you've set that up, you just import this one line, and then you're basically good to go if you have OPA up and running as well next to it. This is the only config you need, so you point the base URL to wherever OPA is. As Andrew's mentioned, generally that's next to your backstage deployment somewhere, I think we deploy it as a sidecar, and then you have an entry point, which is what the rule head that you want to evaluate against. And I can try and do, I've got two minutes, I think, so I just need to, yeah, I can just go to this. Just have to wait for it to hopefully spin up, and it's been a bit slow. Okay, so this is my fictional backstage that I've made, and you can see that I can click on this, and I can find the mouse. I can delete this entity if I want to, and then if I go to, I'll bring this over so we can see the policy, if I now change, so basically what I'm saying is that if the claims contains this, which it does, because that's one of the claims I've set up for myself, then they are allowed to delete something from the catalog, but if I change this, and I have watch set on, so it should instantly, I can now no longer unregister this, and we can do the same with, if I change that back, I can say maybe we can do this, now if I go back to the catalog, I can now only see APIs, so that's applied that only, if they're not in this group, if they're in this group, they can only see APIs, and if I change this back, I reload, now I can see everything again. So you can change the permissions without having to redeploy the application basically, you just have to wait for the next time the permission's back and makes a request. One minute over as quickly. So to wrap up, the idea for this for me is to lower the barrier to use permissions, hopefully making it easier for people, and there's a lot of work going on in backstage on this to kind of lower that barrier for people to onboard quickly and easily. No need to rebuild or redeploy the application when you update policies, you can just do them as you saw on the fly like that. It should be really easy to deploy and use, I hope so anyway, if it's not please let me know, and obviously both sides have amazing communities and rich ecosystems, so you won't get stuck on either side, you can ping me directly if you want to use this on Discord and trying to decouple a little bit the policy from the application so it's not all tied together as the application. And a big thank you to everybody for letting me be here and letting me talk to you about this. You can find the repository on there if you want to look at it, you can reach out to me on any of those and then these are some helpful resources to help you get going on the oversight if you want to use it. All right, thanks. Do we have time for a question or two? Time for questions, yes, we can accommodate a few. Hello, who's working here? I had a thought while you were talking, nice presentation, by the way. But I know it's probably not possible right now, but it would be nice to be able to use the regular file to protect the resource itself and use the same regular file to protect the resource and backstage. So for instance, if you don't want to be able to see a service in Kubernetes that's protected with policy agents, the same thing would happen on backstage. That would be nice. That was my thought. Right, so you sort of like reuse a policy for from something like Kubernetes and into, yeah. That is, that should be doable. We have one more over here. Thank you for inspiring us to decouple policies from kind of the application. So that's good input, I think. I wonder when we list like 100 catalog items on a page and for everyone, a rule has to be checked, like a decision has to be made. Are we set up for kind of a performance bottleneck there? What should we expect? Yeah, I think part of why backstage has chosen this kind of conditional model is to delegate that responsibility to the database. But yeah, you're right. It's still OPA serving that conditional response. And one of the reasons why we normally want to run OPA on local hosts is to have like sub millisecond latency in that query. And that's normally what you'll see. But yeah, of course, there are limits to that as well. If you query like for millions of objects or not without pagination or something like that, that could definitely be a problem. But yeah, I think normally this type of, like this type of system is not that critical. If you have to wait a few extra milliseconds for your catalog to show up. So that's probably not going to be an issue. But yeah, for certain applications like distributed microservices, where each, and if you want to do that in a zero trust environment, where each service queries OPA and there are hundreds of services, yeah, that might add up. So there are different strategies to tackle that. I think that's probably out of scope. I could talk for an hour on that topic alone. But yeah, I'd be happy to talk more on that with you later. All right, thank you guys. Appreciate it. Thanks. Thank you. Thank you.