 All right. Welcome to my session on the Cloud Native Authorization Landscape. I'm Jimmy Zelensky. You can find my Twitter and GitHub handles on screen. However, before you check out my socials, it might make sense to understand why I'm worth listening to today. So who am I? Full disclosure, I'm the co-founder of AuthZed, which is the company behind SpiceDB. SpiceDB is an open source database that computes permissions. So while I don't necessarily have an agenda today to promote my own business, what I am doing today is helping people understand the ecosystem of authorization tooling so that if they find out what the right tool for the job is, they will find SpiceDB. If that's the right fit for them, maybe they'll find something else. If that's the right fit for them, either way, we get more qualified leads and the whole ecosystem gets more qualified leads and gets more education. So with that out of the way, before I worked on authorization, I've actually been in the Cloud Native community since the very beginning. I used to work at CoreOS, which got acquired by Red Hat. And in that time, I've both been in software engineering and product positions, where I've had a pretty large impact on the ecosystem. I'm currently a maintainer of the Open Container Initiative. And what that is is the standards body for containers. Like I said earlier, I've been a community member for Kubernetes since the beginning, since before the CNCF existed. I was a first employee to work on Quay Container Registry, which is the first private Docker registry. I'm the co-author of the Operator Framework and the original inspiration for a lot of the OCI artifact work. And I've also worked on a couple of other projects such as Clare, which is the first open-source static analysis tool for containers as well. So without further ado, let's cut to the chase. There's just too many projects out there in the authorization space or have direct implications on the authorization space. I pulled in a few names of projects down from the Cloud Native landscape. This is just random projects from the security section. Some of these are general purpose authorization tools, but it's important to note that you don't necessarily have to be a general purpose authorization tool to have impact on authorization generally. There's lots of tools that are very focused on solving one particular problem. And if that problem is your problem, congratulations. You found a great solution for it because it's going to have had a complete focus and design from the beginning to solve your problem. So a lot of these projects don't necessarily seem like they might be authorization specific, but they actually end up being so. For example, I just mentioned that previously I had co-authored this project called Clare and Clare is a vulnerability scanner. So that might not seem like it has much to do with authorization, but actually one of the key use cases of Clare is to prevent folks from deploying containers into production that have known vulnerabilities. So in the abstract, that's effectively authorizing whether software can actually be deployed on a system. So you can kind of see how with this broad definition of authorization and various data sources, you can find complete like mixed messaging and lots of confusion in the ecosystem and lots of people sharing similar marketing terminology. So today, acknowledging that the ecosystem is massive, what we're going to do is instead of enumerating every single project out there, we're going to try to establish a methodology for looking at projects like this that may be general purpose, that may be very specific and try to come up with ways to determine what makes sense for us, whether it's worth adopting, whether it's worth letting more about, basically different lenses into view and slice and dice this super large ecosystem so that we can kind of consume it in bits that make sense for us. So to do so, we're going to kind of break this down into four different steps, basically the agenda for the day. First, we're going to suspend any preexisting interpretations of vocabulary. As I kind of said before, a lot of folks use similar phrased marketing and that can basically make it very confusing. I'm sure you've probably heard some of the terms I'm going to use earlier today. I'm also going to have a focus with their application in the use of like authorization specifically. So I might actually be ignoring certain aspects of these words just to kind of explore a greater concept in the authorization space. So with that, you kind of are going to have to let go of some of your preconceived notions for what these terms mean and we're going to like start from the very beginning. So then I'm going to introduce these concepts or reintroduce them if you're already familiar and then we're going to explore how these concepts actually impact our use cases and how they can help us filter to identify whether something is applicable to us or not. And then I'm going to basically use that concept and try applying it to some of the most popular projects in the ecosystem so that you can kind of see an example of how you would actually apply this concept in the real world. But no post or presentation on authorization cannot be complete if you do not start from the beginning. And remember when I said that we're going to throw out any pre-existing knowledge? Here we're going to start from the very beginning. What even is authorization or what even is auth? So auth is actually two concepts typically composed together but today we're going to focus on the latter which is authorization rather than what's typically discussed most of the time the former authentication. Most folks are probably already pretty familiar with authentication. The authentication ecosystem is pretty mature. There's a lot of big companies in this space. They've been around for a while. It's a pretty well understood concept at this time but actually a lot of the things in the authorization space I find to be a bit confusing at this point in time. There are lots of different technologies and even some of the authentication technologies bleed a little bit into the authorization space. So it helps to kind of like bifurcate for now and establish what these terms are and that way we can kind of have a shared vocabulary when we want to talk about how these may interconnect. So authentication primarily is about identity. It is asking the question, who are you? And then typically it also has to do with verifying that you are who you say you are. This is typically seen as logging in in user-based systems but it doesn't necessarily need to just be logging in to an application. Oftentimes this is also pre-shared keys between different pieces of software talking to each other over a network as well. So no human even need to be involved in that for example but primarily it has to come down to identity and on the flip side of this there is authorization. Authorization is about what can you do? Once you have been identified and we know you are who you say you are, next we need to determine if you can perform a particular action that you're trying to perform or be in a particular place that you're trying to be. So I like to use the word permissions for this because typically people are building permission systems when they're in the realm of authorization but similar to how authentication can be both users and software so can authorization. You can authorize particular software to access particular data or not based on varying aspects. So this is kind of the bifurcation we're gonna move forward with. Here are some examples in the ecosystem. For authentication there's a CNCF project called DEX and DEX is an identity provider. Basically identity providers you might be familiar with if you're in the corporate space. These are the similar to LDAP or Active Directory and then Ping, Federate and Okta are two leading companies in the authentication space. These are all software that you might be using at your existing company to log in to get your payroll, things like this. And then on the flip side of this there is also authorization tools. As I alluded to earlier, I work at a company that builds a database specifically for answering authorization questions. So we are purely focused on answering access control. Very popular open source project in the CNCF is called Open Policy Agent and this is a policy engine that's used for a lot of software infrastructure. So that's trying to get, for example, what resources can be applied on your Kubernetes cluster. And then I included this last example which is meant to represent libraries that people use to build applications. So Pundit is a library in the Ruby ecosystem and Flask Authorize is a library in the Python Flask ecosystem. And these are libraries that folks use to build in their web applications and store information inside of their database, their relational database that talks about the access of the users in their system or anything else that their system interacts with. So these things don't necessarily need to be standalone projects. A lot of the examples I gave are standalone projects but actually even libraries that you build into your application and kind of ad hoc systems that you could potentially be building greenfield yourself like your own organization can be an authorization system. So just to clarify today, we are not going to talk any more about authentication. We're going to try to focus purely on authorizations of the latter, so access and not identity. Some things can kind of get murky so I want to make this clarification now. For example, OAuth 2, which is a very popular protocol that's used in the authentication space. It actually has a lot of properties in it that are used for authorization. And because of that, there is lots of crossover and terms that will be shared here. If I do bring up any of these identity protocols, I'm going to clarify that I am explicitly talking about the authorization characteristics of them. And with that said, now we can move on to focusing on how we're going to categorize authorization. The way I like to create this is to split it down into three different categories. The who, the where and the how. So without further ado, we're going to start with who, which is the question of who is being secured? So I'm currently looking at a system or a problem and I have the question of who are the people that are being gated access-wise? So there are these two buzzwords that are quite confusing. And as I said earlier, the terminology actually comes from the identity space. So you're going to have to relinquish any previous knowledge you have about these terms because I'm going to use them from the perspective of just evaluating the authorization ecosystem. So IAM is a term that's used a lot of the time. You can see it in cloud providers that offer IAM services. People can refer to Active Directory or any of the authentication systems that I described earlier as IAM products. It's a fairly catch-all term, but the duality here is the thing I'm going to focus on. So IAM as it exists in most deployments today is typically talking about what your employees can actually do. And I stress that this is the focus that you want to think about when looking at authorization products is typically IAM software and folks describing their projects as IAM software are talking about restricting your employees, people in your own organization from things like accessing payroll, accessing their email, accessing their healthcare benefits, accessing production systems, ticketing systems, effectively software infrastructure is also included in this. And then on the flip side of it, there is a newer terminology and this appears more on the authorization space and actually less in the identity space but does exist in both which is called CIM which is customer IAM and this is largely about what your customers can do. So no longer talking about your employees. If you are a bank, this means you're going to talk about what a person can do to withdraw things from their accounts to move money around. These are the type of interactions and people you're going to be talking about as opposed to people within your actual organization. So the implications of having this split is that if you're focusing on your employees, you're not going to be hiring and firing people to the same degree at which folks are maybe registering for a very popular SaaS product. So data doesn't need to necessarily be exactly consistent. If you fire someone, maybe you don't need to update the systems of record until the end of the workday, for example. So there's not necessarily a tight boundary on when things need to be updated across these systems. On the flip side, if you think about consistency, you might have to ban a user on a forum, for example, that is spamming your service or mining Bitcoin on your service. And in that scenario, you're going to want to have some form of immediate percolation of permission changes in your system. So that means that if you want to revoke access for someone, it should happen immediately. I've used the term bounded consistency here because you might actually want only certain properties to be immediately reflected, but not all properties in the whole system. So perfect consistency or full consistency can be used in customer IAM systems, but also as a performance optimization, most of them give you boundaries around this or the goal should be to have boundaries around this so you can talk about the actual requirements that your system actually has. So then there's also course-grained versus fine-grained between these two. If you are talking about the employees in your business, you might be able to get by with talking about the department they belong to or the team they belong to, and that's probably good enough for describing whether they should have access to particular things. When it comes to customer systems, you might need to talk about access to particular rows in a database or lines in a document. It can be very, very fine-grained. And the difference between these two actually has a very large effect because a single user could be a part of 10 million groups in a user system, and that might be just a normal user in the system, not even a special case. However, if that was the case in your business, that would be very strange that someone is a part of 10 million groups, maybe they're the CEO and they just need access to everything, but typically this is not the normal case and most employee-based systems of record have just very basic group support. These are the types of things you see in systems like Active Directory and what the identity providers can support out of the box. So the final impact that these customer facing versus employee-facing concepts have is rigidity or flexibility. So in an IAM system, where you're talking about your employees, your org structure is unlikely to change that frequently. You're very likely to have some kind of hierarchy. It's probably going to be easily represented in systems like I was describing earlier, like Active Directory, unless your company is going to go through reorgs or you're going to acquire another business, there's unlikely to be large changes here. So you can kind of be more rigid about the structure and the constraints on the structure of the system in these systems tracking your employees. However, on the customer side, there are various domains out there. There's healthcare, there's finance, there's gaming, there's all kinds of different, basically requirements of these varying domains. And in these domains, they're not going to be able to tell their users or their data that they have to conform into some particular hierarchy that pre-exists and has a limitation of their system. Instead, they're going to need to build a system that can model their domain. So the difference here is that in the CIM systems, typically they give you kind of a blank canvas but structure to paint the painting of the world that you want to see. So those are kind of the deeper implications of kind of just thinking about the who and without further ado, we will talk about some of the examples of these systems. So let's run through some of these things that are in the CNCF ecosystem that you might be familiar with. OAuth2 Proxy is a fairly common open source project that folks use to secure web pages. Additionally, Teleport which is a kind of system that does similar or VPN products like tail scale. These are all considering getting employees and have authorization systems that reflect that. When we talk about CNCF projects, there's the Open Policy Agent Gatekeeper or Kaiverno which these are systems that gate access or ensure certain invariance of the resources that you're creating on Kubernetes clusters. So these are all kind of fixated on kind of can you do this particular thing to my software infrastructure. On the flip side, for customer I am, using the previous examples that I've shown before, there is SpiceDB which is a database so you can actually write a schema that kind of is the painting of the picture for the system that you actually need and then you load data into that schema once you have data and then you can query that data. So that's kind of how that one, once you deal with the flexibility aspect and then we have obviously library systems that are built into the different web framework ecosystems. These let you basically write your own code and build your own abstractions. So obviously they also fall into the customer I am fundamentally building web apps and systems like these. You're building things for customers typically. So that's obviously going to fall along those lines. All right, hopefully with that making sense to everyone we can move on to the next question to ask and the lens by which we will filter the ecosystem which is where the decision is being made. So this one is kind of a little interesting because where has some implications on generally how the data you're sourcing impacts your system? So if you think about your problem domain and you think about where the data you need to compute whether someone has access to something comes from that is going to be what we're going to focus on. So there's two kind of splits here. There's federated and centralized. So federated is going to be the term we're going to use to describe systems where the data or computation that you're using to get access to a particular resource is coming from various systems. Maybe you have to actually ask an API of a business that you're partnered with whether a user has access to a particular thing. And then that needs to be then joined with information that you have in your systems as well to determine whether they have access to one of your resources. So this can be either the sourcing of the data or the computation as well. Maybe you have to reach out to an external service and they will perform the computation for you and just tell you yes or no and you will basically take their resolution, their evaluation as truth. So that would be examples of federation. And then on the centralization side, there is basically saying that the data or the computation that is going to perform to get access is going to only live exactly in one place. That means that you're not gonna be reaching out to various systems. You know where all the data is but there is going to be work ahead of time making sure that the data you need to compute these access requests needs to be in this place. It needs to be in whatever format it needs to be in that place. So that is going to take effort ahead of time knowing that okay, when data is created it needs to be created in this location so that access control can happen after the fact. So this has the strongest implications on consistency as I was talking about before. With federation, effectively you're going to have very loose consistency. You will not have any real sense of time in these systems. A lot of the systems that are federated kind of are purely eventually consistent. They're going to be making their decisions with the best data they have available to them at the time. That may not be perfect. And if a user is revoked, for example, that may not be immediately applied to any of these systems it could be an arbitrary amount of time until it actually gets applied to your system or multiple pieces of software in your system might not even agree if the person has been removed yet. And you can kind of see this percolated into applications that are using federated systems. On the flip side, the centralized systems have less of this problem. If it is a problem for your domain because they are strict or bounded consistency if you have everything in one place and it's pre-prepared for you and you make a change to it, it's going to immediately apply to everyone that is looking at the data in the same place or computing the permission in the same place. So while this has better consistency properties it does come at the cost of having that data stored and available and in that centralized location. So some examples of these are open policy agent. So a lot of folks run open policy agent as a sidecar next to an application running on Kubernetes and their application will make queries to the open policy agent and then get responses from it but the data that the open policy agent is using to compute that permission is typically synchronized or sourced from some other location, often on an interval. So that means your data is only as fresh as the interval which is being refreshed. This can work great for data that's not changing very often or data that doesn't need to be perfectly consistent. So there's whole class of use cases where that's totally acceptable and probably for the best. And on the flip side of that, centralized systems. Once again, SpaceDB falls on the right hand side. SpaceDB is centralized so it requires folks to not only load data into the centralized place but also define a schema for that data to ensure all the data is laid out in a cohesive manner and can be queried very quickly to answer any of the questions of the permission system. And then also the libraries that folks are using for their web applications, if they're building their own authorization systems, they typically store these things alongside the application data in a relational database. So this data also gets stored often in the same transaction as data being written to their database like application logic that's being applied. So in that sense, there is the database schema that they've created there and all the data that they're loading into their database. All right, finally, we're reaching our third, which if you've reached this point, it should be because you've probably already explored your problem space with some of the preexisting tools that you've been able to like filter down to and you probably know where the pain points are because this one is a little bit of a rabbit hole because we're gonna discuss how the decisions are actually being computed. And what's interesting about this is that it's not gonna be an apples to apples comparison. What we're actually gonna do is describe two completely different systems, but they are the most popular paradigms in general purpose authorization systems. These paradigms might not even be used by your tool if you're looking at a very specific tool for the job, but if you're looking at kind of taking one of these more general purpose systems and using it to solve the problem for your domain that necessarily hasn't had any custom code or any custom systems built for it, this is kind of how you're going to primarily be looking at evaluating these types of systems. So there's these two paradigms. We have policy and relational based access control. Now, policy engines are basically the idea that in order to compute a permission, the user should write a program and that program will be executed with some input data and the net result of that will be a yes or no whether a user has access. And then that's the general idea of a policy engine and the kind of other side of this is not a direct comparison to that. So I'm not gonna say that these are a direct comparison. It gets completely apples to oranges, but other systems that folks are using for a general purpose instead of representing permissions as computer programs that execute, instead they describe them as the existence of relationships between data. So these are kind of the more database like systems. They say that you store all of your data laid out in a particular format and when you do that, they're going to try to find a path in that data between all of the relationships and if a path exists, that means that user has access to be able to perform an action. So this is not to be confused with something like RBAC, which is role-based access control. You can use REBAC to implement RBAC. You can use policy engines to implement RBAC. So just to reiterate, we're kind of talking about the layer deeper right now. We're talking about the implementation details for building general purpose authorization systems. So you might be thinking like, oh, but I only need RBAC, so why should I necessarily care about this layer? Well, the answer to that is actually one of kind of how I was saying earlier, how if you have experienced pain points already, you might eventually realize that while you're building your existing system, you actually need RBAC plus some other unique functionality or you might realize that some of the default behavior in your system is not typical of RBAC like models where you might need finer granularity than having just roles in your application, but like 90% of the time, maybe just having roles is fine. So now you kind of have like two layers of fine-grain access and general-purpose access. There's a lot of different ways that you can kind of arrive at kind of like wanting to peel back. Perhaps you've built a system and you've realized that it's actually really hard to iterate on and you're going through security. Audits all the time, every single time you need to make changes to your application and that can be painful and you're not getting reuse of kind of like the policies or the permissions that you want to be computed across your application suite. Things aren't scaling. These are all things that might force you to peel back a layer on the onion and kind of learn a bit more about how these tools are working. So what are the implications of this? I kind of just listed a whole bunch of reasons or times it might be important for you to think about this paradigm that your system is built on, but I'm going to kind of focus on these two aspects here now which is policy engines are often federated. The decision is often pushed into the policy engine that executes and the policy has to exist within its own execution of the program because like I said in this scenario policies are computer programs versus in relational based access control models. You've centralized all the data into a database and the decision making is actually merely a query to the data as it's laid out already existing in the database. So that kind of forces you into a centralized position fundamentally. That's not to say that these all federated systems are policy engines and all centralized systems are reback. That's definitely not the case, but definitely these paradigms kind of influence design choices of the systems that are going to lean heavily in these directions. You can, for example, have one global policy engine executing with all the data in that for various applications all talking to that and that would be an example of a policy engine that is centralized, but by and large that's not the most common way you see policy engines being deployed. So the next aspect is kind of the federation which is kind of tied in, I'm sorry, the federation of data, which is tied also to kind of the decision making. Like I said earlier, you're not gonna have kind of any kind of, I haven't personally seen any distributed policy engines where policy is executing in a whole bunch of places and then aggregated. You might see data being accessed and federated out and then collected back into one central execution. But yeah, then on the relational side, there you're almost exclusively going to be locked down into a centralized model by definition. You might have systems that are sharding underneath the covers and kind of scaling like a horizontal scaling database might, but also you might see these as vertically scaling databases like your typical kind of Postgres, Postgres relational database systems. All right, so let's make this a little more concrete in the policy space. If we go back in time, maybe if you've had a college course in computer science, you may play it with languages like prolog or data log. These are policy engines, you load facts into these systems and then you query them and effectively they're declarative but what they do is they allow you to compute yes or no based on inputs they provide. So by definition that makes you a policy engine. Once again, a very popular addition to the CNCF ecosystem is open policy agent. Policy is in the name. It uses a language called Horego, which is very much so data log inspired and then kind of brought into the modern kind of cloud native ecosystem to make that more approachable for folks and then Gandalf is a system internally at Netflix that is actually the inspiration of open policy agent. It works in a very similar fashion. I decided to include some of the internal systems at large companies so you can see that one solution isn't necessarily tied to scale. These things can be done at the hyperscaler size, businesses as well as folks just running one app on one node for a small business. So on the relational based access control side, we have examples again, SpiceDB, which is actually inspired by a system at Google called Zanzibar. And so this is basically they're taking graph database concepts and specializing it very specifically for the application of computing access control on top of the system at Google, which has inspired other proprietary systems at other companies. There is also Facebook's implementation. So if you're unfamiliar with Facebook's initial product or Meta's initial product Facebook, it effectively is a social graph. It's a social network. So it makes sense if all of your data is based on relationships such as this person is a friend of this other person, it makes sense that access control for such a system is also storage of relationships between them. If you're a friend of a person, it means you can see their photos, for example. So you can see from that example that actually if you're your domain that you're working in naturally falls into a system where the domain data itself kind of looks similar to one of the particular access control models, it might make sense for you to adopt that as well over a solution maybe you're more even familiar with or one that might have other trade-offs or properties that you thought were important. The ability for it to jive and just treat all of your data as similar data is very compelling for you to make a decision in the space. So that's a lot. It may not be things you've thought about. Maybe it is things that you've thought about but I kind of wanted to close with a couple additional thoughts. There really is no silver bullet for authorization in general in this space. You're going to have varying use cases. They're going to need varying paradigms. They're going to have various requirements for consistency and scale and all kinds of things. And it's going to be up to you to navigate the ecosystem with that knowledge that you have. The terminology in the space can be super misleading. I tend to try to personally avoid using a lot of the confusing terminology. Kind of what I said earlier when I was using authentication versus authorization I prefer to lean into terms like identity and permissions because you can't confuse them. If you just call something auth, it's very vague and people might not fully understand what you're trying to communicate. There's lots of quote unquote authorization systems but that doesn't necessarily tell you whether they're trying to authorize people, your employees, software, access to infrastructure. It really doesn't tell you anything about the domain whatsoever. And at the end of the day you're trying to solve a problem typically. So the primary concern you have is your domain. And finally, nothing is an apples to apples comparison. The paradigms for building a system, building general purpose authorization systems are very, very different from each other. And even software that you might be looking at to solve your problem can be very, very different. If there is a special purpose piece of software designed to solve your exact problem, then it's going to look very different from trying to solve that problem with a general purpose paradigm or solution. Because it can take advantage of lots of domain knowledge to optimize or take shortcuts or sacrifice particular things that might not be generally applicable to other systems. So you can't necessarily rule out things that don't say authorization on the tin. You might actually want to look at systems that are tangential to the domain and not even in the security, the security tag for the cloud native ecosystem. For example, SpiceDB is stored under the database tag for the cloud native ecosystem because it is a database technology. And while it is a database technology that you can use to secure your products, it in and of itself is not a security tool. It is a database that you use to build your own security tooling. So it's a framework. So it may not even be immediately obvious that like you've exhausted all solutions by just simply going through the things in the cloud native security ecosystem. So as a final kind of reminder, you should always ask questions and let your use case guide you. Don't jump on any bandwagon for any particular concept in this space. It may not be applicable to the problem domain you're trying to solve. Really think about the aspects that are going to matter to you the most and pick the solution that's going to work the best for you. So with that, thanks. If you have any questions, this is my email. You can find me on socials as well. My GitHub handle is Jay Zelinski and my Twitter handle is Jimmy Zelinski but email is actually my preferred contact form. So thank you for your time.