 Hello. Hello. Can everyone hear me? Yeah. Thumbs up. Awesome. Welcome to Keys to the Cloud, centralizing and improving permissions in Cloud Foundry, and I do not want to lose that mic, so let's keep that there. So this talk is about PERM, which is one of the newest Cloud Foundry components. It's actually not in general, it's not really general availability yet, but we are experimenting with it, and it should be available in CF deployment soon. So as you probably noticed in the other talks, you need to start with fire exit announcement. So locations of the surrounding emergency exits and the lit exit sign back there, things that you should be aware of. If there is a fire alarm or a different emergency, just exit to the public concourse area, that is not concourse CI, the physical area, and the emergency exit stairwells are located to the outside of the facility, also along the public concourse, again, not CI. For your safety, if there is an emergency, please follow the directions of the public safety staff. They know more about that stuff than me, probably more than most of us, so there are the people to follow. So we have a quick introduction, and then we're going to discuss what the problem is today. Go on a bit of an authorization primer, so PERM is all about authorization. For those who aren't too familiar with things like OAuth 2, we'll be discussing what some of the alternatives are. Look at what currently is in use through Cloud Foundry, discuss what PERM itself is, some of the technical decisions that we've made, our upcoming roadmap, and then hopefully have some time for questions. Quick intro to myself. I'm a software engineer at Pivotal. I'm the anchor of the PERM team. Prior to that, I was on the credit team, working with credentials, kind of a similar space, still in security space. I really like security. So right now, Cloud Foundry's permissions are pretty complicated. I was hoping to be able to show you the website and actually scroll through this for you, but I'm not really the greatest Fedora user, it turns out, so we're going to be sticking with the slides now that I have them up. So this is kind of the first bit. You can see, I know it's a bit small, but we have 10 roles or so with many, many permissions. This is the full list of all the different permissions. I can see my PM just put his head in his hand when we saw this slide. So for example, things are very different depending on whether or not you are dealing with something that belongs in an org that is suspended. And that has many effects throughout the code base. It has many effects throughout anything that wants to interact with Cloud Foundry, as we'll discuss later. So beyond that very crazy graph or grid, roles are fixed. So what that means is that if, for example, you have someone who is a developer, someone who writes and pushes code, and you want them to be able to do everything that a developer would normally do, like again, write and push code, but not see environment variables because those often contain credentials that you don't want your developers to have access to for security reasons, you can't do that today. You can either have a space developer or you can have a non-space developer who cannot push code. So you have one or the other. There's no middle ground. There's also minimal coordination or centralization across components. So there's no easy way of saying like, okay, I want my space developers to also have access to this set of credentials in Credhub, let's say. That today is a very manual process. Same thing with managing, let's say, Bosch teams, which are, I believe, a fairly new feature. Oh, no, Dimitri. I can no longer see Dimitri. So we'll assume that it's fairly new feature. Synchronization with external identity providers is also pretty difficult. There are some various tools, some open source, and there's no way to deal with that, but most people end up having to kind of roll their own solution. So it's not really that fun. It means that every time that someone, let's say, transfers teams or joins the company, you need to give them all of the right permissions. And same thing when they leave. That one's maybe a bit less frustrating to employees, but more important from a security perspective. Also, Cloud Controller permissions are really, really tightly coupled to roles. So let's say that you wanted, that you were a different CloudFandry component or an app or something, and you wanted to check whether a user could push an app or see credentials or anything like that. You'd have to say, well, are they a space developer? Are they a global admin? Are they an org manager and so on, rather than just saying, hey, can this person push an app? So let's go on to the primer stage. So let's do a quick overview of authentication versus authorization. Authentication is a matter of who you are, who or what, in the case of, let's say, an app or any other silicon entity. So in this case, we see that this is the Ikea monkey. There are very many Ikea monkeys around. We know that this is the one. On the other hand, authorization is about what can you do. So in this case, we see that this is bare. Bears cannot use cat doors, so it doesn't get to use the cat door. Just get its head stuck. OAuth2 is one of the more popular frameworks for authorization. It's often used either with external tools like OpenID Connect or with additional endpoints to also work for authentication. So that is what UAA does, for example. It provides both OAuth2 and OIDC or OpenID Connect so that you can check both someone's identity within Cloud Foundry, as well as whether, for example, they are a Cloud Controller admin or whether they have the Cloud Controller write scopes or whether they have, let's say, credhub.readscopes and so on. The basic flow of it is that you have a user who can own some resources and then an application client. Then you have an application that is trying to access some resources on behalf of that user. So they must get access from the user for which they ask the authorization server. And then they use that authorization via token to ask the resource server to provide them with those resources. This is what is very commonly used, as I mentioned, and it is currently used to some extent throughout Cloud Foundry. However, you can see that our model within Cloud Foundry itself is a bit different because, as an example, the Cloud Controller is both the resource server and the application. It doesn't really make sense for it to be asking anything else, whether a user can access, well, it doesn't make sense for it to be asking UAA whether a user can access something because it already has, it is also the resource server. There are also access control lists, which basically have a list of permissions per user. So they check whether an actor has, let's say, the right permission for a document. And if so, then that user is allowed to edit the document. So let's say that I grant one of you right access for my slides, for example, then you would have edit access versus me just granting, let's say, global read access. Now, every one of you would be able to read them, but not edit them. Then role-based access control is what PERM is focusing on today, so this is a bit more relevant to us. And with rules, rather than checking directly whether how a user gets permission, you're able to say, like, all users of, all users who are a member of a particular role or group have that given permission. So, for example, everyone who is a member of team PERM and GitHub can push code to PERM. We also work pretty closely with the CAPI team. So we've given all the CAPI team members or Cloud Controller team members right access as well. So everyone who is a member of PERM, everyone who is a member of Cloud Controller can both push code. I'm a member of PERM. Therefore, can I push code to PERM? Yes. So currently in CF, we have CredHub, which is using OAuth for basic, like, kind of gate access to the API, and then ACLs for more fine-grained access to individual credentials. Bosch, which is using OAuth for both access, like, kind of that general gated access as well as for teams. So you can limit access to particular, I believe, to a particular, not a director, but particular aspects of Bosch to a given Bosch team. And then Cloud Controller, which uses OAuth with kind of a special type of RBAC, where normally, going back to RBAC for a sec, you have both the aspect of who is a member of what role as well as what permissions that role has. Currently in Cloud Controller, you just have the roles. You don't have that broken down by fine-grained permissions. So you have space developers, and then you have this nice grid. But there's no direct knowledge in Cloud Controller itself of, like, what a space developer means. It just says, hey, Isabella is a space developer, or let's say that I'm trying to push an app, it'll say, Isabella is a space developer. Therefore, it doesn't say, Isabella can push an app. That coupling of that knowledge makes things very tricky. So our solution is PIRM. This is our mascot. Specifically, the wig is our mascot. PIRM is about answering two main questions. Can a given actor perform a particular action on a resource? And for which resource patterns, I'll get into the distinction between a resource and resource patterns, perform a given action. So in order to be able to do those two things, PIRM needs to have a concept of basically what roles that user is a member of as well as what permissions those roles grant. So you can say, for example, a space developer role entails the permissions to push apps for that space. Or to view credentials for that space, let's say. The distinction between resources and resource patterns here basically allows us to, bless you, to manage hierarchies so that, for example, all org managers can create, can do various things in all spaces that are contained in that org by virtue of the fact that they have this parent role. Technically, we decided that PIRM should be a GRPC API written in Go. Thank you, Otto Cracht, for fixing, or for upcasing GRPC. With standalone server, it's a standalone server with different language specific SDKs. So for example, there's Ruby SDK for Cappy. We also have little Go SDK. That one's kind of still in the works, but we have a bunch of Go tooling ourselves and therefore have added the Go SDK for ourselves. All of this is open source, so you can just go to GitHub and see it. And we've been really focusing on implementing it iteratively. So it's in, for example, Pivotal web services, but everything is behind not just a flag, but also behind a tool called Scientist, which was originally written by GitHub and maintained by them, which basically allows you to run two or more code paths and diff the results. It's kind of like AB testing, but for big code refactors. It's really handy. So moving forward, our plan is to become Cloud Controller's source of truth. So as I mentioned a moment ago, we're currently using Scientist, which allows us to diff the results, which means that we can have all rights go to perm, as well as to Cloud Controller's current database, as well as reads still getting consumed from CC's database. But we, and then compare those reads to what perm says. So for example, if you have a Cloud Foundry instance that has been alive for let's say five years, you haven't migrated the data into perm, things are going to be pretty out of sync. And you can check that via the Scientist stuff and see that everything is, that the perm results will be blocking users from accessing things that they should be able to use. Then you can run the migrator. We have a migrator tool that lets you populate perm from an existing Cloud Controller. And after you've run that, you should see that the delta now goes down to zero. And they are the same, that they are both representing the same data. We also are going to be working on external group management. So in particular, that will be LDAP management. So you can currently sync your LDAP and other identity provider groups with UAA, but you can't then have those converge with your Cloud Controller roles. Our plan is to make that, something happened back there. Our plan is to have that, is to make that happen so that you will be able to say, well, all members of this team are space developers in this space. So and so, let's say, myself, Isabel, now joined this team. Therefore, she's a space developer. Well, now Isabel has left the team. Therefore, she's no longer a space developer. Or if you maybe even had also a dedicated concourse pipeline for that team, you could say, this is in the future. You would be able to say, all members of the team can both be space developers and push, can update that pipeline. And then once someone leaves the team, they'll just automatically have those permissions revoked. Which if you've been following any of the data breaches that have happened over the years, a common pattern is ex-employees or people who switch teams and stuff, still having access to things that they shouldn't. So the idea is to really help prevent that. We're also going to be working on custom roles and fine-grain permissions for Cappy. So that means that, for example, you'll be able to have someone who has all of the permissions that a space developer has today, except for accessing either reading or writing credentials or other environment variables. So that should be pretty powerful. And while we are working with Cloud Controller first, the plan is to work with other components in the future. So things like Bosch, Mitri is pretty excited about that, has some kind of crazy ideas, I think. CredHub. So again, that would allow you to sync things that, for example, a given app just has automatically permission to read its own credentials, let's say. Concourse. They're doing some more identity-related work, but right now we are hoping to do some more authorization with them in the future. And again, that would ideally allow you, especially with the custom roles, to have the custom roles in the group mapping, to have a team that automatically gets access to these resources in CF itself, as well as your external tools like Concourse. And those are the main things that we're talking about right now. However, we do have office hours in about an hour or so. And my PM is also sitting back there in the middle, sorry for calling you out. So feel free to come talk to us later today about what you want. And with that, how are we in time? Any questions? So for those who aren't familiar with CF management, it's a tool that basically lets you use a GitHub repo, or a GitHub repo in general, doesn't need to be on GitHub, to manage roles, kind of like what I just said. The idea is that most likely, CF management would be working with perm under the hood. So you wouldn't have to stop using CF management if you like it. But for those who don't want to use it because it isn't maintained as officially, for example, they wouldn't need to. Yeah, that's a good question. Perm itself knows nothing about Bosch or Cloud Foundry. You can just start the server, it's a go binary, you can build it or download it and run it. It's also available in Docker images or, of course, Bosch release. So it can be run however you want. It's standalone or easily available through Bosch. We are discussing the bootstrapping problem. I don't, like, have, for example, I've dealt with that. Back there. Might defer that one to my PM. Our tracker is also public. Yeah. Should, I keep meaning to put it into our repo read me. I saw a hand over there, I think. Yeah. Yeah, that one, we haven't started doing the group mapping yet. So that's actually something that we are starting, I think, this week while, while we're in Boston. We meaning my teammates, not me, clearly, because I'm standing up here. And next week. So the idea is that most likely for now, for things like groups that don't, that don't exist in Cloud Controller, we would have, we'd be available through the CFCLI. Probably at first as, as a plug-in later on, it would be integrated into the core CLI. For existing functionality, nothing will change. So right now, as Christopher mentioned, we have been refactoring CAPI so that it is using perm under the hood. And that's been running in a big production server for several months or so now. We just haven't actually flipped the switch so that it is used as, as a source of truth. It also, we basically have all writes going to perm code-wise and the new like V3 of CAPI's reads going to perm, but not yet V2. That's the trickier part. That's why we still don't have, that's why the estimate was a bit fuzzy. Depends on which writes. So writes for group mapping, so saying members of this group that maps to that LDAP group, let's say, would most likely go through the CFCLI. Sorry, I missed half of that. So saying that that group gets tied to a particular role would likely go through the, would go through the CLI, whether it gets proxied through CAPI or not, still a bit up in the air. If you have thoughts on that, we're happy to hear them later today. Yep. If I understand your question, so is your question whether a single user can get matched to multiple roles? Yeah. Yeah. So with, with the custom, so kind of two parts there, a single user can be part of as many roles as you want. I mean I, we've done some pretty serious benchmarking and I think our biggest user had like 50,000 roles or something and it was pretty performant at that point. More performant than CAPI is today. And with the second part of that, whether you can have a single role that maps to multiple spaces or orgs, with the custom roles, you can do whatever you want. Back there? With what API? Yeah, yeah. With the custom, with the custom role functionality and the fine grain permissions it should do. Any others? Most likely. Yeah, so the, I'm not super, super familiar with CFCR, but we've been kind of tentatively talking about that. Yeah. Yeah. Yeah. I know that the Kubernetes are back system is pretty flexible and our PM Christopher has done some early prototyping with it and looked pretty feasible. So the plan is yes. It's a question of when. Anything else?