 Welcome to KubeCon Europe 2021. My name is Mark Borschtine. I'm the CTO of Terminal Security and we are going to talk about RBAC. So first I'm gonna give a real brief overview of who I am and why I can talk to you about RBAC. We'll then get into what RBAC can and can't do. We'll talk about some of the sharper edges around RBAC. Where authentication comes into play and why that's important. And we'll start talking about some of the business roles of RBAC, especially in multi-tenancy. Finally, we'll get into how you handle all of these concerns and a quick demo of how all this stuff comes together. So who am I? My name is Mark Borschtine. Like I said, I'm the CTO of Terminal Security. I've got 20 plus years of experience and identity management. I've been working with Kubernetes since about 2016 with several contributions to the project. Can often find me inside of the Slack channels talking about authentication and authorization. And finally I just released a co-author to book called Kubernetes and Docker and Enterprise Guide. So what's RBAC and what can it do? RBAC is the authorization system for the Kubernetes API. You can use it for other things, pod security policies as an example, but you really shouldn't do it if you don't need to. There are different authorization models out there that you can use. You shouldn't try and peg the RBAC API outside of the Kubernetes API. One of the great things about it is it lets you centralize your API authorization so you can define your roles globally and apply them locally inside of individual namespaces. It's a good idea across the streams to do that. One thing that's really important that you cannot do is you cannot write a policy that says I wanna do everything except X. I wanna give access to everything except these four secrets in a namespace. Or I wanna give cluster admin access to everything except for these three namespaces or the Kube system namespace. Those types of policies are really difficult to implement. RBAC won't implement them for you. There are different ways that you can do that, but quite frankly, if that's your model, you probably need to reevaluate what you're doing. All right, so what makes RBAC so hard? Once we start getting into the syntax of the objects themselves, you'll see that they're pretty straightforward. There isn't a lot there there, but there are a lot of things that give it really sharp edges. And the way I have this table broken up is I wanted to show that while some of these things make it harder to get started in the long term as you approach maintenance, and what's often called day two maintenance, it actually makes it a lot easier. So there's a steep learning curve, but once you get over that hump, makes life a lot simpler. So the first is no referential integrity. If you create an RBAC policy, and this is true of almost anything in Kubernetes, that references other objects. If those objects don't exist, Kubernetes isn't gonna tell you. So that can make debugging a little bit difficult. You gotta be really particular in how you set things. Users and groups, they don't exist. There are a couple of corner cases, service accounts exist, and service accounts have some static groups. But in general, you can't just define an object called a group and add members to it. It doesn't exist. You can't define a user object, at least in upstream Kubernetes. OpenShift is a little bit different. You can combine cluster roles and role bindings, which would make things easier and does in the long run, but can often make it harder because people have a, in the short term, because people have a harder time conceptualizing it. I know I did when I first got started. This one I have harder and harder. Authorization is a business problem. We're gonna talk a lot about this with multi-tenancy. Authorization tends to follow what your, the ultimate problem you're trying to solve, and a lot of the business issues around that are mission issues if you're a government agency. And so that's something that doesn't get fixed easily in technology. The APIs in Kubernetes are not self-documenting. So when you go to design your policy object, what you're gonna find is that you can't look at a YAML and say, oh, that YAML tells me everything I need to know to design a policy around it. And then same as what we talked about earlier with not being able to write a policy that says everything except, there's very limited wild card supports. Basically you can either specify and enumerate all of what you want or a wild card of star for everything. What you can't say is, hey, I want a group that applies to anything with a name that starts with generation for whatever reason. You can't do that with our back. And there are really good reasons for that. One of which is, let's say you define a, you deploy an operator that builds cloud objects and it just happens to have the same name as that wild card and you didn't realize it. Well, now you've authorized access to it. So our back is very explicit. You have to explicitly authorize everything you want to do. And then finally, this last one can directly reference users. I make this as an easier to harder. A lot of people start designing their policies with those start referencing users directly. This is really difficult to manage long-term, especially if you have a large number of users. And so I always recommend referencing groups and then store your groups inside of an identity provider. So speaking of identity providers, let's talk about authentication here for a minute. So these are the most common authentication methods. First one, everybody runs into a certificate authentication. Your user is stored in the subject, groups can be stored as OUs in the subject EM. You should not use it. Kubernetes does not support certificate revocation, which means once a certificate has been minted, it will be accepted by Kubernetes until either it expires or the certificate authority gets changed to a different certificate. So break glass in case of emergency, but really it's just not dynamic enough for day-to-day use. Service account token should never be used from outside of the cluster. They're only ever designed to work in cluster. Token request API kind of changed that a little bit, but you still need a way to authenticate and get that token. So, and we have a link at the end about how you can do more secure access from outside your cluster. Now impersonation, this is where you have a reverse proxy between your users and your API server. Reverse proxy authenticate to however it needs to and then injects headers into the request. This is a very powerful tool, especially when you're talking about cloud-managed Kubernetes where you don't want to use cloud AM. Open ID Connect, this is the best way to go, to be honest, your token stores who you are, what you can do, excuse me, and your can have short-lived tokens which really adds to your security profile. And then finally, you have your cloud-AM vendors or implementations, every cloud has its own implementation and they all have different ways of managing user and group identifiers. Cloud-AM and impersonation are often tied to the hip because this gets back to the business question. A lot of times the cloud group doesn't want to own who has access to Kubernetes. They want to just deploy it, stay out of the way, let the app owner or the team manage who has access. And so impersonation is a great way to do that. So let's talk about multi-tenancy. There are different ways to look at multi-tenancy. These aren't, you're gonna pick one or the other. It's more like you're going to have all of these depending on what you're trying to accomplish. So the first type of multi-tenancy here, which is one that's really popular these days is this idea of having the tenant at the cluster level. So you have some kind of control plane that is responsible for provisioning clusters. So then clusters are owned by teams or individual applications. The nice thing about this approach is it limits your blaster radius. So if your application would be compromised, somebody gets access to a node, they cause some problems, you're not affecting other applications. The downside to this approach is that it's really poor on utilization management. Not just hardware utilization or VM utilization, but also in resource utilization. There's a big difference between somebody who's an expert at writing applications that'll run on Kubernetes and somebody who knows how to manage the nuts and bolts of Kubernetes. There's overlap, but they aren't the same skill sets. It's why you have different tests, right? You have a CKA and a CKAD. And so having a multi-cluster environment means that you have to have more people that know how to manage the individual cluster versus how to write applications for the cluster. You also need to, you haven't eliminated your boundaries in RBAC, you've just moved them. So maybe you're doing less work in your Kubernetes RBAC, but you've now moved the authorization layer into the control plane. Pipeline tenants, so there are a lot of places, implementations that'll say, you know what? I don't want users interacting with the API server. I just want them checking in their code. I'm using my cody fingers here. Code could be, you know, application code or infrastructure as code if we're talking about GitOps. There are some implications here, though. Just like with control plane multi-tenant or cluster multi-tenancy, with this method, your pipelines still need a way to be multi-tenant. You're just moving your authorization layer from Kubernetes into your pipeline system. Additionally, as you look at how you design this out, you have to think about silos and management, especially in enterprise. The people who write the apps and who are responsible for the apps and who ultimately their paycheck is based on things like uptimes and stuff like that are gonna want as much control as possible. So if you're between them and their application when something goes wrong, you're ultimately gonna get the blame for it. And so a lot of cloud teams that start with this approach make this an option, but not a requirement. So something else to think about. Point out when you're looking at pipeline tenancy and using GitOps especially, is make sure you are aware of what can be provisioned into a Git repository. If you are using GitOps, and for instance, you don't want people to commit RBAC bindings or you don't want them to commit, you know, things outside the namespace, often that can't be controlled via RBAC anymore because you're using a service account to talk to the API server through your GitOps controller, depending on which GitOps system you're using. So be very mindful as to the GitOps controller's controls on what can be written. And then finally, namespace multi-tency. And this is where you get the most richness with RBAC. Every application gets its own namespace or set of namespaces. We'll talk a little bit about that more when we get into the toolbox. Authorization will be controlled by admins. So often, you know, you don't want system admins controlling that authorization. You want application or team admins controlling it. You get much higher density, not just of your resources from a hardware standpoint, but also from your people standpoint. You can better utilize your cluster admins and have application admins inside of each team. Of course, the major downside to this is that if you don't have proper controls in place, you can have a larger blast radius. Somebody comes in, there's a bug in their application, leads to somebody owning the node and all of a sudden, you've now exposed multiple applications. So like I said before, this isn't a, you're gonna pick one out of these three. It's, you're gonna probably end up using all three of these for different scenarios. And that's where it gets back to the business rules and the business problem that authorization needs to solve being the hard part. So let's get into the guts of what RBAC actually looks like. So here we have an RBAC cluster role for certificate signing requests. And there are a couple of things to point out. I color coded this. The red stuff you don't care about. The blue stuff comes right out of your URL. So your rule comes right out of the URL definition for your cluster role. Your resource also comes out of your URL. Couple of things to note. Your version doesn't matter. Versions are not part of RBAC. Also, if you have any sub URLs, any sub resources off of your resource, like in this instance approval, pods have like logs and exec, those have to be explicitly enumerated. Picking the top level doesn't automatically inherit the bottom ones you have to explicitly say, give me access to those. And then finally you list your verbs. So once you have your role, you then have to blind it to subjects. And so this part again, pretty straightforward really. The red part doesn't really matter. That's just your metadata. Your role ref, what role are you going to reference? If this is a role binding, you can still reference a cluster role. So this way you can centralize your management of roles without having to recreate it in every single namespace. And then finally you list your subjects. Let's talk a little bit about subjects. Three types, service counts. So each service count has to be scoped to a specific namespace, a user. So for open ID connect, if you're not using your email address, you have to prepend the URL of the user. And please don't use email addresses as your identifier. Names change for a lot of different reasons. Emails are tied to names. It's better to use something that will never change no matter what happens with the person's name. And then finally groups, and this is really where you want to be, is let the identity provider store groups and then specify it by group. You'll have smaller RBAC definitions. They'll be easier to manage and it's so much easier to audit a directory or a database than it is to try and audit a Kubernetes deployment. Kubernetes, you can't say give me all the users that are a member of this RBAC policy. You have to literally enumerate every single policy to figure out what's going on. And if you have a large implementation that can get difficult to do quickly. It's a good thing to cross streams. Like I said, use your cluster role from role bindings. This will allow you to better manage, centrally manage your roles. Like I said, use groups and then finally, don't use service counts from outside your cluster. All right, so we're gonna go through a few tools here that'll make it easier for you to manage RBAC. So the first one is this idea of an aggregate role. So if you look at the admin or editor cluster roles, you'll see these giant roles that have access to almost everything. So like the admin cluster role is designed to let somebody own a namespace without affecting anybody else in the cluster. So you can change RBAC, you can create secrets and config maps and pods and whatnot. But you can't say create a resource quota because that would affect everybody else. And so to build that role, you could either have this big giant static role or what the Kubernetes API server does is it lets you aggregate additional permissions in. So instead of having this one big giant static role that as new objects get created, you have to update, you create another cluster role and you add these labels to it like aggregate to admin and the role aggregator says, okay, I'm just gonna go ahead and add those to the admin cluster role. So if you're defining like a custom resource for an operator to deploy things into your cloud, instead of having to define a cluster role and assign it to everybody, you add this label to the cluster role and anybody who's an admin will be able to create that object. So number two in your toolbox, automation. Everything's API driven, right? So you wanna be able to automate, automate, automate. So whatever's repetitive, go ahead, automate it. This gives you the ability to have naming standards. Lots of tools to do this. Open Unison, our open source project is what I'm gonna show in the demo, but Terraform continuous delivery tools, do it yourself. There's no end of ways that you can automate this. And so you can see here, we've got three namespaces, each namespace gets a couple of role bindings, it goes to groups and everything is consistent and it's much easier to manage and you as a human are not building stuff. Custom controllers. So we talked about the role aggregator. Another one that I really like is the Fairwinds, our back manager. So instead of having to have that repetitive role binding in every single namespace and having this proliferation of groups, you get to have a single object that defines a label and says any namespace with this label gets this role binding. And so now you can have like a team-based approach where different namespaces get team labels and then the custom controller automatically generates your bindings for you. So this makes it a lot simpler to manage, cuts down on the repetitive objects you need to create and is much more expressive. We're gonna use Fairwinds, our back manager which is a great tool for this, but hierarchal namespaces are another way to solve the same problem by having a team-based approach. And then finally, policy generation. Audit to our back is a great tool written by Jordan Liggett. When you get your error messages, they don't give you enough information. You need to be able to get something that's machine readable to do it. So if you look at the event, you can see this comes right out of the event log. You can see the verb that was created, the request API. What was actually requested and the user and groups that were requested to do it. So this gives you enough information. So always enable your audit API. You wanna try something. You see that fails. Tell this tool to go look at your logs. It'll generate a policy for you. I use it all the time. It's a great system. All right, so let's talk about the demo real quick here. We're gonna go with the team-based multi-tenancy approach. I see this a lot, where each team can have multiple namespaces. CIS admins don't wanna be in the job of creating those namespaces or adding access to those teams. So each team's gonna get its own admin and view group. And then once the team gets created, admins inside of that team are then able to create namespaces for their team. Somebody wants access. They get access to the team. They then automatically have access to everything else. So how does that get implemented? Well, Open Unison's gonna do the automation. So an admin's gonna go in and request to create a team. That'll create groups inside of a database. It'll create an RBAC definition for Fairwinds to be able to run off of. And then when they create namespaces, each namespace will get a label for that team. And then the Fairwinds RBAC manager will go ahead and generate the RBAC binding. So let's go ahead and log in here. So I'm gonna first log in as an admin, and then I'm also gonna log in as a non-pervaged user. So the first thing I'm gonna do is create a new team. So we're gonna call this demo three. Now that we've created our team, we're gonna go ahead and approve the team's creation. And once confirmed, we're gonna go ahead and go over to the dashboard, give this a quick refresh. And we'll see that we now have two RBAC definitions, one for demo admin and one for demo view. And if we take a quick look, we'll see that it's bound to a group, that group exists inside of our database now. And it's gonna match any namespace with the label team demo three. So as an admin, I'm now gonna go ahead and log in. Now let's create a couple of namespaces. So local deployment, we're gonna create it for team demo three. And we're gonna call this U1. And let's create another one. That team, U2, and then finally U3. All right, so those objects have been created. And I'm gonna go into the Kubernetes dashboard here as my regular user. And we'll see that we have a bunch of, the namespaces have already been created, but I don't have access to them. It's forbidden. So let's go ahead and request access. So I'm gonna come over here and I wanna be an administrator. And you can see that we now have demo three in here. So I'm gonna add it to my cart, do dev, submit the request. So the team admin now gets an email says, hey, you got an open request, go ahead and approve it. So it's all done without having to create any code control. I'm gonna approve the request. And if I come over here, I go to our reports. This is where it becomes so important to be able to externalize your group memberships. You can see here that we can now run reports against the database. We don't have to go against the API server to see who has access to what the groups are stored right in the database. So I'm gonna go ahead and log out of jjaxson and log back in from user. So let's log in as genjaxson and let's go into the dashboard. Nope. And so now if I go to U2, I now have access. I'm able to get in, I will see pods and secrets and config maps and am I in the right one? No, I'm not demo three or oops, U2. There we go. We can see there's the Kube root CA cert, secrets all there. Now, here's the great thing about the use of teams. Let's say for some reason, we need to cut off access immediately from this namespace from all developers. So I'm gonna come in here and I'm going as an admin, I'm gonna go into namespaces. I'm gonna go to U2, I'm gonna edit this and I'm gonna remove this label. So within a moment, what we're gonna find, you can see it happened in real time right there. I no longer have access to that team because Fairwinds RBAC manager removed the role binding because I changed the downstream object. This works great also if you're doing RBAC or I'm sorry, GitOps where you're checking this object in, let the controller do the work for you. So some resources here, some links to a couple of articles I've written on some topics we've talked about. The demo for the source code will be available by the time that you see this in May. And so if you wanna take a look at that demo and how we put it together or even get it running on your own, of course, Fairwinds RBAC manager, great tool, highly recommended. And then finally, some shameless self promotion. Go ahead and say hi to me on Twitter at mlbiam. If you wanna take a look at Open Unison, here's a link to our website. And if you're interested in the book, link to that on Amazon. Finally, if you wanna roll your sleeves up and get your hands sturdy on RBAC and authentication and pod security policies, we put together a lab that you can just download, do it yourself, you just need Active Directory VM, excuse me, and Ubuntu VM and we'll help you deploy everything it's all right there in GitHub. Thanks and have a great day.