 Ready to talk about our back. Everybody's favorite subject, right? Yes, we love our back. I have candy imported straight from Los Angeles for the best questions or whoever wants it when we're done. So I was told Laffy Taffy, atomic fireballs, and that's it. But yes, please come see me after the talk. So we're going to discuss our back. And hopefully we can have fun while talking about our friend, role-based access control. So if you want the slides now, later, you can go to this tinyurl.com. Slash rback-to-the-future. I promise it's not malware. So we're going to not really talk about what Kubernetes is. I assume we're good, right? Everybody talks about that. We'll have one slide, one second, and we'll talk about that. And why identity is so freaking important these days, right? Who you are, what you can do, what you have access to, and why we're kind of missing out on Kubernetes. There's a lot of opportunity to do this better. Then terminology, roles, cluster roles, role bindings, verbs, subjects, all these things that we made deal with every day, and they could be net new. So we're going to just cover that. And then a little bit of the off-end, off-see flow. And most importantly, some gotchas. So I've given this talk one time before, like three months ago in Norway, and since then we have a new CVE that was released last week that's super exciting along the lines of RBAC and privilege escalation, which is, well, bad and good. So let's dive in. We know what Kubernetes is. It's really, really important, or we wouldn't be here. We're running it everywhere. People access Kubernetes, and robots do too, right? Service accounts or programmatic access. And that's enough about that. So identity is the new perimeter, right? We have lots and lots of vendors downstairs talking about just-in-time access and kind of auditing access to things and zero trust. We hear this phrase thrown around a lot, but what does that mean in Kubernetes, right? And it's kind of a complicated relationship, right? Just because you have a token or a certificate or you have an IAM role or something in your cloud provider, that doesn't mean you should automatically be granted access to every single cluster and have cluster admin. So it's a very tangled web of configuration, and that's kind of why we're working on untangling it. I haven't even watched Back to the Future in probably 10 years, but I thought it sounded good. So there will be references sprinkled throughout. Hopefully you'll enjoy that. As Ben mentioned, I am the project lead for the OWASP top 10 for Kubernetes, the Open Web Application Security Project. That doesn't sound very infrastructure related, but it's getting there, right? We're starting to see developers who kind of write application code be involved in some of these decisions, and there was a real need and draw towards this project. You can see the top 10 over here and highlighted. You'll see overly permissive RBAC configurations. That's because this is a big deal, right? You've heard about this thing called cluster admin, I'm sure, and some of the built-in cluster roles and using wild cards. That's what we're going to talk about today, right? And it does make the top 10. I'd argue it's even higher, but that's where it landed when it was first created. So time will tell. Please visit the project and contribute, add comments, pull requests, et cetera. So let's talk about the flow of getting access to Kubernetes, right? And there's a number of ways we do this, but it kind of boils down to the API server. This is probably a flow you've seen before, but you can see up top we have kubectl, kubectl access. These are humans debugging things, creating pods, et cetera. We have service accounts, typically programmatic access of things from the outside or inside that need to talk, retrieve data from Kubernetes, the cluster itself, things like client go. The gist is there's a variety of ways to talk to the API server and get the job done that you need to get done. And the kind of magic happens inside of the API server, right? At first, you see, hey, is this even a valid user? Do they have the right credential to be requesting this sort of thing? And a little bit later in the talk, hopefully we get there, we're going to talk about how kind of a case study, I guess, about how AWS is EKS, kind of authentication, authorization flow works, because it's pretty wild, like if you actually deconstruct it and blow it up. And then the second step, authorization, right? That's what we're going to be focusing most of our time today, authentication, different talk, different day, different beast, right? Authorization is really what can this entity do, right? And that's a really important step in this flow. And then we have our friend, Admission Control, right? Policies that govern the request. Do we accept this type of workload to be scheduled onto the cluster itself? And then obviously, SED persistence and things being scheduled on the node. But for today, the middle box, AuthC, is really important to us. So authorization in Kubernetes can be called role-based access control, or what we would just say RBAC, right? And this is a very fine-grained configuration that you, the operator, somebody who's built the cluster, maintains the cluster, maybe in partnership with the security team, needs to scope down to limit access to service accounts, humans, et cetera. And it sounds easy in practice until you realize the different pieces that make up RBAC, right? At the end of the day, you need to access Kubernetes, right? There's some engineers somewhere, some service account, some dashboard that needs to talk to the API, do the thing it needs to do. And we have to grant that access and nothing more, right? This is our favorite term, least privilege access, right? Which is pretty hard, as we'll see throughout this journey. So within the RBAC ecosystem, we have three main components, right? And we're really just constructing a collection of these when we build an RBAC policy. It's not a single file. It's not usually a single YAML file, especially. But it's a collection of these, right? So you have users, which can be humans or machines, right? Jimmy at ksock.com or a service account that the Kubernetes dashboard, something like that. We have resources, right? There's lots of these. This is not a comprehensive list. This is what's available in that given version of Kubernetes that the API has for us to interrogate or talk to, namespaces, pods, deployments, et cetera. And then operations, right? These are pretty much like HTTP verbs that we are performing on the Kubernetes API server. So simply put, RBAC is combining these together in some logical way to give a person or machine access to the resources that it needs. So some of the terminology, maybe you've seen these. Maybe you haven't, but we have roles, cluster roles, role bindings and cluster role bindings. Sorry, there's YAML at 5.30 p.m. that you have to look at on the screen. I knew you didn't plan for that. You could be out having fun or partying. But roles really are built to give you that verb and resource connection. And they're typically tied to a namespace. There are nuances here. We're gonna keep this pretty simple. But for example sake, we have pod reader and we want to, well, this is gonna do more than read pods. So it's already gonna fail the audit. But this is a wild card or star on all resources and it's going to allow git watch and list. We don't have denies here. It's allow and that's it. And as we kind of keep going down the path, we have cluster roles, right? You see this is fairly similar. It's called a cluster role, but it's actually tied to the entire cluster. And then we bind these roles and cluster roles through the use of role bindings and cluster role bindings, right? So kind of two separate objects in Kubernetes. And we want to take Jimmy as a subject or Jimmy as part of this group or a service account. And we want to bind that entity to that role or cluster role, right? So here you see this user, Jane, is now bound to the role pod reader. That completes the loop. So now we have the ability for Jane in this example to do everything that's encompassed in that pod reader role. Great, okay, Jane can do what Jane needs to do. Similar for cluster role binding, right? I think we get the gist. You can do things scoped down to namespace, but you can also keep it cluster wide. So kind of best practice tip number one, I would probably limit the use of cluster roles and cluster role bindings, right? Your cluster has lots of namespaces in it. This encompasses the Kube system namespace. This also includes the default namespace and the Istio namespace, whatever you're running. So you automatically bring all of that in to your policy, right? So when possible, we should use roles and role bindings versus cluster roles and cluster role bindings. And Kubernetes gives us some of these out of the box. You can likely guess which one's most dangerous. And most of the time, we do a lot of audits for our back. Been doing this for a long time. And it's surprising how many folks kind of just fall back on our good friend cluster admin at the top. Have we ever used cluster admin? Yeah, you should keep your hand down, no. This is all you need to do everything possible under the sun in your Kubernetes cluster, right? It's necessary probably. It's kind of like your AWS root account. It is just wild cards or as my coworker calls it, the five star review, right? It has star, star, star, star, star. And it should be used very sparingly, if at all. And then we have admin, edit and view, each of which have their place. And these are really good getting started sort of placeholders, right? Oftentimes we wanna scope down permission. So we'll give developers edit, right? Read, write access to most objects in a namespace. That couldn't be more vague. So we also want to do better than this, right? And that's kind of what we'll dive into. The other, you know, there's a really good reference here written by some awesome folks. I love the title of it. It's an official Kubernetes doc called RBAC good practices. Cause I think we wanna get away from the term best practices potentially. That's a really good reference for some of these things to help you kind of deconstruct what's going on. What's wrong with this picture? Is this a good snippet of YAML? No, why is it not good? It sounds like the service account can do the job it needs to do though. Yeah, cluster admin, yeah. You're welcome to some candy after the talk. Yeah, so I actually didn't think this would be a packed house. I would have definitely brought more. So what's going on here? We have a cluster role binding called redacted-rbac and we have some subjects, right? What is the default? What's the default service account in the default namespace? Have we encountered this special little Kubernetes feature? Yeah, so it's a service account that there's one of them in every single namespace. It's just available. You can just use it. You can grab it from pretty much any pod that spins up unless you tell it otherwise. And this is in the default namespace which is even more sketchy and it's actually bound to what? The cluster role of cluster-admin. So imagine you're in a position where you have a web application running in the default namespace. You have mounted this service account into every pod. Remote code execution happens or somebody accesses the file system. They take that token and present it back to the Kubernetes API server. Who are they? Cluster admin, right? Like that's not a good blast radius, right? That's a pretty bad situation. And if you were at KubeCon in San Diego in 2019 we did a 90-minute workshop on some of this that we have lots and lots of documentation about this. So this is the kind of audit you need to do in your organization and Kubernetes doesn't necessarily make that easy. So we want to try to aggregate our permissions when possible. Here's some open-source tooling. You know, RBAC tool from Alceed and Rackus. We have Crane. We've built an RBAC aggregation and visualization tool that handles a lot of this for you. There's, you know, crew plugins. There's, you know, you can do Kube-CTL canine command, which is attacker's best friend. But the long, the kind of long story short is like this is pretty hard to get to the root of who has access to what, right? If a security person had this information they would be very surprised with the level of openness that is existing inside of Kubernetes. So this was a contribution to the OWASP top 10. Time check. Not too long ago that I thought, you know, it's also made it into the RBAC best practices guide. But we have to be really careful with list and watch as verbs, right? You would assume when you do kind of this Kube-CTL get secrets command, you're gonna print out a list of the secrets that are available. But under the hood, you can actually access the contents of those secrets, right? And that could be misleading. It's not really a bug in Kubernetes. It's not really a vulnerability. It's just how these verbs work. And you won't always be super clear on the output of get or specifically list and watch. So there's a longer write up in the Kubernetes best practices guide, but also the OWASP top 10 has a really detailed kind of how to do this exactly. And it's kind of eye-opening when you show people like, well, you know, I don't want to see the contents of every secret across all namespaces because I granted this one permission, right? And you can see here, this is, you know, if you see something like this, and I'll show you a CVE that kind of surfaced in sort of this vein recently with like list secrets, right? This would be something in human language when I see list and secrets, I would just want to see the names of the secrets probably. But if you run a curl command instead of kubectl or if you do dash capital A dash O YAML, you're actually gonna print the contents out even if you only have that watch or list verb. So a little misleading and it's under documented and it's one of the things that it's really, really hard to detect. So this was a recent CVE as of last week that CubeFS, which is a CNCF project, nothing against CubeFS. This is, it's not really the point here. The point is that we're starting to see this surface in CVE output, which is super interesting, right? Because I wouldn't see this as like, it's not like a zero day. This is just what CubeFS asked to do when you do helm install CubeFS into your cluster. You don't actually read the RBAC that's included probably, you should, but not many folks are out there doing a code audit of what they're installing. And this is a snippet from what CubeFS, what their daemon set needs, secrets get and list, right? Which is kind of right back to that example before. So the CubeFS daemon set running in your cluster via RBAC can access all your secrets, all the contents of your secrets. It'd be the same as doing, it'd be the same as dumping them straight from SED pretty much, which was somebody deemed it bad enough to create a CVE. So that's going to keep happening and we're gonna see more and more of this as folks audit third party sort of helm charts and open source projects that run in your cluster. You can use, that's a little hard to see, but you can detect some of this stuff, right? These are just manifests that exist as YAML, whether that's in CI or whether that's in cluster. Good start is using something like open policy agent or Kiverno or Ksock to find these, right? And you can go in and look at the manifest and look for what is this? The five star review, right? And it's basically flux, which again, not a dig to flux, but because flux needs to do a bunch of things under the hood, but at least you're aware now, right? That these wildcards exist in kind of a static analysis format in your cluster. So how do we get better, right? You can start with logs sometime, has anybody gone like API audit logs, belunking, have we looked into the audit logs that are generated a little bit here and there? Yeah, if you run an EKS, you probably have it turned on already, maybe you're not using it to its full capacity, but every request that goes into the Kubernetes API is pretty well documented in what it did and the decision that was made via RBAC, right? This can help you craft better RBAC policies. It's not gonna solve all your issues, but it's definitely a start, right? This is kind of a classic project that has stood the test of time audit to RBAC, and it basically takes your audit logs and then generates an RBAC policy that makes sense for what's actually happening in your cluster, right? That's the mentality you want to be in. Okay, I wanna watch what's going across the wire, what RBAC policies are actually being used and then I want to create RBAC that makes sense. We wanna get back to this principle of least privilege, right? So you can see here, this is actually an event that's created by the Kubernetes API server, and if you haven't seen anything that looks like this, that's fine. It's got a lot of information from a security practitioner's point of view, like who are you? What source IP did you come from? What was the response? Like I care about 503s and I wanna see when there's a spike in 503s maybe, when it happened, but for the conversation today, really it's about your username, group, and then the authorization reason, right? That line below is gold, right? Where it says RBAC allowed by role binding inventory of role inventory, et cetera, et cetera, right? This is the Kubernetes API server saying, yeah, here's the response in that second box, that authorization box, wasn't allow or deny. If you see somebody or something getting consecutive denies from an IP that you don't expect and they're trying to list secrets or get secrets in the Kube system namespace, it's potentially an anomaly, right? It's probably something your security team would care about that's happening live to your cluster, right? Or if you see some service account token that's usually only used from inside, all of a sudden you start seeing it coming in from the internet, from places you don't expect and there's some denies mixed in, that's another interesting event, right? So these autologs are really powerful beyond just crafting RBAC. I think there's a lot of potential for security practitioners to use these. So there's other considerations, right? There's no shortage of RBAC insanity, especially there's been some interesting revelations with what Kubelet has access to do, that's a conversation for a different day, like the actual internals of Kubernetes, they use RBAC also. But the easy ones are kind of verbs to look for. Again, static analysis, like you could write some brago to find this pretty easily and do an audit of your cluster of escalate, impersonate, and bind verbs, right? These are not verbs that you encounter every day when you're writing RBAC policies, you're typically dealing with get and list, et cetera, more kind of simple verbs, but these just sound bad, right? Escalate is a verb, you can use it and you can build a policy that has that built in, but these are really made for privilege escalation, which is needed sometimes, or impersonation, which is also privilege escalation, or binding to a different role, and that web gets very messy very fast. So you're gonna wanna know if you have RBAC policies that are using these verbs, and if they're actively being used, you should know why, right? That could be a good reason, it's very plausible, but they are dangerous left kind of unattended to. And then the other one, persistent volumes. So there's kind of these interesting corner cases of RBAC where if you allow somebody or something to create persistent volumes, well, that persistent volume probably has access to the host file system, and then you can break out of the container context, right, of that workload or the pods that are part of that deployment, which is a really interesting attack path, right? If you're a developer and you start on day one and you have very limited access to the Kubernetes cluster, but you just happen to be able to do this with persistent volumes, you've just escaped out of the container and landed yourself on the node, it's on the node, the underlying host. So let's walk through how auth n interacts with auth z. I created this wild flow chart. Let's see if I can follow it myself. So on the left, back to the key players, who's acting on the Kubernetes API? And this is in particular how EKS handles at a pretty high level, authentication, but also authorization. There's some interesting handoffs here, right? This isn't really a threat model. This is just showing the complication that exists under the hood that you don't even see, right? When you run Cube CTL and you're authenticated with AWS IM, you don't really see this. So at first, you're going to present your IAM token, right? You've authenticated AWS somehow, we'll skip how, but you wanna do a thing. Cube CTL, Git pods, all namespaces. Present that to the API server and then the Kubernetes API server is gonna pass that IAM token over to a special, it's a daemon set called the AWS IM authenticator server. That's responsible for talking to AWS, right? Asking like, hey, this particular token is a valid, are they authenticated? And then there's a response with an identity. Again, identity is the new perimeter, right? That identity is then passed to, has anybody ever had to deal with this AWS auth config map? It's a really special fun part of EKS, but that's a very important piece of the mapping puzzle, right? That identity is mapped, it's hard to see to a Kubernetes subject, right? AWS doesn't speak native Kubernetes in terms of subjects. So that config map does some pretty heavy lifting and you can go check it out in your own EKS clusters. It's translated to a Kubernetes subject and that subjects pass back to the API server combined with the original action, the intent that you had. And then RBAC says yes or no, right? So, you know, meanwhile, you're just sitting there like smashing away Cube CTL commands and writing scripts or whatever, but under the hood, there's a lot going on, right? And this again is not like saying this is bad or this is a security issue, but it's good to understand how these things work together. And every cloud handle this a little differently, right? You know, similar flow, but this kind of token translation, RBAC, there's room for error here, to say the least. So I think I'm nearing the end of my time. I don't know if I got my five minute warning, but there is a one note at the bottom, this is just a link, but we are writing, we are bringing a top 10 security project to the CNCF. There's two in under review right now. We're kind of translating and moving a lot of the Kubernetes top 10 over the CNCF under tag security, and there's an Istio top 10 coming out. And I imagine we'll do something with RBAC. So if you wanna get involved or if you are already involved in tag security, here's the random link to the GitHub issue, but it is some good conversation happening there. I think there's a lot still to be kind of dissected, shared and yeah, we have a booth downstairs if you wanna talk or I'll stick around after this and I'll give you all some candy. So go enjoy the rest of your evening. Thank you for coming and see you soon.