 I'm Mike and this is Mo, so I will be doing this is technically the sigauth deep dive. So I'll be covering a little bit of sigauth stuff. I'm going to give you guys a highlight reel of what we've been working on over the past years. So because I know it's been a while since we talked and then I'll hand it off to Mo who will begin the deep dive into pod security. So let's get started. So sigauth what do we do, what are we responsible for? Sigauth is primarily responsible for authentication, authorization to the Kubernetes API and various other security controls such as admission and authorization policy like RBAC and extensions therein. So one of the admissions that we're going to be talking about today is pod security admission. So what have we been doing for the past year or so? There's actually been a lot going on so maybe actually probably over a year ago we began discussing the defecation of pod security policy and discussing what it would mean to be a successor of pod security policy. And we knew that pod security policy had a lot of users in open source like we debated whether we can defer those users to look at out of tree strategies for doing the same thing like OPA. Ultimately we kind of landed on pod security which we will discuss in great depth that is going to be beta in the upcoming release of Kubernetes targeting beta in 123. The certificates API which had was long overdue for GA finally launched GA in 119. I think it had been around in beta since like 1.6 so it's like maybe three years overdue for going GA. Before the GA some notable changes were added such as a new signer name API so now the API can be divided between multiple consumers. After GA a duration hint was added so now you can sign certificates or you can request certificates for specific durations. The exec auth queue control plugin and extension also went GA in 1.22 I think which... There's either 1.21 or 1.22 I can't actually remember it now I think it's 1.22. And that paves the way for moving some of the final entry authentication plugins for clouds out of tree so that's kind of part of the whole club provider extraction effort. And we also have a proposal to extend support for those plugins to request signing and like mutual TLS protocols. And the token request API bound service count token volume and provisioning for bound service count tokens to replace like the legacy secret tokens all went GA. And I think the final GA was in 1.23 so that's I think a huge accomplishment. I think we had been working on that for a couple of years and it should benefit the entire ecosystem because those have significantly better security properties than the legacy tokens. There's some more work to do there. And finally the last highlight is we have a SIG subproject in called the secret CSI driver. There are actually implementations of CSI as the container storage interface. There are implementations of the secret CSI driver with backends for AWS Azure GCP and Vault. And that allows configuring volumes that project secrets and external secret storage providers. So I think the KEPF for GA and that just merged maybe this last week. Yeah, so we've been super busy. And with that I will hand it off to Mo. Cool. I'll make some quick comments on this stuff. So on the token request stuff, some of the stuff that's left is to stop generating the secrets, the back service accounts and really backpedal that functionality completely out of the API server or the controller manager. So there's KEPFs open for that. There's plenty of work there to be done if anyone wants to get involved. We're a very cautious bunch of folks. We don't want to break everyone's stuff like we understand that people run real stuff on Kubernetes now. So we're trying very hard not to break things, but we also need to move the ecosystem forward. So there is a balance there. And like Rita and Anish have been working on secret CSI driver for quite some time. So that is also a place one could get involved to try to push things forward. So the rest of the talk is mostly code walkthrough. So Tabitha and Tim had a talk earlier today at 2.30 that talks about the new pot security stuff from the perspective of an end user. This one is more focused as the perspective as a perspective contributor. If you care about how the code is implemented and maybe you want to get involved and help us get it from beta to GA and so forth. So everything in the code walkthrough is based on 123 alpha 3. So if you see something that's not what's in head of master right now, that's just that's probably why. But in particular, Jordan and Tim really set an incredibly high bar for the code, which is unlike basically anything that exists within tree, this thing has to run, be able to run standalone without like an API server can be run as a web hook, maybe be used in CI tooling. Also all the checks like, you know, is this a privilege pot or whatever, all of those are implemented as individual units. So there's not like some large spaghetti thing that checks all the things. And the idea there is over time, you know, we'll evolve these things and add new checks or iterate on the existing check, but at a different version. So all of that stuff led to a pretty, pretty elaborate implementation for what sort of just sounds like a pretty simple thing. Like, you know, if this field is set on a pod say no, you're not allowed to do that because that's a privilege pot or whatever. So as most of this slide is just or sorry, most of this talk is just me going through the code editor, but I did have two slides. So as an example, this is from one of Jordan's examples on this stuff. If you wanted to run a little CLI that would, for example, do some linter checks on some pot specs that were going to be committed into your like your prod clusters coming up, right, you're going to do some promotion. But you want to validate that they met some security requirements, right, and you just want to gate this. All right, so this is basically the skeleton of a tool that can run just completely without an API server, just and statically validate things, right. So the pot security stuff, unlike PSP is much more constrained, right, you basically say what level of the API you want and what version, right. So like baseline or restricted. So restricted is best, right. If you if your pod can work under that, that's what you want. And, you know, maybe, maybe you're on like version one dot 19 right now. And you're like, all right, I want to make sure that this stuff is going to run on 120. So I'm going to go ahead and try to validate against 120 and see like if I need to go talk to my developers and like, figure out some stuff before we actually move stuff forward. So like, you know, we pass those things and we'll go over like what the evaluator and the checks are like. But basically, if you're, if you're familiar with the Kubernetes code base, you're probably familiar with the builder pattern that we have. That's all this is doing is like, you know, passing in some files locally. And then it's just doing the visitor pattern on all those files. And so the actual visitor is basically like, all right, I was given something with a pod spec deployment, whatever, I'm going to extract that spec out and then I'm just going to evaluate it against the level and the version, whatever you passed me. And just check if it's allowed or not, right. So in roughly like 100 lines of code, you have like effectively like a production ready CI tool. And that's kind of the hope from this is that we want to enable people to build tooling off of this and, you know, offer value just outside of just pure Kubernetes admission. So the notes that I have kind of, basically I went through the code and tried to write notes on like interesting aspects that might not be completely obvious. So all the notes are on that fork of mine. So now I'm just going to kind of walk through some stuff and hopefully it will be helpful for folks. Can you all read that okay? We tried to make the text nice and big early on. So if you're familiar with Kubernetes admission, all of the stuff gets wired up in this API server package. So admission is ordered in the sense that it's a deny based system, right? So with a deny based system, the ordering of checks matters a lot because if you have an early check that denies the guarantees that the rest of the stack wouldn't run. So one of the things that, you know, pod security does is to make sure it runs before pod security policy. Because you can run pod security in an audit and warning mode. So it won't deny stuff. But what you could do with that is you could have your existing pod security policies, the old style, go ahead and put the new pod security stuff in an audit mode but in a very restricted mode. And just let it run and generate events as things are going on, right? And like you, the Kubernetes administrator can go check the audit log saying, I see that Moe is running this privilege pod. That doesn't meet the requirements now. I can go talk to Moe, go ahead and start getting this stuff rectified. But that's only possible if you run pod security in front of pod security policy. And I realize this is obnoxious naming, right? Because pod security admission or pod security policy, like it's very hard to understand sometimes what you're talking about. Also, don't be confused by this. This piece of code makes it sound like pod security is enabled by a default. It's not, it's feature-gated. So the code is there, but feature-gated off. This whole function is kind of a pain to read because it's like, what's off? But it's on, and then the intersection. So it's a little painful. Oh, man, it did not globally change the text size. I'm sorry, let's do, do, do, let's make it bigger. So I'm going to try, I'm trying to assume that folks don't know how the internals of CUBE work to some degree. So one of the sort of weird things within the Kubernetes code base is we have this concept of like an internal type or a hub type. So whenever we read your configurations or your objects, we read them as like pod v1 or whatever. There's an internal representation that's completely related, but it's meant to represent the schema across all versions. So obviously today, pod security doesn't necessarily have that. But you'll kind of notice some of this stuff. So like, in this, where it's doing the loading, if you don't provide it anything, it's going to be like, all right, I'm going to get the alpha configuration. I'm going to then default it. We'll look at what the defaults look like. Then I'm going to convert it into the internal version. Don't get too stressed about this, but this is like all over our code base. This happens all the time. But the defaults for pod security policy are kind of what you would expect, which is if you don't tell us, we just assume you mean privileged. And if you don't tell us a version, we just assume latest, which by the way, privileged means don't do anything. So in order to not break all your stuff, we do nothing by default. You have to tell us to do something. It is what it is. When you add something new that can break stuff, you can't just turn it on. You'll break people's stuff. So looking at the actual configuration, it's pretty straightforward, right? It's like, what version do you want to enforce that? What version do you want to audit that? And so, oh, I'm sorry, I will try to remember. The sort of interesting bit here is there is a hard-coded list of exemptions. So you can imagine that you have a pod runtime class that is, I don't know, running VMs under the hood and you're like, I don't need pod security. Like, I just have strict isolation. So you could tell us in this bit that like, yeah, if a pod asks for this, I don't know, Windows runtime class or something, just ignore it completely. Like, I have guaranteed out of band that this is safe. You could similarly have some user that's always allowed to run privileged pods or whatever. And then you're saying, I'm taking ownership of somehow gating access to these pods, right? Because if you can let, like, as a, for example, if you let arbitrary people exec into a running privileged pod, well, you have broken all security for your cluster. Let's walk through. So if we go back to the actual code, I always like to jump through the name of the plugin to actually get to where it's wired in. So it's pretty straightforward. It's like a, it runs on creating an update of a set of resources primarily. Let me find it. So it runs against basically namespaces and pods and anything that has a pod spec. But for the most part, the enforcing side is on pods, right? So the idea would be is that you don't want to break, like, a deployment controller with this. You just want to prevent the pods from being run. So because you can imagine you could have a mutating webhook that will coerce the pod into a correct shape when it's created, even if the deployment has something slightly different in it. So for the most part, it doesn't enforce on the embedded stuff. So we can kind of look at how some of this code works real quick. Let's see. Like the validation logic is, so as I just mentioned, like, so there's distinct logic for how namespaces are validated, pods are validated or pod controller, pod controller being like controllers that cause the generation of pods because they have an embedded pod spec somewhere within the schema like a deployment. So the most interesting one is pod because that's the one that actually does the real work, right? So it's like, all right, ignore sub-resources that don't matter and so forth. The reason for doing that is to hoist errors that you get as early as possible because it's terrible to create a deployment and have that fail three steps later asynchronously. Yeah. Yeah. But you can see like, you know, a pod evaluation, you know, is strictly enforced and which makes sense, right? Like at the end of the day, pod is what we have to prevent from running, right? Like everything else is a byproduct of all that. Let's see, let's see. I think some of the more interesting stuff is the whole registry. So with the old pod security stuff, right, you had this really expressive API and you would make individual pod security objects and you would try to say like, here's my policies. That level of expression is purposely omitted from this API because the other API is unmaintainable effectively. So what we have is basically this little evaluator thing which takes a series of checks. And so the idea is that this evaluator, you know, is given the input, it runs all the checks and says, is everything okay? Some of this code is like a little bit hard to read because it's trying to do like, well, you have a check at like version 1.12, but this is version 1.20. Oh, that's the like the newest check that's available. So it must also be the 1.20 check. It's kind of, it's a little hard to read at times. But for the most part, you don't have to worry too much about it. Jordan and Tim got it right, so you don't have to worry too much. The more interesting stuff is like the actual checks themselves. So there's like this global registry of checks and you can add to them. So what, oh, I'm sorry, I can read it, so I assume you can read it. I apologize. Yep, it does, it does work on that machine. So you know, what does a check happen? It's basically like a unique ID, what level does it run on like baseline and restricted and then the specific checks over time, right? So you could have a check, the same check change its meaning over time. This is one of the core requirements of the pod security stuff, which was, we can't just say like a check today is the valid over time because the pod spec changes, right? So if you had, if you add a new field somewhere within the pod spec that's privileged, like the equivalent of privilege, the existing privilege check needs to start accounting for that, but only add a newer version, right? We don't break you in place, right? No Kubernetes API is allowed to arbitrarily break in the middle. So to show you what that looks like, the privilege check is sort of the most canonical check, right? Which is like, don't let these fields be true. Like false is good or undefined is good, but true is bad, right? And we can see that this check has existed since the beginning of time effectively because these fields have existed since the beginning of time. So it's, you know, it's a very early check. And you can kind of see that it's like, all right, look at all the containers and by that I mean like all of them, like the innate containers, the actual containers, the ephemeral containers and maybe one day there'll be more containers than they'll also be in here. Look at all of them and make sure that nobody sets privileged equals true in there, right? And if they do, then fail with a nice error message, right? So that one's pretty easy and, you know, it's a pretty easy check. The run as non-root check, it's kind of long and kind of sprawling almost because it's got like levels of checks in it, so it kind of just keeps going for a bit. But the nice thing is that all of these things are completely distinct, right? So if the pod security standards change and we add a new check, none of the existing stuff has to change. We just add a new check to the rest of the thing. And to make it so that, you know, you can imagine a more expressive version of pod security that is completely implemented out of tree as a validating admission webhook. And let's say that someone wants to not run this stuff at all, they just want to run their whole thing. It would be really nice for them to be able to validate that the stuff that they have built conforms the same validation checks as the internal tree stuff. And so one of the ways we do that is for each of the checks, we have a test that generates a series of passing pods and generates a series of failing pods, right? So those checks are obviously run sort of in line in our unit tests and stuff, but they also generate, so if we look at this failure case and the zeroth index, they also generate YAML matching that. So all of this YAML is available, you know, in a very structured way, right? Like this is a baseline check at version 122 and it's a failing check for the privileged check, right? So if you were implementing equivalent code out of tree, you could run all of this YAML against your outer tree implementation and validate that it conforms exactly to the entry one, right? Obviously, you know, this is not a strict requirement for having pod security be a thing, it's just that Jordan and Tim were very careful to make it so that the ecosystem could build and mature on top of this stuff, so it adds pretty significant complexity to the work we had to do to make all this work. So I'm done with the code. Does anyone want to ask me something before I go back to the slides? I did have a question here from online and this is a few sentences, so I promise I'm getting there a little bit longer, though. What is the recommendation for organizations that plan to use something like Gatekeeper or Coverno to provide pod security pieces? Should the built-in pod security admission controller be disabled? Or is the recommendation to have them coexist with the built-in controller providing some basic level of protection and made more fine-grained by a third-party admission controller? So Mike, you might have a different opinion on me, just because I feel like this is an opinion and less of a thing that has a concrete answer. I would personally lean towards always running the built-in stuff in whatever mode is, you know, obviously if you can run everything in restricted latest, do that. That might be impractical, but, and then if you have further checks, add those checks as you're validating webhook or whatever you want, OPA or Coverno, whatever your favorite check is, because as you upgrade your Kubernetes cluster, this stuff will get upgraded by us. This stuff is tested by SigAuth. This stuff is maintained by SigAuth, right? And we care deeply about this implementation and we validate this one. So I would, I would sort of, that's the direction I would push people, but it's more of an opinion. It's not a strict guideline. Mike? Yeah. So one of the initial problems with PSP that was cited when we, you know, chose not to graduate from GA, to GA was that it was, users were incapable of turning it on in existing clusters without breaking their users. So what we did with PSP V2 or whatever you want to call it was we designed the system such that it can be turned on and the cluster will continue to behave as normal. So we would, like, we included kind of what we saw as the table stakes for constraining pods and, you know, we totally accept that our definitions aren't going to meet all use cases in the world. And I think in those cases, it makes sense to run, you know, OPA or Coverno. In that case, like, I would say you can use pod security admission for as far as it gets you and then use Coverno using both at the same time, really up to you. I would say probably it's not necessary to turn off pod security admission controller. And if you feel like you need to, then it's probably a problem with pod security admission controller and you should let us know why. Yeah, one of the core requirements of it was for it to coexist with other tools that are doing similar work. And PSP made that annoyingly hard. So we've tried hard on this one to not to have that, Evan. So as I understand it, right now, if I turn on the alpha flag in 122, all my namespaces by default are privileged or using privileged. Are there plans to either ship something so that when I create a namespace, I can set it down to a lower level or change the default in future versions of Kubernetes to steer users towards a more defended security posture? So I had this discussion with Jordan and Tim and David. They did not believe we could do that while maintaining the compatibility requirements of the Kubernetes API, which is unfortunate. At least basically what they said is this really comes down to the choice for your distribution and what security stance it has as this default posture. So as a, for example, I work for VMware in the Tanzu portfolio. As soon as this thing is beta, I'm going to go turn it on and I'm going to hit anyone with a hammer if they try to turn it off. But that's sort of the opinion of the distribution. At least the nice thing is it's very easy to go change the default and it's a very simple YAML file that just says if you have no labels, that means you're like latest restricted or something, right? Like you can really tighten it out very easily. It is a configuration that you pass to the API server. So it's like the rest of admission, the rest of built-in admission, it's a little unfortunate, but it's sort of, yeah, we just don't want to break people's stuff. If we were building Kubernetes all over again, I'm pretty sure we'd start from the beginning with this. Yeah, that was one of the core requirements. And if you are in an environment where you don't have access to that configuration file, it is also fairly easy to create an admission controller on namespace create and just set that default. All right, we have one other question and we'll look to the room if anybody else has any questions, too. This question is, with the built-in pod security admission implementation, is it possible for admins to add custom profiles or are the built-in privileged baseline and restricted profiles the only that will be available without a third-party admission controller? This one I can actually give an answer and, yes, baseline and restricted and privileged is all we plan to ever implement. However, the code is modular, so you can go ahead, fork it and use the same structure to deploy your own admission controller with your own profiles. Yeah, I don't even think you have to fork it, really, you just consume it. It is meant to be a library. You don't even have to fork it. So yeah, you could totally do that. But basically, we're trying very hard. One, we don't want to become a kingmaker here. We're not trying to rule your policy in all the little ways. But just also, this is what prevented PSP from going GAs because it was an ever-expanding API surface and trying to make it keep up with the growth of the pod spec over time just proved to be untenable. Being opinionated and simplistic here sort of allows us to even have this at all. And for the people who are willing to and have the need to go and define their own profiles, there are a lot of very good existing solutions that would allow them to do that, such as OPA gatekeeper or keyverna. So we felt like that was a problem that was already solved. Yeah. Yeah, at a really simple level, we want to make sure that in the future, on basically every Kubernetes distribution, sort of by default, create pod does not mean root on kubelet. If we can get to that stage, we have made an incredible set of progress into what the default is today in most environments. All right. Come over to you. I'm glad I asked for questions. I had a couple of questions about it. So when it was reading and trying to decide what to react on, it was looking for namespaces or things that had pod specs internally. So if you say create a deployment that violates some of the rules and it's turned on to reject it, it'll reject the deployment creation, or does it still allow the deployment to go through? It just rejects the pods the way pod security policy does now. So I might be misremembering, because what Mike said seemed to be mislined with what I had in my head. I thought it was we would reject the pod, but not the deployment. We would only audit and warn on the deployment. But I might be misremembering. I think that's right. So how is Warn mode triggered? So you have an annotation that sets whether it's enforced or worn? If I remember correctly, because it's under the pod controller path, it's always audited and worn for that piece, I think. The idea was that your deployment could be out of the valid spec, but you could have a mutating admission controller for pods that courses the pods into a valid spec. So the thing that gets created is allowed. So that was sort of the nuance that that... It's not great, right? That was my follow-up question, because there are... You couldn't mutate your pods in line and they can be eventually good. So rejecting the deploy outright would have been a bad idea. Yeah. So that's the use case we're trying to maintain is that we know that the pod itself is the thing that gets run. Everything else is implementation details, right? So that's where we really enforce. And the namespace stuff where it's really powerful is server-side dry run is a really cool feature. And what you can do now is you can do a server-side dry run that says, I would like to change the label on all of my namespace to restricted latest and just tell kubectl to do that. And what will happen is you will get back all the warning messages from the API server saying, hey, here's the like 17 pods across your entire cluster that do not meet this requirement. So... But you don't have to mutate anything to do that, right? So now you've got an immediate feedback on what you need to fix without causing any disruption to your cluster. And that's sort of like one of the cool things about this being label-based and just the maturity of the ecosystem now that didn't exist when PSP was originally written. Thank you. Awesome. We have time for about one more question, if anybody has one. Let's hop to the slides real quick, can't? All right. Yes. Cool. So we have some, we're always looking for people to join SIGOS and get involved in the community. So a couple good links for new contributors to check out these slides will be posted next to our talk in the schedule. So come check it out. We have some good first issues. We have a Slack channel and we meet every two weeks on Wednesday at 11 AM Pacific Standard Time. We'll have a lot of good conversations, a lot of ongoing initiatives to discuss there. And we welcome anyone and everyone to get involved. So yeah, thank you so much for attending today and hope you had a good keep gone. As a reminder, the PSP stuff is alpha. So if there's something wrong, please come tell us because we can fix it before we promote it to beta because it gets a whole lot harder because this is Kubernetes and we don't try to break people's stuff. Everyone likes to depend on every implementation detail of everything we built. But thank you all. Thanks everyone.