 Hello everyone and welcome to this episode of this week and cloud native. This is episode number nine And in the previous episode we spent quite a bit of time kind of exploring the a lot of the information about the pod security Admission controller and in this episode. We're gonna do a little bit more hands-on there. So Let me know you're out there if you're out there, and I'll be happy to raise your voice into the Comments, so if you're listening or if you're tuned in definitely, let me know that that's happening Good to see five of you out there the hack and B for this episode is located at the link below where it says hack and D I oh slash at TW ICN and so if you want to see the notes or if you want to put anything in there You can put that stuff there How's it going Russ? Good to see you. Let's dig in here. So Yeah, Russ saying hello to everybody and then here is our notes for this week. That's some good stuff to cover I'm doing some work at some, you know pre-work and getting ready for kubecton. I will be down there in person So I hope that some of you will be as well and hopefully we'll get to see each other I'd be tremendous kind of off-center somehow Stream title things off should say this week in cloud native search magic interesting because according to Where I usually set that that's set correctly, huh? Interesting interesting Take a look here and see if there's something I can do there. Give me one moment Yeah, sure enough. It is correct. It is not correct, but I don't know how to change it because The place where I've normally changed it. It's already set correctly. So I'm not sure where To make the change guess we're gonna have to fix it in post All right, well, thank you for pointing it out at least I don't know exactly how to change that I think it's already set correctly on my side. So I think I can do it on my side. It's correct. Sorry about that Anyway, let's dig into it a little bit more So as you are already aware, this is cloud native TV. This is a official live stream of the CNCF and so it's subject to the CNCF code of conduct If you like the stuff that I'm putting out here make sure that we're that you're throwing in the That you subscribe to this channel and you'll be notified when things are coming We're trying to still figure out the calendar things you can actually schedule ahead of time But that's taken a little bit of time If you want to see any recorded sessions like if you missed an episode or if you want to check something out You can actually go check out the YouTube streams And this is where a lot of the recordings that we're doing will be here So here's a bunch of this stuff from before the last week and the week before but each of these curated playlists Will actually have the content from the previous from the previous shows Actually, it looks like some of mine are missing. I'll figure out what's up. Oh, that's the cloud native classroom Sorry, you want to see a previous episode that'll be there for the Kubernetes side of things There were a couple of good articles or a couple of good things that happened including this one from Celeste Hogan Celeste is a good friend from the community and points out that there's a really great list of first projects or first You know good first good first items inside of the Kubernetes docs And I wanted to kind of share with you how to both define them and also to Be able to you know, how do you get involved, right? And so if you go to go dot ks.io Good dash first dash issue and that's this URL right here go dot ks.io slash good dash first issue and you're going to see any issues that are in the Kubernetes Repository that are marked how wanted and or good first issue And these are great for getting started and if you wanted to look for a particular category you could jump into You could look for you could look right in here And there's lots of good stuff in here for for good first issues And so this is a great place to start if you want to start getting involved in side of the Kubernetes and Environment and there's definitely stuff that is like Easy for some and hard for others like there's one that is talking about like some of the Korean translations for the Documentation is incorrect And that's a great first issue for someone who speaks Korean will be very easy It would likely be very simple for them, but it would not be very easy for somebody like myself Does not speak that language Yeah, there's some stuff in the perf test there's stuff in cube builder there's stuff in the hierarchical name spaces There's there's tons of stuff in here including stuff for docs. So great place to get involved There's actually also just recently a thread on the Kubernetes Twitter bot, which is kates contributors And so if you want to follow kates contributors, there's always good information coming from kates contributors regarding Things that are happening inside the community. They'll announce that like the the summits coming up and those sorts of things But they also call out right here information about Other things that are looking the other folks other parts of the kubernetes community that are looking for contribution So here's an example of building an intuitive dashboard So if you're if you're if you'd like to play with angular JS go lang and client go So gui is looking for for such contributors to help them build an intuitive dashboard or information here There are a few other callouts just this last week around things that are around folks that are looking for Feedback there as well Go ahead and retweet those and if you're following me on Twitter, you'll see those retweets as well So a sick cluster of life cycles looking for contributors to help us with at CD ADM, which is a neat project It's basically a tooling that automates the lifecycle of at CD within your cluster cluster add-ons in cube ADM Are you a user looking for a way to give back here for more information? Also, we have sick scheduling looking for contributors to help document scheduler internals They're looking for more folks in the sick auto scaling They're looking for more and then here's what I was referring to earlier If you're considering registering for the contributor summit at the upcoming cube con virtually or in person Please register ASAP if you haven't done that or if you plan on actually attending cube con either Virtually or in person definitely check that out. It would be a really good time So, yeah, great Twitter bot for following what's what's happening kind of in the community and getting updated the next thing up Usually as we cover a security announced look like I have some other chat some kind of bot Would be nice with posts clickable links. Well, Duffy talked about them. That would be neat. Yeah It's something maybe I'll build Why that'd be kind of fun? so We have Kubernetes CBEs usually I check out the announce group which there's nothing in here since July 14th, so that's good No big new security announcements I did find another dashboard that I wanted to share with you all because I thought that this would probably be a good source of information as well I'm not sure that everybody is aware, but Kubernetes is actually From the relationship with hacker one and they have a bug bounty program and it was launched in January of 2020 Which feels like it was a million years ago and only yesterday at the same time so this is the kind of a hacktivity list and I think you can sort by when it was published or when it was Body I guess you can't really sort this in a way that I think makes sense, but here are some here Here are some things that have been Opened against our you know publicly disclosed about things that are happening within the cluster and if you look at that security Announce list you're going to get to see the same events, but I think it's actually probably a pretty good way of understanding what's what's coming and There'd be a coordinated disclosure So if you're following the security list that is in the notes, then you would still be able to see this But also this is kind of like where you're seeing this is where folks that are actually doing the work of trying to find Vulnerabilities and Kubernetes are going to make that announcement And so let's just take this top one here for example node validation admission does not observe all old object fields The validating mission controller Mission webhook for node objects is passing old object fields incorrectly on admission review request It's identified. This was an interesting one because they're theory the premises that you could Potentially allowing users to bypass validating admission by updating node labels taints and others Which is an interesting attack surface and they created a validating webhook and they created a dummy and they created a potential Issue location. So this is a great example of an expert of a vulnerability documentation like that actually provides experiments or provides actions and Really explains the impact of a vulnerability that has been put out So I thought this was really well written and really well really well done here and then You know as I was saying before so this was resolved as a CBE This is the CBE here and again if you just go to that information Within the CBE document There'll be a link to Kubernetes security announce which is the group that I usually check every every time we log in here And this is where you can actually see like what this issue was rated as What the affected versions were and what the fixed versions were and so this particular Piece was actually exposed and fixed so good stuff But anyway more interesting information to go look at Superd CNCF things. I wanted to talk about I always check the cube weekly for the previous week to take a look And see what's coming or what would be announced and what is happening this week? CNCF programs online programs for this week We have Kubernetes clusters need persistent data by James Byron of storage OS And then that tweet that we were just watching for or we're just talking about from Celeste Hogan and Then there's some good technical articles talking about Prometheus definitive guide Interesting so digging into the Prometheus operator by Nadad Desai from in for cloud technologies Kubernetes CI CD by Alex Chakas from Ubuntu and SQL comment emerges with open telemetry interesting. I hadn't seen that one yet Nimesh Paghat from Google Cloud and Then we have Daniel and John John Doing a sneak preview of the cube con talk Should be pretty fun I'll see you to subscribe so subscribe to events for push and pull operations serves as a UI to view them. Oh I mean that should be a really interesting talk So if you're and I think that will be pre-recorded so likely you'll be able to see that one both virtually and in person So we'll see how that works out Upcoming events there'll be kata and arm a secure alternative to the SG space Building an HAA control plane for Tinkerbell. Oh, wow by Jason my good friend Jason de Diberis. That should be a fun one That's coming up on the 15th hasn't happened yet two more days and Then on-demand seminars moving from CLI's to control planes with cross with crossplane somebody else from crossplane Victor Farsik And using CSI snapshots. So that's what's coming in the CNC F community Definitely check those things out if you're interested in them And then one of the ones I also really liked was this blog post by my very good friend Scott Lowe Who gets into Envoy configuration he's talking about how Envoy works what you're configuring and how and how the and how those bits of it work So if you're interested in Envoy and the way that Envoy integrates with service mesh This is probably a pretty good read on the Envoy piece of it. I thought that was a pretty good article Alright now it is playtime. Let's us get started. All right. So last week we did a lot of talking about the Enhancements, ooh, and I just learned something new. I wonder if Click here. I want to see if this works spare with me for just a moment because I think that it does If it doesn't then I'll be kind of sad, but I figured out that there is a new short path features Okay Features goes to issues Hmm I was hoping that was a little easier or more intuitive, but I don't see how it works Right now my phone's ringing Not today, buddy So that's how it would work. So if you know the issue number 2579 for example Then you can actually go to that issue by going features 257 So if you are aware of a like this particular issue, this is the Cooper is the enhancement for pod security admission, which is a thing that we're going to be exploring in hand today These are the assignees to it and this is kind of the work that's ongoing If you know that issue then there is actually a short code that you can use to get there Which I only just recently learned about which I thought was actually pretty cool So let's see. So we talked quite a little quite a lot about this documentation Last time talking about like basically how it's going to work What pod security admission is how do we enable it? Those sorts of things. Oh And actually I just saw an issue go by in the Kubernetes development list Which is on I usually look for LWKD info for this and what they pointed out was this recently merged And what this does is it actually changes the admission Mechanism inside of the API server to enforce pod security before pod security policy So if you're in a state where you're running both then pod security will run before pod security policy and that enables the Functionality of like audit or warn right and we can talk a little bit more about that when we get hands-on But when you have this stuff running if you have pod security admission running and you have it set in an audit mode Or a warn mode, then you probably want to be able to see those things before a pod security policy takes over takes the object and And implements it and so in this case you can they they basically just changed the order of The order of which admission control policies the ordering of admission control controllers policies Objects however, they are being evaluated so that pod security would run before a pod security policy So the admission for pod security runs first Thought that was pretty neat. It's a good fix And then we also talk about okay, so this is actually where we're going to jump in a Good friend Laughlin evansson actually wrote this article And I was helping him kind of understand the way that the sitcom stuff worked But this is actually also just an article that he put together about the very thing that we're going to explore today And so since some of this work is already done, we're going to dig in and we're going to start here To keep it simple for our particular environment So i'm going to jump into my desktop two pop up with a terminal zero zero nine Let's get started here somebody Now there are a couple of the things I wanted to extend in this and I want to talk to you a little bit about why and how and so Inside oh shoot. I want to want it. Um in the documentation Which is here A part of this documentation talks about like if you're in warning mode Let's get down in here So in pod security admission Here we go You have audit and warn And in here it says policy violations will trigger the addition of an audit annotation to the event recorded in the audit log or Policy violations will trigger a user facing warning, but otherwise be allowed And so as we define policy I'd enforce that policy in a given namespace Then we'll be able to see like The three the three different modes here. That's what we know. That's what i'm planning on covering today But in here specifically it calls out the need possibly for being able to see what's happening in the audit log and so I wanted to extend our kind configuration To support an audit log so that we could actually see that And fortunately I have done this work before In the form of a gist and so I wanted to actually just to share with you what I was doing in that gist and then Hopefully that will give us the ability to export our audit log And then we'll be able to actually see The audit event as it is annotated when we go into audit mode and we'd be able to see what that audit Security event looks like so for each mode. Okay, this is actually configuration of the object itself. Okay, so Inside of here This is a gist I put in gosh 13 months ago So a year and a month ago when I was playing with Auditing inside of kind you could even see even the api version is an older version I think now we're on a different version of api um And this is a cribbed from the out cold out cold solutions monitoring keds v4 audit Document which probably good enough for us Do not log from the collector We probably don't need to worry about that piece of it. And then there's Don't log node communications Don't log read only urls Lock fig map and secrets sure To catch a little more than There we go So then here's our kind configuring. So these are this is like basically what's the audit with the The auditing file will look like when we're configuring auditing and auditing inside of kubernetes And then the other file here is our kind config that it leverages That makes the advanced audit dot yaml file available to the api server So that The api server can actually use that audit policy file and specify an audit path And then i'll make it so that we actually have an audit log that we're able to view So what i'm going to do is i'm going to go ahead and copy We're going to build a kind configuration that supports both pod security pod security admission and also The um the audit logging so that we can see those events So bear with me and as we work through this And then we'll be able to actually see exactly what that looks like So i'm going to copy this i'm going to come over here. I'm going to go down here to the bottom just for a second Paste all of this in here And then we're going to do some handy cutting and pasting We can actually just do a relative path now, which is cool Have audit dot yaml read only true type file But this is doing what this line right here with the extra mounts is doing Is it saying for the container paths in the control plane node? At this path etsy kubernetes policies adva audit dot yaml I want you to make a copy of this file, which is local dot slash adva audit dot yaml Make it a read only file and copy it as true now. What this does is it makes the assumption that um Actually, this makes it so that the file is available on the underlying node So if I were to docker exec into the api server sorry into the Control plane node I would be able to see that file sitting on the file system inside the container But what this does not yet do is make the file available to the api server running inside of that Static pod container that cuba dm stands up. So that's where the next piece is going to come in Get rid of that only get rid of that We have our control plane node with this file made available And then here is where we're going to make our cuba dm configuration patches I have to look to see if this is correct. I think it might no longer be So I think we have to see what version is actually the current version And the way that we can do that is we can go back to our You know it That's relative to it's the path is going to be relative to the um, or you're actually running the kind Create cluster command. So our current api version Looks like v1 beta three, okay, so it's v1 beta three 15 newer can be used to migrate from one to two 122 plus no longer support v1 beta one and over apis and But needs to be supported v1 beta three so That works for us that had changed configuration This metadata name thing is just giving it something to actually parse against and then here we're here's where we get things that Makes things interesting. So for the api server component We're going to pass some extra arguments to that binary We're going to set the audit policy flag our file to use our policies of audit.yaml We're going to set the log path var log kubernetes Cube api server audit log and that'll be made available on the underlying container and then the format for that will be json And then here we actually make the file available to the api server binary Yeah inside my laptop And where it will be inside the docker node will be at that path at that container path that we specified And we're going to look at that here in just a second once we get this wired up, right? and so This path host path will be on my laptop and container paths container paths will be inside of the container That makes sense So in the extra volumes made available to the api server Right, we're going to pass in a couple of extra volumes. We're going to create a volume that says host path mount paths that's equivalent to these policies And this will actually give us access to that file that we copied into the node And we're going to make that file available to the api server inside of its container file system And then the other piece is where we're going to put those logs So we called out above that it was farlog kubernetes. So we're just going to mount that path on the underlying container Into the container that's running kube api server Where that path will be this is read only false that's directly occurring So in theory, we should have everything we need to be able to turn on both pod security policy That's all we had to do here was to feature gates pod security equals true Um And then in the nodes on the control plane, we're going to add This advanced audit dot yaml, which we still have to create on a local file system And then we're also going to pass in some configuration options to the to the api server so that the api server can enable autolog So i'm going to go ahead and vim have audit Animal insert mode, and then we're going to go back over here to our gist Nope yep I'm gonna go ahead and grab this whole thing Come back over here and drop it in there now I think Oh, you know what that's probably not right anymore either I'm assuming this is v1 by now, but let's Go for 121 cluster real quick and just take a look at that I'll show you how to actually validate that stuff because like sometimes I also don't know so Pretty sure it's v1 now, but actually we could absolutely just but we could also just bet it here So they apparently have some stuff in v1 alpha one some stuff in v1 beta one And also there is actually now v1 likely what we're doing will fall into the v1 category So we should be good there. We're gonna leave that as v1 Also, you can do cube well explain audit Yes, it probably just can only be known about within I thought that audit was actually something that was serviced in the api. So you could actually oh, no, that never happened That's right I was thinking maybe it would actually be something the surface in the api because I thought we were talking about like dynamic audit for a while, but I don't think that actually were materialized and so It's actually not going to be known here. It's going to be known in the code base All right Here is our audit YAML We have set it to audit.gaze.io v1. It's a policy Here are the rules that we care about Should be enough to get it done We're going to do kind name equals check And then we're going to go ahead and bring up our new cluster with all best configuration stuff And create cluster config equals YAML and then version or image equals Image string from that blog post Too many windows I chuck some of these, you know, I'm gonna do exactly that inside of locky's blog we want Kindness node v120 because that's actually where the feature shows up before this the feature gate will apply to nothing So I'll paste that in there Enter Nothing happened here field type not found in type v1 Alpha 4 mount alias That's changed so under our type Now will it fail or will it work? We are making some pretty significant interesting changes to the um To the control to the control plane configuration. So you might get to see some troubleshooting On the kubadium side. Oh, hey control plane started Whoop whoop Worker nodes I don't think I actually did anything to the worker nodes. So that should be started fine What's the control plane that would have messed up if we had messed it up? Can't believe we got that in the first try That was kind of amazing Feeling kind of chuffed. Okay. So we So now we think we got everything worked out here. Let's just go ahead and validate our assumptions So i'm going to go ahead and ducker exec ti kind open And then we do ls Let's see kubernetes policies There's that file and then if we do our log It is Oh, that doesn't look right Yeah, that's not That's That is where I put it, right? Yeah, so this is where because I was doing kubadium configuration patches We can look at the kubadium configuration file that was actually Resolved and verify that the changes we made show up in that resolved file the way that we do that We can just cat Um kind kubadium dot com This is where the files would show up if they were going to show up. This would be where they would be So we're going to cruise on up here So there's kubadium v1 beta 2 There's our feature gate showing up. So that's working. You know, I don't see this stuff showing up I think we have missed the boat somewhere kubadium Yeah, I don't see it here If I do it at security this manifests cat kubadium server Nope, that did not work. I see no auditing flags. So this failed in a silent kind of way Interesting stuff. It's probably one of those things. You know, there's this thing that um happens with yaml And that's Where it is like It will ignore stuff that is that is not that is missing See i'm forward. This is v1 beta 2 not v1 beta 3 In this configuration we can see that we're using Yeah, that's what's happening. Okay, so take a look at that again real quick and then I can cement this Make sure that we have it right You can see that the configuration that was passed to kubadium was of this particular version kubadium dot k it's the i o v1 beta 2 But in our configuration we were passing a patch to v1 beta 3 And v1 beta 3 is not a configuration that is being passed to kubadium. And so it is failing silently because it does not match If I go back into my kind configuration and I change this value kind cluster Cluster And then we'll see if it works. So we thought we got away with it But we didn't quite get away with it. We're gonna see if we get away with it this time while this is loading up I'm gonna actually go ahead and open up another terminal and do You can actually probably hey trophy started Hey, there we go. Now we're seeing it show up Much better Okay Is there a way to tell kubadium To use beta 3 in the kind config? I think we have to rely on the machinery that kind is using So likely in a some version of kind that is forward so that it would use that next version Um Alternatively we could do a thing where we generate the kubadium.com ourselves And overwrite the destination in that api server with just a complete kubadium file that it would use right And so if we wanted to if we wanted to take like an example kubadium configuration like you see here and um Make that like This is the only part part right is this local ip advertised address Because we need to know for sure that that will be the ip address that it would resolve to but if we did know that Maybe knew it would be the the second docker container running. I want to have you Then we could actually go ahead and make this change Also, there's a default that would probably be okay here because there's actually the only the one ip address on those nodes So we probably don't need to specify node ip here. It's just Being passed because of the way we generate we're generating configuration I hope that I answered your question. I know that's not a super easy I know that's not a super easy answer, but that is one way that we could be doing that So let's go ahead and validate our assumptions again Look like it worked, but I want to jump in here and take a look around control plane and we'll do cat or tail minus f bar log kubernetes Oh Perfect It's working. All right So we do have a working audit log. We do have a working That's the 2di server. So it's actually So inside of here, we should be able to see our audit line And that is pointing at that correct file. And so everything is Working like a champ All right, so our cluster is stood up. Let's play with it Dig in should be fun And what I'm going to do is I'm going to make this This configuration I'm going to put this to configure. Actually, let me just before I say I'm going to do it Why don't we just do it? And then I'm going to jump over here into our this weekend cloud native Go to edit mode Those who want to play along at home, this will be the avauda.yaml And this will be We're here, okay Same here, relative And create cluster Located one Okay, so that should get you to the place where we are And verify role playing Kubernetes Let's dig a little further down into the rabbit hole So the next thing I want to do I'm going to go back to this blog post because Some of the work that was that was done here has already done Warrants about the baseline pod security pin to be 122 of the policy on the test and s So the way I'm interpreting this When we go back to the documentation Um, remember we have a couple of different modes, right? We have a warning We have a mode for baseline mode for restricted and The baseline one is pretty permissive And the restricted one is not quite as permissive So in restricted for example, you can't run a container as root But in the baseline one, you're pretty okay with it So this is a great example Let's give this a try. Let's let's actually validate this in our configuration And see what this looks like from our slide And then we can kind of play through like what that would look like So we'll do Do I have KNS? I do I've got all Is it label or annotate? Let's go ahead and do that bit baseline Oh label And then we'll do Warn I'll also do audit These are the labels that I have specified on this namespace I've set the kubernetes.io metadata name Oh, that's actually already done for me But then podsecuritykubernetes.io audit restricted podsecuritykubernetes.io enforce equals baseline And then podsecuritykubernetes.io warn equals restricted So now I should be able to get warnings I should be able to see that in my audit log And I should be able to get user facing warnings about Manifests that don't line up with a reality But they should still allow the creation of a pod So if I were to do kubectl Limit I'm already in the nginx test namespace Clut nginx Image equals nginx Stable For replicas equals 3 80 Enter And I do see a warning I use a facing warning Would violate the latest version of the restricted pod security policy Allow privilege escalation is not equal to false Container nginx must set security context Allow privilege escalation equals false Unrestricted capabilities Container nginx must set security context Capabilities drop all Run as not root is not equal to true Pod or container nginx must be set Must set security context when it's not root The second profile must also be defined Run time default Or localhost That's pretty cool I think that's actually pretty useful for the developer side of things You'd be able to see that in your log If you were doing something like a mechanism by which You were deploying automatically using CI or something like that Then this mechanism would actually probably surface in your logs and you'd be able to see it I also want to see the audit log and see if we see that same output Or what the audit event looks like So let's jump in and see Let's do Ducker exec TI Oh wow This is actually not going to deploy I just noticed the typo Whoops Well It doesn't really matter too much We should still be able to see the event Because this is an admission not in runtime So that it doesn't deploy as the most beside the point So kind Proplane nginx kubernetes audit install log I'm going to actually kick this to take you There we go So this is neat Yeah That's exactly what I was looking for Okay So actually that last match gives me kind of what I'm looking for So this is the event for So authorization.case.io decision was to allow The reason was that there was nothing that blocked it Podsecurity Kubernetes.io slash audit allow privilege escalation So this is annotating This piece of it And it basically gives us the same output that we saw before So that's actually Pretty cool But let's look at this entire event So we can see kind of like what this looks like So here's the response object for this deployment Right This is coming from the API server Being and is a response to whatever the command line client is that created this object Right It's a kind deployment apps v1 So this is in line with what we saw previously in the documentation where Now we can actually warn you about the object as it is created as part of a deployment We don't have to necessarily wait for the pod object to get created And this is different than the way pod security policies worked Pod security policies only take effect and only manipulate pods themselves So you could very easily end up in a state with pod security policies where You just wouldn't see pods deploying but the deployment object would be accepted by the cluster And so that was kind of leading to a state where sometimes It was hard to actually debug or understand what was actually happening in the lifecycle of that deployment Because of the way that it would be manipulated Because of the way it would materialize inside of pod security policies So your like your flow would be you know You already have a pod security policy defined inside of your cluster your user or whatever mechanism You're using to deploy inside of the cluster that would create those pod objects has access to that pod security policy If the decision that that pod security policy would enforce is to deny the admittance of that pod Then what happens is that the pod itself just will never be Scheduled it will never it will never show up as a pod object inside of it because the object was denied But you would still get the replica set you would still get a deployment object Those things would be created and they would be allowed in because they're not part of the way that pod security policy Would enforce admission pod security policy would only enforce the pod object itself In this case though with pod security admission This is the different this the new version stuff right the new stuff actually can evaluate the deployment object itself And it can actually look at that object and give you feedback best based on that information Right, so this was the manager for this was kubectl create. We just saw that Here's some information about what was being known what's known about it Information basically that it is evaluating against There's going to be three replicas This is so it's in the configuration we can still see our typo here whatever Right, and then we can see That this object was annotated Authorization k it's the io decision is allow And then here is the pod security audit annotation now. I don't think That the if the engine x Object is annotated with the same information and we can see that it is not So there's no annotation in the object itself with this information. That's just annotated to the event We can see of course, but the image pull back off right because that did not work There we go table What this does is it actually Just as a quick hack will change the image Whose engine x stable instead of engine x and now I should be able to see those things deploying You can all set basically just gives me the ability to set particular things about a deployment or a daemon set or what have you Then the interesting thing was even though that was being set. We still saw the same would violate latest version of restricted And that is being configured by by what's configured at the namespace level So we have things working. We're able to see the audit events. We're able to see The user space feedback. We're not getting any feedback on the object itself, which I think it's kind of interesting I don't think you'll be able to see it here Yeah, because in the trigger it's in the code That is a start. Let's go back to our docs here So you've got auditing working We're able to see the event which is pretty cool So there were some questions that were brought to me By rw. It might be us The u.s Or is that somebody else so it'd be interesting to see applying policy on top of existing pods and see what it does Once an old non-compliant pod restarts it might not come back up plus up down That's true. If you're in a restricted mode, then it would break things Although deprecated pod security policies are still are still supported through 1.25. How badly can you break a cluster with both enabled? I think that That would be fun to play with I'm not sure if that's gonna be in scope this time But I imagine that because now pod security policies So I pod security admission will be enforced will be evaluated before pod security policies Um, and so the way that you would break things It's probably likely that the box originally is it's going to be in pod security policies not in pod security admission and the reason I say that is because pod security admission it's only validating it's only looking it's only Validating the specs and making a decision whether to allow or deny Based on its evaluation and against the policies that you've defined Um, whereas pod security policies do quite a lot quite a lot other other stuff like they might also do things like enforce Um, the configuration of things Or mutate the pod spec itself So likely you're more likely I I would say that you're more likely to run into trouble with pod security policies Then you are with pod security admission because pod security admission is either you're going to allow or deny And pod security policies are going to be the thing that mutate and change Or enforce the change in configuration. That's the way I understand it But I want to see one of my one of my um explorations of this effort was to see Are we doing any any any sort of mutating and I don't think that's that is the case Do duplicates of mode cause an error or is the last one declared or maybe first? so In our example here It'll get ns labels So in this case The mode is somewhat duplicated. I don't think that we can create two labels with this with a different value So I couldn't do like pod security kubernetes.io audit equals Warn, but let's try it Get all label ns Yeah, there's already a value We can't really do it, but we can overwrite it So if I wanted to change the key if I wanted to change the value of this one key And I would have to actually I would have to I would have to change the value of it and another question would be can I pass multiple options to audit? right like could I pass restricted and baseline I don't know that that would be a valid configuration either, but it might be kind of fun to see It doesn't take a comment on that comment on any of the less things So I think that would not I don't think that would work. So do duplicates of a mode So I assume pod security is cluster wide So you couldn't have different rules per node So this is admission and it happens at admission on objects like deployment or daemon sets or pods or those sorts of things And because it's admission it doesn't actually matter The rules would not apply differently on different nodes It would apply differently on different namespaces and you can actually also have sets of exceptions around things like What user is creating an object like you can actually and this is actually one of the things I wanted to explore next was Sort of the exceptions because you can make it so that like a given user has the ability to Deploy a pod that is of a particular enforced set and a particular In a particular namespace you can like overwrite it But how the nodes cri Would or how the node would enforce it would be would be the same Regardless of the node, I think because it's all happening kind of at the admission layer Yay I'm glad that's you I think you can only pass me. Yeah, exactly. I think that's right That's what I thought too all right If I understand it right big if there are only three Milton policies, that's right and they can't be changed also true Privilege baseline and restricted Is there any way to add your own or the too soon and that's why there are links to other projects? Yeah So the way this has shaken out over time is this Initially The concern was the reason there's even a pod security admission piece is that The api for pod security policies has grown somewhat out of bounds, right? Like there's just so many things you can configure and and it's not it's kind of difficult to configure it kind of difficult to To test it and so it became a thing where we're trying where it became very difficult to actually Manage that api's growth over time So that led to the deprecation of pod security policies However Liget stepped up And our Jordan Liget stepped up and he said, you know what? Maybe we could just make a much simplified a much more simplified api for what is happening here Where and you might have like the three major models that people use privileged baseline and restricted And those are following the best practices that we define in the pod security And in the pod security document And that you would be able to then just apply those things to given namespaces and get information back to users that say This is not going to work in a restricted mode or this is not going to work in a privileged mode or what have you And give them a feedback loop that way But just create a much simpler api that only fits a specific set of use cases And if people need more than that then they can actually define that more than that In opa gatekeeper keyboard and caverno, etc And I honestly think that's the right move, right? I feel like we have to have something in place for the project itself And that's that something that is in place for the project itself should follow the best practices that we're describing As a project for how to secure pods and containers inside of your kubernetes cluster But to provide like the entire configurable surface that pod security policies provided was Was arguably like too much. It was arguably too difficult to actually maintain over time And so I think honestly, this is probably the right decision So that's why there are only the three and why they're not mutable. Why you can't change them? Um, and I don't I don't suspect that they'll be changeable in the future But I guess we'll see how that evolves over time So other than getting the future flag enabled should this pod security policy admission mechanism work on cloud providers like GKE Aks once they upgrade their services to offer kubernetes version that it's compatible. I believe that is the case Yeah It should just work The way this would evaluate like I mean, I imagine once it becomes default Might be kind of interesting to understand how the how the um Graduation plan would work, right? So if you if we go back to our notes And we go back to that um that cap 2579 or what have you Oh neat, there's a pod security evaluations metric on the api server Oh, that's cool. It's actually broke. Well, okay. Let's just go look at that real quick We're gonna take a quick sideline And because we already have like some that would have been allowed We're just gonna look at like some of the metrics that are exposed here. So Let's just look at that real quick. So you can all get So I'm going to look at the metrics that I looked at the metrics name wrong pod security evaluations should be there with pod security policies Maybe that maybe the metrics aren't quite implemented yet because I don't see them here Hmm It doesn't look like the metrics are in place yet. Unfortunately You know, one of my other commands one of my other favorite commands in this I'll just share with you because I I always thought this was a super interesting one The object One of my big takeaway commands From the raw metrics So I only have the one api server and what I'm doing here is I'm querying the metrics endpoint on the api server Using kubectl. So I'm doing kubectl get dash dash raw which tells Which which means that I'm just going to make like a raw get Against the kubernetes api out of pass And I know that the slash metrics Um Pass it's just hanging right off of the root directory inside of the kubernetes restful api So if I do kubectl get dash dash raw Metrics and I'll be able to see those metrics that are relevant for this api server that I'm connected to If you have multiple api servers, you're going to see multiple different outcomes Because each api server is its own entity and they might have they might have different outcomes for some of these values Now likely they'll all agree on this one But It's an it's still an interesting argument If you're if you're talking to a load balancer that talks to multiple api servers, you're likely to see multiple api server for her however This is one of them in my opinion one of the most interesting Metrics that we expose at the api server and what it is is the scd object scd object counts And you can see like what those actual objects object counts are inside of the kubernetes cluster So we can see there are 43 secrets secrets defined. There are 42 service counts There are two services. There's zero stateful sets One storage class mean you can see like all of these different objects and one of the ones that's really interesting is this one, right? So here are there are 107 events currently stored in scd If I do kubectl delete events a All the events going all the way do that same Object again, and I look for events. So waiting for it to converge Oh, maybe it's because of compaction Maybe it's because of the cache I would have expected that to update by now into the sd it goes a few seconds for for The exported metric to change But yeah, I mean what this can tell you this this can actually show you like Where things are likely to go sideways soon, right? So you can do things like monitor the number of namespaces monitor the number of nodes you can monitor the number of pods The number of events that are currently stored all of those things and you can kind of see where your problem is And so if you're seeing experience if you're experiencing a problem like, you know My kubernetes cluster is really slow, and I want to understand like what's where it's spending its time one One way to see that is this information here, right? and then there's also Rest client request duration request latency in seconds broken down by verb and url Here's another example these are calls to the api scheduling k8s.io piece and then I think this is neat because it's a It's bucketed right so you have the value you have values broken up by category So what we're looking at here is a prometheus histogram um And for that bucket they're broken that means there's a The value of how long a particular call is taking is broken up by uh buckets So we have pointers or one seconds two seconds four eight 16 32 64 128 256 512 And when you're evaluating this what you're looking for is where the event is showing up in the set of buckets that you have Right, so if you're looking at the histogram, it's going to say um in this particular query All of the queries are being resolved in about the point zero six four bucket for resolution I mean, they're not taking 128 and they're not happening any faster than that So if we look for like something that actually has a little bit a few more requests We might be able to see some more interesting outcome Here we go Here's another one So getting the namespace getting the namespaces of the cluster There have been seven requests to get that information um Three of them were responded to within point zero zero two Three more were responded to within the point zero zero four time frame And one of them was responded in the point zero zero eight time frame And so if somebody were to say that kubectl get namespaces is taking a long time Then we would start seeing things we would start seeing things graph up to like a maybe a half a second or longer Yeah, that's reading all the types from ecd that the api server cares about Not I mean there may be other stuff in ecd that if the api server doesn't care about But it's about the objects that it's the ecd itself considers that it's in that it manages right, so um It doesn't store the metrics key in ecd. I believe it's actually What's happening is that the metrics code inside of the api server is evaluating The number of objects inside the cluster as part of a metric that it exposes over a period of time And I suspect that it doesn't constantly pull that to get that metric It's just updating that put that metric on a on a scheduled period of time And that's where that was actually happening because otherwise it would probably greatly increase the load on the api server if every time Somebody hits the metrics endpoint it had to evaluate all of the metrics that would take too long So likely what would happen is what happens instead is that it just evaluates that value And then stores it as a response for anybody hitting the metrics api endpoint For a period of time and then once that period of time expires and it goes back and it re-evaluates Updates the values next time you scrape you're going to get a different value But I suspect that's actually how that's working So that doesn't work yet unfortunately the time for a minute What are reasonable slo's? Dependency, scalability Oh, hey, how does this feature react if the api server And or sdd is unavailable So because the admission is happening effectively at the api server if the api server is unavailable it would not work Oh, yeah, because it's a label maybe we can actually manipulate it differently. So let's see what happens if we did that So if I were to do a kubectl label I'm getting a kind of a confusing error What if I change this to baseline if I'm back to restricted Eat, okay. Well, this is in my opinion. This is the bug Is it saying the namespaces nginx test is invalid, but really what's invalid is the Value that I'm trying to set audit to so I'm going to open that as an issue Because I feel like that's a problem All right So back to our questions make sure we got everything We talked about that Yep, we talked about that Other than getting yeah, we talked about that. I agree. Yeah, who would this so I guess the last thing I wanted to cover was this piece here Which I thought was actually pretty interesting. So pod security tests So this is a project that was built by jim baguardia Contains test pod yamls for each policy type defined in kubectl as pod security standards So I thought we'd try this out because it looked like a pretty interesting way. So We're apply the baseline or restricted yamls by appending the appropriate folder name So this is actually pretty interesting. So I'd like to try this out And then that'll probably be our day. So let's see actually where we are. So it's 327 right now Configure a policy instantiation for your collection like pod security policies or policy interests like opa gatekeeper by the test yamls Like baseline or restricted yamls by appending the appropriate folder name Check the created pods All test pods have the following labels defined But policy baseline or restricted And the control the value that identifies the security code and the control being tested you can view the pods for policy level What's this? Well, let's try this out. I mean, I think I think it looks pretty interesting. So let's do uh, copy What I'm gonna do is I'm gonna create a new namespace Then we'll do kns restricted Okay, and we'll do I do okay cool So inside of here when I want to deploy this Customize build Oh, I know you can do it by url Well, heck let's just do that That's a feature of customize. I don't think I've ever used This is not currently restricted. So all of these pods should be created Blocked app armor blocked Eat This is really fun So I'm gonna call right now. So just for a few more minutes I'm covering some states that you may not have seen before right, but it says sysctl forbidden Some things are in block state Figure some are sitting in container creating still That's cool. Let's do q kettle label That's in force Renewed the IO force rules restricted What happened the other questions that came up? I don't think it'll change anything but Do q kettle delete? I guess because these are all pod specs. I won't be able to just restart them I can tell there's pod specs because they're by name. So if I were to do the same customized thing Now in theory, we should see way fewer of these show up Very few errors output was just surprising because we're in a restricted namespace now Oh, because I don't have warnings for none. So probably a bunch of them just didn't show up Oh, look like they're there privileged running interesting It was running as an out route and it's in policy restricted. So that went to pass I'm gonna do the different label here Let's have a label to warn so we can actually see the output of stuff again And I'm going to do that build again Nope, no outputs interesting That is a surprising output to us here Oh So most of these containers that are being built Actually do match the restricted profile So I guess these are all Here you're setting the override security context Privilege true run is not root true And we see the privilege container startup. Well, we may dig more into this in the future Yeah, I hadn't seen I mean I have seen cctl forbidden before because you're actually trying to manipulate a Cctl that isn't exposed by the api so by the cubelet, but yeah, it's an interesting state All right. Well, I think that's the end of our broadcast today. So I hope this was educational I hope I hope folks found it useful I want to say thank you for tuning in And I'm really looking forward to Keep calm this year should be a really fun one One of the big things I can share with you is that I will be co-hosting with mr. Dan pop from Cystic I'll be co-hosting even if you have day at kubekan, which is a pre-show event So if you're going to be at kubekan or if you want to attend virtually Definitely check that out. I'll be co-hosting that with dan pop and that'll be amazing. So definitely check it out Um, I hope you all have a great week and thank you for tuning in and I hope this was useful And I'll see you all next time. Thank you