 Thank you very much. Thank you very much. I'm really really excited for this session And I know it's the I am the only thing between you and some nice cold beers at the end of the days So I'm gonna try my best to entertain you, but you let me know at the end you be the judge So this is the hitchhiker's guide to pod security. I Have gone through the galaxies of pod security and condensed all the learnings. I've had in a one little guidebook Take you through. Oops. We're gonna take you through that guidebook And I have a couple of goals for you in the next 35 minutes. We're gonna learn together So my hope is that everybody leave this session You go back to your lives and you turn on pod security in your clusters And you report back if my learnings actually helped you with your journey to pod security But I would love to see more secure kubernetes clusters out there So what you're gonna learn today is how to use some features in kubernetes to achieve that so I hope Hold me to this and let me know at the end that you'll feel comfortable Going back and turning this on if you don't please ask me for more. I'm happy to give it to you so our journey today has a few pit stops and Hopefully do we have any hitchhikers go to the galaxy fans in the room? I'll make a few puns I'll make a few puns. So you see the emojis up there. There's already a few and I I should have brought my hand towel up on A stage as well, but I forgot it. We're gonna go over the pod security concepts. We're gonna see it in action I have recorded demos. I'm not cool enough to do them live because I've seen the internet here today But you're gonna be able to see them after the fact And you're gonna be able to take them with you and then I'm gonna share some next steps How to turn it on how to get comfortable and how to use it in your environment and be successful with it? So we will go through all this and more, but I guess you want to know who you're talking to Who's talking to you about this? My name is Locky Evenson Lachlan Evenson. I listen to both You can call me. Hey mate or whatever I'm a product manager at Azure on the upstream team where we work in cloud native ecosystem tooling To help make folks successful in the community. You'll see me around In this space. I'm a CNCF ambassador. I'm on the governing board. I've served on kubernetes steering I built a kubernetes release that hopefully didn't ruin your life 1.16 if it did please give me that feedback I love travel language hiking and experiencing different cultures I'd like to learn seven languages while i'm here on earth I know three currently if you don't count australian and english as two different languages Okay, good now, you know who's talking to you we can get on with that. Thank you very much Okay, so let's get into the concepts. Let's get into What we're here to talk about Before I get into all the details I want to go over what is pod security and you might have heard pod security or pod security administration pod security is a built-in So I have everything bouldered admission controller. It is built in you don't need any extra software to run this It is there and you can use it in kubernetes today Yeah, what does it do? It evaluates pod specifications against a predefined set of pod security standards. Don't worry about that I'll tell you exactly what they are, but they're super important They are applied at the namespace level So you apply policy to each namespace you would like pod security or apply to And they are action when pods are created In those namespaces It is currently in beta or beta for you americans And uh as of kubernetes 1.23 which is now Not the current release the one behind so we're on release in and it is planned If you all give it a thumbs up after this To go to stable in 1.25, which is the next release. I've got some links down there. Just follow along here I'll post the deck later so you can grab the links and whatnot Excellent Why I think it's important to say why am I telling you about this? Why would you I want to know this? So what we want to do again is level up the security of your kubernetes clusters and the workload specifically that are running on them So the pod This provides policy standards to restrict pod privileges Why would we want to restrict pod privileges to make our workloads more secure if they are ever compromised? We're reducing the surface area of attacks and therefore making your cluster more secure by doing so as well They are simple and easy to use that was written on the cap for this feature They have to be super super simple. Hopefully I can illustrate that they are indeed super simple Predefined so you don't have to go make up some magic standards There is a set that already exist and you simply can refer to them by name It supports and encourages kubernetes security best practices So if you use these you will be in line with pod security best practices out of the box You don't have to do much and it performs validation only we'll cover this a little bit later But you might be thinking can I change resources? The answer is you cannot And we'll talk about some different options for doing that What about The elephant in the room or maybe the fish in the room the babble fish in the room What about pod security policy? You may have heard of this or have used it and been terribly disappointed when you heard it was deprecated It is going to be removed in kubernetes 1.25. So if you're still using it today Don't worry. I'm going to go over how you can migrate to this so that you won't be broken in the next release The difference is however, it does not have a feature parity with pod security policy So there are some gotchas which are a call out specifically It does not support mutation and mutation is the ability to change kubernetes resources server side when you're creating them Okay Moving right along here. I'm going to cover the migration story So if you're in this path, I'm not going to go into terrible detail But there is a migration plan. It is well documented on the kubernetes docs So you can go from pod security policy today, which is deprecated and being removed in three months four months To pod security, which we're talking about today. There is a link to do it. There is a Well-defined process of how to do it and hopefully if you follow along I know a lot of time and effort has gone into that if you're in that situation. You should be okay Okay, everybody following along so fast So far cool so far. Okay, excellent Other secret other similar ecosystem tooling So I'd be remiss to mention that there is an ecosystem of tooling out there You may have heard of some of these different tools But pod security is designed to provide a built-in set of capabilities that include the security of your workloads It's not supposed to be the only thing So it's complementary and composable that last point So there are other projects out there if you want to do more complex things I'm going to name two there on the screen there kivono and gatekeeper if you're interested in more complex pod security use cases Pod security use cases and This is designed to be composable. So this is complementary. You can use this and others as well And again, these aren't all of them. I just list two there for the interest of showing you some others All right, now we're going to get into the elements of of Pod security we have a built-in admission controller It may be run as a standalone webhook So if you are running earlier versions, then the version this became beta in which was 1.22 You can actually deploy this in earlier versions using a webhook admission Admission controller won't go into the details. You just need to know that exists There are three different security of three different policy levels known as pod security standards That range from permissive, which means you can do everything to restrictive Which means it removes all the things that you shouldn't be doing that are known to be Areas that cause security threats The policies are applied with a specific mode of per namespace We'll go into these modes and you can apply multiple modes to a namespace That'll be critical. We'll go over that as well with different policy levels And we'll go over why you might want to do that at the same time They're the elements you need to know lost the monitor And we're back That was very dramatic I didn't record that this isn't recorded That I know of okay, so let's take a look at the Pod security standards. These are the levels that you will be using And they are predefined levels privilege, which is open and unrestricted as I mentioned baseline This covers the no one privilege escalations while minimizing restrictions So this is where you want to get to for a start if you have nothing And have never looked at this before the goal state you want to get to is baseline for starters And then restricted is the highly restricted if you have really high security interests This takes out the most security prone pieces of pod Specification and make sure that you're in the most restricted mode, but be warned it may cause compatibility issues So it's up to you to decide I think a great goal would be that every kubernetes cluster and every namespace has baseline Pod security standards applied to them. I think that would be a cool goal to have Okay, but what about these specific standards? What are in these standards? So they are well documented the link is down the bottom. I'm not going to go over every Piece of the pod specification that is indeed restricted by these different pod security standards But here are some example of different fields. So speck dot security contest assist calls speck dot host network. They are covered There is a wide range of Different pieces of the spot pod specification that are defined in that link down the bottom So too many to list But you can go and have a look at the different pieces and I'll show you them in action So don't worry about writing them down again applied per namespace Which allows you to have granular or different levels different levels of security standards Based on your needs per namespace You may not want to apply restricted to cube system namespace for example, but you may want to do it to your default workload namespace Okay elements of pod security So these things are applied in a specific mode And there are three different modes and they can be set on a namespace And you can have different policy levels per namespace as well. The three modes are in force Which says I will not allow this it'll Strictly block it audit it will post something in the audit log and warn. It will actually fire a warning message back to the user Stating that hey, this is in violation of this policy And you might want to do something about it, but it will allow it to be created So you can actually get user feedback if you define a non-compliant resource within force You'll actually get an error message back via cube ctl that says this policy has been violated Okay, so just remember those three Sorry Water break Okay Okay in addition and I'll show you how to do this with labels on a namespace you can pin policy So for each version of kubernetes, there will be Changes to the pod security standards. Why is this important? Because if it changes over time and you're not familiar with those changes and you upgrade kubernetes You might break your workload. So this allows you to pin much like everybody pins all their version dependencies and software, right? Oh bad joke. Sorry too soon too soon But you can pin the specific versions pin the specific versions allows you to have consistent behavior. You can say Baseline colon latest you can say baseline colon v1.22 and that will apply Because remember this is built into kubernetes. So you can strictly version pin Okay, so let's get on to enabling pod security and how we might go about doing that. So 123 it's enabled by default because beta features are actually enabled by default So you can use this on any 1.23 cluster out of the box If you're on 1.22, you'll need to set a feature flag on the api server to actually allow this feature to be created Um, so if you don't have access to the flags that get run on the api server, you might have to wait You could also take the option down the bottom which the third bullet which is you can deploy a webhook admission admission controller Mouthful for me at this time of the day And you can deploy that on versions one dot lower than 1.22 if you want to test this out Now I had to put the disclaimer here as we're going through the galaxy. Don't test this in production. No, no, no You might get uh, unexpected uh outcomes So get familiar with it in a development setting a lab cluster spin it up on kind Start seeing how it works before you turn it on. I don't want anybody calling me saying locky You tell me to do this and I blew up prod. Don't want to hear it. Okay, you've been warned okay Configuring so these are the two labels that you need to apply to a namespace So you have the pod-security dot kubernetes.io slash mode What are the three modes? Excellent. Yeah, you got them all and force One and audit they are the three modes. So That is required specifically to enable it. The second one there is version pinning, which is completely optional There is so if you say hey locky, I don't want our users defining pod security policies or labels on namespaces No, no, no, I can't have that. There is an admission configuration Uh way that you can do it by setting a config file and actually having the api server pick it up So that users don't configure anything So there is a way to do that and there's a way to do regexing excluding namespaces Um, so you can do that. Um, so as I've got the star down there at the bottom So possible modes there again and force audit worn These are important. Okay configuring So The specific version can be applied for the enforcement mode. Why would we do this? We have remember worn audit Um, and uh, what's the other one? Enforce exactly. Thank you. So you usually start with worn give the users back warning Um, and then you move on from worn to enforce So, uh, this is you might be able to warn on one level and enforce on another for example You might want to move from Privileged to baseline so privileged to baseline so you might enforce on privileged But worn on baseline so that you can actually help the workloads that are currently deployed Move up so the users will actually get a warning message The workloads will still be deployed But you can say hey, you're not in compliance fix it so you don't get this warning message the actionable I'll show you and then you can move from Privileged up to baseline and then even subsequently up to restricted We're good so far everyone up in the rafters Good see thumbs up excellent Okay, we're going to go into some demos in a minute. So, uh, what I wanted to show you is What actually an audit piece looks like without filtering through an audit log So if you have audit logging on your cluster You will see a message that looks like this So this one says allow privilege escalation does not equal false for the container busy box It must be set in the security context allow privilege escalation each false Unrestricted capabilities. So there's a bunch of areas there You can put them in audit logs and people looking at audit logs can actually see where things have been fired I will also show you that we publish a set of metrics That you could scrape with Prometheus to see which policies are indeed Being hit how often they're being hit so that you can see from your monitoring tool how it's operating as well So I'll show you that Okay from here We're going to show you it in action and by the way, I got all these images from the free and public nasa images site They have wonderful images of space. That is the Hubble telescope, which is fantastic. Okay So we're going to take a look at how this works. So I'm going to walk through some different examples here with you And what I'm going to do is demonstrate don't worry about this There is a blog specifically where you can reconstruct these whole demos by yourself all the commands are published So you'll be able to take this from a kubernetes blog and run the exact same thing So for this first demo, we're going to confirm specifically that pod security is enabled because You want to make sure if you're going through all this work that you're actually doing it on a cluster That's has the feature enabled. So I'm going to kick this off here Let's see if it if it does on my click. There we go. Okay So I kind of walk through the demo here in the top half of my vs code I have the script that I'm going to run with some comments about what you should see and down the bottom half I have a live and running cluster. So I'm just going to show you that I indeed have a 1.23.5 cluster here That's up and running. So there's no, you know, I'm not pulling a rabbit out of a hat It's all up and running And you're going to see that there are some nodes on that cluster as well In the blog, we actually show you how to Run this up with kind. So I'm actually using a kind cluster if you don't know kind check it out It's a great tool for testing out these features and running with them Okay, so what I'm going to do in this to check that if it's enabled I'm actually going to check the api server flags if you do not have access to the api server flags You cannot run this this will not work, but you can run this command and specifically I'll go and highlight it in that big list Of different admission controllers that pod security is indeed present in that list That means pod security is indeed enabled on this cluster and that I can use this feature Now I'm going to quickly quickly show you a Quick dry run where you can actually test if it's working as well So what we're going to do is we're going to create a namespace The reason why you'd run this is if you don't have access to the api server flags I'm going to go through this We created a namespace. What do we have to do with the namespace to turn this on anybody? Label it I'm going to label it So here I'm going to label it with enforce equals restricted So I'm running that command now and then I'm going to actually just show you the yaml of that specific namespace called verify dash pod dash security Just to prove that that label is indeed there I'll go and highlight it and you can see it there in the list that that is applied to that specific namespace So what we're going to do here is Actually do an interesting command. So if you don't know this one, it's a fun one to test. We're going to do A run So we're going to run test with the image busybox, but we're actually going to do a dash dash dry run equals server So I'm going to try to apply this And it violates the policy and I actually get an error back that it's forbidden because it violates pod security policy restricted colon latest So because of these reasons, so there are many reasons there in that list that this workload Violates that policy, but that's just a great way for you to test that it's running You apply a restricted to a test namespace and then actually try to dry run Create a workload on the cluster so dry Dry run server is something I use all the time to test the capabilities without actually having to run a workload Okay, so that's the end of this first demo I just wanted to make sure that everybody was on the same page about checking that it was indeed on in their cluster Because I know not everybody is running 1.24 if is there anybody running 1.24 Yes, I've got a prize for you Okay. Yeah. No, excellent. So I know it's relatively modern 1.23 1.24 So you need to check that it's indeed Enabled now we're actually going to go through what it looks like To apply some policy then create a workload that violates that policy then try to bring it into compliance. Okay sound good Excellent. Thank you All right. I'll kick this one off. So we're going to apply create a privileged level Uh enforce on privileged and create a workload. So again, I'm going to create the namespace Verify dash pod dash security that's indeed created Gonna label the namespace with the dash dash overwrite Because I didn't clean up the labels from last time on this test cluster And we're going to enforce on restricted and we're going to audit on restricted Okay So we're getting the hang of this now, right? Everybody knows what we need to do to turn this on And now we're going to take our privileged workload And deploy it to the namespace We expect this to be forbidden. Why do we expect this to be forbidden? Because I have in this pod security context allow privileged escalation equals true And that violates the restricted policy So we're going to go and use this pod specification and apply it to the kubernetes cluster And what we should expect is an error back And indeed we get an error for the pod busy box privilege forbidden violates Allow privilege escalation does not equal false So again a useful error message for your users. They can understand that they can see exactly what they need to do to fix that But that will not be created because we're enforcing on restricted Okay, I'm going to go ahead and actually update it to say we're enforcing on privileged We're warning on baseline as I said and auditing on baseline So we actually have mixed mode there across enforce warn and audit And as I said, this is a great way to enforce and make our users move help users move to baseline Okay, so I'm going to deploy this privileged workload And then we're going to check if it's running. So do we all think it'll run this time? Yes, okay. Let's see. Let's see Cue the jeopardy music Okay, it is indeed created because we actually Set the privileged policy. So we're allowed to have privilege escalation is privilege escalation great in pods No, it's not Okay, so I'm going to run some cleanup here and now we're going to work through more complex on restricted Workload so that you can see exactly how you might be able to clean them up and Look at them. You can see that it's running even though in the brackets at the top I say that it will not be running on the node, but you can see the pods running because um, you can actually schedule to the control plane node on on kind by default Okay, so I've cleaned up That's a privileged level workload. Now we're going to move on to a restricted level and workload So let's take a look at this one and then we go and have a look at some metrics Okay, so again create the namespace Okay, now we apply the labels to the namespace. What we're doing this time is enforcing on restricted last time We enforced on privileged that we're enforcing on restricted and auditing on restricted So we're going to deploy a restricted workload in this namespace and you're going to see a little different behavior this time Which is is really interesting Okay, so here's our Restricted pod and we actually have a security context with a loud privilege escalation false and we add a capability Net bind service, which if you look up in the pod security standards, I should be able to do that Under the restricted pod security standard. So I expect that to be okay, but let's let's see what happens Okay, it is forbidden. Why is it forbidden? Unrestricted capabilities. What does unrestricted capabilities mean? You actually have to explicitly set fields in the pod Specification when using the restricted you cannot infer that default values are compliant So you have to explicitly set in this case that error I need to drop all the capabilities if I don't explicitly put that in my pod Specification, I will not be compliant. So here I am. I'm dropping all the capabilities All specifically and adding the net bind. So again, I think I've addressed this as as a user But this is interesting with restricted policy It does not allow you to use default values that a computed server side It wants them to be upfront. So that is indeed created And we are great there now. I want to go through one other use case here, which I think is really interesting Check out the pod the pod is stuck in container create config error What's going on it passed the policy? I thought we were good. I don't know what's going on So what I'm going to do is go and describe the pod in my normal cube control Debug we have a really interesting error message here that we can take a look error container has non run as non root An image as root So only when we execute the container do we actually see that it wants to run as root And we can't have any containers running as root, right? No, no, no So now that we've seen that we can actually go and fix this up And what we need to do is actually set in the pod specification that the pod cannot run as root again This is getting these pod security best practices out there and allowing you to have a safer Operating environment for your workload. So I'm specifically setting to the runner's user 65 534 Um, so this pod will now not be executed as root In the container namespace Or the pod namespace I should say So again, we're going to go ahead and create this now that I've set it to run as a non root user It's created and I'll go check. I'll do a Get pod and make sure the pod is running Hopefully it's running this time excellent So there we can actually see it'll actually be blocked as the pod's trying to execute because it's Violates the policy by running as root. Okay, we're going to clean up there one more demo quickly This is how I'm going to show you the metric end points so that you can go grab them. I'm actually using cube Uh, cube control get dash dash raw slash metrics But you could have Prometheus scrape these metrics endpoints and we're going to grab pod underscore security underscore evaluations underscore total Now why this is important is you can light this up and you're monitoring dashboard and actually see Hey, what have I got? I've got a mode of enforce Set uh policy level set to privilege latest on this specific namespace and we can see that's been hit 17 times So you can actually light this up in your Prometheus or any of your monitoring dashboards and know Where all your violations are happening or where all the uh allows are happening So this is just a great way to light it up and you can actually see externally without trawling through audit logs Exactly what's going on here Okay That good you feel confident Cool, that's that sounds pretty confident to me Okay next steps. So as I said, what are you going to walk away with here as we round up Okay, go experiment. I want everybody to start playing with this understand it. It's not scary It wasn't scary for me and hopefully it's not scary for you Use warn and audit on existing namespaces And there is a really really cool trick. You might say to me. Hey, lucky. I have clusters everywhere. There is stuff everywhere How could I possibly turn this on that sub point you can use dry run To evaluate all the violations of all the workloads that are currently In a namespace when you apply the label and force So when you do label and force on a namespace that has a bunch of pods You will actually get an evaluation and say all these pods violate that policy If you run dry run, it's not going to action it You're not going to apply the label But you have a list that you can burn down On a specific namespace to bring all those workloads back into compliance So this isn't just something you turn on in a brand new cluster and figure it out You actually have some tools to go and introspect and safely use it on clusters that already have a bunch of workloads to deploy Okay, I always say, you know general set warn to the same level as a force Because that helps you get some user user feedback And make a goal so you want to you want to start at privileged and you want to work to baseline You want to make sure that all your namespaces are at least a baseline And then selectively figure out which workloads could run to it restricted Think you got it Okay, I feel like I'm leaving everybody in good hands This time next year, I don't want to see a talk at kubecon saying how pod security ruined my life and lucky ruin my life But if it is I'll sit in the front row and listen Okay, I just want to say thank you for coming on this journey through pod security with me So now it's handed over to you so long and thanks for all the fish and I do want to thank our bridget Tim and jim for their review. Thank you very much Enjoy your kubecon booth roll I'll be hanging around outside here. If anybody has any specific questions I'd be happy to talk to you and answer them, but uh, you can come find me. I'll be here all week. Thank you