 Hi, and welcome everyone to our talk about practical challenges with port security admission. I'm Christian, I'm an engineer at VMware. I'm P, I'm a PM at VMware. I had a lot of questions about whether there would be like any time travel dinosaurs, etc in this talk, but I'm sorry, it's just a talk. So we're going to talk about, you know, like Christian said, practical challenges with port security admission. My clicker just stopped working one moment. It's DNS, never mind. So before we start, I just want to get the TLDR because then you know what to pay attention to and what's interesting to you in what we're going to discuss. So we're going to talk about port security admission. It's kind of new, but not really, it's extremely new. It's a security feature that is meant to kind of replace port security policies. We're going to talk about how it works. The main thing that we wanted to discuss though is that port security admission is quite simple to use. It's a very elegantly designed feature. The problem you're going to encounter with port security admission is everything, every workload and their dog needs privileges of some sort in order to work, like host privileges, network privileges, I don't know, privileges. And if you enable privileges to everything, thereby by passing port security admission, then that defeats the point of the feature. So if you put privileges to everything or how breaks lose, we're going to try to discuss how to and how break lose if I your setup so that you can enjoy the advantages of this great functionality. So before we start, let's take a look at what we have. So we will do first a recap about the whole PSA thing, then enter two main challenges and pitfalls, get into some action and finally cover some guidelines which help you and at the day to day base. Yeah, so let's do a quick recap of what port security admission is and how it works. Basically, you are going to have three distinct security profiles, restricted baseline and privilege, they restrict certain capabilities that we're going to discuss. And another axis of control that you have is how you, what do you do about the security profile? So we have audit mode, warn mode, and enforce mode. So when we talk about the security profiles, we need to see what is it that we are restricting or allowing in the first place? Yeah, so there's there are lots of capabilities at the pot spec. You could take a look at to secure your containers. It's settings about ab armor, host namespaces, host path volumes, capabilities or privileges. Yeah, so one thing that's funny, and I totally miss forgot the joke here, there's a capability called capabilities. I didn't know this and I found it was hilarious. So I'm certain there's like two of you with the same kind of humor as me. Isn't this funny? Another funny thing that is not so funny because it's confusing is we have the word, the keyword privilege being used in two different settings that mean two different things and you're going to work them together and that's not confusing at all. It's as simple as enabling the capability capability when you want a capability inside your capabilities. So you have privileged as a pod security profile, you have privileged as a setting in your pod spec and that relates to like the Linux configurations. So when you're talking to your developers in your team, do you need this to be privileged? Make sure you know what kind of privilege you're talking about. Because as Christian can tell sometimes they're having that conversation, you think you're getting along and you're talking about two completely different things. Yeah, because if you just use one single host path, you enter already the privilege profile and you are not using the privilege Boolean here. Yeah, so the Boolean that we're looking at is one thing, the security profile, another thing, just wanted to make that very clear. Okay, so Christian, I kind of looked at the capabilities. I know like which like, okay, I need this capability. So I need a baseline profile, etc. Once I decide that, what do I do? Yeah, let's look at the last confusing names. There are the three modes, the audit mode, which as it says, creates audit log messages. There's also the warn mode, which returns warnings about the configured policy. When you use kubectl or any other client to apply things, they basically cover the same information but through different channels of information. So you get the same message in a different place? Yeah, exactly. And last but not least, if the force is with you, you can start doing enforcement and block pods from getting run or created, which do not comply to your wanted security profile. Yeah, so we have two modes that show you information but don't really do or limit or restrict anything. And then you have a mode that actually stops things from happening, which is security admission as it says on the tin. So in the end, we're dealing with three settings and three settings. It's a simple system, kind of. And you can apply one of those three and one of those three, two. Yeah, either on a pre-namespace space or you can also set it cluster-widely. But be aware, the namespace-based setting has more priority than the cluster-wide. But this allows you to enforce cluster-wide a less restrictive setting and have a single namespace, which runs your single privilege workload. Yeah, so the TLDR, I guess, is that it's a very broad configuration. It's not very detailed at all. It's not meant to be. It's meant to be simple. When you need extremely specific control over very specific things, then you're going to want to use one of those other tools, gatekeeper and pod security admission. They complement each other, for example, they work great together. So that's something you might want to do. So here's an example. When you're applying pod security admission to a namespace, you simply apply the labels as they're there. And when enforcing cluster-wide, you have this configuration file used at a Kubernetes API server where you can set the very same settings. There's always two settings per mode. The first one is the profiles we use for the mode, like here, enforce privileged. And the second one would be the version to define. Yeah. And you're the engineer here. Is this quite right? Yeah, I think there might be something wrong. Folks, take a look. Maybe you guess what it is. We'll come back later. Yeah, we're going to show you. If you guess right, what's wrong here, you get free coffee at the end of the talk at the free coffee booths. We're happy to join. We're happy to join, exactly. All right. So now we got an overview of what the feature does. So let's talk a bit. And at this point, you need to think, if you're implementing this from scratch, from a new thing that you're developing right now, you're free to take an app. We're not going to be offended, just a little bit sad. But you're free to take an app and wake up at the last section of the talk. And then we're just going to tell you the guidelines to develop stuff from scratch using PSA. That's not necessarily a problem. The problem is most of us are not working on brand new applications. We have a ton of stuff running, a billion services here and there, et cetera. And even though PSA is quite simple by nature, when you apply that to a large complex system that you already have, things are inevitably going to come up that you need to think about. So we're going to talk about the main challenges and pitfalls for that. So to adopt security admission, you can basically go through three steps. The first step would be pick the right profile for the workload you're currently running and afterwards start optimizing and reducing the security profile to a better one or improving it to a better one. And the third step would be add continuity by taking care that you don't run into security issues in the future with regards to the settings. So let's start with the theory first. Step one, pick the right profile. So there's a rule of thumb where you can say if you have host name spaces in use, if your workload requires privilege or any administrative capabilities, host path volumes or any other things capability capabilities. Yeah, the lots of capabilities. Then you probably need the privilege profile for now for this workload. Yeah, so in the beginning, when we are starting to implement pod security, we're going to have a lot of stuff with a lot of privileges because we want to get everything up and running. And as we're going to discuss in a minute, if you try to make something run with less privileges than it needs, you're going to end up in bigger problems than if you give it broader privileges at first, you get it running. And then you can start curtailing those privileges so that they are not just everything can do whatever they want. So privilege profile basically means this workload can do whatever it wants. There are no restrictions. Baseline is a set of settings that is generally considered by the community to be like, okay, these are the things that cause problems. So these are a good baseline to restrict those. And then restricted setting is a different thing. The thing is that to apply baseline, you need to be conformant already with this entire list of settings, which when you're just trying to get the whole system to work, you don't want to do that. You first want to get things to run and then you start to optimize. And I think this is what it just said. Okay. All right. So what happens? Why did we say give privileges first and then remove them? Because if you don't give privileges to a thing that does need privileges, a lot of weird stuff is going to happen. Yeah, like if we just start enforcing the restricted profile, there will be for sure parts which don't get even created anymore. So that's a good thing because the admission works and actually blocks our workload and we get notified because it doesn't get run. Yeah. So the admission controller is, you know, controlling admission. So that's a good thing. But that's only when it works as intended. Yeah. So there's another case like there are things, we removed capabilities, but the workload may need them and the pod just starts to crash loop. But the good thing here is we hopefully have monitoring to also cover that and get notified there too. And that's the last scenario where the pod may even get running and yeah, okay, it doesn't get ready and doesn't get served any traffic. So others may hopefully cover that. But the very last thing is the pod gets running. It also gets ready. So should be good, right? Yeah, I think everybody knows this guy at work. You look, you look at them and it's like, looks like they're working. They tell you they're ready. Everything's fine. Zero work gets done at all. That's our employee of the month. And that's the type of pod or situation here that you need to be concerned with because this one's going to be hard to track. Yeah, maybe that one joins us for coffee later. So these are the must set settings which you need to set that at the pod specification because the default settings may be insecure or not when you set nothing, it will default to insecure defaults. So to get down to the restricted or baseline profile, we have to take a look at them, which were the list we had in a screenshot before, for example, and set the more secure settings there to comply with the better profile. Yeah. So just to get back to the larger structure we're discussing in the beginning, you give things more privileges so that they can run. Then you start to address the settings to optimize things. So there's a bunch of settings in your, in your spec that you need to, basically the defaults are not secure enough. So they're not going to be compliant with more advanced security. So you need to remove capabilities explicitly because the defaults are just not secure enough. So here are some of those settings. Yeah, that's for example, the run as non-root Boolean, which says, or which does the Kubler to block any pod process to getting run, which is running as root. Be aware if you just set that Boolean, your pod may not start because the user ID in the Docker file may still be the root user. So the Kubler will check that and block it from getting run. Yeah. Another setting that's pretty straightforward is second profile. This is a requirement by the, the more strict pod security admission profiles. So it's something you need to set as well. And yeah, finally, there's the allow privilege escalation Boolean, for example, at a container security context. And again, the capability to drop capabilities. So we have less capabilities in your pod. Yeah, exactly. Now we just talked about removing privileges and generally that's great because generally things have more privileges than they should. And you want to remove them as much as possible. So what do you do if a workload actually does need privileges to work as is the case for any host things or file things or, you know, sys calls, et cetera. You set the whole thing to privilege. I have an idea. We could just set it a privilege and I can go home now, right? Okay, perfect. Now, here's what you do. Let's say you have a service and application, whatever it is. You know it needs privileges to work because it requires some of those capabilities we described. The first instinct is going to be PSA work by namespace. I'm going to set the whole namespace to privileged. That will work. That's what I just told you. You should do that at first so that things run, but you should not stop there. Here's an example of what you can do. You can break the application into two. You have one namespace with privileges, one namespace without privilege, you reduce the surface area of things that can go terribly wrong by a good chunk. You don't have to stop here either. So if you're okay from PSA side, there may be still containers in a pod which requires privileges for one container, but the single container there, for example, could still get reduced in privileges and run more secure. Yeah, so when we come to this point where we're triaging things inside the namespace, we're not talking about pod security admission anymore. We're talking about those settings that we discussed before where you explicitly set things so that containers can do them or not. So this is not about pod security admission, but this is something you should do because pod security admission, like we said, is not granular. When you want granularity, you need to do that yourself. You can do it manually, like we're describing here. You can use other tools like gatekeeper and et cetera. What I just said. And now let's go to some practical examples. We discussed before the capabilities that we are trying to restrict or not. So Christian, what types of applications are going to use those capabilities? Yeah, there's some workload I think every one of us requires when running stateful stuff like CSI to actually get some volumes for your pods, which requires host paths are also the privileged Boolean to format disk or mount disk and at your notes. There's also monitoring like the node exporter, which needs to read the information from the file system or lock shippers, which of course needs to read the locks. And lastly, there's also the CNIs which for sure need the host network setting to do its work. Yeah, now we discussed before in theory that if you try to restrict privileges for a thing that does need privileges, bad things probably will happen. Some of them are easier to find. Some of them are not. How do you identify what's wrong with your workload? So the good thing is PSA helps us with that. We can use the audit and warning modes to show or to highlight us what's wrong with our workloads if we want to run in a specific profile. Like here, there's an audit log message which shows us that our pod violates a certain policy because we didn't set the allow privilege escalation Boolean, for example. Yeah, and like we discussed before, warn mode shows you the same thing just in a different way. So here you get a warning not on the audit log, but on your command line when you try to apply the manifest. And the cool thing here is this namespace is not yet scoped down. We can use kubectl apply a kubectl label with dry-run mode to see how it would look like or if there are pods currently violating that profile if we start enforcing. Yeah, and if you apply any new pods or new workload, you also get messages, for example, in kubectl or any other client talking to the API server. Yeah, and I think Christian alluded to this earlier, but the cool thing to do is you set the whole cluster to warn mode when you kind of don't know what's up and just want to see everything that you need to fix. And then you can start applying restricted policies to specific namespaces as you fix them and adjust those settings and partition things, etc, etc. Yeah, and to not forget, if the force is with us, it blocks also pods from getting created. Like here, a replica set which doesn't get created, it's pod because the admission blocks it from that. So once you kind of got the point and you don't get a mountain of errors anymore, then you set enforce and that's actually going to, you know, just boot out. It's going to not allow things to run if they're not compliant as opposed to warn mode, it just throws a warning, but it runs anyway. And now we're back to talking about our favorite colleague. Yeah, so to also find out what's wrong with him or to identify that something's wrong with him, we should start going into. Yeah, wait, wait, wait, sorry. No, so I just, I just wanted to highlight one thing, but to create admission has it's great in terms of showing you what's wrong, but it can only show you what it can see and it can see everything because you have these applications where like for Kubernetes purposes, it looks like the whole thing is working just fine, but it's not performing the function you expected it to. So you can trust PSA up to a point, but you still need to test everything if it works after you restrict it down to something like if you, for example, take a look here, there's a poor request and we did run all necessary tests to check that everything is still working again here at cluster API. Yeah, because for some things, for Kubernetes is going to look fine, but it's not actually going to work. So you do need tests in place. Yeah, so if the test part, if the test pass, it's great. But what to do if they don't pass? Yeah, so you're going to have two different scenarios here. So like you think you adjusted your privileges right and you got the right profile or permissions, etc. You ran your tests, it doesn't work. First thing you're going to think, oh, this just needs more privileges. But that's not necessarily true because sometimes the functionality, like let's say the actual code, the thing your program does, needs less things to work than what your config thinks it needs. So it's very common to find configs that are more permissive than what the application actually needs. And that's basically, it's going to be a very easy fix for you. So don't overlook that, for example. Yeah, for example, here, upstream is also aware of the change or the introduction of port security admission. And there's work done like, for example, for a third manager to comply with the least profile possible, which is in this case restricted, by just adding the capabilities drop with the capability thing, you know. But yeah. Yeah, so as you can see here, if we don't look at the config file, if we just look at the actual thing that the thing is doing, let's say, it never needed privileges in the first place. We just had to apply privileges because we forgot to be explicit and say this thing doesn't need privileges. Now that we said it doesn't need privileges, it complies with the security policy. So this is like a, this is a politics issue. It's not a functionality issue. So it's something to keep in mind. And like we said, this is all you needed to do to get the thing running. So it's the easy fix. What I just said. Okay, now, sometimes you're, you're trying to do this thing of removing privileges, reducing the surface area of privileges in your, in your cluster. Do we get paid for every time we say privileges? I hope not. Anyway, sometimes you actually need those capabilities because your application needs to work. It's going to need the sys calls, the host access, et cetera, et cetera. So in that case, let's go back to the, what we discussed before, what can we do? Yeah. So again, we can separate the workload by namespaces to get into different profiles and the additional thing to configure the containers and reduce the capabilities which are not needed for single containers too. Yeah. So what does this look like in practice? Let's talk about, he's fear CSI controller. It has a lot of components. Christian, how do you address this? Yeah. So we have a deployment and a demon said, and we know that a demon said needs to do stuff like mountain discs and formatting discs, but the deployment should be totally fine with the restricted profile. So what we can do is because it doesn't need any privileges, we can do the namespace separation thing, configure it to run as non-root and so on. And finally, we would be there and are secure for this deployment already. Yeah. So there's two things to think about here. One is how did you find that this is the case? Yeah. Like again, we did adjust, we did take a look into the code, doesn't need those privileges and we did for sure test that it works afterwards too. And you tested with foreign mode and with your own unit tests and yeah, exactly. So the whole thing, the thing must work afterwards. So, okay. And another thing we can do when we look specifically at the part that's left here, we know this is going to need privileges to some extent because it needs to access the hard disk, but there's a small container which basically exposes an HTTP service to the API server which could get reached. This one doesn't need any privileges like the other two containers. So we can also scope that down and reduce the privileges of this container and by that reduce the attacking surface. Yeah. So before when we started, let's say I just read about pod security admission for the first time today and I'm thinking, well, this is a CSI driver, so it's going to need privileges because it needs to read the disk. I'm just going to set the whole thing to privilege. You could do that, but that's going to give you like this whole list of things where things can go wrong. So there's a big surface area with privileges. What you can do instead is you break it down. This part doesn't need anything. I can just set it to restricted within the part that needs something. It's like, look, there's only one thing with an HTTP endpoint here, which like it's what I'm more worried about. It doesn't need any extra capabilities. So I'm going to restrict this one as well. And then we go from this whole thing being privileged in a more naive approach to only precisely two little bits of it being privileged in a more, let's say refined approach. Yeah. So there are other examples where this should work. Take a look at the Prometheus home chart, which is there. There's the node explorer in there which requires the privileges, but there's a lot of other stuff like Prometheus itself or its additional deployments, which should be fine with running restricted. So the same approach could be done here by separating the node exported to a separate namespace. Yeah. So we tested the CSI driver. This one we didn't, but we're just saying it's kind of the same concept. This concept is probably going to translate to a lot of the stuff you work with. So that's the general idea to keep in mind. So this is what I said. Some workload requires more privileges, but not every part of that workload does. And the smaller the surface area running on privileged mode, the better. So let's talk about some guidelines. Yeah. So we have to think of PSA preemptively, and we should make it support us through the whole development process. So there's some guidelines to take for that. Yeah. So if you're transitioning a bunch of stuff that you already have into working with pod security admission, like we discussed before, first thing you want to do, enable warn and audit mode cluster wide so that you know everything that is not compliant with something. You're going to get those errors. You're going to see what's going on. Start setting and force mode on a per namespace basis as you go to configuring things individually. After you're done with that, then you can enforce cluster wide defaults. And you should know that once you're enforcing cluster wide defaults, that's going to be good because you're not going to end up with regressions because things are just not going to run, but that's going to be bad because things are just not going to run. And sometimes you're going to forget that you set the setting. And that's going to be problematic. But you're going to be adhering to a very high security standard. Yeah. So we're not only transitioning. We're good. We also start creating new things. So for that, best would be to create a new namespace where we already set the enforce mode to the restricted profile, which we want. So this way we can work backwards. And when we know that this workload needs a certain privilege, we can go the way back up and know that, okay, it requires that we have to go to that privilege. So for sure, at the end of end to end test, that all works. Yeah. And for the things, like we said, the things that do need privilege, don't forget, compartmentalize, split the namespaces in the privilege namespaces, look at the individual pods, go fix their settings, don't just leave everything open up just because one part needs it. If you like to read, the internet has a lot of things to read, so you can do that. And couple more things. Yeah. In CLI, you can use Qverno, which has a CLI tool, and some pre-built YAMLs to apply or use Qverno as linter during your development process when you adjust some manifest. And see immediately that there's something wrong, there's something misconfigured to what we expect. And here's the thing that was wrong before. We set the version to latest. And so far, in the history of pod security admission, there have been no, like, big breaking changes. We're not expecting there to be. But we don't know what the world's going to look like one year or two from now. If you put latest on stuff that you reviewed today, it might break down the line. So just put this heading for the version that you're currently using, that you know it works. And if you want to review it at some point in the future, then you do that deliberately, but you don't want your... You don't want to be enforcing a policy that you don't know what it is because it just got updated and you didn't notice. That's a recipe for headache. Yeah. So at the end, we applied PSA to everything. So with Secure now, I can close my epic, right? Exactly. So now that we got to the end, you applied PSA, you're enforcing a restricted profile. You are 100% secure. It's Friday afternoon now. We can go have a beer being 100% calm. There's just a couple of things you want to look at later. But you know, it's practically done at this point. Yeah, it's for Monday. Yeah. We're not on call, right? Okay. So thanks, everybody. I hope this was helpful. We wanted to make this a very practical thing with practical examples. Easy to understand, nothing fancy. When you are working on this in practice on your systems, we hope that this is going to be helpful to review so that you kind of have a light based on the work you already did. Thanks, everybody. Thanks.