 Hi, everyone. Welcome to a Hitchhiker's Guide to Container Security. Today, we're going to be talking about container security through the eyes of a couple of characters and the story that we created. We'll also be showing a couple of demos. But before we get started on the talk, just wanted to give a brief introduction. So, hi. I'm Jed Salazar. I like mountains, dogs, and security keys. I currently work for a company called Isovalent as a security engineer, where I'm focused on Kubernetes security. Hi, everyone. I'm Tunde Oluisa. I enjoy teaching and also solving complex problems. I currently work with a company called Ultimo, where it's a consulting company that specializes in solving cloud-native problems and working with modern technology. We do this within enterprise IT companies and also the federal government space. I work as a managing consultant and also a chief architect at the US Air Force DoD Platform 1 currently. And we both formerly used to work together at a small startup called Heptio. That was a great Kubernetes company. However, both at Otimo and at Isovalent, we're both hiring, of course. So let's get started with the story. So once upon a time, containers, which are in this story known as remora fish, would hitchhike on nodes or sharks across the seas in a symbiotic relationship. Over the years, containers ran in a variety of risky ways, such as running as root and privileged. Containers used the shared nodes operating system kernel for their resources, so they were known as hitchhikers. Meet Polly, a hitchhiker who runs an NGINX web server with just a few pages. Polly used to run in a dedicated VM environment, and in many ways, things were easier for them in a VM because of a lack of choices. They didn't have to worry about resources, networking, and security configurations. However, the environment that they were provisioned was too large for the just few pages that they served, so Polly made the decision to hitchhike on a node as a container. Let's meet our node. This is GNOME, a Kubernetes control plane node. GNOME is super laid back and will run any containerized workload that is scheduled to them. Also, like other nodes, they'll run workloads with any settings the containers ask, even if they're risky or dangerous. GNOME always runs the newest version of Kubernetes, and they take pride in providing a safe environment for their hitchhiker friends. So we've talked about containers a couple of times, but we haven't touched on what a container actually is. So what is a container? Containers are Linux processes that run in the context of a namespace. Namespaces provide isolated resources for containers, like volumes and networking. Making a container think it's the only container on the node due to this isolation. Some of the key, there's a lot of namespaces, but some of the key namespaces include PID. So every process in Linux gets something called a process ID or a PID. A PID namespace masks other container PIDs or their processes, so your container only sees the processes running in its container. There's also a namespace called a mount namespace, which is used to give containers file systems and access to some directories on the node. This namespace is used to unpack the container image as a file system for the container. As you can see, Polly also has a network namespace here. A network namespace allows containers to have network interfaces and routing tables so that they can send and receive traffic. Not shown here, but also part of containers is something called C groups. And C groups are used to limit the CPU and memory resources that a container can consume. In the context of security, this is nice because it allows us to not have one what's known as a noisy neighbor container take all of the resources from the node from other containers. So Polly knew containers were awesome, of course, but they were also a lot of work. So Polly asked their friend, known the node, if they could help them get started hitchhiking as a container, of course that node stoked to help their friend. Polly was given a lot of choices to get started. Overwhelmed with the choices and really just wanting to get things to work, Polly chose a privileged setting. This gave Polly unrestrictive access to all of Nome's resources and disabled any safety precautions. While this made it easy for Polly to get started hitchhiking, they didn't know what these privileges meant or the risks they could introduce both to Nome as well as other containers. As we can see here, Polly is using Nome's resources directly instead of their names-based resources. What could these privileges enable in the real world and what risk could they introduce for Nome and other containers? So here we'll actually break into a demo. So here we'll actually break into a demo which showcases what a pod can do or what a container can do when it's actually given unrestrictive access to a Nome's resources. In this case, we'll actually be using the Nome's Kubernetes admin resources to effectively break the contain or break the boundary between a container and a node and escape out of the container and basically use the Nome's resources for bad. Let's take a look. All right, so what can Polly do with the privilege permissions that may have been granted? Let's take a look. So I've got a couple of manifests. We're gonna be using both in this demo. So the first one is Polly's manifest itself. And we'll actually take a look at this manifest, kind of jump through the important attributes that are kind of specific to this demo, but we also have another pod manifest here called incognito, which we'll be going over here soon. But for now, let's take a look at Polly. So let's highlight some of the important attributes about Polly that are actually gonna facilitate us to, in this case, kind of jump out of the container and do a container escape. The way that I like to think about a container escape each time that I do it is the chest burst or seen an alien, but let's take a look. All right, so we have a couple of things that are spec here that are gonna be particularly interesting. So we have hostPID and hostNetwork. So if you remember back in the presentation where I was describing namespaces. So for example, we have the PID namespace and the network namespace, the PID namespace basically meant that, in Linux, every process gets a process ID or a PID and the PID namespace makes it so that the container can only see the PIDs within its own namespace. What we're asking here is to basically say, just disable that all together. So we're gonna see the PIDs coming directly from the host. So we'll be able to see the PIDs both for anything that Polly creates as well as anything else that exists within known the node. But also hostNetwork and in this case, if you remember a network namespace basically creates a generally a network interface in that namespace of routing tables allows you to send traffic. Basically, we are removing that namespace altogether and we're using the hosts network interface directly and routing tables directly. So what that means is if we have something like network policy, which is kind of like a firewall for your pods, we're able to circumvent that network policy and send traffic directly out as the node. So moving forward, we have a couple of other things I wanted to discuss. We have this concept called volume mounts. And basically what this means is inside of my container I'm going to actually mount a slash Kubernetes directory. And if I jump down here, we can see that the thing that I'm actually going to mount in that directory is from the host itself. So we're going to mount the Etsy Kubernetes directory from the host inside of the container at slash Kubernetes. Now, this directory Etsy Kubernetes is particularly dangerous. This tends to contain things like all of the credentials that's used to basically make up the cluster as well as some pod manifests that are run locally by theCUBE which we'll take a look at here soon. And then lastly to kind of make this even more evil and powerful is, oops, we're going to specify that we're going to basically have Polly land on a control plane node. So when we look at this Etsy Kubernetes directory we're going to be looking at that from a control plane node which kind of gives us more additional resources to take advantage of. And of course, as mentioned, Polly is actually going to be running with a security context of privileged equals true. So we kind of have the option to decide whether we want to use these namespaces that we defined or not. Awesome. So let's go ahead and what we want to do now is we'll just, I've created a namespace called Polly. Actually it looks like I haven't yet. So let's go ahead and do that. And what we're going to do is we're actually going to deploy Polly into that namespace. By the way, I actually have an alias for Kubekuddle for K. So in this case, let's go ahead and apply Polly to the Polly namespace and then go ahead and get Polly created. Yeah, it looks like Polly is running. So let's go ahead and exec into Polly. All right, great. So we're in Polly now. And as mentioned, we went ahead and mounted that Etsy Kubernetes directory into Polly. So if we just do a list here, we can see that we actually have that slash Kubernetes directory that I mentioned. So let's move into this. If you list the contents, again, we're actually looking at the known the nodes Etsy Kubernetes directory directly. And what we see here is we actually have quite a bit. First and foremost, we actually have this file admin.com. This right here is basically the keys to the kingdom. So this is basically all of the kind of authentication for the cluster itself that kind of gives you Kubernetes admin. So at this point, if you've just made, these configurations into your pod, mounted this in, utilize this data here, and you could basically do whatever you wanted with the Kubernetes cluster. You could basically impersonate any component. You could basically launch any pod, join any node. The possibilities are endless. But what we wanna do in this demo is we actually wanna take advantage of this directory here. So this is the manifest directory. So in this directory, basically there's a component in each node called the kubelet. And the kubelet will actually, it's responsible for running pods and making sure that those pods are constantly running. So if the pod fails for whatever reason, the kubelet will basically restart it. And you can see here that inside of this directory, we have all of the kind of Kubernetes control plane components that are run from the kubelet. So the API server, control manager, scheduler, et cetera. So if we wanted to, what we could do here is because we have kind of like, direct right access into this directory, we could basically inject our own API server or kind of tamper with the controller manager, et cetera, et cetera. But what I wanna do is I wanna kind of showcase this kind of little known Kubernetes pod. And what I'm gonna do is I'm basically going to insert my basically that incognito pod and have the kubelet run it directly. Now, most specifically is this pod actually is going to be running inside of a Kubernetes namespace that doesn't exist. So we just created the Poly namespace. We're actually gonna be just running this and pointing to a namespace that doesn't exist. And the reason that this is important is because basically Kubernetes will not know about the pod as a result of this. So if we run something like kubectl get pods or if you look at like the Kubernetes audit logs you just ask about pods in general, basically it will basically be invisible to the Kubernetes cluster. So let's do that. So I'm just gonna quickly write out this incognito pod spec into this directory. And what this is gonna do is it's gonna have the kubelet actually run this for me. So we can take a look at this. So there's a couple of things here that are interesting about this, namely this namespace that doesn't exist. So again, this is a pod spec. Its name is incognito. And basically it's also gonna be running as privileged. We actually picked an image that kind of gives you some hacking tools. So we can basically take advantage of that. And if we ever needed to, we could basically jump into this pod and do things like send network connections. Just basically have the tools that we need to do to be able to kind of continually own the system. Great. We should actually see this pod running locally if we SSH into the node. And the way that we're gonna do that is because this is a kind cluster. So kind is basically running Kubernetes via Docker nodes. Is I'm actually just going to exec into the node directly, which in this case is kind of control plane. And what I'm gonna do is I'm actually gonna show the locally running pods with a tool called cry cuddle. And you can see that I do actually show this incognito pod running in this namespace that doesn't exist. And remember, because this namespace doesn't truly exist in Kubernetes, Kubernetes won't actually know about it. So we have this running locally. And basically what I wanna show you is that we jump out of the control plane node and ask Kubernetes about this pod directly. We're not actually gonna be able to see anything. So in this case, what I'm gonna do is I'm gonna say, let's get pods. And I wanna have all the pods listening. And you can see here that we have Polly shown. So Polly's running in their namespace. And we actually have kind of all of the cube system pods that we have running here. But what we don't see is our incognito pod. And what this means is basically Cubelet is going to be running our incognito pod secretly and invisibly in the background. And if whatever reason that pod stops or needs to be restarted, Cubelet will dutifully do so. So it's kind of a high availability, command and control or incognito system that we have running. So that kind of shows you just a little bit about what you can actually do if you actually pull in node resources into a pod. So some of these configurations, all you really have to do is ask in the pod spec. And by default, you'll kind of be granted these privileges. Cool, thanks for watching and let's get back to the talk. So Polly now understands the implications of using a privilege container and knows the ability to give such permissions should be controlled. So what Polly does is Polly employs Oprah gatekeeper. Oprah stands for Open Policy Agents. And Oprah gatekeeper is a sub-project of the OPA project itself, which is a Kubernetes native solution that provides a way to validate what goes into noon. So it does enforcement of policies and government and provides governance rules specified by Polly. So what gatekeeper is able to inform Polly of violations of rules that has been specified via its auditing feature and also enforce the denial of malicious configurations to noon by any of the validers in this case. As you can see from the image in this case, gatekeeper denies this specific malicious crypto miner each ICAR from applying dangerous pod configurations to noon. Let's go ahead and see gatekeeper in action. So what we have here is a Kubernetes cluster, run in version 1.19.4. We'll be using this cluster for deploying OPA. And so first thing we do is we create the namespace gatekeeper system so we can install our OPA configurations in that namespace. So we go ahead and apply our OPA configurations. Now that the gatekeeper components are installed, the gatekeeper, now we can validate configurations when the resource is either created, updated, or deleted. As part of this installation, the two objects that are particularly installed, one of them is known as the constraint and the second one is known as the constraint template. These are custom resource definitions. A constraint could be taught of as a given set of rule or requirement that must be met by a configuration being admitted to the cluster. If the constraint is not satisfied, the configuration is rejected by gatekeeper. Similarly, the gate constraint template holds the logic that enforces a defined constraint. Before a constraint can be created, the constraint template must be created. So now let's verify that our gatekeeper installed correctly. And now you can see that we have gatekeeper installed and its components are running. So next thing we would do is we'll look at applying those specific constraints and constraint templates that we're going to use when forcing our rules. So the first constraint template that we'll be looking at in this case is the constraint template for all spots. And this is the logic that enforces that. So let's go ahead and apply that. Great, that is applied. Next we'll apply the constraint template for the privilege ball. Remember, this is the logic that enforces the constraints that we'll hand up applying, which are the given set of rules or requirements that was to match by configurations being applied by either a port or deployment or Kubernetes objects in general. So in this case, we're applying the constraint template for the privilege port. Remember, this is the logic, right? It's essentially returning a message that says privilege containers are not allowed. If that privilege container is set in our port configuration in this case. So this contains the logic in this case. So we apply that. Next thing we're going to go ahead and do is we're going to take a look at the constraints. And now the first constraint we'll look at in this case is the deny all spot constraints. And you can see is that we're excluding the namespace kubesystem and we're selecting the kind port. And part of what OPPA allows us to do is select parameters. And here we're saying that the only allowed all spots is slash o and it has to be read only. So let's go ahead and apply that. So we apply the deny all spots, that's applied. Next, let's go ahead and apply the, let's look at the privilege container constraints and then apply that too. So as you can see here, we explicitly had the line enforcement action deny, but this is actually implied. By default, OPPA sets denies sets this as deny. And so this is applied to a port and it explodes the namespace kubesystem too. And this is because the namespace kubesystem is like the boob namespace and has containers and pods running that require privilege containers. So great, now that we've gone ahead and apply that. Next thing we're going to do is we're going to change into our example manifest directory. And we're going to run examples, show examples of configurations that violate ours. And the first one in this case is engine X privilege. And you can see here, we're setting the security context privilege was true. This violates our rule. So let's go ahead and apply that and see what happens. So we've tried to apply that and we get that response back from our constraint template logic that says that privilege containers is not allowed. So again, it returns the issue of why it's failing to us. So the next example we'll look at here is the OSPAT example. Remember this was exploited in the last demo we showed. Now in this OSPAT we're trying to add etsy Kubernetes. So let's go ahead and apply the configuration and see if this is admitted into our Kubernetes cluster. Again, we tried this, this violates our rules and we get a response back that says this part is not allowed. And so what we're not seeing is that configurations that violate our rules are being denied entry. And this concludes our demo. Opera Gatekeeper is a great solution amongst many other tools within the cloud-native ecosystem that's able to provide policy enforcement and governance to objects inside of the Kubernetes cluster. Awesome. So that demo really shows how OPA or an admission controller can restrict unsafe or overly privileged container configurations or pod configurations from being administered to the kernel. So what does this mean for our story? With these protections in place, Paulie and the other hitchhikers learn the value of friendship and symbiosis. So symbiosis is when two systems work together to the advantage of both. The nodes love carrying their friends the hitchhikers and in return, their hitchhiker friends pledge to never run as group or privileged again. This ensured that everyone in this environment got to run safely. Thanks so much for listening.