 Warning, the following awesome presentation contains slides with flashing images for storytelling purposes. Any events described may or may not be fictional. Hey, you must be new to this co-working space. Let me introduce myself via the form of a musical number. I'm just Ali, engineering skills X10. Is it my life to write YAML again and again and again and again? People love me here partially because I'm so great at singing in public places when everyone else is trying to work. But with my 10x ability, there is no we in this works today. It's an Aussie works co-working space. I work for Big Corp and I'm kind of a big deal there. But I'm sure you already knew that. Big Corp sent me to KubeCon 2023 in Chicago last year and I'm still wearing the t-shirt. There's been no security incidents since I've been back and I think this t-shirt has given me good luck. That's why I haven't washed it and you've probably noticed that smell. What? Now that smell? No, that must be my lunch. I've been microwaving some fish and I get so many compliments here about my food such as Aussie, you microwaving fish again? Sorry, co-working space is it. So what's that? You're just getting into Kubernetes too. Let me show you the perfect cluster that I single-handedly created. So this is Aussie's production cluster. We already have Big Corp running production workloads. I don't know what everyone has gone on about actually. Kubernetes isn't so hard. I've moved all of Big Corp's microservices into containers. Lift and shift, baby. And these are runners' parts. You can see them as at the bottom of the cluster. I've set up services to manage the networking between our services so all microservices can talk to each other. Big Corp already had a surge of demand after we put the cluster in place and we scaled to meet their needs. HPA! All this from a three-day conference, a technical masterpiece. Everyone in this co-working space is so lucky to have me here today. I am the perfect engineer. Right, I need to go to talk to management. I deserve a huge promotion. I'm going to go and do a warrior pose next to some random person so they know how important I am while I'm having these conversations. Hey, watch where you're going. Oh, excuse me. I'm so sorry. Hi, I'm Nova. I'm here in this co-working space today too. I'm so sick of that Aussie guy walking around here like he owns the place. He needs to be brought down a peg or two, am I right? I think I'm the one to do the job. You see, I look pretty friendly and I really am deeply kind. For example, I buy my friend's lavish birthday gifts with money that I make stealing data from people in the co-working spaces like this one. People really should be more careful. So basically, I hang out here, I pretend to work, but really I'm watching people and sometimes I see someone who gets a bit excited and leaves a desk with their computer unlocked. So I have this tool called a rubber ducky. It looks like an innocent USB flash drive, but to the computer, it's a keyboard typing at superhuman speeds. So I'm going to plug it into Aussie's computer and I'm going to cause some chaos. So while I do some stuff with this rubber ducky that I'll tell you about later, I'm also going to do more to mess with them in the short term. So first I'm going to apply a YAML to run my own workload in his cluster, but I'm going to take away his admin RBAC. So then he can't easily stop my running application. So now let's pull up my app in his browser. Okay, great. Now I'll lock his computer. Oh my God. Look at this lock screen. This guy is something else. Okay, I got to get back to my desk before someone sees me. Yeah, okay. So management said they're going to call me back later because they're kind of busy right now apparently, right? Let me just unlock my computer and wait. Nova rules Aussie drills. First of all, that's not true. The pillow was already wet before I went to sleep actually. And why is that URL pointing at big core? Nova has somehow accessed my new cluster. Well, I know Nova. I know all of them. Some fool got pwned by Nova the other day. It was a talk of a co-working space. Apparently Nova got some pictures of them and they transferred Bitcoin monies to Nova's wallet. And Nova is still asking for more. There was an investigation here to find out who Nova is, but no one's found anyone wearing a dark hoodie and an anonymous mask. I have to hurry up and delete this site before anyone else sees it. Shoot, I'm getting an error. I can't delete their workloads. The error message looks like my RBAC has changed. That's super weird. I gave myself most privileges so I can run this cluster. What's changed? Like here I have an offline config file of the cluster admin on my USB key that I have in a wallet alongside a photo of myself. All right, let's plug that in. Switch to cube config file. Great. I have cluster admin access again by using admin profile. I'll use this profile to make changes to the cluster to harden it, and I should be careful about managing this admin account though. We might need to access this in a break glass emergency, but we should have processes in place to make sure we know who can access it and when. Maybe having it in my wallet isn't the best idea. So we could use a secrets manager to store this securely, but equally having this offline in the safe might be a good option too. So I'm still using the admin account and I'm able to remove Nova's website. And the site's me for well, I better try harder. Let's update my R back so instead of having most privilege, I have least privilege when I connect the cluster with my RZ account. Enough to do the work when I'm supposed to do and nothing more. Right. Thinking about it, the API shouldn't be available to everyone on the public internet either. We already have strong security in place to ensure who can access the internal networks of our cluster via our cloud provider. So using my cloud provider, I will put the cube API onto a private network to ensure that all my systems can no longer access it on, but people in big core can and they can interact with it if needed. This is more secure, but I've hit a problem. How do I manage what's running in my cluster if I don't have direct access to cube API? I can only view the cluster now with my RZ account, but how do I create, apply, delete or patch my workloads? I heard about this thing called GitOps actually. It's like infrastructure as code, but for my cluster. So I can put all my YAML into a Git repo and then I can use GitOps to monitor that repo for any changes to the cluster configuration. I learned about CNCF GitOps tools like Argo CD, Flux and the Carval Cap controller that can help me achieve this when I was in Chicago last year. Right. Then the GitOps tool will be the one to interact with the cube API to deploy all the company's workloads. This all happens within the private network. But how does that help here? Well, instead of a cluster having configuration sent in, the cluster can go out and check a Git repo. Instead of giving away a key to the cluster to be able to run any workloads, I can use GitOps to read access to the Git repo and let that update the cluster via the cube API. Then I just need to secure the Git repo. Instead of the cube API being the threat boundary, it then becomes the Git repo. Right. Actually, I'm also going to use Cubescape. So Cubescape is a CLI tool that scans clusters, and Helm charts detecting misconfigurations. So I'm going to look into these results now to see if there's anything else that can change my cluster. I wonder what's going on over there. He seems pretty busy. But I've been watching the big, my big corp website in his browser, and sadly he's taken it down. His loss. It was a masterpiece. So remember I used that innocent looking rubber ducky hot plug to interact with Ozzie's computer? Let me tell you what I did exactly. First, I made a copy of his cube config file. So now I know the IP address of that cluster he made and he won't stop talking about. Actually, since I have this IP address, maybe I can put my website back up of his cube API server still public. Hold on. Since I know the IP address of Ozzie's cluster, I can use this software called inMap to scan the ports and it tells me he's made his cube API server private. Bummer. Oh, well, there's still a lot I can and I will do. So back to the matter at hand. With rubber ducky hot plug, I was able to procure the IP address of Ozzie's cluster with the cube config file. I was also able to get Ozzie's private SSH key. So now I have his identity. I also got his SSH config file, so I know where Ozzie's private SSH key can be used. And the cherry on top, I took the git config files, which show me all of the URLs of all of the git repositories that Ozzie's contributed to lately. So with this information, I can cause some real trouble making git commits as Ozzie. Huge promotion. Yeah, Ozzie, yeah, yeah, yeah. All wasn't by promotion. Apparently, BigCorp is losing money on each sale on the website. Instead of selling our product for $100, we're selling it for something around $15. It costs us $40 to make the product, so we're losing money right now. I linked our sales to a spreadsheet for the finance team. Let's check that out. Is your microphone working? Welcome back to YouTube. I see that instead of making around $100 per sale, everything is $13.37. What kind of idiotic, rookie liability co-working of mine changed the price of our product? Let me check the git commits for the payment service. It was me. I didn't change that code. I've been here all this time. You've all seen that. Here is my commit code that sets the value to $13.37. Someone must have my identity. All right. Let me just revert the git for commit. If only I could remember how to revert a git commit. Let me check chat GPT for one moment. Great. I think that's how I do it. That looks okay to me. Let's enter. Has that worked? No. Be quiet, please. I'm doing some super serious work here. I'm looking at some messaging. I feel terrible for causing you any inconvenience. Every day, it seems. Wait. Wait. Someone just committed this to me again. The price is back to $13.37. They've reverted my reverse. How did I set up my git account? I made sure I was secure and I used a private key. That's only on this machine. A private key is better than a password, but how has someone got access to my private key? I'm going to have to revoke my SSH key to stop any further commits. I've removed that from our git repo. Let's revert the commit now, but this time I'm going to sign this commit with git sign. I'm more about that in a bit, but now let's check the log, and that looks good to me. We've reverted the reverse of the reverse. Perfect. Another way to sign our commits is to use a tool such as GPG. This provides a way for me to manage and rotate keys, and I could store it on a uber key rather than having it on my laptop. But in this instance, I used a tool called git sign, which provides a certificate to sign a git commit when I verify myself with OAuth, like you would on Google, for instance. This is then stored on SickStore so that the commit can be verified against my identity publicly. So now let's go back to the spreadsheet and back to $100, and we're okay again. Well, gee, captain, it's only Wednesday. Okay, this shows how important git is to us, not just my git commits, but access to git in general. If someone has my password, they could add a random key and commit as me again. I'll also set up a multi-factor authentication. So that way, to gain access to the repo, people will need to know my username, my password, and have access to my multi-factor authentication device. I should have got this set up. Dave, day two of a conference? I should have set this up long ago. I guess I was just too busy thinking how good I was. Wait, what? Back to me already? Ozzy wishes. I'm three steps ahead. Ozzy's about to notice that his cluster is running some random container image for his message queue instead of the company-approved image. Why is my cluster running some random container image for the message queue instead of the company-approved image? Well, that's weird. Our messaging seems to be working just fine. There haven't been any health status alerts. This must have been from a commit earlier on. Well, Nova seems to be a great multi-tasker. Or is there more than one Nova in here? All right, focus, focus, focus. We need to look into the message queue. It looks like that an image is being pulled from a random registry off of public internet. We put all our images from a registry we own with our cloud provider. Why would we pull a random image? Let's describe the message queue. Looks like the image name is Nova Images Generic Message Queue App. Tag 1, 3, 3, 7. Nova! All right, let's see what this message queue container is running. I can see that data is being sent to a random address, and I don't recognize that address at all. It's not part of the big corp IP. Oh, no. Nova's Nefarious Message Queue App is also sending information outside of a cluster. This can't be good. Why would it need to send information out to the internet? Right, I'll add Kubernetes network policies in place to ensure that the message queue doesn't have egress access. The message queue app should be taken requests from our website app only and passing it to our database app inside the cluster. There is no reason that any part of the message queue should be sending data out of a cluster. So, while Ozzie's been distracted with prank websites and product prices changing, I've been doing my real money-making attacks. So this whole time, the company message queue has been running my image from my container image registry, my message queue app, instead of the company-approved one. And my app seems normal, but I'm able to capture all of the information that's being moved through the message queue. And I'm going to sell it for big money. Wait, what? The information stopped being sent. Ugh! Ozzie must have put a Kubernetes network policy in place. As much as I hate to admit it, Ozzie's starting to get better at Kubernetes security. Lucky for me, Ozzie didn't notice the other container I put into his cluster. I still have one more big trick up my sleeve. Right, let's fix this message queue first. I'll restore the big court message queue by converting the git commit that changed it to Nova's application. And I've already set network policies. I've clearly defined which apps in my system are allowed to access egress, not the message queue app. But I can do better than that. Let's take a step back and look at our cluster again. Our cluster is an open network once inside the cluster. What if someone is inside a cluster? How could they intercept network calls between pods? It'd be good to have encryption set up. Like how HTTPS is used to prevent person-in-middle attacks within a co-working space. My cluster is like a co-working space for my apps. What happens if one of the apps starts intercepting internal network requests? So I'll add a service mesh like Istio, LinkD, Kuma, or Silium. And a service mesh is used to set up a network inside of our cluster. And it has some additional features that we didn't have before. We can encrypt all traffic between pods so that internal traffic can't get intercepted. This is what we call mutual TLS, or MTLS. So as well as MTLS, we could add authentication so it ensures that the services communicating with each other are who they can aim to be. This prevents impersonation attacks. We could also use automatic certificate management, which simplifies the operational overhead of MTLS. And like a certificate mismanagement was exploited in a famous hack in 2017 that cost a company half a billion dollars. Service mesh can further enhance our network policies, so this reduces the attack server scan by ensuring only authorized services can connect to each other. And finally, a service mesh can add auditing and monitoring, allowing for better anomaly detection so we can start tracing unauthorized or suspicious requests. I guess a service mesh could be kind of a big deal here. Now, what about those workloads coming into our cluster? If someone can access a cube API, they can apply whatever YAML they like. Then our cluster will run it. That's what Nova did. Let's see about using a cluster-level policy tool like Konovo or OPA to put a mission controller in place to make sure our workloads meets our requirements. Then we'll add a rule that any image in the cluster must come from big corpse internal registry. This prevents random images from being run in the cluster. Right, done, and I am feeling great. Now, what's that? The policy that I just added notified me that there's a problem. The policy fails for another image being used in the cluster. Another image is coming from a registry that is outside of the cluster while the image is failing. Nova images generic buildup, 1, 3, 4, 7, again. No, but not over again. When I was still able to commit as Aussie into Aussie's Git repo, so back before he set up multi-factor authentication, I started running a container that was disguised to look like Aussie's company's build service. It's a common rookie mistake for orgs to give extra privileges to their build service containers, and it's a mistake that I now intend to exploit. Aussie isolated his build service in a CI CD namespace of the cluster. Here, let me show you. There, so here's the build service in the CI CD namespace. Aussie put measures in place to isolate the CI CD namespace thinking it will protect him from attack. So from that privileged running build service container, I'll use a tool called NSEnter to connect to a different Linux namespace on the host machine. And now I connect to the host process on the node, and that gives me a little more privilege to gain access to all of the machine as root, and now I can see everything on the entire machine. Ha, ha, ha, ha, ha, ha, ha. Yes, maybe. Quiet, please. I'm going through a serious security incident right now. So back to Nova images. Generic build app 1337... Wait, 1337 is a hackathon for lead as an elite. Nova's making me look a right noob right now, so let's look at what is being run. Not only is Nova running a random image, Nova has given a number of privileges to that container, including the process namespace of the host machine. But thinking about it, if you run a container with privileges, I'm pretty sure you'll be able to move laterally onto a machine running the containers. If that's the case, I'm not in a good position right now. Right. Let's update our runtime policies to prevent this kind of profile from being used nefariously. And using a runtime security tool with EBPF, like Falco or Cubama, all communications happening on the kernel and enforce policies for iCreate. EBPF is a technology that allows code to run in the Linux kernel without changing kernel source code or loading kernel modules. EBPF has a lot of the same features, service mesh, but for the kernel. It gives us lots of power, but as we all know, with great power comes... for the internet, great responsibility. The kernel is outside of a Kubernetes cluster, but I can track what is going on in the cluster at the kernel level. Just the cluster, it'll be everything on the machine. So here I am. I have access to this whole machine in Ozzy's cluster as root. So let me see if I can listen in on any of the network traffic going across the network. Ugh, the traffic's all encrypted. Drats, Ozzy must have preemptively added service mesh. Dang, he's getting good. Well, with EBPF, I can track what's going on in Ozzy's cluster at the kernel level. And it's not just Ozzy's cluster, it's everything on Ozzy's entire machine. So here's what I'm going to do. I'm going to use this EPF-based tool called BoopKit, which, as a side note, BoopKit was created by Chris Nova, who's a cloud-native security researcher, a wonderful human and friend. She inspired my name for this talk today, but she used her hacking skills for good, not for evil, like me. So right, I'm going to use BoopKit so that I can manipulate Ozzy's running kernel. The first thing I'll do is to create a backdoor. That way I can get back here easily at any time without having to break out of a container again. Next, I want to use BoopKit to see the system calls on Ozzy's Kubernetes node. Perhaps there's some data here that I can sell or hold for ransom. But Ozzy's getting good at the security stuff, so I'm going to go old school and I'm going to leave a calling card. A fork bomb. So if I lose access to this cluster, the process will get started and BigCorp, and more importantly, Ozzy will pay. Okay, for those of you who don't know, a fork bomb is a bash function that gets executed recursively. So it's a denial-of-service attack where a process continually replicates itself and depletes system resources. So basically, it crashes the whole system to restore starvation. So this is what you get for microwaving fish, Ollie! Wait, what? It didn't work. Oh no, I just lost access to the cluster. Oh, I didn't have enough time to set up the fork bomb. No! No! Completely shut out. Dang it! It's like you're having a rough day. Oh man, this guy, he really had to come into him. That sounds like a jerk. You're not mic'd up again. I am. I'm just softly spoken in times of need. Okay. Right, okay, here's what happened. I looked to add Falco onto the node, but noticed that something didn't look right. It felt like someone was already there and looking to modify something or other, and I was at my depth, so I just called BigCorp Security and we put plans together to quarantine the cluster. The engineering team provisioned a new cluster at the same time. This new cluster is built with everything we've put in place over the last, I'd say, 21 minutes, and we've now directed all customer traffic to our new cluster. We can use forensics now on the old cluster to try to figure out how far and over it got. The best part is we have a brand new cluster which has been hardened. So can I tell them about the security features of the new cluster? Sure, but how do you know what I'm doing over here? Never you mind. I don't, so go ahead. So first of all, the Cube API isn't public. It can only be accessed from within BigCorp. The next security feature is that we have least privilege RBAC. So we have one heavily guarded accounts that can actually access the Cube API, and then regular users have only view-only access. But then, how do the regular users make changes? We do that because of GitOps. So the GitOps tool is the one interacting with the Cube API, and this moves the threat boundary out to the Git repo. And the GitOps tool has read-only access to the cluster. So the GitOps tool is pulling to the, not from the cluster, the GitOps repo, yeah. Pulling from the Git repo, yeah. Thank you. Applying to Cube API, and it applies the workloads to the cluster, and it keeps the workloads in sync with the state as defined in the Git repo. Then we set up multi-factor authentication. So someone shouldn't just be able to get in with an SSH key or with a password. We also have to have something, a secondary device, working out. We set up Kubernetes network policies. So in our story, this is what Ozzy used to stop me from being able to send data outside of the cluster and collect it and use it for nefarious purposes. So with the Kubernetes network policy defining exactly which applications are allowed to go outside of the cluster. Then we have service mesh, which can be used for a lot of things. But in this case, we talked about MTLS, which encrypts all of the traffic that moves within the cluster between any two applications. We also set up cluster-level policies using a tool like OPA, which stands for Open Policy Agents, or Kyverno. And in our case, we made a rule for the cluster that only images that are company-approved images that are in our registry are allowed to be run within our cluster. If you try to run any type of other type of container image, it will fail. And then finally, we set up runtime security. So this is at the kernel level. So this is watching system calls and making sure there's nothing going on that isn't expected. Yeah, pretty much. That's what you did. Just a wild guess. Good. So this was my cluster at the beginning. And looking back on it, well, I feel a bit embarrassed that I thought it was actually that good. But equally, I didn't know 30 minutes ago what I do now. So this is closer to what a hardened cluster should look like. And although there's room still for improvement, we can continue to do so. I know this looks intense, but when we break it down, it all starts to make sense. Think about it. And I finally understand what they mean by onion layer security. It's not about having a single layer or peeling that layer and then crying myself to 5 p.m. each day. But it's about having strength in depth. It's about having lots of layers of security to prevent lots of different attacks. So you've got the clicker, yeah? Yeah. Do we do that again? So hey, everyone. My name is Lewis Denham-Parrie. I'm the CEO of CubeCon Talks as this one. When I'm not being overly dramatic, I work for an amazing company called ChainGuard. And at ChainGuard, we're here rebuilding our images... Well, your images... We're rebuilding images. I'm making them all secure by default. So please come find us. And it is my absolute distinct pleasure to introduce you to the star of the show today. The amazing creative mind who put all of our ideas into these beautiful slides. Thank you. I am a developer advocate at VMware. I host three streaming shows, so if you're interested, I put stickers down here. I do lots of goofy stuff. This is kind of my thing, actually, it's your thing, too. That's how we work so well together. I also have... I started in tech about two or three years ago, and I've given two... two CubeCon keynotes in that time. I've given no CubeCon keynotes. Now I'm just super bragging. Okay, so that's me. And then we also want to give a shout out to Chris Nova, who passed. She really... When we put together... Thank you. When we put together the abstract for this talk, we did, from the very beginning, name our characters Nova and Ozzy. And we didn't realize it at the time, but Lewis and I both have very positive experiences with Chris Nova. And so we kept... When she passed, we decided to keep her name in just to homage to her as part of the talk. Yeah, so we jumped on the call, and like we haven't spoken about how Chris influenced our careers, and we just chatted about it. And for me, I met Chris back in London in 2017 when I was getting started at a Clive Native conference. I didn't know anyone else there, and I felt a little bit lost to say the least. But Chris invited me to go to... For a curry with her and her friends in Brooklyn. Great place if you're ever in London. Where I learned many of the things that she loved including mountaineering, or mountains, sorry. I'd often see her at conferences like this one, and I was always amazed that she remembered who I was. Yeah. But most importantly, she always made me feel welcome and made me feel part of the community. And I hope to pass a feeling on to others, a feeling of welcomeness. As I remember clearly, I came here for the technology, but I stayed for the community. I met Chris Novak, my very first cube con, which was only two years ago in LA. And we, as part of a group called SIG Bouldering, one morning a bunch of people who liked to boulder went to the bouldering gym together. And I spent that morning with her, and we went out to lunch, and then just she and I spent time in the pool. And I was nobody. I'd done zero talks. I didn't know anything from anyone. And she was so, so kind and kind and good, Deverel. And also how to put community first. So Lewis and I both definitely want to pay that forward. So to that, thank you for your time. Don't microwave fish and coworkers in spaces. And please be constructive with your feedback. We have two minutes if you there are questions. There's a microphone over there but Otherwise I've got my own questions right now. Excellent. Thank you so much, y'all. Thank you.