 Hello everyone, hope everyone can hear me. Thanks for joining me in this talk. Today we're gonna talk about Kubernetes Security 101, best practices to secure your cluster. My name is Magna Logan, and I'll just give a brief introduction. I'm originally from Brazil, have been in Canada for almost five years now. My background is a developer, so I have a dark past, I was a Java developer. But yeah, early in my career, luckily I switched to application security, have done pen testing, web application firewall management, security code reviews, and also some cloud security research around containers and Kubernetes as well. So that's about three years ago, that's where it all started. I'm also part of a training company back in Brazil called Gohacking. We provide intermediate to advanced level cybersecurity trainings. And we're planning to bring some of these trainings to Canada, so in English as well pretty soon. But yeah, this QR code is just my LinkedIn page, so I think that's the best way to reach out to me if you have any questions around this talk or any other related topics. Okay, so who here, before we start, who here already uses Kubernetes? Who's dealing with Kubernetes security at the moment? Who has solved the problem of Kubernetes security? Nobody? Okay, that sounds good. So this is the agenda for today. So since this is the on-ramp talk, I'm assuming that I'm gonna talk from scratch, understanding the components of a Kubernetes cluster, and then we'll talk about security. So that's why we're gonna cover the architecture, control plane, worker nodes, and things like that. And before I move on, I started this research on Kubernetes about three years ago when I was helping develop a product, image scanning solution, and generating the vulnerability feed for that solution as well. And basically the solution ran on Kubernetes. So that's where I didn't know anything about Kubernetes three years ago, and that's where I am right now. But I had some articles that were published around Kubernetes security and the basics, and like securing the foresees. So if you wanna find out more about it, just let me know. And of course I'll share the slides after this talk. Okay, so who here can explain in simple terms what is Kubernetes? Anyone? Yeah, go ahead. Because yeah, it might have different interpretation, right? I think you said the same thing. Container orchestration, right? How do you manage, once you have your Docker, your containers, when you start growing with dozens or hundreds of containers, how do you manage that? That's what Kubernetes is for. Of course it's not the only solution, but it's the most adopted out there, and that's why it has a growing success. Basically it helps you with automating the deployment, the scalability, rollbacks, and updates of your containers. A few fun facts about Kubernetes. This one, the Greek, Kubernetes is a Greek word, which means helpsman. So the kind of the pilot of the ship, and that's an allusion to Docker being the containers and Kubernetes is driving those containers around. You've probably seen some of the ships with containers outside there today. Yeah, what about this one? Who knows about this one? Yeah, the baseline project that generated Kubernetes, right? The project from Google called Borg that generated Kubernetes. And this third one is a bit harder, maybe. Seven of nine, but why? The name of the original project of Kubernetes was called project seven, and the reference because of seven of nine, and that's why on the Kubernetes logo, instead of usually having six or eight points, you have seven, seven of nine. And one more fun fact, what is, why do you call KAS? Because we're lazy, yeah. Yeah, Americans like to abbreviate everything, right? So they're like, okay, yeah. Kates, that makes it easier, the number of letters between the K and S of our Kubernetes. But the first question I like to ask myself or anyone looking to get into Kubernetes and oh, we wanna move to Kubernetes solution and we wanna start using clusters is like, do you really need it? Is it really necessary for your organization? Do you have the resources to deploy and maintain a Kubernetes cluster? It's very tricky. There is a lot of complex inner workings inside Kubernetes that makes it really difficult to deploy and maintain. And if you don't have enough resources, people and money, of course, that's gonna be very hard to keep this system running because Kubernetes was created to maintain clusters, maintain containers running for a long time. So it's not just a matter of throwing buzzwords out there. I think it's important to understand the complexity and we're gonna realize at the end of this talk that it might be better if you really need to go into the Kubernetes world, it might be better trying a managed solution first. Okay, so before we move on, before I talk about anything about security here, we need to understand how Kubernetes works. What are the main components of a Kubernetes cluster? So looking at the left side there, the control plane, which used to be called the master node, we have five main components. The one in the middle there that we see that it's talking to all the other components, including the worker nodes, is the Kube API server. It's basically a REST API, which receives and sends all the calls from the Kubernetes cluster and to the Kubernetes cluster. It deals with authentication, authorization, there is some also admission controlling being done. And it receives that, it sends information, it can store information to the at CD. The one at the bottom there left, bottom left, is your database. It's the one that stores all the configuration of your cluster. So it's like the heart of your Kubernetes. Anything changes on at CD, it tries to reflect that back to the cluster. Let's say, oh, you have three NGINX containers there, you update at CD to five, Kubernetes will make sure that, okay, now I need to deploy two more. So there's something called a desired state where the system Kubernetes is like a live-in thing where it's always checking, okay, what I have here in reality, is it reflecting what I have on my database, on at CD, the key value store? Yes, okay, I don't need to do anything. Oh no, there's difference, right? There is a drift, we call it drift, there is a difference here. Then yeah, I need to do some changes, I need to take down some containers, or I need to add more replicas, things like that. So everything is done by at CD, the information there. Also at CD has also an API, so now we have two APIs. You have the Kube API server and you have the at CD API. And we're gonna talk about that soon. At the bottom right, still on the control plane, we have the Kube scheduler. The scheduler is basically the component that assigns the containers to the nodes, right? So okay, if you need to deploy a new container, the scheduler is gonna talk to one of the worker nodes here at the right, through the Kubelet, and okay, deploy a new container, right? It may be right now, or it may be in, well, I don't know, one day or one week from now. That's why it's called the scheduler. Before I talk about the Kube controller manager on the top left, are there any managers here? Okay, I usually make this joke, but you know, the Kube controller manager, like any manager, it doesn't do anything. It has other components below him that do the actually controlling of a Kubernetes cluster, right? So you have the pod controller, you have the service controller, and so on and so forth. So that's what it's there to keep your system running. So it's always checking, verifying the replicas, the number of containers, the health state, and things like that. And of course, on the top right of the control plane, you have the cloud controller manager. Kubernetes is, people reference Kubernetes as being the operating system of the cloud. So it has the ability to communicate with different cloud providers. So AWS, Google, Azure, and others, it's able to interact with those APIs from the cloud providers and provide things that might not be able, Kubernetes might not be able to provide on its own, such as EBS volumes or load balancers, things like that, so that it can interact with those components as well. So it does that. Okay, looking at the right side here of the slides, we have the worker nodes or just nodes. These are the ones that run your applications. So technically, these other components on the left, they can be run as containers, and they usually are, but your own applications, your web apps, whatever your database is, it should run on a node, a Kubernetes node and not the control plane. By default, there is a setting in a Kubernetes cluster that doesn't allow you to deploy your applications to the control plane. Why? Because they wanna separate the control plane to the worker nodes. So we have three nodes here, and basically these nodes, they have three main components. You have the kubelet, the kube proxy, and the computer there, it's basically the runtime engine. The kubelet is the agent that's installed on each node that talks back to the control plane and also assigns and deploys your containers on that node. So you can think of the control plane being a server or a VM, and each node here, each worker node is a separate VM or instance, right? So the kubelet here will talk to the API server or the API server will send the information back to the kubelet, okay, deploy this container on your node, and then they will send health information, health checks back through the API server to the HCD to say, okay, everything is working and have two NGINX containers, it's all good. So you need to have one kubelet at each node, but look at the diagram here. We don't have a kubelet in the control plane. At least it's not showing here. And I'll talk about that in a second. There is also the kubet proxy, which basically by default manages IP tables on Linux to communicate internal and external communications inside your cluster, and the runtime engine, which used to be Docker by default. Now you have other options such as container D, cryo, podman, whatever. The reason you don't have a kubelet and a kubet proxy on this diagram, at least from what I've seen in the latest, in the documentation from Kubernetes when I was doing the research, there was no kubelet and kubet proxy there, but there is because you need to deploy these components from the control plane, they're deployed as containers usually on an unmanaged cluster, right? So you need to have a kubelet and kubet proxy there, but it's not represented because technically you don't want to deploy your applications there, you want to deploy them on the worker nodes. Just yeah, interesting fact. Kubelet also has an API. So now we have free APIs, right? So we have the kubet API server, the HCD API and the kubelet API. This kubelet API is not very well documented, but attackers already are exploiting that. So the only one API that should be exposed or is exposed by default is the kubet API server, especially if you need to talk with nodes and especially on unmanaged clusters such as EKS, AKS and GKE, but the other two at CD and kubelet, they're not exposed. So usually you can only reach out to these APIs if you are on the same network, but sometimes they might not have authentication and if the attacker is already inside your network, then it can be a problem. Okay, now that we understand at least the main components of a Kubernetes cluster, let's start talking about security. So there is something called namespaces in a Kubernetes cluster. And namespaces here are different from the Linux namespace. In a Kubernetes cluster, a namespace is basically a folder where you separate your applications. So you can have different developer teams working on the same cluster or you can have different namespaces representing development, QA, and production environments. The main, usually when it gets created by default, you have these four namespaces. The main namespace where the control plane pods are is inside the kubesystem namespace. And there is no, there is no like logical, there is no security boundary between namespaces unless you specify that through our back road-based access control, right? There is no protection there. So your pods, there are in the kubesystem which are critical to your cluster because Kubernetes uses Kubernetes to run Kubernetes. They can be compromised as well. So if I compromise one pod from other namespace, I may be able, with the default configuration, I may be able to reach the kubesystem as well. So I talked about the API server, right? The kube API server which is basically a REST API. So things like that, the idea is that you shouldn't expose your API server unless you have to because by exposing that, you can give information to attackers about, okay, there is a Kubernetes cluster running here. If you reach out to the slash version endpoint, you can get the version of the Kubernetes, the version of the Golang and everything. And that information can be very valuable for an attacker to try to exploit your Kubernetes cluster. One of the main things that's important and it's like basic, not just for Kubernetes security, but any kind of system security, is file integrity monitoring. So technically, files that shouldn't change, shouldn't change in your system after you installed something, you should monitor them because if they change, then you need to receive an alert, you need to receive a notification. So file integrity monitoring basically works by generating the hash of these files. And if the hash is different after a while, you can continuously check those files for the hashes and match with their database. If something changes, then okay, something's going on. Maybe someone tampered with the system. It can be an update, but it can be a malicious action as well. So this is based from the CIS benchmark for Kubernetes. It recommends from the control plane itself. And that of course, if you have access to the control plan, if it's an unmanaged cluster, these are the files, the recommended ownership and permissions for your files. So if you can monitor those, if those change, then maybe you should look at it carefully. I talked about at CD, at CD being the key value store of your Kubernetes cluster, right? So it's the heart of your cluster. First thing that it's interesting is that, luckily by default, the encryption with at CD in transit is done by default when you install the cluster. You have a key, you have a certificate. But at rest, no. So everything that's inside your at CD, key value store is not encrypted by default on an unmanaged cluster, right? When they're talking about managed clusters, it's gonna depend about the cloud provider. And if you, like sometimes you need to set up a flag or something to encrypt your at CD. So as I said, there is an API there on at CD. If a attacker is inside your network and he's able to query the at CD API, he's gonna be able to get all that information from your at CD. And if they can change it, then your cluster doesn't know where the changes are coming from. And then it's just going to try to reflect that back to the cluster. So let's say I change an information on the at CD to instead of using engine X, download this malicious image from Docker Hub, right? That can happen. And that could be indication of a supply chain attack or something like compromising the cluster there. So that's why you need to monitor that as well. One thing that's really scary about a Kubernetes cluster is that from a security perspective is that it's a flat network. Every pod, right? When we're talking about Kubernetes, I forgot to mention that, but when we're talking about Kubernetes, when I reference containers, I'm talking about pods, right? The smallest piece of a Kubernetes cluster. Every pod can talk to every pod in a Kubernetes cluster, right? And so this goes back. To me, it's like we're repeating the same mistakes that we had before with networks and we talk about VLANs and network segregation. So you need to implement something to restrict access between those pods. This is called network policies. So you need to customize and implement, for example, let's say, does the front-end pod need to talk to the database or just the back-end? So you can restrict that by limiting this. This is a basic security principle of reducing the attack surface, right? Reducing which pod can talk to another pod, you're reducing the blast radius if it gets compromised. Of course, with network policies, inside a Kubernetes cluster, there is something called CNI, container network interface. So not all the CNIs, there are different ones, just like you have for runtime engines. So you have Calico, you have Flannel, you have others. But not all the CNIs support network policies. So you also need to check if your Kubernetes cluster, if the network policy, sorry, if the CNI that you're using supports network policy. Same thing for the worker nodes, right? For the file integrity monitoring from the CIS benchmarks as well. Some of the files that you can monitor for any modifications there. And the Kubelet, as I said, is the agent that runs on each node of your cluster. So it makes sure that all the containers are running inside a pod. And a pod can have one or more containers running on it. So usually the idea of containers that you have only one process per container. So if I need another process, another application running, for example, to collect logs from that first container, I would deploy another container on the same pod as we call the sidecar container. Some of the things that you can implement on a Kubelet is restrict the Kubelet permissions and rotate the certificates. As I said, the Kubelet also has an API and there is a blog post that I publish, I think last year or two years ago, where a track actor group called Team TNT, they were exploiting Kubernetes clusters. First, get inside the network of the compromised environment, let's say AWS or Azure. And then they were scanning the network for the specific port for the Kubelet. If I'm not mistaken, it's 10, 250. So they were scanning inside there. They found something running on that port. They were sending the request to the API, the Kubelet API, by default it was unauthenticated and they were able to change the pods to manage to mine cryptocurrencies, usually Monero cryptocurrencies. So we've seen that we were able to get access to their command and control server. And they're like for every compromise, they would send the IP address of that pod back to their C2 server. And basically there was about 50,000 different IPs there. So technically around 50,000 pods or containers were compromised. So yeah, there is an article about that as well. Okay, audit logs. It's a basic principle to have logs on your application. It's one of the recommendations from OWASP, the open web application, a worldwide application security project. It's like you need to have proper logging to identify if something's going on, if something malicious is happening on your system. But here on Kubernetes, the audit logs, which are the one that can identify suspicious behavior, they're not enabled by default. So you need to enable those, right? If you're using EKS, for example, it's just a click of a button and then the logs are stored on S3 and then you can access those through CloudWatch. But it might be different on different managed Kubernetes services. You also need to think about, since I said that Kubernetes is like a living system, it's gonna generate a lot of logs. So you need to be able to filter and maybe rotate those logs, otherwise you're gonna run out of storage as well. You can identify through the audit logs, you can identify maybe suspicious behavior, pods being deployed, who deployed what. There is a talk that I did for a Sun's CloudSecondX event where I talk about using these audit logs for detection, for threat detection as well. So before I wrap up, basically my recommendation, if you're starting your Kubernetes journey or Kubernetes security journey, I like, and I didn't talk about any kind of tools or solution here, but I'd like to mention the kind of the Kubernetes security triad that I like to call it. First, you need, I think, three basic things. One is image scanning. So you need to scan your images prior to deploying them to your cluster, not just for package scanning, but also libraries and dependencies, especially if you're not doing something like an SCA, software composition analysis. The second thing is implementing a proper admission control. Kubernetes has one, it used to be the pod security policy, now it's pod security admission, but you also have third party tools such as OPPA, Open Policy Agent, or Kyverno to do that for you. What is an admission controller? It's basically what the name says. It's controlling the admission of a container of a pod inside the cluster. It's just like the bouncer at the nightclub. Okay, you get in, you know, no. It's just like that. So you can specify, for example, things like if this container has a critical vulnerability, don't allow this container in my cluster. Or if this container hasn't been scanned before by my image scanning solution, don't allow this container in my cluster. So things like that. And the third solution or the third tool that I recommend is a runtime security, right? After you deploy your pods in your cluster, so if they get compromised, let's say you have a web application that is exposed to the internet and that web application has a vulnerability, the attacker compromising that web application get access to your pod, get access to your container. How do you know that something's going on? How do you know that they're using your pod to mine cryptocurrencies? So the runtime security is a solution that it can monitor for that, right? Most of these runtime, they use a technology called EBPF. You've probably heard about EBPF here. If not, you will hear today or tomorrow. But it's a solution that helps you monitor for anything that's happening inside your containers as well. Okay, so basically that's all I had for today. I hope you enjoyed this session and I'm open to questions. Thank you. Yeah, go ahead. Oh. So you mentioned you shouldn't be exposing the CUBE API server, but how are we expected to interact with it? Like CUBE CTO, I'm assuming, is using that same interface, right? Yeah, the API endpoint, right? You can use CUBE CTO, you can use inside your network, right? So you can not expose to the internet, but you can limit the IP addresses, you can have something like API gateway or web application firewall, for example, blocking or limiting that access. So that's the idea, right? It's not that, okay, it's a huge deal because most of the managed providers, GKE, EKS, and AKS, they do it by default, but it's not a good practice because it's revealing information about your cluster to anyone. If you go to tools like Shodan that you can find the exposed systems online, you can find a bunch of API servers there as well. Most of them might be Honepots, the people deploying clusters to analyze and do some research like I do, but yeah, that's not a good idea. Okay, thank you. Yeah, anyone else? Go ahead. You mentioned something about threat detection using audit logs. So is there any analytics that is integrated as part of Kubernetes or how are we doing that? Any example? Sure, yeah, no, great question. As far as I know, there aren't anything native from Kubernetes to help with threat detection. You have to implement some cloud native for open source tools, for example. One, for runtime security, there is Falco, from the cloud natives from the CNCF that has rules to detect this kind of malicious behavior. So it also uses EBBF and Kubernetes audit logs and you can create your own rules. So you have the default rules provided by the Falco team, but you also can create your own. But natively, there's nothing out there. So there's why there is a lot of companies investing in kind of threat detection for Kubernetes environments as well. Thank you. Welcome. Yeah, can I give him the mic? Could you a little bit more explain about how to enable the runtime security with the EBBF? Sure, sure. With Falco, for example, that's the one that I'm more familiar with. You can install that as part of your Kubernetes cluster. If I'm not mistaken, it's basically a demon set that installs Falco on every node and can collect information for you. So Falco has another tool called Falco Sidekick where you can see even the dashboard. You can see the logs for the audit logs and the EBBF information that it collects and it can provide you just some information. It won't act as a prevention, right? But it would detect some malicious action. Somebody deployed, someone is using curl or WGAT on your container. That shouldn't be something that's normal or someone is running a crypto miner, right? So like most of the attackers, when they run a crypto miner, they run a Monero crypto miner. So the binary, it's known. So you can detect that as well with the runtime. Thank you. Welcome. Okay, guess we're done. Yeah, thanks everyone. Thank you.