 All right, thank you everyone for joining us today. This is today's CNCF live webinar, Thinking Like a Threat Actor in Your Kubernetes Environments. I'm Libby Schultz, and I'll be moderating today's webinar. I'm gonna read our code of conduct, and then I'll hand over to Upnav Mishra, Director of Product Management, and Raywant Tamina, a consultant, both with upticks. A few housekeeping items before we get started. During the webinar, you're not able to talk as an attendee, but there is a chat box on the right-hand side of your screen. Please feel free to drop questions there, and our presenters will get to them as they come through, as many as we can within the time limit. This is an official webinar of the CNCF, and is such a subject to the CNCF code of conduct. Please do not add anything to the chat or questions that would be in violation of that code of conduct, and please be respectful of all of your fellow participants and presenters. Please also note that the recording and slides will be posted later today to the CNCF online programs page at community.cncf.io under online programs. They're also available via your registration link, and the recording is available on our online programs YouTube playlist, and will be online later today. With that, I will hand it over to our presenters to take it away. Hi, everybody. Thanks so much for joining. Libby, can you hear me? Everybody good? Yep, you're all good. Cool. So thank you so much for joining. I'm Abhinav Mishra. I'm a director of product at Uptix, where I lead the containers and Kubernetes security. Raywant, do you want to introduce yourself? Sure. Hello, all. Raywant, and I'm working as a consultant at Uptix, and I'm mostly focused on the Kubernetes security side of things. Thanks for joining. Awesome. Through intros, so I'll skip through that. Those are things you can get later too, but the real thing is Kubernetes, and of course, the attacks that are happening. We, the goal in our today is to really be able to think like a threat actor when you're talking about Kubernetes incident response. This was some recent findings from Red Hat's 2023 Kubernetes security report, and there are some very interesting stats where this was one interesting point that 37% experienced revenue or customer loss due to a container or Kubernetes security incident is a pretty large number across, you're talking thousands of customers, and moreover, when you were customer loss is something that as an enterprise, we never want to experience. Similarly, we see, of course, vulnerabilities and malware and sensitive data, those traditional things being looked at from a security point of view, but we're starting to look at other things such as the supply chain and how that comes into the fold as we to respond to incidents such as the SolarWinds attack or log for JVulmarability. And so the real thing is now threat actors are now Kubernetes security experts. We saw this in cloud when cloud adoption was happening and threat actors knowing how to manipulate IAM roles and get inside of cloud environments, and now we see the same thing inside Kubernetes clusters and across the container supply chain. And so it's very important for us to be able to think about how a threat actor goes inside of Kubernetes and performs attacks, for example. Now, the analogy I like to give is if you think about a robber when they enter a house, some of the questions that he or she would want to answer to be able to do the attack is which room am I in or where are the security cameras, where are the valuable items? Notice that they're starting to ask these questions to start to drill down into specific parts of the house that they can attack or valuable items that they can steal. And this very much maps to Kubernetes, for example, which room am I in may map to a specific pod or namespace that you enter in when you enter the cluster, where the security cameras, there could be data like Kubernetes audit log data, which is having a track of the audits seeing what people are doing. And that is data that of course you want from a compliance point of view, but that's data that A can be exploited, but B also you need to protect against your Kubernetes environment. And so we're gonna talk a little bit about those nuances there. The valuable items of course can map to secrets or sensitive data. And the other rooms that you can get into, for example, lateral movements, Kubernetes network policies. And of course the doors that I can open relate to access controls and role bindings. So all of this forms a very holistic picture in terms of how do we look at threats, what is the threat actor thinking about in the context of Kubernetes? And we'll dive deep into some pillars and frameworks around how to think about this. Now before we get into more detail, this is something I always like to share in terms of tools. So there's this concept of red teaming. And of course with red teaming, you're really going in, trying to perform attacks, reveal misconfigurations or vulnerabilities. And there are tools and frameworks that exist out there that can really help you think about how to simulate these types of attacks. You don't know what you don't know. And anecdotally, for example, there are things like, how does a container breakout work? How does privilege escalation work? And there's frameworks and tools out there that can really help you not just understand what those are, but actually try them out and simulate them. Kubernetes Goat is one really good example where they talk around different types of attacks from the control plane to the data plane in Kubernetes and help you simulate those attacks and understand what remediation steps you can take. Similarly, cubehound is another one that was recently introduced by Datadog. And that one goes into attack paths for Kubernetes and how to think about the different attack paths and get inside of a cluster, for example. So these are tools that I highly recommend, looking at really being able to think about what would a threat actor do to simulate the different types of attacks. Do you want any perspective here from your end as well being a Kubernetes security ninja? Well, nothing in specific. I think you have covered most of the things and once we go through the demos, I think we'll cover all these things again. Cool. So the first pillar that I'd like to talk around is visibility across your supply chain. And supply chain is something that's been getting a lot of visibility. You see, we've seen great bot leadership from the folks at Chain Guard, for example. And even the analysts like Gardner talking about it. So what is the container supply chain? And how does it play into the different types of attacks? So when we look across the supply chain, we're not just talking about your Kubernetes clusters anymore. If we work backwards, we're talking around looking at things like container images that are being built as part of my pipelines, for example, in Jenkins or being stored in container registries, for example, Artifactory or ER. And even going all the way back to the actual GitHub repos and the code and the PRs where they're being built. And going even back to the developer laptop where the blast radius and attacks can start from. And so it's very important to have a holistic view into all of this, because what can happen is any misconfiguration or any vulnerability or malware can be introduced in any one of these parts of the supply chain. So it's not just being able to proactively or reactively remediate from a runtime point of view, but to be able to be proactive and catch these issues earlier, for example, when a container image is being built and be able to enforce certain policies and guardrails. So, Rehwanth is one of our Kubernetes security experts. And Rehwanth, do you want to go over this quote and sort of your perspective on what it means? Yeah, sure. So this quote has been coined by me like when I was preparing for my black hat talk on supply chain security. And when you observe at the trends about supply chain security, most of the attacks happen just because the companies don't know how to patch the systems. It's not like they don't want to do it, they procrastinate or they're not interested in patching the systems. It's just they don't know where to look for. They don't know what their system is made up of. And that's when the attacker comes in and he tries to understand your system better than you do and all the things start to fall. So that was the context about writing this. And I think the next slides would cover a bit about the supply chain security or the different components and things. Yeah, yeah, thanks Rehwanth. Let's take a specific example, right? You build a container image, for example. And there's certain things, right? Like we talk about image signing where you have maybe Docker trust, you said, hey, the image is signed and attested by some authority. So, hey, we're good there. Similarly, we're doing scanning at the registry and CI level where the image is built. We're scanning for vulnerabilities, including at the layer level. And you think you're all set, but it's about having kind of that snapshot point of view each time, so for example, an image is built and being able to look deeper. So imagine like we were just, I'm someone who maybe likes to get insomnia cookies delivered every week, for example, right? And maybe the 10 months that they're delivered, they're so far so great, but I'm in November, and not to call out any brand actually, let's not say insomnia, let's just say any cookies, for example, they're just getting delivered. And we look and we have to understand at that point in time, what were the ingredients used to make the cookie? Was it safe at that given point in time? What are the contents of the cookie on the inside, right? It's not just, we see the chocolate chips on top, but is there something inside that is malicious or something that I cannot eat? What was the state of the factory when the cookie was being made or in the context of the container image being built? Each, it's very important. So there are a couple of factors, right? What am I looking at at that given point in time? So a snapshot point of view. And also being able to dig deep inside of the contents of that thing and not just taking an outside peripheral view. And that's something that's really important in terms of looking at specific vulnerable packages, for example, in the container image. Rehmanth, I'll delegate this one to you in terms of if you wanna talk about the specific example. Yeah, sure. Thanks a lot. So regarding the supply chain security that we just discussed a few slides earlier, this is a part of the same chain where there are Faker and ColorJS packages of NPM and they have been compromised directly by the developer. He introduced some, not malicious code, he just made sure the packages stopped working. And there are numerous thousands of other packages that rely on them. For FakerJS, the weekly downloads is around 6 million and for ColorJS it was around 20 million weekly downloads. There are around 20,000 packages that depend on ColorJS. So when ColorJS is affected, all these subsequent packages and all their users are also affected easily. So that's the entire part of supply chain where the core component gets broken and then everything that's using it starts falling into pieces. So that's where you have to look into different things like how your software is built, similar to the cookie case. You're seeing a cookie and you know what it is made of, then you can easily say, okay, you have diabetes, for example, you don't want to take in sugar. When you look at the ingredients, you can say, sure, this is having sugar, so I don't want to consume it. Similarly, when you have a software, if there is any vulnerable package and if you have a pre-composed list of it, that will help you to immediately patch the systems. That's where the image comes in and similarly the traceability and provenance as well. Like if the image can really be trusted, for example, if you're downloading something from NPM, there are some packages directly coming from the NodeJS maintenance and there are some third party packages that come from different contributors in the open source community. So how do you know which one to trust, which one not to trust or even it's inevitable that you'll have to use the things eventually. So even when it is coming from an untrusted author, you might end up using it or maybe you might be using a real package, but there can be some other package that's using the untrusted package. So how to map all these things, that's a really challenging problem and that's where the seriousness of supplies, chain security kicks in and all the other things at the end. And one question I have for you is, we've heard a lot about S-bombs and being able to have that in terms of understanding specific content, for example, for example, in an image, can you talk a little bit about that and some of the pros and cons around S-bombs, for example? Sure. So S-bombs stands for supply chain, software bill of materials in supply chain and what it basically does is similar to cookie, on the back of the cookie, we get the ingredients, like what it is made of and how much of each, what's the quantity of each ingredient. Similarly, when it comes to software, we just explain like, okay, we have a Node.js application or a Python application and it is using one package. Let's say if Python, it's using Flask, for example, we only stop at Flask, let's say in the requirements file or when it comes to Node.js in the package.json, you look at only one entity, but it subsequently downloads multiple transitive dependencies and usually we tend to leave them all and that's where the S-bombs come into picture where it will try to keep a storage of all the transitive dependencies as well for you and then when something happens, you'll have a chance to look up to see, okay, if you have 100 applications, which are the applications that are using a specific package and you can easily map the things. Awesome, awesome, thanks for that. So one other example I'd like to talk around is Admission Controllers. Now there's, of course, a lot of positives with Admission Controls, for example, they're able to enforce sensible and secure defaults. You can use them, for example, to say, hey, only allow trusted repositories and enforce when maybe there's images, for example, being pulled from untrusted repos or not allowing insecure resources to be deployed, for example, wild cards that are dangerous controllers or overprivileged service accounts. But when you're thinking like a threat actor, one of the things you have to account for is let's, are they gonna attack the things that are not obvious and the things that are, actually, you think are doing secure things, but maybe if attacked can be even worse, right? If, for example, if somebody goes into my house and attacks the security cameras, then they have access to everything and we may think those security cameras are secure, but we also need to think about when attacked, what can happen? And Admission Controls, of course, act as a key guardrail, but when attacked can be extremely malicious. So the key challenge is, how do I know that my Admission Controller is secure at any given point in time? And so one example we have here is mutating web hook configuration, where we see that we wanna do a cube cuddle, run nginx and this is very common, right? Just pulling and deploying an nginx image. But it turns out there was a mutating web hook that was deployed on my cluster that is actually leading instead of pulling nginx when we have this command, it's actually pulling a malicious image. And so you have to watch out for these specific configurations at any point in time. It's not just say like, we have the Admission Controller, it's deployed on the cluster and we're all set. We have to have continuous posture over what those Admission Controllers, mutating web hook configurations and other key components of our Kubernetes control plane and data plan are doing. So there's a blog here by Raventh, you can feel free to check this out around how to create these malicious Admission Controllers, set this up and we'll dive deep into a little bit of a demo towards the second half of this talk. So one example, Raventh, you wanna go into this crypto mining example? Sure. So just adding on to what I've now has mentioned about malicious Admission Controllers or the malicious mutating web hooks, there are different ways an attacker can gain persistence on the systems. And once a system is compromised, the attacker wants to stay in the system as long as possible without being caught. So they can get the most benefit of it. And one way to do that would be leveraging this mutating web hooks and they can add an init container or a sidecar to each of your deployment. So let's say you're deploying application a Node.js or Python or Golang application, you just expose it on a service through a port and once you make a deployment through Jenkins job, your application gets deployed and it will be accessible to the outside and you start using it. But you don't really tend to look at what's happening behind the scenes, at least not often, maybe sometimes we do that. So let's say if you have a sidecar running along with your application and if it is running a crypto miner, it will run in the background, it will not tamper with your application, it will not stop your application from running. So whenever you try to use it, it gives you a very good performance considering you're able to access the application. But in the backend, that doesn't work like that. Your system is compromised or the resources are being consumed or there can be an attacker sitting and monitoring all the network traffic. So how do you detect these kind of things? We do have the CAS benchmarking from different companies or from different vendors, including governments like organizations like the NSA and Department of Defense, but they only look at the static scanning side of things. How do you prevent or look at this dynamic side of things? That's what our main aim is on runtime security. So we'll do a demo on this, so I think it will get more clear. Yeah, should we start with the demo? Sure, so the first demo we'll try to do is how easy is it to set up a crypto mining service on a Kubernetes environment? So when compared to monolithic systems, the Kubernetes systems are quite different. So let me know if you can see the screen. So this is a website called mining pool stat where you can look at all the crypto mining, so all the cryptocurrencies happens and they're looking at the Monero service, for example, and these are all the different pools that are available for us where you can connect with them. Once you mine, you can send out the results to it and then you'll be getting the money and you can see the number of concurrent miners at a point of time, as well as what are the connection settings? How can you mine something and connect with these pools and the other port numbers like a SSL port and a non-SSL port? So the first thing would be, we'll need to have a configuration in order to run the crypto miner and we'll use something called XM rig and here the URL, we're getting it from this page and similarly, you can create a new user information. So whenever you mine something, you get the profit out of it, but for this use case, I'll just copy a random users ID and then we'll just use it over there and then we'll just do a simple deployment where we are trying to pull a XM rig image. So even if you check on the Docker hub, it's quite famous like all the attackers have been using it and it got more than a million downloads, I mean, million pools, if you see on the right side and the others are some basic configurations. So first we try to create a config map of the configuration. I think it already exists because I was just testing out before the demo. We'll just delete it out and then create a fresh one. So once we create a config map with the username and the URLs where it has to connect with, we will use this to load it into the XM rig deployment. So the main idea behind showing this is to demonstrate how easy it is to have a crypto miner running in the world of Kubernetes or the containers, unlike the monolith applications where you'll have to do so many other things. And once you have, you can configure everything starting from number of threads or the CPU usage, the memory usage. So if you are putting it to very low, the users will not even find any difference in it. And it goes on. So I think, yeah, that's a bit, that's a very short demo on how we can level the mining and the configurations that are available on a GitHub repository will share the links post webinar. And if you want to try, you can just do a quick hands-on demo as well. And we got one more to explain about the mutating web hooks. I think we'll do that after this. Yeah, over to you. Yeah, thanks very much. So that was a great example where sort of being able to think like a threat actor, trying to actually simulate the types of attacks proved to be very valuable and understanding, for example, of course the pros of admission control, but when they are compromised, what are the kinds of things they can lead to? So just only going to spend a minute on this in terms of takeaways, always have point in time snapshot of your security posture. It's not just about starting the monitoring, but of course making sure that in a given point in time, what am I looking at? And making sure that we're relying on a combination of the following for images, not just of course scanning and signing and verification, but also looking at provenance and traceability, as explained before. So for time purposes, I'm gonna move on to the next pillar, which is when we're looking at attacks, there's a big question around where do we start, especially if I'm new to Kubernetes. And from the data, it seems the best place to start is to start with our back and to dive deeper. And one data point is, this is the MITRE attack framework for containers. And if you look across pretty much all these buckets, there's always some element of our back, right? That comes into play because there's always a question of how do I get in? And once I get in, of course there's a lot of different things you can do, but it's that initial sort of key to the door to make sure that you wanna be able to at least protect that and understand what you're auditing, your permissions, you wanna be able to understand the different controls and access controls you have in your system. So I'll give a couple of examples. So here, this was from Aqua Security and they talk around our backs and cluster roles. This is an interesting one because we noticed that there's a system controller, cube controller name for this cluster role. And what happens sometimes is we're in this mindset saying, hey, this looks like something that is a system role, let's allow for it. But it turns out this system role has all excessive permissions. And so it's very important to understand that threat actors will try to hide behind and benign names or components that seem important could actually be harmful or could be exploited. In this case, even though this was a custom cluster role that was called the system name and so it seemed a benign but was actually malicious. So and one important thing to notice that misconfiguration can also be introduced via human error or using defaults. So many times in our clusters, we have default service accounts that have star permissions across the cluster. And while those are good from a DevOps point of view to just quickly try things and get started, when we're thinking about enterprise security we have to understand that those default service accounts should not actually be used on the cluster. And if they're lying around, they can pose major, major threats. And here's an example in terms of lateral movements via default service accounts where say you're using a shared EKS cluster and you have team alpha and team beta and the different namespaces here. We'll notice that what's happening is that, well, the namespace alpha one and namespace alpha three in this case have locks in them. Namespace alpha two is using a default service account that has cluster wide privileges assigned to it. So if any malware appears in the namespace alpha two, it can actually spread to the rest of the cluster and laterally move. And so you need platforms, you need an understanding of what are those RBAC misconfigurations, not just in terms of at the cluster level but at an individual namespace level as well. Ravens, you wanna discuss this one around zero trust and I am in the cloud and how it relates to your Kubernetes service account security? Sure. I think Abner has covered a wide range of things starting from the supply chain security and the other pillars for RBAC and this falls under the similar scenario where we are trying to enforce the zero trust model or trying to have a least privilege access in the cloud environments because in the cloud environments when you had issue or when one of the components get compromised, you want to make sure the blast radius is less. Let's take a scenario where one of your application features, takes a file from the user and tries to upload it to an S3 bucket. So how do you make sure the authentication between your application and the S3 bucket is secure or it's in the best way possible? And one way or the simplest way to do would be using the IAM users in AWS directly but in that scenario, you'll have the keys directly within your application and if that gets compromised, an attacker can easily extract that and reuse that to authenticate to the S3 bucket or if those credentials are having more privileges like if they can access the other resources like the RDS or IAM then that will be even a bigger blast radius. And the one would be to attach these roles directly to the nodes where your pod is running on and even in that case, there have some scenarios where if it gets compromised, the blast radius will be huge. The more secure way and the one that AWS recommends is to use the OIDC providers like the IAM roles for service accounts where you try to have a tight binding between your pod and the S3 bucket. So we won't go in great detail but there is a blog post available on that. So if you're interested, you can have a look at it but the main idea of showing this is to say how easy it is to just take the wrong path in order to achieve the goal. If you see there are different ways to connect your application with S3 bucket but it often depends on how you're doing it and in case of compromise of the pod, what are the preventive controls you have or what are the difference in depth steps you have to take care of that. And Rewind, one thing around service accounts and sort of our back like role bindings in a cluster, can you talk around the difference between, for example, if I'm scoping to a namespace, right? There's a difference between sort of object scoping versus permission scoping. Do you wanna talk about that or I can dive a little bit into that? Yes, sure, I think you can. Yeah, so one thing we wanna account for it as well is even if we have the logs, right? Say we're scoped to a specific namespace, we also have to look at permissions inside of that specific namespace because tomorrow if that namespace is compromised, say there's an open, there's no network policy for that namespace and it's able to laterally move and talk to the rest of the cluster, we have problems. And so that is something that also doesn't get enough attention, not just looking at what are we scoping the permission to for a given developer, for example, saying, hey, that developer only has the ability to talk in X namespace but what are the permissions inside of that namespace as well? If for example, say the developer is compromised and they accidentally delete audit logs because they have the permission to in that namespace and tomorrow we need to, we see that namespace led to some compromise because the permissions were changed, the, you know, something, you know, it led to, you know, lateral movements inside of the rest of the cluster then we have a problem as well. And so it's not just about the scoping of your role bindings in Kubernetes. You know, Kubernetes of course is cluster role binding and it's just the generic role binding where generic role bindings are typically used for namespaces or pods, but it's also about the permissions with that role binding and what for example, a developer from an inner developer self-service model is able to do within that context. So I just wanted to address that as well. So for our back, in terms of starting with our back and diving deeper, always have a way to monitor your identities in and across your clusters. Leverage concepts such as IRSAs and pod identity, this was actually something that was for example, introduced at I think reinvent this week to map Kubernetes service against core IM roles that are properly managed and audited and leverage concepts such as OIDC to make sure you're secure, have secure access to resources in your cloud. And of course you can leverage security platforms or tooling that helps you answer key questions about your RBAC posture. Of course, when you have thousands of thousands of apps, you wanna have high level questions that you can answer and we'll go into that in a little bit further into the call, but cool. So Graven, do you wanna go into another demo? Yeah, sure, let me share my screen and then we can just have a quick look. So this is a demo from end to end perspective, where you have an application and how does an attacker leverage it or how does he try to gain the persistence within the system? So in this scenario, we have an application running on this specific IP address and port. And if you see it's just a regular application which got a whole set of features, displaying multiple things like the integration or some examples as well on how to use it up and some configurations and tags and so on, many other features. And as a user, you'll tend to just use this application in a regular way, but when it comes to the security set of things or when it comes to the attackers, they try to enumerate the application and try to see what it's built of similar to the cookie example that we discussed earlier, trying to understand what a cookie is made of. And if you see, we can see a bunch of information about it saying what's the server type and what is the target host name. So there is something called showcase action that we are seeing. And Nikto is a tool that allows you to look through the web application banners and fetch some new information. We got some information, but it's not really helpful so far. So what we'll try to do is we'll try to use a tool called Nmap that will allow you to scan the specific port or the entire service or the subnet to look for what are the open ports that are running or what are the kind of services that are running on the application or the system. So here, what we are trying to do is we are trying to scan specific port number because we know that is available. But for some reason, the server is blocking our ping probes. So we'll try to just skip that. And if you see, it shows scanning EC2 something. So it also tells you that this application is being executed on an EC2 system. And if you see it is having some information like the server headers and the versioning information. So we can quickly try to search what kind of things are available. And I just give a space in my browser and it automatically shows exploit over there. So it is definitely exploitable or it's been a very old one. What caught my attention was the structs too, where? So exploit DB is a place where you can look for CVs of the specific vulnerabilities and stuff. So what we'll try to do is we'll just copy this specific payload, which is publicly available as a proof of concept. And since the system is not patched, we'll try to execute it and see if the end system is vulnerable for this vulnerability or if we can exploit that. So if you see, like we have the code and CV 2017. So it's a pretty old one. And we still find these applications every day. If you look at Shodan or the other sensors kind of search engines, maybe it must be Python too. So, and it requires the IP address of the machine. So what we'll try to do is we'll just give the IP address of the machine and then we'll give the command. The exploit DB is having the entire code to bypass or leverage the vulnerability and perform a remote code execution. So I execute commands from my machine and they get executed on the application system. So if config.io is a website where you can get the public IP address of yours or any system. So this is my IP address of the server where I'm running on. And if I want to check where the, so what I'm simply doing is we are passing on the IP address of the URL and we are sending the commands that we want to be executed there. And if you see the IP address is different from my IP address and it gives the IP address of that machine, which of course we know from the URL but just wanted to double check. So this is good for running one to one commands, but if you want to perform some extensive operations, it's not a feasible approach. You should get a shell on the system. And the way to do it is using a reverse shell, for example, and NC is netcat. It's a tool that will help us to connect, make socket connections with the other applications and try to gain access. So here we just try to get a reverse shell payload from a GitHub repository. You'll find it all over the internet. It just opens a TCP socket connection from the source to the destination. So the destination would be my IP address and the port that I'll be listening on. So it's 1337 and this is my IP address. So once I execute this, we should be able to see a shell but first let's make sure there is communication between my server and the application. That would be the primary requirement because we need to see if we can even access it. So I'll try to do a ping command but the ping goes on forever. So I made sure I'm giving like minus C1. So it will send only one packet and then return the response rather than waiting for it. But still I didn't get anything. So we'll kill that out and we'll put a timeout saying if in two seconds, if you're not getting any response, just terminate it. And you can see the 100% packet loss happened. So it means the application is trying to connect with my IP address of 5120, 140 and 105 but it's not able to connect to this. So it can be because of multiple reasons. Maybe the outbound connections are completely blocked from the system. So that's the reason. And in those scenarios, you won't be able to get a reversal but in some scenarios, maybe the ICMP is blocked. So just we are not able to do a ping. So we'll see if we can make HTTP request out of it. So we can be sure that at least the system is able to connect to the outside servers using HTTP request. So I start a Python server on my system and then I just try to make a curl or double get command. I think the code is missing. If you see the request came and we are able to see the response. So this clarifies or ensures that the system can make connections but the ICMP is blocked whereas the TCP connections or the HTTP connections are open. We tested for HTTP. Now we'll try for TCP. There are reversals available in HTTP as well. So we can get HTTP reversals. So that's not a constraint for us. And now what this simply does is it will open a TCP connection and it will pass on the batch terminal over that socket connection. We should get a shell. Sometimes there can be some network issues and it will take a couple of seconds but usually it should be instantaneous if all the connectivity is good. We'll just try again. Let's restart it and it should usually work. Yeah, so we have a connection. If you see who am I, like I'm the root user and also if you see the host name it is the IP address of the, sorry if the host name is different from mine but some web something, it clears up that we are on the victims machine or the application. Yeah, so if you say it is. So usually this is a structure for a pod. It will be the name of the application replica set and some random values and we can see at all the processes available on the machine. So now since we know what is running on Kubernetes or we guessed it, we can try to enumerate system but in this specific scenario since it's made for the demo I know what's exactly wrong with it so we'll just look at the service account available because that's a common thing that is done by all the applications. They mount the service account into the Kubernetes environment or the deployment. I think Abhinav has covered this in the previous thing related to RBAC so it falls under the same category. So what we'll try to do is we are trying to connect with the API server from here. So for people who are new to Kubernetes service account simply gives you a method to authenticate to the API server and perform some operations. So it's like your authentication token we can say but it got tied with its own set of privileges so sometimes you can say it should not have any privilege to do any operation or sometimes you can give it wide set of privileges. So we'll use that to connect with the server and see if we can perform some operations. So if you see all these values we are trying to connect the API server and we are able to get some values from it and similarly we'll see if we can list secrets in a Kubernetes environment using a similar method. Yeah, so looks like we are able to list the secrets as well. This is all cool but this is all the read only access. So you can extract the tokens and if there are any API keys stored on your Kubernetes system you can easily extract when this kind of misconfiguration happens but what we are interested in is something more. So this is my blog on where I just write about the security side of things. So this is something we just discussed on the malicious admission controller. So this is how the workflow looks like in Kubernetes. You get a request that passes on through multiple phases we'll not discuss in detail but once the authentication is successful it comes to mutating place where you can mutate the request once you have an incoming request you get to choose if you want to pass it on like that or if you want to make some changes to it and now we are exploiting that specific component also have a GitHub repository on the same so if you want to try it out you can do it post the webinar or whenever you feel. So it's a simply a go code. It will try to connect with the API server sorry when an application tries to connect with the API server it performs some mutating operations on it. Yeah, since we are here we don't have access to connect to the Kubernetes API server directly we are communicating using curve by token service account token to authenticate but nevertheless we'll just check if there is any cube config in the system. There's nothing over there we did the find command so we'll just use a curl in the same but this time instead of just listing the secrets we'll try to create some secrets or create same spaces. So for that I have done was we will be creating a hook demo all this content is available online so we'll share it once the webinar is done own as well. So the CSR 10 token for authentication once we have the namespace instead of YAML it will be in JSON because we are passing for a call request and then we'll send the request to the API server. And you can see the status is active because it's instantaneous and namespace creation and we got the response. So similarly we'll create the other services as well that are required to create a mutating webhook. So these are the certificates so nothing much over there so just create it directly. You realize the time check it's 12.50 so just one heads up. Sure, thanks. Yeah, six more minutes and then so if you see the image it's pointing to my Docker Hub repository. So either we can look at that or maybe we can skip it out and just deploy this specific file. So that would be the same. We'll just download the file using Wget as a call request to connect with the API server and deploy it and the status is still pending because deployment takes time unlike namespaces and this is a service that we are creating to connect with the deployment and then the service is also created and similarly we'll create a mutating webhook this star of the center operation where it connects with the service to perform some muting operations on every object that goes to the API server. So if you see it is successful but let's just check the status of the pod. So this is the API server if you see the namespace we are just looking at the demo namespace specifically and it is successful like the images referring to my Docker Hub and we can see the status and it should be it running over here. Yeah, the container status, the state is running. So now we have deployed all the things but what does it mean for us? Using curl and overprivileged service account because the service account is having more privileges like write permissions. We are able to write things to the API server directly from your pod. So I already have access to the cluster so we'll try to create a new pod and see if this, what does this mutating webhook do? We'll just switch back to our demo namespace where we'll just try to create an engines image but this time if you see it is unable to pull but NGX is a public event anyone should be able to pull it but when you try to analyze the errors you can see that it's trying to pull a different image overall. It's trying to pull a malicious image even though you're trying to get engines running. So this one was meant to interrupt the way it works but if an attacker wants to gain persistence what they'll try to do is they'll inject a sidecar along with it or they can have an init container. So you can never detect it or at least not in a regular fashion. So if you look at the mutating webhook it will have a bunch of things but eventually it connects with a service where and then we can look at the... So if you just look at the code what it is doing the code related to the service it's trying to replace every request and first image that is available and it is replacing it with a malicious image. So that's what the mutating webhook is doing. So often all the companies rely on CIS benchmarking or compliance in order to test these kinds of things. For example, Cubescape is one of the most famous benchmarking tool for Kubernetes. If you try to use that and I'll just do a quick clone on the local system and we'll see if there are any checks related to mutating webhook configurations because you have seen how devastating it can be or what kind of impact it can have on a system where it can gain the persistence it can do defensive asians and stuff. So if you see there are so many matches to mutating webhook but all of them are commit test data folder which is purely there for testing purposes as well as even the other ones are just exams. So if we are just excluding those folds the Cubescape is not having any kind of checks related to this. And none of the available tools can do a check on these kind of scenarios where everything happens at a runtime. So even when you try to run a compliance scan it will be all green, all checks past it says you are compliant to everything. But in real, your stuff is quite compromised or it's having so many other issues. Over to Yovanov, I think that's wanted to show. Cool, awesome. So we have only a couple of minutes so I'll just quickly go through the last things where Ravens was talking around compliance and being able to, it's not just a control plane thing where we also have to look not just at compliance but also runtime telemetry and security. And if you look at runtime security it relies on observability. You don't know what you don't know. And so if I'm looking at, for example, any cluster this is just an example of an EKS one or anything. Of course you wanna look at control plane data in terms of audit logs or things coming from the Kubernetes control plane but you also, when you have a good security solution to be able to tackle the kinds of threats that Ravens was demonstrating earlier relies on EPP telemetry. And some examples are like looking at specific processes that were started, for example, maybe the inside cars looking at network events in terms of like the TCP the other connections that were started inside of the runtime or looking at specific files that were manipulated or changed. And so what you wanna do is from the pillar three takeaway in terms of correlating telemetry you wanna across your data plane and control plane you need to start with observability and really have some kind of data lake that you can start to collect this data because you don't know what you don't know. And you wanna leverage not just the control plane data but EPP telemetry and even forensic techniques potentially such as Yara rule scanning I'll dive into a little bit of an example there where basically your process could be hiding behind something benign and you wanna be able to analyze the signature of the process in order to understand what's actually hiding behind it. So just last couple of minutes there are security platforms that can help you really tackle a lot of these security challenges. And in terms of the pillars that we talked around the first one are being around supply chain where for example platforms like optics can help you have a single pane of glass view from the developer laptop to your identity providers from code to cloud. And it's really about having a single golden thread that allows you to understand the overall sort of the overall attack surface and how a threat actor got into each of the different components using a detection graph. In terms of starting with our back and going deeper this is just a simple example but where you wanna up level certain questions especially if you have say thousands of namespaces or many service accounts for example we have 64 and 79 cluster roles we wanna understand which particular ones of those can access secrets or can accept into pods. And so you can answer these questions at a very high level. That helps you being able to enforce more of a strategy that actually makes auditing policies real rather than just having to do it on a quarter to quarter basis. For example, if some new service accounts comes in and can access secrets you can have learning mechanisms to be able to act on that data. And finally being able to correlate that data plane and control plane telemetry here. For example we see this 8.8.8.8 and it says QWER but what's really happening is that that's an end map. That's a enumerating vulnerabilities and doing a port scan. And that's where using different forensic techniques such as YARA rule scanning which is actually looking at the process signature of a process and saying hey based on the data that we're seeing this is actually an end map that is being executed is really really important. And making sure of course you're not just looking at control plane data but also runtime data at the process level to understand what is happening. So I know we're at the minute but you can reach out optics.com on LinkedIn up in of Mishra and you can reach out to me or to Ray once anytime. And thank you so much for joining this conversation today and we really appreciate it. Thank you both so much. Thank you everyone for joining us today. Like we posted in the chat the recording and slides will be up later today on CNCF.io. Thank you both for your time today and for all this great information and we'll see y'all again on another live webinar soon. Awesome. Thanks Libby. Thanks everyone. Thank you. Bye.