 Hey, hi everyone. Welcome, welcome to the talk. Welcome to Cube Day Singapore. I hope you all had a great keynote session. Unfortunately, I missed it because presentation jitters. So I'm very glad that I have my talk at 10.30 today so that I can be relaxed and attend rest of the sessions. So hi, I'm Anusha. I'm our technical product manager at Nirmata. So as part of my job, Nirmata is a startup. So as part of my job, I do product management. I do a little bit of pre-sales as well. So over the past year, I've been talking to a lot of prospects. And the inspiration for today's talk is based on my various conversation with several prospects. So I see two patterns emerging, which is the first part is raising awareness about Kubernetes security. Because a lot of people pinpoint on just a single thing and they feel that they are securing their environment. So that is the way to go. And the second part is there are some misconceptions or myths around Kubernetes security or security in general. And also, CNCF provides a wide variety of tools. If you check out the CNCF security landscape, there are about 100 tools, I guess. So what tools to use and when to use them at which different stages of your deployment lifecycle. So those are some of the things we'll touch upon today. But before that, I have a question. So how many of you here have experienced a security incident in production? Wow, OK. And how many of you had to deal with it yourself? Like debug, RCA, fixate? Amazing, OK. And for those of you who haven't experienced, I hope you don't have to experience anytime soon. But it's going to happen sooner or later. But let's be prepared for it. So Kubernetes is complex to secure and scale. It is complex, no doubt. It is easy only if you have to do a kind create cluster. Everything other than that is very complex. Because there are a lot of things that you have to take care of. Because everything is configured as a YAML file. Just to deploy a service in production, you have a deployment file, you have a service file, you have to set up ingress, RBAC, and whatnot. So all of these are configuration files. And these are some of the headlines you may have heard in the recent past. There's about log4j vulnerability about Kubernetes instances found exposed online. So love and sacks. So all of this is not old news. This is maybe in the recent one or two years. So Red Hat has done a security report. They have produced a security report. And these are some of the stats out of that report. Like 180% increase in security issues. 93% of them reported a security incident just like this room. And misconfigurations are one of the leading cause. Because everything in Kubernetes is YAML, misconfiguration is common if you have to do all of those manually. So automation is the key. And then what tools are we going to use for that? And this is one of my favorite slides when I have to present to my prospects. To be able to secure your Kubernetes clusters, you need to understand what is the cost of missing it. So sometimes, us as DevOps professionals or SREs, platform professionals, we understand the importance of it. But how do you convince your management to allow you some time? Because you cannot do this overnight. You have to dedicate time for it to invest in security. And management understand numbers. If you want to provide a number or a data-driven approach. So this study was conducted by CNCF. There's a complete detailed study that I've linked. And it's really great. So I'll just explain the first half of this, the top half. So this graph, let's forget about Kubernetes for a bit. Any software, if you find any bugs and you fix it when you're coding, that is to the left side of it, the cost of fixing it is just maybe one developer or two developers, you identify what the issue is. You spend a couple of hours, you fix it, you push it again. The cost of fixing it, let's say, is 1x. As opposed to that, something you have already released into production. You identify there's a bug. Now just imagine the cost of fixing that. There are multiple people involved. There are SRE teams who have to triage. There are other teams who have to go find out what the issue is. Maybe you get on multiple calls with different teams and you're spending a lot of time. You're spending a lot of people's time as well. And also your customers may be affected by this. So the loss that you cause by not fixing something that could have been fixed in a coding phase as opposed to finding something in production, it is exponential, it's 640x. Just to put numbers for this, cost per defect pre-production is just $25, but at production level, it is around $16,000. And this is per defect and you can just extrapolate it to a cost per cluster and that is a very big number for me to even read. So I highly recommend for you to read that blog and see how they have, they have given a detailed breakup of how each of these costs add up. Okay, moving on. Okay, I told you about my conversations with prospects. So the first thing that we do is we want to assess what level of Kubernetes maturity the prospect is on. Some are at the very modernization level. Some are already moved some part of their workload to containers, some are still on VMs. They're at different stages. So when I ask what do you do for Kubernetes security or security in general, the most common security measure that teams do today is scanning the images for vulnerabilities. They say that, okay, we are running things like aqua-trivy or something else for scanning images and that's it, but is it enough? I'm sure, if you're not doing anything, just doing this is a good first step. This is definitely necessary, but this is not sufficient. So don't be under the misconception that if you're just scanning your images for vulnerabilities, you're going to be secure in all aspects of security. This is not enough, but you should definitely be doing this. It's not that, you know, replace this with anything else. And the second most security assumption that I get is, you know, hey, I'm using a managed cloud provider. I must be secure by default. I mean, I'm using AWS, right? Like, come on, it's such a big company. They must be doing something for security. They cannot be giving me insecure clusters. So I'm fine. And AWS is just an example, by the way. Same applies for GKE or AKS or any managed cloud provider. But if you read the, you know, cloud provider's security guidelines, there's a very fine print that security is a shared responsibility. So cloud provider clusters, they manage the control plane nodes. They are responsible for upgrading it, patching it, making sure they're secure. But the worker nodes is the customer responsibility. Because, and that is where your actual workloads run. It is your responsibility to secure all of those. Of course, the cloud providers give a very extensive security guidelines on how you should be running your workloads, how you should be configuring your workloads, but doing all of that is not their responsibility. And if something happens, it's on you. Now, the next part is understanding what are the different layers that come in the cloud native security. So this is again, taken from the Kubernetes docs where you have code, container, cluster, and the cloud level. So each of these layers, there are different security best practices that you can follow. And there are different teams in any organization that are involved in securing these different layers. And each of these layer built on the outermost layer. For example, if you allow root users in your containers, it doesn't matter how secure your code is, your code is still vulnerable, because now you're not securing your containers enough. Similarly, if your cluster has some loopholes, it doesn't matter how much you secure your containers because your outer level is already compromised. So each of these layers built on the next layer, we'll see how different teams are involved in securing each of these layers. So in any development or a deployment workflow, we have developed deploy and run phases. In the developed phase, we usually have coding, building and pushing say those images to something like Artifactory. And usually app teams or development teams are responsible for this phase. And in the deploy phase is the DevOps teams who set up different pipelines, ISE pipelines and so on, and make sure your applications are deployed to the cluster or to the cloud. And then finally, the run phase is where your applications actually run. You need to continuously monitor them. If there are some issues, you need to triage and so on. So that's where SRE or platform teams comes into picture. This is not a clear segregation. The lines may be blurred in different organization and different teams may have different set of responsibilities, but this is a sort of a, I don't know, template. And each of these teams should take care to secure their part of the system. And you have to secure every part of the system. It is not like you secure the developed phase and your deploy phases secure by default. No, at every stage, you have to secure. At every stage, you need to have some set of best practices you need to follow or configure policies and so on. Now that we know the attack surface, let's look at what are the different ways different teams can do to achieve some of these security aspects. In the developed phase, the development teams can do container image scanning. So this is you scan your image for CVs and you can also do code scanning by the way which I've not highlighted here. So here you can scan your code for, say best practices in terms of don't expose passwords or secrets or make sure you are using the latest dependencies in your go-mode and so on. And then in the build phase, you can actually sign your container images and when you're storing something like an artifact tree, you need to still be continuously scanning for vulnerabilities because it's not that you scan once for vulnerabilities and you're good. You need to keep scanning them continuously because vulnerabilities can come in any time. And then in the deploy phase, usually the developed team use something like policy-based deployment control. So these policy engines act as admission controllers and they allow you to either block or allow a deployment based on whatever best practices you've configured, based on the policies you've written, you can block or allow certain deployments. And then in the run phase, you still need to do continuous scanning, not that if your deployment was secure at the time of deployment, it's not secure say after two months. So you need to still be scanning for different policies and different security measures and also there might be runtime security events and there are a lot of CNCF tools that are available which we'll look at next. Yes, okay. So there are different tools and a disclaimer here, I've just highlighted few of the tools that I know of and I've used a little bit, but you have to go to the CNCF security tools landscape to see the plethora of tools that are there. So first we have in the, let's start with the develop phase, right? When you're coding. So we have something like Dependabot, SonarCube and Gosec, Dependabot as you all may know, it helps you to keep your dependencies up to date. So it will automatically create a GitHub PR for you and all you have to do is review and merge the request. Similarly, SonarCube is again a static code analysis tool and even Gosec helps you follow all the security best practices in Golang. And those three tools can be part of your GitHub CI itself. And then you have something like SixTor and Notary. So SixTor has a tool called Cosign, so you can use that to sign container images, you can sign YAML artifacts, you can sign basically anything and save the signature in the OCI repository. So it is very important to sign and verify your images so that the image integrity is maintained. And then you have, once you store these images, build these images and store this in an artifact tree, something like Harbor allows you to scan your container registry for vulnerabilities. You can use Trevee and GRIPE as well for scanning your images for CVs. And then in the deploy phase, you have policy engines like Kavanaugh and Gatekeeper, both of these are admission controllers and they allow you to write policies and rules that help in either blocking or allowing the deployments. And then finally in the run phase, we have tools like CubeBench and Falco. So CubeBench helps in generating CIS compliance reports and Falco for runtime events. I think it is based on EBPF, I'm not an expert on it. So you can use these tools in runtime to continuously monitor the state of your clusters. So like I said, these are just few of the tools but there are plenty out there. So lots of things to explore, but you get the idea, right? You need some tools to secure your develop phase, deploy phase and the run phase. Yeah, I want to close with what we have spoken so far. So we have the commit phase going back, the coding phase. So here you can identify misconfigurations. Now applying this to a Kubernetes level, everything is YAML manifest. Even your ISE is YAML these days, policies are YAML, everything is YAML. So you can scan for misconfigurations, you can scan for package dependencies, make sure you're using the latest versions only because they are assured to be free of any known CVs. So all of this you can set up in a, say something like a GitHub CI or any of the CI pipelines that you use. And then comes the build phase. This is where you verify, sign and attest. All the images, you can sign the YAML manifest. With cosine, I think you can sign a whole bunch of things that are different types like key signing, keyless signing. So depending on what suits your organization, make sure you're also signing your container images in the build phase and not just building and pushing it. Then in the deploy phase, you have to validate all the images that you've signed. So it's not enough if you just sign them because at the time of deployment, you have to validate those image signatures. Otherwise, what's the point of signing, right? If you're not validating them? Not just the images, you can now also use admission controllers like Kivarno and Gatekeeper that help in verifying all of the different rules or policies that your organization is interested in and your platform teams have defined those policies for you. Then definitely the run phase. You may encounter new issues. So you have to continuously keep scanning. This is especially true for finding new CVs because CVs may come in any time. So you need to be scanning continuously to find new CVs. And again, things like Kivarno helps with writing policies to ensure your scan reports are latest. One of the policies that we usually recommend is write a policy that says my scan report should not be older than 30 days because that gives a sense of confidence that my scan report is always active. And then you can also have policies that don't allow images that have severity higher above. Then you can be rest assured that at least the deployments or the services that are running in your cluster are secure for the time being. Yeah, I think that's all I had. I covered it pretty quick. We have plenty of time for questions. Yes? I see there are a lot of tools you use from each phase. So how do you manage all these tools and learn about all these? And also what the license, are they free or are they licensed? So for me, I will see that the cost of managing all these can be quite high. Yeah, that's a very good point. So like I said, this is again a very subset of tools. This is my opinionated view of the tools. There are again plenty more. But even this is a very huge number for any single team to manage this. And also to gain expertise in each of these tools, how to efficiently use that. That is definitely a challenge. But again, one single team need not manage all of these tools because something like Dependabot and GoSec, your application team may take ownership of it. And then Kivarno or Gatekeeper might be your platform team or your DevOps team. So that is one way to go about it, to distribute between amongst different teams. And these are all CNCF tools. So they are open source, they are free to use. And the way I think some was, I think most of the tools that I've listed here, they usually have a company backing these tools. So there are enterprise versions available if you think just the open source version is not enough or it is too much for you to manage. There are definitely enterprise versions available for most of the tools listed here, I guess. Yeah. Thank you for sharing. Well, we can probably chat later because I don't want this to be a vendor pitch. But Nirmata is mostly for Kivarno. Other questions? All right. I just want to take a picture because my marketing person will kill me otherwise. Yes. Thanks everyone, thank you. I will be in the hallway and happy to chat with you. Thank you.