 Thanks Chirath. So today I'll be talking to you about securing Kubernetes. So in this session I'll be talking about from a point of view of an attacker and how an attacker might try to sabotage a cluster and how we can prevent it. So as the first slide I'm introducing you to the area of attack when it comes to a Kubernetes cluster and at the bottom you can see infrastructure as cloud which is basically where you have your data centers, your firewalls, your network and servers and in the next layer we have our cluster where we have our back authentication authorization then we have admission control and we have network policies. On top of that we have containers where we have container sandboxing, image restriction, privilege, escalation, supply chain etc. And finally we have developer discipline as the code base layer. So let's look at the first scenario. Attacker has access to your network. So now the attacker has somehow found a way to find your network and now how can you prevent that? So basically if you're using cloud you can basically use SSH based key auth and then you can make your Kubernetes API private to avoid these attackers or intruders discovering your API and try attacking it and then you can have a firewall, you can secure your network through a firewall and then you can have proper RBAC policies implemented in your cloud and also in your cluster. The next scenario is attacker has access to your Kubernetes control play. So now the attacker has somehow gotten through your network and now it has access to the infrastructure of your Kubernetes control play. So before jumping into what we can do let's just go through a few slides on how you know what Kubernetes RBAC is and you know who can what things are. So on the first slide we have on the left we have who can access. So in Kubernetes we have two types users and service accounts. Users are for humans and users are someone with a cert and key and the certificate is signed by the certificate authority of the cluster. Cert is managed by external to the Kubernetes and it's not a resource in Kubernetes really as I mentioned it's handled externally and then we have service accounts. So service accounts is for processors it's not for human use but rather when there's a pod or some other resource in Kubernetes trying to talk to the Kubernetes API let that be a worker node let that be a pod CRD whatever they need a service account to access the API. So these are the users in a Kubernetes API and then comes permission. So as you can see on the right hand side we have two kinds of permissions one in blue is called role and the one in orange is called cluster role. So as you can see the role is namespace bound so when you create a role it only affects a namespace whereas when you create a cluster role it's rather global in Kubernetes it's accessed across all the namespaces in the cluster and then to bind this role into a user or a service account we have something called a role binding. So there are two kinds of role bindings one being the name role binding itself and the next one being the cluster role binding. So role binding is also namespace bound and when you have a user you can bind a role to that user in a given namespace and that role can either be a role or a cluster role but when it's a cluster role binding it's it's more advanced and it's more it has a lot of privilege because it's it doesn't scope into a namespace it's therefore the entire cluster so when you do a cluster role binding to a certain person on a certain permission the person can access that permission across all the namespaces in the Kubernetes cluster which is a bit dangerous. So we'll look at an example in the next slide here we have John who is who can do read secrets in the full namespace and there's a role called read secret role in full namespace and you can see on the slide that there's a read secret role binding to John who which gives John access to read secrets in the full namespace that is all right. Then there's Jane so Jane needs access to read and write secrets to the full name space however read and write secret is a cluster role we have created for some reason and in this scenario we can still create a role binding to the cluster role which is read write secret to Jane which gives the correct access to full namespace but now then look at the admin user. Admin user we have given admin user read write secret cluster role binding which means the admin user as you can see from the red dotted lines that they can read across namespaces of all the secrets that there are on the names all the namespaces available in the cluster so this can be pretty dangerous if you don't properly give access so it's recommended that you use role bindings and roles instead of cluster role bindings so hardening a Kubernetes cluster we already spoke about our back policies and how to set it up the next thing you got to do is enable audit logging so when a disaster or something has happened someone has access to your cluster your best friend would be logs so you need to enable audit login to figure out who did what on your cluster so that you can trace what has happened and you know come to a conclusion the next thing is you got to run CIS benchmarks on your cluster this is mostly applicable for the clusters that you have done from scratch on your data centers etc because cloud managed clusters GKE AKS EKS this already has CIS benchmark reports on them it's managed for you but if you are managing your own clusters if you're managing your control planes run CIS benchmark and do the recommendations that they have given the next one is this is applicable for the clusters in cloud as well manage clusters as well use a CIS hardened image so by default when you get a GKE or AKS or EKS cluster they provide you a simple node a node with a Linux runtime and container d runtime for image for containers but you can have CIS hard node images which has more second profiles of power profiles installed big deep within to the image node image itself which would give you more security the next thing is encrypt XED doesn't matter whatever the security tools you run you might be running runtime security sandbox you might be running everything but if your HCD is not encrypted in your control plane any attacker who has access to your control plane should be able to easily look at the key value store that HCD provides and then read from it and get your secrets same goes for your secrets so the common recommendation is I mean 12 factor app store us to inject variables as environment variables that's not the case anymore now you have to mount them as files right so attaching environment variables can leak your credentials to someone who has access to the VM so don't do that mount your variables as a file and then read it read from that file let's go to the next section attack has access to a club to a service account in your cluster all right so now attack has coming now it has has the control plane access and somehow it has access to talk into your Kubernetes API with some permissions right so we stop that first of all let's look at network policies the concept of network policies is as you can see on the example here we have a web server in green a Python backend and a database and as you can see there's no need for the web server to talk to the database only the Python backend needs to talk to the database and on the the on a separate namespace we have something called a super important API right by network policies what allows us to do is we can specifically say which pod can talk to which service right so we can specifically say web server cannot talk to the database web server can only talk to the Python backend and Python backend can talk to the database so if somebody has exploited web server and easy it has access to the web server they won't be able to write directly to the database they won't be able to access the database they and then if somebody goes and you know comes and gets into the Python backend of course then they have access to the database but still because of network policies they will still not be able to attack the super important API we have on a different namespace right so that's what network policies allow us to do and by default you know you have to use a cni a container networking interface plugin on Kubernetes that supports these network policies network policies can be at many layers so common leads at TCP4 but layer 7 network policies network policies supporting CNIs are there such as Celium etc so it's up to you to figure out which to use let's move into the next slide so as I said so the first thing is you got to use namespaces to isolate your tenants on the network then use network policies which is the Kubernetes equivalent for firewalls then it's added advantage if you have an API gateway you can even make the inter-service communication go through an API gateway if you're using a service mesh use mtls or you know if you're using a service mesh that's powered by ebpf in a network encryption etc happens by default you can do that so that's great that means no one who has access to infrastructure cannot overlook your network and figure out what's going on finally there's something called admission controllers on Kubernetes so there's open policy agent key bro etc which allows you to write policies on Kubernetes your Kubernetes cluster which I'll explain in a bit more in a later slide so now attacker has access to your code base now if an attacker has access to your code base attacker can do a lot of things they can insert malware they can insert vulnerabilities and exploit them later they can they can do a lot so to avoid this you can do a few things first one is of course static scanning with your CI pipeline you can do sonar cubes can etc and figure out if you are committing any sensitive information to get which is accessible by anyone if it's public and then you can do code vulnerability scanning as I mentioned and then you can you have to keep scanning these it doesn't matter like today you scan it it's all good everything is green but I mean there could be an exploitation or a vulnerability discovered tomorrow so you got a scan you know on a on a daily routine or on whatever the preferred way that you you do then goes image vulnerability scanning so here we our images might contain vulnerabilities this also again you got a scan at a at a interval or a duration you can't just scan once and if it's okay you can just forget about it right you gotta scan continuously and you know exploiting and a vulnerable image could lead to you know privilege escalation someone can get remote shell access your information could leak someone could give your DDoS attack within the cluster I personally use Claire but people use 3v and any other tool you can do to use image vulnerability scanning and then there's another one called configuration scanning so configuration scanning is you know when you are committing your yamls to your GitOps repository etc you can easily check them via check or or something else and see if you are missing anything if you are missing a trick if you are missing any any configuration security configuration or a policy so it allows you to it gives you feedback and you can can fix them and you know secure your pipeline so let's talk about Kubernetes admission controller which I promised I talked in the previous slide so let's see what what it does so in this scenario there's a create pod request by a user so when that request is gone to the Kubernetes API via Cube CTL what happens is firstly it tries it figures out who you are right then it goes to the authorization what can you do can you actually create a pod then finally there's something called admission control so in admission control there are so many policies even we can write our own policies in this scenario what here what it's gonna look at is whether you know have had the pod limit has been reached on the given ns whether you are able to create a pod or not so admission control has two types when we plug in we have something called validating webhook and a mutating webhook so validating webhook is a red only type where you can it scans a given request and it just tell either allows it or denies it this is perfect for you know third-party policy controllers like open policy agent or something else so here you can give any additional policies like don't pull images from you know public Docker right pull images only from your private Docker registry so you can give these policies and it would automatically deny at the admission control level when you give that from policy via a third-party policy controller or you can write your own validating webhook for it as well then that's mutating webhook so that's a bit different so it changes the payload dynamically this is mostly used with CRDs and you know controllers that you know work with this operators that CRDs so here you give a payload and according to some some logic it changes the payload dynamically and applies to Kubernetes in a different man right so that's what Kubernetes admins admission controller does and this would allow you to write your policies and secure your cluster right the next step is now the attacker has access to your container somehow they have exploited your container now they have a shell that they have a shell access and now let's see what we can do so when it comes to container hardening when you're building the container itself there are some best practices that you've got to follow first thing is remove the bash right so that they won't be able to access a shell remotely make the file system read only so that they won't be downloading anything or writing anything into your container file system and then make sure the user is running as a non-root user so when these are done it it's hard for us to you know for an attacker to attack and you know exploit a container so basically make a container mutable then there are other things that we can do there are images which we don't have access to you know to change the image right there are scenarios like we don't build the image somebody else built it we only run it so in those scenarios we can use Kubernetes without changing image to enforce immutability so we can run this this thing called startup probe in Kubernetes where a container is running you can run some script before the container actually starts running using that we can remove bash if you want to and we can set like run as group run as user and as non-root set security context remove privilege escalation can do a lot of things so by using those we should be able to make the containers immutable at a Kubernetes level as well then comes runtime security doing all of that is sometimes not enough and you need more let's look at how containers work and what this runtime security means right so as you can see in a VM when a container is running it's running at LXC on top of LXC and then we have Linux kernel at the below right so basically a container is a you know group of namespaces and C groups so when containers are making this doing things it does syscalls to the Linux kernel so say that means if there's a vulnerability in the Linux kernel the container should be able to exploit it if someone with the proper knowledge and tools are there so first thing to do is as I mentioned earlier disabled privilege escalation and drop all the capabilities of a container and then only add the things that are needed using Kubernetes constructs and there are there are things like uparm and seccom profiles which restricts the syscalls that you make to the Linux kernel so that you know you don't make any weird ones and you know try to exploit your kernel then there's container sandboxing there's firecracker there's gVisor there are few out there which is you know a lot makes makes this issue go away but on top of it adds more performance issues but still it's it's great if you're running a multi-tenant system and if you don't trust the images then there are tools like sysdig Falco etc which monitors the anomalies of the container on time and then gives you alerts like this is happening stop this etc right so that's continue runtime security for you so you got a concealer information this is something I told earlier as well I asked you to do not mount inject environment variables but rather do them as file mounts and when you're doing that it is recommended to inject wire secret manager at runtime don't save it as secrets and Kubernetes itself base 64 is not an encryption so use hushikov vault or Azure KVAWS secret manager google there are so many secret managers out there use one of them and then inject your information sensitive information as files to the container at runtime you can very easily do that with those technologies and I already told make your container root the system read only and this is a developer discipline do not lock sensitive information so comes to the final thoughts Kubernetes security is still new and you know vulnerabilities get discovered every day and then they get patched frequently so update your clusters as soon as you could if you're running your own or even in cloud just upgrade keep upgrading your clusters when it comes to managed Kubernetes clusters in the cloud security is mostly I mean 70% of security control plane security it's managed you don't have to do much there and many organizations today needs some level of multi-tenancy they might have different different teams etc working different different projects and share within a shared Kubernetes cluster so so yeah so these could be you know these things that I told in this slide in this session would be appropriate if you can implement them in these scenarios so you don't keep the proper standards up and follow up and things should be alright that's it from me today thank you and I hope you learned something today over to you