 Hello everybody, can you hear me all right cool welcome to the session I'm working at the EMEA dev team for open shift and Let's get started All right first of all a Bit of contacts has anybody of you. I mean played with with showdown.io before Yeah, all right, so it's a platform where you go and hunt for vulnerable servers and devices All right now at any time of the day You get to see some thousands of clusters. I mean Kubernetes clusters in there each CD Clusters data that you can access at will I mean read write delete whatever and some of them are production really Okay Yeah, can you hear me all right? All right, cool So you you probably all all know of the Of the Tesla glitch last year. All right, they open up there They're their console to the world with AWS credentials on it and the platform their platform Tesla's platform was used for for bit for Bitcoin mining. All right, and then you all know of the Kubernetes vulnerability last December. I mean one month ago. All right, but by just doing an unconfirmed I mean proxy upgrade request right to the API controller you would get unlimited unlimited Access to your Kubernetes objects at everything now the thing is what would do these kind of attacks having common Not not much practically, I mean they're they're very unsophisticated think of them of Yeah, of doors accidentally. I mean left open. All right, this is yeah very simple attacks that you can Execute of any kind of technology thing is that as Container adoption and container Platforms, I mean Get more adopted. I mean you'll get to see more and more advanced attacks and and trust me there's there's plenty of room for Proving now what this session is not about is all these things that you see on the slide I mean, there's plenty of documentation. All right, there's plenty of guidelines There's plenty of reference architectures on how to on how to I don't know harden your your OS your Kubernetes cluster. I mean your your image registries are I don't know to perform a static code analysis and for vulnerabilities. This is not it It's it's it's more it's more about looking from the other side of the fence I mean suppose you don't know anything of it suppose that You just get to know the vulnerabilities and you start attacking a system. So it's really about Getting to approach the problem from the other side of the fence. All right, but trust me There's plenty of good documentation and guidelines and reference architectures on how to harden your systems all the other is plenty of it Now Chaos engineering in its purest original to traditional form it's it's it's It's it's a way to bring chaos to your system. All right in order to test is resiliency It usually consists of four four steps First of all you define the desired state of your system. All right then Then then you assume that desired state of your system will continue Staying like that. So it's persistent. All right Then the third thing that you will have to do is just interfere with your system and try to to introduce tasks activities I don't exploits right in order to bring down stuff and The fourth one is is just Get to see that the actual state of the system after having introduced those those Activities and see whether your hypothesis is right. All right, and if it's not if the system is not acting as desired Okay, then you act and and improve the thing is that that that the that third step. I mean in its Traditional form what does it mean in Kubernetes language doing some traditional chaos engineering? It's bringing down pots is bringing down down some notes and then see how your applications will survive it All right now the thing is with security is that this third step? I mean all you have to do all you have to do. I mean it it's it's it's Try to insert vulnerability scans in that step. All right Now you could say but this isn't this a bit of penetration testing. Yeah, all right, but not not the way that we know it It's not like getting somebody every two or three months paying a lot of money and then do some penetration testing and give you some reports No, I'm talking about a repeatable process. All right which Should ideally run in a production or production like environment. All right automated Automate everything out of it But in a way that you can control the damage I mean if you're interfering with your system, I mean from a security point of view you're breaking things that you cannot That are irreversible. It's better not to start with it I mean you have to control the damage, you know, you have to report it and and and to repair stuff Now before starting to write vulnerability exploits Or implement vulnerability exploits for For your Kubernetes cluster You have to know where these vulnerabilities come from Now vulnerabilities can come from from a myriad of places. I'm not talking about Host infrastructure vulnerabilities. I mean you have to do it with or without Kubernetes or contain a runtime Docker vulnerabilities, whatever They this is a 25-minute thing I mean I have to be very and they deserve sessions of their own what we're talking about here is Kubernetes and up. All right, so roughly speaking you have three groups of vulnerabilities Kubernetes first of all, it's your cluster your core components of your cluster. All right Your controller your your your your your API server server your each CD on masters and on Cubelet your cubelet proxy If these these components are configured Not correctly, but then you have a large gate open to attacks and and by default speaking vanilla could burn these deployments have that All right. Oh, this is the first Section within the cluster group of vulnerabilities. The second section is your your your cluster configuration files All right, the pod static pod definition files And and and and and access to those files and data directories, okay and ownership of those files and and directories and and and everything I mean Getting access to those files and data mean getting access to everything on your Kubernetes cluster and then and then you have the security primitives in the cluster itself like roles How do you sign? I mean admin or a cluster admin roles people the role bindings. How do you do the namespacing? I mean because you can do many things wrong with it or or or how do you how do you do? How do you deal with the admission controllers? All right Or or I don't know the network policies and everything so this is inherent to the cluster as such self All right On top of it. You have your workloads Don't forget Kubernetes is a platform to build platforms. All right, so it's for the stuff that you put on top of it that you get now These are typical things that you would do wrong while building applications like forgetting about the second profiles. All right Not using scc's properly not using psps properly Having vulnerable code in your application give your application more more more access To data that then that then your application really needs Having unnecessary syscall capabilities in your application do stuff. They're not supposed to do Root, I mean privileged containers We should not be having them there in first But even if you have them if you haven't if you have to have privileged containers because some technologies that we deploy Kubernetes ask explicitly for that because otherwise they would not function Well, then you have to compensate for that with our bag or psps and everything this is tough, which Not many people do all right a third category of Vulnerabilities is the ecosystem of tools that you use together with your Kubernetes like for example start with third party Solution or product providers, okay a database an application server a middleware component an off-shelf thing. All right There's plenty of vendors who have who were not there with containerization of their workflows. We just put it up there All right, so to say that they have a containerized version of their things from a security point of view that they're They're huge gaps in there Something that you don't have much control unfortunately. All right, then you have the you have you have companies Relying on traditional security and monitoring solutions. All right forget. I mean people will say well my f5 will do everything forget about it. Okay We will not because it's it's not aware of microservices. It's not aware of workflows It does not even understand what a microservice is all it knows is IPs IPs, which which are a firmware Ephemeral in Kubernetes. All right, so and then you have another Another source for vulnerability when it comes to the ecosystem of tools is Like not very long time ago. I mean security was by who's done by security guys networking bus was done by network guys and Databases and now you have people doing all of it. All right working in teams and with that But now if if if you if you're not paying attention to who's doing what or how they're doing it on the rights Access rights that they get on our namespaces on objects are back and it's a mess You you're opening up the door into troubles now the thing is that you will have to concentrate something I mean to start with something so it's that it's a it's a workload Vulnerabilities part on the long run you will have to delegate the first part to tools and technologies that that can harden All right Kubernetes for you when it comes to workloads I mean there's a lot of stuff that you have to do yourself and when it comes to the third category of things I mean you have to rely on vendors and other guys Now we just mentioned these vulnerabilities In addition to that We see a lot of stuff with customers that they do wrong Take this for granted. Okay. A lot of customers do this do this shit I mean it's it's they fail in deploying and managing stuff correctly. So if you want to attack Keep this into account first of all this zero trust network security thing, you know, you hear about everywhere This is not it's not the case. All right. So in order. I'm what does this say it means that you have to You cannot trust anybody. I mean In the traditional way of doing things it was okay everything inside my network It's it's trustable everything outside Zero net network zero trust network security says you cannot trust anyone. So how can you trust someone? based on a set a combination of attributes like Metadata network identifiers layer 7 request IDs And then you can identify a workload and make sure and be 100% sure that you're communicating with that workload Not a workload coming from that IP address. So this is and it's not that you see it very much I mean in current deployments now second thing that that is done wrong It's a lack of metadata driven deployments, you know in Kubernetes. Everything is about metadata So you label you annotate you tag All right, and this is also connected. I mean to to the first point In order to create multiple identifiers for your workloads, you will have to first I mean to label your workloads properly and this is not happening anywhere. So it's it's very I Mean it's simple. I mean to spoof yourself or to spoof your workload for being the workload that you want to communicate to That secret management. All right. We know how that happens in Kubernetes Well ideally your secrets need to be encrypted At rest need to be encrypted in transit and they should be decrypted ideally only in memory It's only users or particular containers that should have access To particular secrets. This is not the case I mean so many times if you just access a namespace you get the secrets for everything you use them with some other things and You create other pods with elevated rights and whatever and then there's this one thing one way hashing I mean theoretically speaking you should be setting Secrets not reading them. Well, you can read them, you know So take this for granted. I mean if you want to tech use this All right, try to go fast because I have to come back this Now legacy integration and infrastructure service deployments on cloud All right, we talked about tagging and math data and everything The problem is that when you integrate with the back-end system for example with a database somewhere Which is not which is not which is not incorporated in your Kubernetes cluster or you're using some cloud services of shelf or you're using Infrastructure service components well like virtual machines on on on cloud You should be stretching this metadata driven approach for deployment. So you should Should take everything outside the cluster to so to create these hybrid hybrid workloads Which are identified as this unique components of your system that you can trust this is not the case most of the times Okay platform hardening we're not gonna be talking about platform hardening here All right, but you have a lot of people a lot of companies a lot of customers What would go to and you ask these guys to say harden it please all right and lock it Okay, lock it lock the note. No, they don't want to lose control and leave Many things over and then you start to start messing with the notes And you can mess with the notes actually when you start messing with the notes you've messed up with everything so it's a good area to To test to scan and to attack Configuration externalization not applied correctly. You remember we talked about these Native components of Kubernetes. Well the API controller the the the API server The cubelet whatever All right, most of the time they they're deployed with with common blind flags common blind flags All right, you get it from the process whatever they do I mean you get a lot of information how they have been initially instantiated or whatever So there's there's not many many people Externalizing these values. I mean to configuration files. I mean golden configuration files place them somewhere Nobody can access them with the right access and ownership rights and make sure that this is not the case most of the times Then lack of logging and monitoring Trust me. I mean most of the kubernetes deployments don't have meaningful Logging data durable data on every layer container application cluster level It's a it's something that you can really really really exploit or auditing. I Mean many of I mean you can audit on API server level I mean configure your your your Auditing policies first of all nobody does it first even if they would do it. There's not much Information you would get from it. I mean when you're running will elevate it writes for example or doing some other It's it's not there. So whatever you're doing there make sure that nobody is watching you now if you start All right Have to be really quick Now if you're if you're building vulnerability scans, okay? To test the shit out of your platform Stick to these stick to these patterns. All right enter a pod. Okay, typically authenticated users You got access to a pod what you can do can do a lot of stuff. All right, you can fork processes. All right You can abuse with with with storage and network and Compute resources if quotas and limits haven't been done, right? You can you can you can I don't know do some some crypto mining in there All right inside the pot now pot to pot. This is typical This is challenging the the multi-tenancy of Kubernetes I mean going from your namespace into another and then doing whatever you think so it's a very nice I mean think that that you have to keep in mind I mean when building when building I'm saying again when building vulnerability Exploits make sure that these all are covered pot to note and you start from the from the pod Maybe an end with elevated rights and you get to the directories and data. I mean on the note I don't know it's just shadow or whatever and get it. I mean This is a nice test to do now intro node and note to know they are great out because you do them You supposed to do them. I mean with or without Kubernetes. All right Intra node means inside your note try to get I don't elevate it root access Okay, or not to know try to jump from one node into another be directly or via Bastion host Okay, so this is not very much Kubernetes specific does not mean that you don't have to do these kind of tests Know to pod. All right, you start from the node. You start from infrastructure. Maybe I don't know how you've got it in there Okay, maybe not legitimately, but you have excess And then you can try to get into your pods. All right get into the pod elevate your rights And then get back to the S with root access because you have elevated pots with elevated rights running in the cluster external to cluster it's typical Attacks coming from outside everything which is publicly available on internet. It's a vulnerability Okay, be Kubernetes endpoints or Or workloads that you have in Kubernetes and cluster to external mean having access I mean to your internal cluster and using the system to go outside to do some kind of D does thing To anybody it's perfectly possible Some example security exploit test. We're not gonna go through it. I don't have much time. How much how many minutes you have Now this is this is this is this is vulnerabilities tested that you can write. All right, so Reading metrics you can get plentiful of information from the metrics. They're open go and show them that IO. My god, there's thousands of of Endpoints and clusters that you can get information from do port scanning for example sports of which are not needed But which are accessible access API server access to console try to access the ECD from from outside try to try to push some key values or delete some just for the sake of it Nothing your production system by the way But so try to try to do some men in the middle attacks try to do to inject some sidecar Containers in your pods that you might get access to and use it for something I don't know do a note for a port attack tried by breaking by privilege escalation So there's plenty of things or access the metadata Services of your cloud provider is very easy to get I am roles and attributes in Azure and AWS most of the time is not done, right? All right, I'm gonna skip Now good practices. It's like Don't start doing this in your system before you have completed some some some basic hardening of your system All right, otherwise doesn't make any sense. You'll get you'll get exploits from every so start with hardening your system first Don't do it ad hoc. Everybody's got building continues delivery pipelines in their system. All right, just make some place For these jobs, that's it. It doesn't have to be complicated do passive and active testing passive is more about Reporting on vulnerabilities active means is breaking the shit out of your system and make sure well, I broke it All right, so fix it. I would suggest to start with the passive ones and then make some place for the active cloud layer three four and seven attacks don't Don't confine yourself to the pod net or concentrate into into attacking the world the service service Mesh all right or or or any ovasse vulnerabilities on layer seven and then and then you can drill down the pod And you can drill down to the node If you want to test vulnerabilities of products that you're deploying like database is a posgres equal where posgres equal would need I don't know some kind of root access I mean on a cat so these exploits go to Exploit dash db.com and get these scripts because you have scripts there. All right, incorporate them and just test Test your workloads. All right. Now. This is very important live policy violation and alerting system I put their prometheus and graphona because they're the more or less the default one So anytime you run these things scrape Results build some some alerts report them somewhere and make sure that they're actionable This kind of vulnerability scanning you will do it at every deployment or Also at intervals. All right I'm going to schedule jobs or something and don't try to fix stuff. I mean while you're doing it doesn't make any sense It's we'll do it later Now how to do it? If I would say there's plenty of open tools libraries or whatever That would be not true because there's not plenty. There's just a couple now if you look at the CIS at the center For internet security, they have some Kubernetes benchmarks for every kind of technology. All right now to satisfy the criteria or the conditions Put the point I mean by the CIS when it comes to Kubernetes deployments, you can use Q bench All right, they are for grab up there or Kubernetes security benchmark It's not that you have to add something to it there there So so we have a list of a very large list of of of requirements and you have these Exploiting or scanning tools available there. You can you can of course, I mean add some value to it and Then you have vulnerability scanning for a Kubernetes workloads. It's like for the active hunting part You can use cube hunter. All right from Aqua or a cube audit and something from Heptio as well The thing is that they are very very very limited in scope. It boils down to the fact when I talked about these these These tests these vulnerabilities can scan so that you can build literally means you adding your own snippets of code of test I mean to these libraries The other things I mean you probably already know Kali or a like if you if you can think of any kind of tests I mean that you want to do in addition to that in addition to pure Kubernetes and Kubernetes workload testing feel free You got you got container version containerized versions of of of of these of these distros with You know yourself with a lot of built-in features in there or you can write your own from scratch now how you do it You either as I said you use your CI CD. I mean tool chain in the cluster itself You use the CI CD Outside the cluster. We've done it at a financial institution. I mean that they were using spinnaker outside Kubernetes and outside that they were doing I mean also Deployment stuff also on infrastructure level so they are using I mean something outside If you don't if you don't want to use or you're not using any any any CI CD tool chain Whatever just rely on cron jobs. I mean everything is an object on Kubernetes You have the cron John cron job objects of Kubernetes that you can assign per Workload per namespace or just create the central namespace and you can you schedule these? these scripts on on any other namespace from a central namespace in Kubernetes cluster Alright, okay, that's it. Do you have any questions?