 So let's talk about security priorities and how can we detect them easily, okay? I'm going to present you with an interesting research we've been doing in the last three months, together with Intel security. So do you know this guy? I guess you've heard of him. So Ken, in the next picture, I'm going to ask you to point him out. Do you find him? Everyone looks similar, right? Sometimes it's really hard to pick out the real one, right? The real things which are bothering us when we're looking for a target, okay? And in our case, okay, to improve security in the cloud-native world, okay? Sometimes it's really hard with today's tooling to improve security posture, okay? We have a lot of great tools giving us a ton of information. And for us, it's for, you know, and I can talk right now, we are not in KubeCon, right? We are in cloud-native security con. So among the security professionals, sometimes it's really hard for us to pick out the real issues. So this is what I talk is going to be today. We are going to, I'm going to present you today a talk with Arya, who is, we pre-recorded him for this talk. He couldn't make it here. Who is part of the Intel Assert team, okay? They are like an offensive security team inside Intel, okay, giving, helping product security teams to improve their product from a security perspective. And CubeScape, okay, CubeScape is a sandbox project in CNCF for Kubernetes security posture. And we are going to show you a research which we've been doing for a while, again, on how to prioritize, okay, security issues in Kubernetes using what we call attack chains. And we're going to show you a demo, okay, of how it's been used inside the Intel Assert team in the last month. So a few words about me, those who don't know me already. I just had a talk in the next room, so some of you might already heard this. Okay, but I'm Ben, I'm CTO and co-founder at Armo. Armo is a Kubernetes security company, startup based in Israel, but we are like, we have people around the whole globe. In my previous stints, okay, been working as an offensive security engineer for a long time, also went to the defensive side and then went into R&D and creating products then to startup. I'm fluent in a lot of languages, English is not one of them, so I'm sorry if I'm making mistakes. And you can find me around CNCF a lot. Hello everyone, I am Arjen El, principle engineer at Intel where I lead an offensive security research team called Assert. I have 25 years of professional experience in different domains of software and security engineering, and I teach software security at the Jerusalem College of Technology. At Intel with my team, we primarily conduct proactive research and have different modes of operation, from long engagements to focus their thoughts on various kinds of technologies. While some of the targeted effort may be short term, it is crucial for the project team we are working with to be well prepared in order for us to deliver the best results. Overall, it's not to replace the standard security assurance practices of the product team, such as the secure development life cycle, well known SDL, but to complement it by providing an external attacker's perspective. But to be effective, we aim to ensure that any bugs that could and in fact should have been detected by automatic tools have been addressed, and we refer to these as low hanging fruits. The challenge we face is that when searching for this rotten low hanging fruit using most automatic scanners, we must sift through a large number of other fruits also reported by these tools, many of which are not actually relevant to the project. Scanners report issues based on rules that may return alerts that are not applicable to this specific project. Additionally, it is important to consider the severity and risk level of the reported alerts, some may be critical, while others may provide a minimal protection against the remote risk. While we advocate for defense in-depth, prioritization is primordial in real-world scenarios. In order to effectively identify and address security vulnerabilities in our Kubernetes-based systems, we needed a tool that could not only discover issues, but also categorize and prioritize them, provide explanations, and even suggest remediation steps. After evaluating various options, we found that CubeScape was a good fit for our needs. We have since worked closely with the CubeScape team to further improve the tool and incorporate new features such as the ability to focus on identifying the most critical areas of the cluster from an attacker perspective. My team is not the primary customer of this tool, and it shouldn't be. However, it is important for us as a security team to ensure that such a tool is easily adaptable by developers and validation teams as part of their workflow. This way, it can be used to catch and report potential vulnerabilities and weaknesses in the cluster in a way that is easy to understand and actionable for non-security professionals. So, a few words about CubeScape. So, CubeScape was born as a Kubernetes security posture tool, which gives you both compliance and security posture information. The project started a year and a half ago as a site project at Armo, and as of today, more than this is the first time, okay, I'm attending a CNC event where I can say out publicly that we all became a sandbox project, okay, Armo contributed CubeScape to CNCF. And it gives you today two major value or information. One is the configuration scanning, okay, scanning through your Kubernetes configuration, in other words, API server and object and so on in order to find out different security issues. The other is the vulnerability scanning, okay, we are enabling you to scan vulnerabilities inside your cluster and find, give posture, see posture information about your container images. Now, the interesting thing and one of the things which were very helpful also for Intel, as you could hear it from Aria, was actually the ability for CubeScape to integrate not just into your cluster into live information, but also to tap in already to your development workflows for VS code integration, GitHub integration and so on. So you could, DevOps and dev engineers could already implement, okay, the things during their work, okay, and not just get there to the production and then find out issues. And this is a very, very helpful point here. Now, CubeScape and the configuration scanning part of CubeScape is based on what we call controls, okay, controls, think about them as tests, okay, every control is a test and it tests a single, I would say, property of Kubernetes objects, okay. I'm just making it simpler, okay, to explain, but it can be more complex because it can have contextual, take into account contextual information of multiple Kubernetes objects. It is based on Rego from the OPA project and you can define, we are defining these controls in Rego. Now, every control has its severity and it can, every control can cover a single thing like whether the container which is inside the pod is running as a root or whether, you know, the service account token which was mounted into the pod has actual RBAC privileges attached to it, whether the workload has, in this case, this example, okay, whether it has a critical vulnerability inside and so on. So we have this control, concept of controls and CubeScape output shows you, okay, which resources failed on which controls, okay, and you can take it for fixing them or either decide that you're ignoring them. But the main problem is that in the case, for example, for Intel, they're running a cluster of nearly 100,000 workloads, okay, and having a single scan running inside and having, taking this output is really, really hard. It's overwhelming and the security team is looking for, want to know what are their most risky workloads on to understand what they have to fix for, you know, before the others. Therefore, we started this project, okay, and we thought of the following concept and I'm going to explain you now the algorithm, the idea, okay, and I will show the results. So again, we have these things called controls, okay, and we want to get into state when we can prioritize the output, okay, and to show you for each workload which one is the most risky, okay, which one you have to fix the first, okay, and, you know, building something in between this algorithm in between these two, the input and the output has to make, take different things into account and the way we decided to do this is what we call attack chains, okay, we are going to, with what we have done, we've modeled different, these controls into a framework we've created and trying to show which controls has effects to others, okay, and if you are seeing these two controls failing on the same resource, creating a bigger effect, bigger risk, okay, than each of them alone. So in order to do that, we've taken Microsoft's work, okay, on Kubernetes security, they've released, I think, two researchers at Microsoft released this framework like two years ago, something like that. They've took the MITRE framework, okay, I guess most of you know it already, okay, the MITRE framework and adopted it to the Kubernetes world. So what they did, they took the same categories, okay, of initial access, execution, persistent, and I'm going to read everything out for you, but they took the original MITRE categories, okay, and put different issues around Kubernetes sounds under each of those categories. What we decided that we are going to categorize our controls we are doing in Cubescape to be part of each, categorize them under these categories, okay, and what it brought us to under our next step, which is creating a graph, okay, of these categories where we are seeing that one category can affect the other. So for example, you can look at the initial access category, okay, whether the different issues under this category of initial access, whether the attacker has an access to the container, okay, and if the attacker has an access to the container, it might gain execution, which is another category, right, and one, you know, category can bring to another, and execution, if the attacker has an execution on a container, it can try to escalate its privileges beyond the container, okay, or trying to hack into the kernel, okay, and so on. If the attacker has an execution, it can try to access credentials inside the container, okay, or credentials which are bound to the container. If the attacker has an execution, it can do discovery of the environment and do lateral movements, and so on. So we created this graph, okay, of these categories, okay, how they are affecting one to the other, okay, and we said that this can enable us, okay, to create a sequence, okay, of potential attack inside the container. So if CubeScape sees an NGNX controller, okay, which an NGNX controller which can be accessed from the public internet, there, this workload fails in the initial access category. If it has a critical vulnerability, which is a remote code execution vulnerability, the attacker can gain execution, okay, if the same workload which is again an NGNX controller and NGNX controller needs access to the Kubernetes API and therefore it will have access token with role attached to it, then it is vulnerable to API access and so on. If the container runs as a root user, the attacker has more potential to exploit different bugs in the kernel. So if we're taking this, okay, and putting it on our graph, okay, you will see all these, you know, red categories, okay, which are failing in a given workload, okay, the categories which are failing and this can show that we can build up chains, right, because we can say, okay, that since there is initial access and execution and privilege escalation, okay, because this was a root container, okay, then the attacker has a potential attack chain, okay. I'm not saying that it's 100% exploitable, okay, because there are other things, but for the sake of prioritization, this is a potential attack chain. Now, this is the first chain, the second chain can be this chain, initial access, execution, credential access and impact on the Kubernetes API, okay, because the attacker has this flow inside the workload. So if I built, if I, you know, more formally defining, okay, this algorithm, what Cubescape will do, okay, is for every workload, it already calculated the controls, okay, it assigns each fail control to the graph, okay, calculates the fail, the chains, the potential chains inside, you know, this graph and for each chain, it calculates a score which is based on actually on the severity of each control, it failed on it and then sum up all the chains together and this is going to be the priority, okay, of the score of the workload and at the end, it can take all the workloads together and create a list of priority list of each one based on this score. So again, if we are looking at the scores, so the first chain had three failed controls, okay, each control has its own severity, we are calculating the multiplications of each, each control together, the chain one score is, you know, is set, the chain two score is going to be also the multiplication of the each of the severities we, of the each control it fell on, okay, then the, actually the priority score of the workload is going to be chain one plus chain two. So now what we are going to do is, Ari is going to demo the results he had using this. In the last few months, we have evaluated the usefulness of Cubescape in the various stages of development across different reforms. The feedback so far has been very positive. The findings reported by Cubescape in a user-friendly manner should help them to integrate it into their workflow and focus on the most important areas of potential risks. As you can see on this slide, different types of output formats are available. The Cubescape for report showed different namespaces with different misconfigurations reported along with a direct link to the documentation explaining the issue and examples of remediation steps to fix the issue. I clicked on one of the links in the report regarding a vulnerability with a cryptic name. I was taking to a page with the vulnerability description, additional resources, and even recommended remediation steps. Very convenient, isn't it? As we previously mentioned, there can be a lot of noise when it comes to security threats. And even after categorizing health entering out the less relevant ones, the production team may still want to prioritize their effort. In this video, we demonstrated the use of Cubescape on a large namespace. Despite the vast demand of information and useful details, we still found it necessary to prioritize as the initial report, as we previously mentioned, there can be a lot of noise when it comes to security. So here we focus on one of these workloads, a database one. In this view, as a user, I try to understand what can be mounted as an attack exploiting the vulnerabilities found in a specific workload. In the bottom part, I'm presenting with an attack tree. It describes a full chain of exploitation using different findings, all regarding this very namespace. This makes the work of explaining the possible consequences of these vulnerabilities much easier, especially when you have to justify more work from the development and validation teams to management. This is an example of how Cubescape prioritizes the workloads in the namespace I've been working on in the previous slides. The list shows the workloads that Cubescape suggests to fix first as they are considered to be the most risky. As you can see, my two database instances are... This is an example of how Cubescape prioritizes the workloads in the namespace I've been working on in the previous slides. The list shows the workloads that Cubescape suggests to fix first as they are considered to be the most risky. As you can see, my two database instances are prioritized ahead of others due to the fact that they both have persistent volume claims connected and multiple replicas. Additionally, the control manager, which is forced on the list, has access to the Kubernetes API. This prioritization order provides guidance for the development team on where to start fixing issues within the namespace. Although this feature is still in the alpha phase, its value is already evident. This is different. At the end, you can access a comprehensive list of what has been tested, the vulnerabilities found, the severity, and a risk assessment based not only on the vulnerability, but also on its context. This is important because some potential vulnerabilities may not be relevant in the specific context of this namespace, especially when the project is entirely scheduled. It's crucial to be able to prioritize your resources. It's not perfect yet. There are still many areas that need improvement, and we're working with the maintainers to develop new features to make Cubescapes even better. However, the maintainers are very responsive, and for example, the last time I reported an issue to Ben, a new version was released with a fix the very next day. So, to wrap this up, from the beginning, we've put ourselves a very hard goal of creating a good prioritization engine, which will help you really to understand what are your most critical workloads. Now, we are really in the early steps, and we are working in tech security to try to align this concept of attack chains with all the other CNCF projects and trying to standardize this. But what we've seen in Intel, and from the enthusiasm, actually we got in the Cubescapes team from Intel, made us think that it's a very promising start, and we still have, as Ariya said, we have a few issues to work out, and we have a few missing parts. For example, vulnerability information is not really connected today to this prioritization engine. I can tell you that it's already giving, even without it, it's already giving a valuable input, and hopefully we're going to work out this issue of vulnerability information missing in the next month or two, and you will have this information already part of the prioritization. We are missing a lot of testing and feedback, and therefore, actually what we are, I have also here called to action, we are looking for inputs and feedback on this, and also early adopters would be ready to test it, although this is already in the main release in Cubescape, it is hidden right now under a flag, which therefore is not part of every report, but if someone wants to try it out, I would ask him or her to reach out to us, to the maintainers. We have on CSEF, it's like you can find the Cubescape and Cubescape dev channels where you can always reach us, and therefore, we are looking really for input from all of you, and either you are coming from your vendor or your user or you are just a security enthusiastic, okay, we'll be happy to get any feedback from you, and yeah, also an idea is how to make this even better, but I can tell you that even from, we have also started this week to show it to another company, I don't want to disclose their name because it's too early, but they also started to use the name, we'll be very, very happy about their first results. So, we started, you know, with this picture, I can assure you that Cubescape cannot tell you which one of the real world though here, and therefore, my punchline is that there is no punchline here, but I think that this is a really great project and if you want to contribute it, we are more than welcome, and please rate this talk, thank you very much, and if you have questions, I'm here. So, good question, okay, Cubescape controls are working either on live Qube API data, okay, or even from YAML files, so you can, you know, this is just a tool, you can feed it from both directions, so you can create, you can do this prioritization on your health charts, okay, already, you know, in your Git, okay, and get a view of which one of how they refer one to another. Honestly, okay, if you are scanning a lot, okay, in once, you will get like this prioritization, but I think it's more interesting inside the cluster, okay, because you have their love workloads together, you can compare to each other, but looking at the attack chains and the attack tree itself, I think it can be also valuable, you know, offline, okay, even before you are deploying stuff in your cluster, and as of today, the only two data which goes inside this engine is the Kubernetes objects themselves and will navigate information which is a little bit lagging, but it's going to be there. Anything else? One, two, three. Thank you.