 Hello everyone, I'm Yossi Weisman, a Senior Security Researcher at Microsoft, and with me is Ram Pliskin, a Principal Security Research Manager at Microsoft. We will talk in this session about the threat metrics to Kubernetes, which is a knowledge base of the security threats that target Kubernetes. This metrics is one of the first attempts to systematically map the threats of Kubernetes, and we will see how we can use it to improve the security of our environments and measure our security coverage to potential attacks. Ram will start with some background of this project, so Ram Plis, the stage is yours. Thanks Yossi. Hi everyone, I'm Ram. I'm leading the Security Research Team for Azure Defender at Microsoft. Please allow me to outline what you're expected to get from today's talk. First, we will introduce the problem space and what led us to focus on the orchestration layer for containers. Second, the threat metrics for Kubernetes. This is the outcome of our research of Kubernetes trend landscape. We will also go over an attack we witnessed and we will unfold it using the metrics. Then we will present an example of how organizations can leverage this knowledge base and harness it to better secure their Kubernetes workloads. Last thing, we will discuss the collaboration with Fox for Mitre to establish Mitre support for containers. We will wrap up with highlighting the differences among the two metrics. To set up the stage, we will start by introducing the problem space and how we got tapped into it. So before Kubernetes was a thing, before Docker was a thing, Fox were running distributed systems largely either on bare metal or in virtual machines. When containerization started to take off, it provided a way for consistent and repeatable deployments. With Kubernetes being so widely adopted, around two years back, we were tasked with building a plan to protect users running Kubernetes workloads. That's basically how our journey started. With Kubernetes build as extremely robust of construction layer, we knew we would need to adjust off security perspective. Normally, as we are natively integrated to Azure backbone, we tend to leverage any internal signal we could have to strengthen security offerings. But with Kubernetes being platform agnostic, we aligned the core only and shifted our focus towards Kubernetes layers. Let's linger for a second on the diagram in the right. We split it into three different levels. Starting at the bottom, the cloud layer, which can often be referenced as the platform layer, consists mainly of the control plane of the platform and the nodes operating systems. Both these fronts already have an extensive knowledge base, mainly of our MITRE enterprise metrics. So we knew we would have to dive into Kubernetes APIs, both the ones that are being serving the users and also the southward phasing to the ports themselves or internal Kubernetes services. So our goal was to map threats targeting Kubernetes. And when we started looking into Kubernetes landscape, we discovered a variety of attacks with each touches a very different aspect of Kubernetes building mechanisms. We figured we should map those different areas so we could keep track with the interface we would like to further explore. And with MITRE being largely embraced by the community, as the go-to place for learning and evaluating security coverages, we figured we should leverage a tech metrics structure. And so we started defining our own Kubernetes-centric TTPs. This was later turned out to be Yossi's great publications from April 2020 as the first drug knowledge base for Kubernetes. Later in this talk, we will also present the outcome of partnering with MITRE, which resulted with the release of the attack metrics for containers. Thanks, Ram. So as Ram said, we saw that there is a gap when it comes to understand the unique security threats, the target orchestration level of Kubernetes. And that brought us to build the threat metrics to Kubernetes, which focus on this specific layer. In this slide, you can see the second version of the metrics that we released earlier this year. Like MITRE, we split the metrics into tactics, which are the columns in the table, in the dark blue. In each tactic, there are techniques, which are the specific methods that attackers might use. So we won't go over all the techniques right now, of course, because there are many. You can read the full metrics in the link below. But let's see three examples. One is access-managed identity credential, second, coordinates poisoning, and third, images from a private registry. The first one is access-managed identity credential. In some cloud providers, you can allocate identities to cloud resources like virtual machines. For example, in AWS, you have EC2 roles. In Azure, you have managed identity or managed service identity in its formal name. You can access to the token of the identity from the virtual machine itself, and also from the containers that run on that virtual machine. So if attackers gain access to a container, they can potentially grab the token of the identity that is attached to the underlying node. With this token, they can later access other cloud resources, depends on the permissions of this identity, of course. By the way, in some cloud providers, you can prevent containers from accessing this identity token and basically mitigate this threat. Another example is core DNS poisoning. The technique talks about poisoning the cluster's DNS, not by traditional DNS poisoning techniques, but by modifying the core file, which is the configuration file of core DNS. This configuration is stored as a config map in Kubernetes in the cluster. So if attackers have access to this config map, they can poison the DNS in the cluster. You can see that it's not a traditional DNS poisoning that affect any service. This is a unique threat, the target Kubernetes specifically. Another example is images from a private register, which talks about how attackers might still images from private registries once they have access to container in the cluster. For example, by leveraging the clouds manage identities of the working nodes that we talked about earlier, this identity has many times permissions to pull images from the registry that used by the cluster. So besides being aware to the various threats that are Kubernetes, what else can we do with the metrics? So the answer is that we can use it to measure our coverage to real world attack. Let's see how we can do it. So here is an example of a very simple attack that happened earlier this year and targeted Kubeflow. Kubeflow, some of you are probably familiar with it, is a very popular framework for machine learning tasks that run on top of Kubernetes. I think that there are also some sessions in this conference about Kubeflow. Some of the functionality of Kubeflow is exposed via CRDs and some by a centralized dashboard. In some configurations, this dashboard doesn't require any authentication. In such configurations, if the dashboard is exposed externally to the internet, it basically allows a free access to a management interface of Kubeflow, which is obviously a bad thing. Earlier this year, we saw a large scale campaign that targeted internet accessible Kubeflow workloads. The attackers used exposed dashboards for deploying a malicious pipeline. A Kubeflow pipeline is a service of Kubeflow that allows users to create machine learning pipelines, which is based on Algor workflow. Here you can see that by using the dashboard, you can simply create a new pipeline, and that's what the attackers did. Using pipelines, the attackers deployed crypto mining containers. Those containers use a legitimate Tanzoflow image from Docker Hub. They also leverage the fact that in some cases, Kubeflow workloads use GPUs for machine learning in order to increase their gains. Here you can see the attack flow. The attackers access the Kubeflow dashboard. Using the dashboard, they deployed a new pipeline, which created a new container eventually by the pipeline controller. The container, which ran a legitimate Tanzoflow image, faxed a crypto miner from GitHub and ran it. You can read the full report in this link. So we saw an example of an attack. How can we now use the threat metrics to measure our coverage to this attack? So step one is mark the relevant techniques. So here in red, we marked the relevant techniques for this attack. Attackers used exposed sensitive interfaces, which is the Kubeflow dashboard in this case. So this is the initial access. The attackers use containers for execution and persistency. They use legitimate names for hiding their activity. So this is the defensive agent by pod or container name similarity. As we said, they levered the pipeline controller permissions for deploying a new container. So this is the container service account. And the impact is resource hijacking, which is the cryptocurrency mining. So we marked the relevant techniques. Step two is evaluate our coverage to those techniques that we just marked. For exposed sensitive interfaces, we can monitor exposure of services to the internet. For example, monitor load balancer service creations. Ingress objects are also relevant. In OpenShift, it could be roots, et cetera. I really want to emphasize this point. A very large portion of the attacks that we see against Kubernetes clusters start with an exposed sensitive interface. It affects everyone, literally everyone, from small organizations and individuals to huge organizations with very large information security departments. Misconfiguration in Kubernetes is a serious problem. It is absolutely crucial to monitor your environment for such misconfigurations. And when you find one to act immediately, because once it happens, your cluster will become compromised very quickly. So that was initial access. For execution and persistency, we should monitor new containers. That means to monitor the images that are used to monitor antropons or arguments, to monitor configurations such as privileges, capabilities, volume mounts, et cetera. For NEM similarity, we can monitor the images in our workload and make sure that only known images and legitimate images are running. We once observed an attack on a Kubernetes cluster, not this attack, another one. The attacker used a malicious container as well. We notified the victim and we told them, hey, your cluster is compromised, you should be aware. And they actually argued with us and told us that this either end-to-end testing containers and it's fine, it's legitimate. We showed them that the container runs a malicious image and the attackers just used a legitimate name from their environment. So it's really important to monitor those things. For container service accounts, we should monitor suspicious operations of service accounts. This can be done by using the Kubernetes audit log or as some people call it, QBody. This is a very powerful tool that gives you a great visibility to what happens in the Kubernetes control plan. You can see basically all the operations in the cluster, who did it, when did it happen, from which IP and more. So it's very useful when we want to monitor the activity in the cluster. For resource hijacking, we can monitor it from two levels. In the orchestration level, we can monitor once again the containers, images, entry points, arguments, also exec commands. In the node level, we can monitor the running processes, memory consumption, CPU consumption, et cetera. So this was an example of how we can use the threat metrics to measure our coverage to real-world attack. Now Ram will speak about MITRE's project of adding coverage for containers in MITRE attack and the differences between MITRE attack and the threat metrics. Ram, please. So let's talk a bit about MITRE metrics for containers. So in December 2020, based on the community interest, and if I may add with some inspiration for Microsoft threat metrics, MITRE started to work on attack metrics for containers. And in April 2021, the containers metrics was added to attack V9. Okay, so if up until recently, we had no solid source of information for learning on possible attacks targeting Kubernetes, now we have two great sources. So let us pause here for a moment to highlight why these metrics are different. So first, attack is focused on in-the-wild adversary behaviors, meaning that MITRE consider a TTP only when it happened when it already has been witnessed. Their investigation is focused specifically on gathering intelligence of what adversaries are actually doing versus what researchers and red teamers can do. Clearly our motivation is different. We are research oriented and so we do not have the luxury of waiting for the community to share real world attacks. The second main difference is that attack metrics is built on existing techniques. MITRE's motivation was to leverage an existing TTP. So in case there was an equivalent technique, for example, from the Linux metrics, they prefer to supplement it with a container tag and therefore names and descriptions are oftentimes different as well. Also on some few cases, I think we had some disagreements. Each stakeholder had a different interpretation of a certain activity and that also led to few misalignments. With that being said, it's important to also highlight the core similarities. So please note that each metrics combines adversary behavior techniques for both orchestration level and container level. Secondly, both metrics as can be considered as an obstruction layer. For an example, if an adversary installs a corn job within a container, that behavior would be covered by the Linux metrics instead of the containers metrics. To learn more on the evolution of the metrics and the process behind of building those, please read MITRE and Microsoft joint publication. At this point, let's recap the attack that you just laid out and see how it's been reflected on each of the metrics. We can use the error to map between the techniques. So from the initial access tactic, the exposed sensitive interface is mapped to external remote service on MITRE's metrics. From the execution tactic, the new container is mapped to deploy container on Mitre side and the container name similarity is mapped to masquerading. And the resource hijacking is actually remained the same. The discrepancy comes from the backdoor container technique and the container service account. Let's look into the backdoor container technique. So attackers can run their malicious code inside a container within a cluster and by using Kubernetes controllers such as demon set or deployments, the attackers can ensure a constant number of containers will be executed with their malicious container. We chose to represent this technique under both execution and persistent tactics. For execution, it's obvious why it's fit there. For persistence, we mainly referenced to the nature of Kubernetes controllers that will safeguard the attacker's payload to remain running. Mitre choose to have this technique only under execution. Another discrepancy was the container service account. And according to Mitre, lateral movement hasn't been observed in real-world effects for containers. So they have decided to not include it. To wrap up, I will share my key takeaways. So first, I hope we demonstrated the importance of building a knowledge base and always keep challenging it. We also showed our organization should measure their existing capabilities against it. Number two, defenders should close ranks. Microsoft collaboration with Mitre is only one instance of our defenders and blue teamers across the industry to join hands. It's crucial against the challenge we are facing. Just wanted to call it out. With that, thank you. Back to Yossi. Thank you, Ron. So my final words. Kubernetes is evolving from all aspects, including security-wise. Kubernetes today is much more secured than it was several years ago or even last year. But as Kubernetes gains more and more popularity, it also gets more and more attention from the attackers. So the threats are also evolving. So that's why it's really important to be up to date. And we also updated metrics over time to keep the pace with those changes and new threats. And we already released a second version of this metrics this year. And we plan to release more versions of this metrics. When we released the first version of the metrics last year, we were very excited to see how the security community and also the Kubernetes community adopted this metrics. It showed us that there was a real need for such knowledge base and we were very happy about it. We really hope you enjoyed the session and understood what is the threat metrics and how you can use it to make your organization more secure. Here are some useful links, which are the links that appear in the deck. So thank you very much.