 All right, thanks for the introduction. So I'm Laurent Cohen. I'm working at Azure Systems. And I'm a security software engineer. And a fun fact about Germany, calling yourself an engineer is quite a protected job title. So I can call myself an engineer because I have an engineering certificate, actually. But yeah, so I'm a software engineer there. And I'm going to talk about how we secure teammates where I got encryption. So first of all, I wanted to pose the question, why do you encrypt? And if you ask people this question, you get two kinds of answers, essentially. You either have people saying, a company requires us to do it. We have a government agency which tells us to do it. So if you are in a protected or regulated branch, for instance, finance or health care or something, all you have people saying they would trust that came up a lot today. Defense in depth is another keyword. But in the end, it will boil down to, I don't want to leak any sensitive traffic. And of course, compliance should somewhat enable security. But if that's really the case, this is another talk. So if you go into the user MD, you have at least 10% of users saying they use psyllium encryption. But a lot of them just say, yeah, we use psyllium generally. So I imagine that the percentage is a bit higher. So why do we encrypt at Azure Systems? Why do we use psyllium? We don't trust a cloud provider. We don't trust infrastructure. We don't trust the network. We don't trust the nodes and so on and so forth. So let's do a bit of threat modeling. So for us, this is how we see the world. So threats everywhere, threats everywhere in the cloud, ranging from the connection towards the QAPI, towards the nodes itself, actually, of course, in transit and storage. So how do we go about securing all these against all these threats? So first of all, we need to secure in use, address, and then transit. So I'm just going to go briefly over how we solved the first two points. And data in use is, I guess, the most interesting point. So what we do is we use confidential computing technology, like Intel TDX and AMD SEV, to shield your node from the cloud provider. So the hypervisor can't even access the node's memory. And because it's encrypted and isolated, and you can also verify that your node is in a trusted execution environment. This is a similar technology. I don't know if you've heard of the Confidential Containers Project. Those also have a talk at this KubeCon and last KubeCon. They've been around for a few years. And this is a similar technology, but those guys put single pods or single containers inside a trusted execution environment. And what we do, we put the full node inside the trusted execution environment. So this is the difference. A secure in data address is sort of a solved problem. The key difference between what we do at address and what some of you might do, if you trust, for instance, a cloud provider, is that you might say, OK, AWS, please encrypt my traffic. Choose a key and encrypt it. And what we do is we, for instance, with the disk, we mount them with the encrypt and the severity. And with a key that's just inside our trusted execution environment, which itself, of course, is not available to the cloud provider. So we encrypt within our trusted execution environment, and therefore, they extend our security boundary here. And this is the most interesting point, which I'm going to expand on. And how do we secure data and transit? Of course, for Kubernetes, for the Kubernetes components itself, so for your Q-blads and so on, you have your TIS configuration that you should all have done. And for the workloads itself, we use Cilium. And how are we going to do that is what I'm telling you about. So this is a constellation. This is our Kubernetes distribution. What's interesting about how we use Cilium also is that we need to run on all cloud providers and even OpenStack and everything that our customers want to run Kubernetes on. So we have heard about Cilium being cloud agnostic, and this is what we leverage here. While we choose Cilium, I guess those are the main points. This is why we are sitting here, I think. It's fast. Observability is great. It has great policy features. It has proven track record, a strong open source community. And of course, it has recently graduated as a CNCF project. Just go a bit about nomenclature and what traffic types you encounter in your Cilium network, just that we are all on the same page. You could have POTs talking to each other, both of them being a POT network. You could have a POT talking to a host POT. You could have the other way around, a host POT talking to a POT and a POT network. And of course, host POTs talking to each other. Host POTs would be the same as the nodes, of course, talking to each other. So those are the endpoints. And this is how we call the traffic. So we essentially named them after from which network they come from. If they come from the POT network and go to a POT network, it's POT to POT traffic, and so on and so forth. So POT to node traffic, node to POT traffic, node to node. Of course, POT to node and node to POT is essentially somewhat the same traffic. And yeah, if you use encryption, these are the key aspects you want to be aware of. And so the first point is only covered by POT to POT encryption. And all the rest are covered by node to node encryption, generally. Yeah, and maybe let's compare. So what Cilium does is Cilium, of course, offers IP second wire guard. There used to be an IP second node to node implementation, but this was un-maintained. You can find the discussion and the issue and the slides. I've also uploaded them, so you can read about that more. So I guess if there's anyone willing to implement and to maintain it, I guess they are welcome. And what this talk is about is about strict POT to POT encryption. And I'm going into much more detail about why we need this. There's also something strict node to node encryption. This is being talked about currently. There's also the issue linked there. Yeah, this is also in comparison to most other enterprises. If you actually dive deep into their docs or if you dive deep into the implementation, most of them offer just POT to POT security. So if you have any node-level component talking to a POT, this will not be covered by IP second wire guard. So now we have a little quiz. And the question I want to pose is, is this traffic encrypted? So for that, I'm going to switch over here. And let's do that quickly. Okay, up here. Yeah, so this is how I installed, I think it's really a bit somewhat, this is how I installed Cilium. And I enabled encryption. We use wire guard encryption and we don't use any node to node encryption for this demo. So let's go back inside this cluster. And what we have here is just a normal cluster with a healthy coin just running. And I'm going to go into the namespace I've prepared. We have some hosts POTs, some POTs in the POT network. And what I'm going to do is I'm going to ping the POTs on the second worker node from the first POT. So ping this and how am I actually gonna observe if this is encrypted? So for this, I'm using just a basic TCP dump just for simplicity reasons. Yeah, so please raise your hand if you think this traffic will be encrypted. Remember I've installed it with encryption enabled, Cilium. So yeah, this is just some background noise I will go about the ping. So what do we see here? We see that the ping leaves the POT interface and goes over the wire guard interface. So this traffic is actually encrypted. So now a similar case. Again, we have the same setup, the solution is the same, but the only thing that is different I will show you that someone let the operator crash, deployed it wrong or something happened so the operator isn't there. And as a side note, I'm using for, since I wanted to scale my cluster in the future, I'm also using a Cilium endpoint slices to have the endpoints listed in the Cilium endpoint slice. So it's much more performant to consume them. So Cilium endpoint slices enabled, operator is crashed but POT to POT encryption is enabled. So now I'm going to do the same again. I'm going to ping the second POT on the worker node here. This one I'm going to ping it and please raise your hand if you think this traffic will be encrypted. Raise your hand, okay. So actually what we see that it's still egressing the interface of the POT but it's leaving it via the Cilium VXI interface so it's just being tunneled and not encrypted. And actually if you have the cloud provider sniffing on the wire between the nodes, the cloud provider can see the full traffic. So why is this happening? Why is this happening? So those are the backup slides, what we saw. Yeah, so this is again just a reminder as 10 traffic from POT one to POT three essentially. And we saw it leaving the LXC interface either over the VXI interface which was then it was then being sent out unencrypted or over the wire guard interface. And if we look at the EVPF code which is attached to the LXC interface from the POT you have this call stack and somewhere below you have your wire guard maybe redirect your encrypt function. And what does this function do? It redirects it maybe to be encrypted. So this is a simplified version. This is even more simplified version. And what I wanted to point out is here below on the redirect it's redirected to the wire guard interface. So this is what we want to happen. But this is only the case if the destination has a key set. And where does the destination come from? The destination is looked up. The remote endpoint is looked up. Where is it looked up in the IP cache? And you can list your IP cache in the Selium agent with the following functions. So Selium BPF IP cache list and you'll see if the anchor key is set or not. So for all the POT IPs for instance here we have an anchor key set. But how does those value come into the IP cache? So the values come into the IP cache when a new POT is created. So the Selium agent then pushes the Selium endpoint to the Qube API. And if you have Selium endpoint slices enabled then the operator consumes the Selium endpoints and puts out Selium endpoint slices. And only then they are consumed by the Selium agent too. So on the other note and the IP cache is updated. So at the full red path here, this is the critical time in which traffic will be sent out to the POT that was newly created unencrypted. And what is also important to mention, I've let the operator crash in a second example. Just to increase the critical path time here essentially to infinity. You saw the operator was crashed for four days now. So this makes the critical path very, very long. But the critical path exists even if you have Selium endpoint slices disabled because you have still some delay between the endpoint being pushed and the IP cache on the other note being updated. So this might of course not what you want. If you have encryption enabled and you have a regulator saying you must encrypt your traffic and quickly recall the threat model. So we have someone sniffing on a wire. And our solution was to essentially have a strict mode which looks at the traffic either at the equals on the weak side interface or on the equals on the native network and do additional filtering on that. So you can use a filter to decide if you want to send out this traffic or not. And how have you implemented this? This is a simplified version. This is even more simplified version. And we essentially do a CIDR check. So if the destination, the source address is in the strict CIDR, we give this function. Then based upon that, it's decided if it's sent out encrypted. So if we use, for instance, we could install similar like so. And if then the destination and the source is both in the pod CIDR, for instance, you can choose any side you want in the pod CIDR. And we see this traffic leaving, wanting to leave the VXN interface, then we simply drop it. Yeah, and I'm going to show you how that looks like. So let's go over here. Yeah, so what happens now? As I said, we have essentially the same setup as before. So the operator crashed and let's go into the namespace. And we have exactly the same as we've done the last two times, we are going to ping the pods that I do the right check. That's going to namespace again. And go to on this node and ping the pod on the other nodes just to show you that it works. Yeah, so what we already see that the traffic is not being, or that the SMP response is not being received, but that could just mean that we leak traffic. But the node, the other node just decided that to drop the traffic, this would of course still leak the traffic if it's being sent out. And just to prove that to you that this actually and dropped. Yeah, we see just the traffic just leaving the LXC interface and not going over the VXN interface. So it's not sent out at all. So it's not leaked. In any way. Yeah, so this is actually an update. I've looked at the slim repository this morning and they decided to change the default. So I just showed you that there's a decision being made if the traffic is sent either over the tunneling interface or to the Viagard interface. And we have a new pull request running here. Let's see if it's already merged. No, but it was opened 10 hours ago. So this is the improvised part of the talk. Yeah, but the great news is still will be secured by default when you're using tunneling. Because what then happens, right? What then happens is you do double encapsulation. So you route over the tunneling interface then over the Viagard interface again and then only then you route to the other node and which does double encapsulation which was one of the reasons I think because why we had the other architecture before but I don't know if you need to ask the maintainers there why they suddenly decided to change the routing here. So what about the SIDA filter now? So I talked to the engineer that opened the pull request and one of the, so this feature will still be an upcoming feature in 1.15 as of right now at least and why you still would want the SIDA filter is if you use native routing. Of course, using native routing and using Viagard encryption on all your traffic is a bit somewhat questionable because you're not using native routing and you're tunneling over the Viagard which does tunneling for you but still in this architecture you only tunnel you encapsulate your packet once and not twice as in the other architecture. So there's a still a problem which the strict mode solves here in this architecture. So to wrap up, first of all if this was news to you, carefully read the docs. This was documented for all the versions in which the old architecture applies. Think about using a strict mode or upgrading to 1.15 in the future depending on what architecture you have then using architecture actually fits your needs. So if you are okay with double encapsulation then you'll speak slown if you have requirements for native routing and encryption of course then you might be looking at the strict mode. If you have other requirements of course use an architecture that works for you. And the last point is engage in a community. So as I said, we've implemented that feature for our threat model, for our security model because we didn't want to our architecture before was funnel over Viagard which we handcrafted and we shared the keys over there at CD because we hooked into that. And this is of course not a great architecture. So we used Syllium for the Viagard and for the key distribution and all that and we just contributed a feature to make the security fit our needs. And yeah, that was the summary. Are there any questions? Let's give Leonard a big round of applause while we think up some questions. I see a question over there. Do you wanna come to the mic in the middle? What's the performance overhead of using encrypted traffic between pod to pod? So we have performance numbers of our committed distribution which some basic performance numbers but there are of course also performance numbers of Syllium itself which you can look into the can just look into the documentation of that. It's not that huge. I don't know on the top of my head actually but for us it's much more about since we don't trust the cloud provider we have no other choice but besides encryption. So yeah, so we have performance numbers which you can look and our docs Syllium has performance numbers. Those are of course a bit more fine tuned. Yeah, so I guess it's a few percent depending on what metric you want to look at. Any more questions? As you're trying to debug and make sure that this encryption is happening this approach of hey, I'll do TCP dumps and sniff on everything. Are there tools that you found that accelerate that or make that a little bit easier to try to troubleshoot? So what I did was a bit of test-driven development. So I've implemented in the first iteration I've implemented the test in the same test suite before hand and then just ran it until they were happy and essentially doing the same thing I did here. So I'm scaling down the operator, scaling it up and they actually also use TCP dump for that. I think you have other introspection tools are also good. I haven't used Hubble for that for instance but if you want to actually have a look at interfaces how it's routed over there I think that's going with the old school tools isn't that far off. So that's what was I used.