 Hi, a designed by the connection that might not be encrypted and a few other stories. My name is Liz Rice. I work at a company called Isovalent which is the company that originally created the Cillium project. How many of you have heard of Cillium? Excellent. Hopefully you all know that Cillium recently graduated in the CNCF, so that's really exciting for us. I think deserves a round of applause. Thank you and today I want to talk about a beta feature that's in recent versions of psyllium for mutual authentication and let's take a step back and talk about what we mean when we say a connection is secure and I think one of the best ways to think about this is to think about what you want when you're talking to your bank over the internet. You absolutely need to authenticate identities. You need to know, you need the bank to know that you're you because you don't want to be paying money into somebody else's account. You want to know that you're talking to the bank because you don't want to be paying your money into some random person. So that establishment of both ends of the identity at both ends of the connection is critical. It's also important to then encrypt the traffic that's flowing between those endpoints because you don't want everybody knowing how much money you've got in your bank account and you also don't want them potentially monitoring that traffic to, for example, get your password. In the environment where we're talking about a web browser talking to a bank, we're probably talking about TLS. I'm sure you're all familiar with the little padlock icon that you get in the browser when TLS has been established. We don't need to go through all the details of how that TLS handshake happens, but the core parts are, it starts with a TCP connection. So there's a client end that initiates a TCP connection. That SYNAC flow is the beginning of every TCP communication. Then the client initiates this handshake asking the server for its identity in the form of an X509 certificate. So this is the bit that proves that the bank is really the bank. The client can validate that certificate and proceed. There's also some exchange of information that allows both ends to derive key that they can use for encrypting traffic for the duration of that session. In a cloud-native environment, we typically talk about mutual TLS between two workloads. When you're talking to your bank, you probably do TLS to establish that it's the bank and then you use some other authentication method like a username and password to confirm to the bank that you are who you say you are. In a Kubernetes cluster, for example, we probably have two peer workloads that can exchange certificates that have been issued by a common certificate authority that they both trust. The mutual TLS handshake is very similar to the TLS handshake. We've just got an additional certificate flowing and being validated. Mutual TLS starts with TCP and upgrades it to be authenticated and then uses that TCP connection for encrypted data. We often hear about the requirement to have MTLS in a cluster for that encryption stage, but MTLS isn't the only way to solve the problem of encryption. It's a core principle really of zero trust that we want to see all the traffic in our cluster being encrypted. Another way that we can do that is with transparent encryption. Transparent encryption uses either wire guard or IPsec to set up secure tunnels between each pair of nodes. The traffic between pods on those nodes automatically gets encrypted as it flows through that tunnel. The pods are blissfully unaware that the traffic is being encrypted. The identities used for that encryption or the certificates rather for the authentication and then the key exchange that allows them to do encryption, those identities are representing the nodes rather than the individual workloads. If you don't trust your node identities, you may have some other issues. If you think your nodes have been compromised, you probably don't want to be running workloads on those nodes. For a lot of circumstances, transparent encryption ticks the box that people need for their encryption requirement within their cluster. Wireguard and IPsec were both originally designed for using VPNs. They're quite similar. They tunnel your IP traffic encapsulated and encrypted in the form of UDP packets. IPsec maintains these tunnels over a longer lifetime between each pair of peers. IPsec allows you to specify the cryptographic ciphers you're going to use and you can choose ciphers that are going to be FIPs compliant so you can run IPsec and be FIPs compliant. That said, a lot of people consider wireguard to be more secure. It's a newer alternative. You configure a wireguard interface so instead of having or as well as having the F0 interface out of your node, you'd also have a WG0 interface where if you put a packet into it, it's going to get encrypted. You basically just configure that interface, set up your private key and the public key for each peer you're going to communicate with, and then you just send packets into that wireguard interface. It just so happens that cryptography protocols that wireguard uses are not deemed to be FIPs compliant because somebody hasn't paid the money to get it assessed or something like that. It's really more about money than it is about technical merits, but for whatever reason, it's not considered FIPs compliant. That might put you down the IPsec route if that is something you need to do. The other thing wireguard does that's really nice is it automatically rotates keys for you. These are really nice solutions for that encryption part. If I have two droids in a galaxy far, far away and they want to communicate with each other, they don't need to know anything about whether that traffic is being encrypted or not. I have a cluster in the galaxy far, far away. It has a couple of kind nodes in it. Is that big enough to see? Anybody? Yeah, I see some thumbs up. Good. Okay. So if I ask C3PO to talk to R2D2, we can see that works. In the bottom half here, I've got Hubble showing all the packets that are flowing in the galaxy far, far away, and we can see lots of green. Everything's going well. Now, if I were to... This is going to be one of those demos where you're just going to kind of have to take my word for it. If I run TCP on the eth0 interface where, in this case, it's R2D2 where R2D2 is running, I shouldn't be able to see that traffic. I do have transparent encryption turned on. Let me just show you that. So if I look for encrypt, we should see encryptions enabled using WireGuard. And if I look for the beep message on that interface, nothing shows up. If I didn't have encryption on, you would see the message appearing. But I don't have time today to turn it off and turn it on again, so you're just going to have to take my word for it that that is encrypted over the wire. That's the encryption part, but it doesn't tell us anything about the identity of who's communicating with whom. And in Silium, for a long time, we've used network policy and identity based on Kubernetes labels to restrict traffic. How many of you here are actually using Silium already? Quite a few of you. So you will probably be familiar with network policies like this. So this policy applies to the namespace far, far away. It matches or it will apply to endpoints that have the class of droid. So a label class equals droid. And it's saying for all those droids, ingress traffic must come from endpoints that have the label org equals rebel alliance. So this should only permit traffic from pods that have that label rebel alliance. Now, the labels that you apply to your pods are used by Silium to construct what's called a Silium identity. So any pod that's got the same set of labels will get the same Silium identity. So let's just have a quick look at my endpoints. I'm going to look at the Silium endpoints. Silium endpoints, there's quite a few different craft in my Galaxy. You can see that almost all of them have a different identity. This identity ID column is the Silium identity. You can think of it as a security identity, if you like. I've got a couple of pods that represent the Death Star, operating a Death Star service. They have unique endpoint IDs, but they share the same Silium identity because they represent the same service. Okay. So that Silium identity is derived from labels. And if we want to make a network policy decision, it's using that identity value to check whether or not the traffic that's come from a certain pod that has a certain security identity does that security identity match the set of labels as required by the policy to determine whether this traffic is allowed or not. So traffic from a certain IP address has a certain security identity, which either passes or fails the policy. So if I set up a policy, I just need to comment out this bit and comment back in this bit. So I'm going to apply that policy that's basically exactly the same as what I just showed you. And this is going to continue to allow C3PO to talk to R2D2 because they're both droids. But if I try to communicate from, let's say, a tie fighter, this is going to hang, hopefully. And if we look at the Hubble flows, we'll see traffic here is being dropped between tie fighter and R2D2 because of the policy denied it. Policy verdict has been denied. Okay. Let's just stop that. Right. But what if I'm Darth Vader and I can use some kind of Jedi mind trick to spoof IP addresses? If I, as Darth Vader, can pretend that I have the same IP address as C3PO and that IP address has a certain security identity, then I'm going to pass the network policy. And that would be a bad thing. In fact, what can be potentially even worse if you don't have a transparent encryption turned on, traffic is flowing in the clear and will have the, if you're using tunneling, will have the security identity embedded in the header. So somebody observing that traffic would be able to see the security identity and they could also match on the security identity. Kind of unlikely, IP spoofing is not that common. And probably if you are seeing things, spoofing IP addresses in your cluster, there are issues, but nevertheless it's something we want to prevent. So let's come back to our network policy. And in addition, we're going to add two lines to the network policy. We're going to say, as well as matching the label-based identity, we require that this communication is authenticated, that you prove you are the security identity you say you are. So the existing check still is still in place. We need to make sure that that IP address matches the security identity and the security identity conforms to the labels that we require in the policy. But also, Cillium on this node where RTD2 is running, for example, knows where that IP address should be. So if I look at, let's look at the, so I've got a pod here for C3PO. If I look at the Cillium endpoint for C3PO, now I want to describe it and I'll get it. So we can see, we know about all the labels that apply to C3PO like it is a droid. We need that. It's a member of the rebel alliance. Good. We can see its IP address and we can also see the node that it's running on. So this information is available to the Cillium agent where RTD2 is running. So we want to be able to check that the traffic coming on this particular IP address is known to the place where that IP address is supposed to be running on. In other words, if we get traffic that says it's from C3PO, we want the node where C3PO is supposed to be to confirm that C3PO is in fact there. And we do that with a handshake. We're going to use the same handshake that you would use in an MTLS communication as we set up a connection, but it's between the Cillium agent and another Cillium agent on behalf of those pods. So because this is an ingress policy at the end where RTD2 is receiving the traffic, it's going to be initiated by the Cillium agent where RTD2 is running. That agent is going to get hold of an X509 certificate representing RTD2. We'll come back to that. It's got that certificate. It sets up a TCP connection to the Cillium agent where C3PO is known to be running. And it initiates the same handshake as we'd used in MTLS. There is an exchange of certificates so that both ends have authenticated each other. And now we're going to pass, and we go back to, we're passing this authentication required check that's part of the network policy. So yeah, let me add that into the policy. I'm just going to turn on these authentication mode required and apply that. Okay. So C3PO should still be able to communicate with RTD2. There we go. That started. So we can see here ingress is allowed and we've got this new authenticated by spire indication in the Hubble flow. That probably gave a very big clue if you didn't already know that we're using spire to distribute those certificates. And this is another thing that's transparent. We don't need the applications to get involved. In fact, we don't really need to do any configuration other than to turn mutual authentication on and say, please use spire. Cillium will automatically start running the spire services. Has anybody used spiffy in spire before? Okay. A smaller number of hands. Right. So spiffy is a way of securely providing identity management to workloads. And those workloads need to be registered. Spire is an implementation of the spiffy protocol. And you register a workload with spire. You tell spire something about that workload that only that workload can do. It's called a selector. And then a workload that can meet the selector conditions can retrieve this spiffy. It's called a spiffy verified ID. Really, the ones we're interested in are in the form of an X509 certificate. So in Cillium, Cillium is going to register the workloads on behalf of the pods and retrieve the corresponding certificates on behalf of the pods. And the spiffy IDs that we have correspond to that Cillium security ID that we had before. So if I look in my cluster, I can see Cillium automatically created a Cillium spire namespace. And we can see the spire server and some agents running in there. And I have just like a little abbreviation here, an alias for a long command that gets me into the spire server. But if I use this to show the spire entries that have been registered, there should be an entry in here for every Cillium security identity. So if I get the Cillium identities, I'm going to do it with the labels. This will be, you know what, I'm going to grab for C3PO. So 52452 is C3PO's Cillium identity. And we should have in here in spire, there we go. This is the spiffy identity for C3PO. That the Cillium agent will be able to retrieve on behalf of C3PO without C3PO needing to know anything about it in order to perform that handshake. Okay. And we saw already that when that handshake has been achieved, we can see authorised by spire as part of the policy verdict that we see in Hubble. Okay. So I've shown you that the authentication can work. But what about a situation where the authentication wouldn't work, like the IP spoofing situation? So pretty hard to set up an actual IP spoofing situation, but I am going to show you something kind of equivalent. So what I'm going to do is I actually have, as well as my kind cluster running in Kubernetes, I've also got this extra node here that's on the same sort of underlying Docker network, and it can connect to my clusters. And I've called it Darth Vader over here. Now Darth Vader doesn't have any access to any of the, there's no, there's no Kubernetes running on here. So it doesn't know anything about things like Kubernetes services. I can't curl to R2D2 as a service, but I could curl to the IP address provided it knows how to get to that IP address. And in order to make that possible, I've actually added this additional routing rule. So 10.244.1.16 is R2D2. And in fact, if I show the worker nodes, we should be able to see that kind worker, which is the node where R2D2 is running, has this IP address ending in zero three. So what I've done here is just add a route to say, if you want to get to R2D2's IP address, you've got to go via that node because that's where it's running. So I'm just going to delete that policy because otherwise it might work. So I should be able to curl to that IP address. There's nothing preventing it from happening right now. I've deleted the policy. Come on. Oh, I did. Thank you. Yeah. It's a good job. You're all paying attention. Thank you. All right. Okay. So there we go. Darth Vader was able to send a curl. So we know that that IP address is reachable from Darth Vader. Now what I'm going to do is create a policy. I don't want it to fail on the labels part because we already knew that worked, right? And we know Darth Vader doesn't have any labels because it's not in the Kubernetes world. So instead what I'm going to do is instead of matching on labels, I'm going to say all entities are allowed to communicate provided they can give me authentication. Yeah. So in other words, it's not going to fail on that first network policy check. It should fail when we come to check the authentication. So let me apply that. Okay. And hopefully, if we try the curl, it doesn't look like it's working. And if we look in Hubble, we can see that it's failing because of authentication not being in place. So authentication is required. It's not there. And therefore, we're just going to keep dropping the packets. So hopefully that convinces you that you need to be authenticated by a spiffy identity if that authentication required is in place in your network policy. So that is Silym's next generation mutual authentication. There are a couple of other things to note about this. So first of all, that handshake is happening independently of the traffic that it then permits. So although the handshake starts with a TCP connection between the Silym agents on the two nodes, doesn't have to be TCP in the traffic that is then subsequently permitted between those endpoints. So if you're using MTLS today in, let's say a service mesh, it's only going to allow you to authenticate and encrypt TCP traffic. Whereas with this approach, we will be able to authenticate any traffic that goes over IP using this handshake. And then encrypt, again, anything that goes over IP using IPsec or wire guard. The other thing that happens because we're able to have the handshake independent of the encryption is that you don't necessarily have to do encryption at all. Now, at the beginning we said for a secure connection to be secure, it needs to be authenticated and encrypted. But there are some environments when we have some users who particularly have particularly strict latency concerns. For example, maybe they're doing high-frequency trading and the amount of time it might take to encrypt and decrypt traffic can have a material effect on how fast they can do trading. So in those environments, they don't actually want the encryption, but they still want the security of the authenticated connection. They still want the security of the network policy that only permits traffic between those authenticated endpoints with the security ideas that have been proven by SPIFI IDs. So that's the situation where a secure connection doesn't necessarily have to be encrypted. All of this is basically made possible using the power of EBPF. How many of you have heard of EBPF? Excellent, good. So this is a new documentary that's going to have its world premiere today's Tuesday, right? So tomorrow, 6.15, here at QPON, we are seeing the premiere of the EBPF documentary. I've seen it. It is amazing. It's a really good story. Maybe I'm biased. I don't know. I think it's well worth seeing. And if you want to learn more about EBPF or Cilium, do you come by? There's a Cilium project booth. There's the Isovalent booth. We have a Cilium experience centre where you can come and get hands-on with Cilium. We have lots of books and resources that you can get from the Isovalent website, and you can also come by the Isovalent booth at five o'clock today, where we'll be doing some book signing of the learning EBPF book. I hope you are all really fascinated by EBPF, because I think it's amazing. So with that, that's all I have other than if you've got questions. So thank you very much. Anyone got any questions? Oh, it's Andre. He's going to ask me a hard question. Yeah, I have a quick one. When you showed us that when we were doing the authentication, was it between Pot and Pot, or was it between Pot and Service? I'm not sure I missed it. So the authentication is, you can see it in a couple of ways. So let me try and bring back the, actually that will do. That will do for the diagram. So the handshake is happening between two different nodes essentially, between the Cilium agents on those two different nodes. And they are, what they are asserting is that the security, you know, that Cilium ID exists on that node, and it's where the node expected that IP address to be, where the other node expected the IP address to be. Does that make sense? Does that answer the question? Yeah. Thanks for the talk. I was curious that, so currently you're using MTLS to do the handshake, but then you said that you're not using it for the encryption that you're using WireGuard or IPsec, but there are now some options for tunneling over MTLS itself, including UDP with some of the new tunneling options that are coming out. Have you considered that as an option and would you be open to taking that sort of contribution? I think answering the second question, if there's like use cases for it and you know, good engineering reasons, we are absolutely always interested to hear about, you know, use cases and accepting contributions. Yeah, totally. It's not something I've heard about. My immediate question is like, well, not a question. I think one of the benefits that we have is this idea that if we remove MTLS from the equation, we can support other kinds of IP traffic as well. So I'm not sure if I've understood that correctly, whether that would be like a replacement, but it might be an extra layer or an extra alternative way. Right. Yeah. I think it would be an alternative to those other two. And then you could tunnel IP over like H3 so that then you get the unreliability that you sort of want with IP or UDP of traffic. Okay. Yeah. Yeah. Definitely an interesting conversation Thank you. Hi. Thanks for the talk. I had a question about the spire usage. So you're showing the registration entries for the different workloads and they all had like the same selector value there. Yeah. So conventionally, like inspire, you would want to have like unique selector sets per pod. So I'm curious if there's any special usage of spire and slam, like if the workloads were able to fetch an identity from spire, they might be able to get like the identities of other pods if they all shared the same selectors? Yeah. So I think the part of your question said, is there anything special we're doing? Not that I'm aware of. Someone can correct me if that's wrong, but I think we're using spiffy inspire in a very standard way. The reason why we're using the same selector is because we're, I guess, twofold. One is because it's the psyllium agent that's going to do the retrieval on behalf of any pod. So it has to be able to do it on behalf of any pod. And also we don't know where, in advance, where that pod's going to be. So we couldn't tie it down to one agent because we need all of the agents to be able to retrieve the identity for any given workload. I think the, if that were compromised, again, it's one of those situations where you have a much bigger problem. You know, if you can compromise your psyllium agent to the extent that you're going to be able to get different identity, you could just change the EBPF programming. There would be a lot of other issues that I would be concerned about in addition to the ability to retrieve a wider set of sphids. Yeah, I think I see how it's being used with the psyllium agent as kind of like acting on behalf of the workloads here. I guess my question was kind of like if the workload itself could still get its identity from the spire agent in this deployment. Oh I see, yeah. Oh Nick, Nick knows the answer. I see, okay. Okay, so the answer there was that the identities can only be retrieved by the psyllium agent rather than by the workload. And I guess if the workload wanted to register an identity, it could register its own identity. All right. Okay, thanks. I was wondering, I was really interested in the wireguard part of this. So out in the world, we now have things like tail scale and other, sorry, that have brought wireguard to the masses more or less for your phone and laptop. Do you have any opinions on what we should be deploying as the wireguard part of this diagram in like let's say EKS, VKE, etc? Sorry, was the question should we be deploying wireguards? Oh no, what would actually look like from a deployment standpoint? Is psyllium, can psyllium do the wireguard bit or would that be a separate service? Yeah, no psyllium can do that for you. Yeah, yeah, yeah. So you just turn transparent encryption on in psyllium and it will set up the interface for you and make decisions to put packets through that interface for you. Oh, got it. So we don't need like a third service running around the square down at the bottom. There doesn't need to be a third service. Yeah, that's that square is really just supposed to be a tunnel. Oh, I got it. It makes a lot more, it makes a lot more sense now. Thank you so much. All right. Thanks so much for your questions, everyone.