 My name is Mark Han, I'm a security solutions architect with Qwallis. If you want to follow along with the slides, you can take a look at that URL there in blue at the bottom. You can pull that up on your phone if you want to keep the slides up while we're looking at that. It's got speaker notes. If you press sort of at the bottom of the MRAP presentation, you'll see speaker notes. And this is... Hi, I'm Ted Han. I'm his son. We are a father-son team. And we like to do these presentations together. Having some fun with it. I'm a site reliability engineer, a consulting site SRE. I help a number of small startups do cloud-native write, security, AWS, all of the stuff around Kubernetes clusters, whether that be the basic setup to running large distributed apps on them. Cool. So this is our presentation on the Kube TLS. It's a tool that helps you achieve neutral TLS by injecting certificates into every pod. In 2002, we built the original version of this. It's a web admissions webhook pod which mutates the pod. And we've updated it to take advantage of new changes that are coming in newer versions of Kubernetes. So in particular, there's a feature called cluster trust bundles that allows you to set the scope of trust for cluster. So let's start off with what is a trust bundle? I imagine some of you have been to the various spiffy talks. So you probably have some idea, but at the same time, I don't think any of them defined trust bundle particularly well. A trust bundle is, at its most simple, a trust bundle is a bunch of CAs. You probably already use a trust bundle that is your browser's web CA store. That's a trust bundle. But that is a very broadly scoped trust bundle. We want much smaller scoped trust bundles. Typically, we want trust bundles per organization or even two or three for your organization. So this is what, essentially, you're familiar with this. This is how web certificates work. This is how HTTPS TLS certificates work. So at the bottom there, there is a workload certificate. That workload is probably a web server. And that web server is signed by some intermediate cert that then in that certificate that you get back from your signing authority typically contains both your workload cert and then the intermediate certs that chain up to a CA. And then at the top level there is that root certificate on most systems that you're running, your phones, your laptops, your Linux servers that are deployed in your cloud architecture and on your clusters. That root CA contains about 160 different CAs. Some of which are just, well, next slide. So some of which you know and some of which are the Peruvian central authority. Go ahead. So what do trust bundles allow you to do? So trust bundles allow you to verify that somebody is saying something, right? Web certificates contain claims, typically sands, and web certificates verify that the trust bundle has said these things. So what trust bundles will allow you to do is to make the, well, make your Docker images smaller and they'll allow you to much more rapidly update trust bundles such as the web root of trust. Currently you build the web root of trust into the very base most layer of your Docker image. So you've got all of those things and they're actually, you know, sometimes even the majority of a Docker image, the Google distro list image has just four things in it of which the trust bundle is the largest by far. That's not true for most Docker images but, you know, that is one of the base most layers. But sometimes that trust bundle changes, right? Just a few months ago trust core was removed from every web CA, you know, is in the process of being removed. I think they have like two or three more days before they're disallowed. But that is still embedded, right? That certificate is still embedded to all of the Docker images you built last week. Cluster trust bundles can be mounted like config maps and you will mount them like config maps on at CSSL to update the web root of trust dynamically so you can keep that up to date regardless of when your Docker image was built. But more importantly, trust bundles can be mounted like config maps and you can have a bunch of different trust bundles for all of the different little scopes that you need. And so that's what we're looking for is to try and limit the scope of trust. So as we said, you know, that top row of CA's of root CA's is essentially the web root of trust and the trust that's in most Linux distributions, Amazon Linux, Google distro list, Alpine, you name it. It's got 160 of those. What we recommend and what our opinion is, you know, on your clusters, you basically want a very limited set of top level CA's. You want essentially your internal private certificate authority to be your root of trust and no others. Maybe a partner or two, but not on every server that you've got. So what we're looking to do is try and provide a way to implement easily trust bundles that limit your scope to just the part in blue and not the 150 items in red, right? There's no reason that any server should have to trust, you know, a Peruvian cafe website. But your system would accept that Peruvian cafe website's TLS certificate because it's got an intermediate chain that roots up to the Peruvian root authority, which is in your trust bundle. And part of this point is not that you shouldn't trust all of those web certs, but you should trust them for when you're going to the web. For most applications, for most of your internal calls, you are not going to the web. You should be using a trust bundle that is scoped to your internal network. So the web to root of trust will still be around. It just won't be the default for most of your applications. Right. So in our keynote this morning, we just added this slide. The presenter, Zach Butcher, mentioned, you know, trying to get to a zero trust architecture. These were the five points that he made. Kube TLS helped to implement those first three points very easily. So we'll show you how this works. Yeah. So Kube TLS automatically injects certificates that provide workload identity onto every pod and every container in the cluster. They provide privacy by TLS encryption. They provide authentication by TLS mutual encryption. And authorization comes very quickly from that simply because you're, you know, once you have those identities, those per pod, per service account identities, authorization becomes a very quick layer. So how do you do, how do you do secure networking on Kubernetes? Well, you know, in the last talk, there was a number of, they went over a bunch of those ways. Any number of talks here are going to talk about different ways to do that. So the two most common ways are these first two sub bullets is one is you're using sidecars. And we're not a big fan of sidecars. They don't cover all the space. They're complicated. They're typically managed in enterprises by teams that are not your application development teams. So they're orthogonal to your application architecture. The other thing is that your application security architecture should be designed in your application. And one of the, one of the 12 factors in the 12 factor app is that there's a sort of a single point. You're, you minimize dependencies. And so if your security of your application architecture is depending on not only your application, but also these sidecars, now you've got two points that are, that are doing that, not a single, single dependency for it. The other choice is some kind of CNI, complex CNI thing with Calico or something, where you're trying to limit network access and who can send packets to who. And again, this is at the wrong layer. It is in your network architecture, your application security architecture is now dependent on your network architecture. And the two have to change in sync and that can be very difficult. So neither of these two ways are very, they're difficult to manage. Typically in most organizations you have two separate groups that do it. And so there's a lot of communications that's got to happen. Our recommendation is the right way to do this is that every service speaks TLS natively. So it doesn't send a packet anywhere unless it's over a TLS link. And to do that you need certificates in every pod so that you can create that TLS connection. In our demo repository, which is linked at the end of this presentation, we show different examples of how you configure your service so that it speaks TLS natively, how to get that private key and that certificate bundled into however your particular language does web service. And it really is usually that simple as, you know, when you are making a GRPC connection, you have to specify, I am doing this the insecure way or you can specify, I am doing this the coop TLS way. We provide just a GRPC dial option that makes it, you know, just picks up the right place, picks up the right certificates for your cluster. We provide a Java version of this as well. Yeah. And if you've ever gone, you know, Google how to do MTLS with Java, like half of it is how to create this, not half it, 90% of it is how to create the certificate and 10% of it is like, here's the Java code that just simply uses that certificate and private key. And we've simplified that 90% down to 5%. One of these days I'll get around to writing the Python version. So the nice thing about TLS everywhere at the application level is that there is no chance of it being sniffed whenever it is leaving your pod and in fact your application's, you know, memory space, it is TLS encrypted. TLS native, it's encrypted and it's mutually authenticated everywhere you go. So yeah. What is coop TLS? At its most basic, it is an admission controller. It's a mutating webhook configuration object inside of your Kubernetes cluster. What that means is the Kubernetes cluster is going to, based on the filters that you set up in the YAML for this object, it's going to send every pod request to our service. It makes a web call to our service. We return data back to the Kubernetes master. We modify that incoming pod request, add the certificate secrets, add certificates to it that contain the secrets so that they're mounted on a file system visible to that pod and so all that pod has to do is pick those certificates up from a well-known location. And so this is the building blocks of trust architecture. So yeah, it's got a private key, a matching private key and it's got essentially the trust bundle. So it's got the root certificates that you care about for this pod in there, which is basically just your internal CA in most cases. We'll talk about the different types of patterns that you'll use. That bottom bullet will get replaced by the cap when it's doing cluster trust bundles and that feature is added to Kubernetes and then all you need to do is provide a secret and a key. And this is the building blocks of an architecture that we'll talk about. Yeah, I think we've already mostly covered this, but the application should speak only TLS. Your organization itself is a zone of trust. There is no such thing as zero trust, but you can make your application-to-application connections and start of a zone of trust. And once you have done so, adding the identity-aware part of allowing things to connect becomes very simple. Once you have the core identity and authentication provided on every request, the rest of this just falls into place. So this is an example of how not to do it, which is basically just the side-card pattern we think is flawed because there's open text that is, as the previous presenter mentioned, like if somebody is a muck on your network, which you have to think of that as a valid possibility, you have to assume breach. And so if you assume breach, that clear text connection can be sniffed and used in various and nefarious ways. So this is why we don't like this architecture. We'd rather that that application pod speak TLS so that's encrypted well before it hits any of your network architecture inside the code, inside the running application. It's encrypted before it hits network networks. And just to be clear, this is hard to sniff. This is not easy to happen. On the other hand, Istio fails open. If your side cards aren't injected, your application will still try to talk and the proxy won't be there and it'll just talk to the network. That's a problem. That is actually pretty much the crux of our argument. And Istio fails open. You don't want that. And there are ways to bypass the network. If I have a rogue app that's been deployed to the cluster, I can ignore the fact that there's a side card there and talk. Now you can fix that with complex CNIs, but those are complex CNIs that... This is our opinion. This is much easier to implement in our opinion. Although this looks as complicated as anything else, this is just a set of repeated patterns done the same way with slightly different regular pre-planned variations on what they are. So here's... Yeah. So this is what an app looks like. It is serving something and it is acting as a client. And it does so with certificates provided as... Or certificates chaining up to your trust bundle. It connects to... It has an identity on your network. And that identity might tell it, might say, I am the server for X. And that identity might say, I am the client that has this service account. And you might just put both of those into the same client certificate as Kube TLS does. It just creates a client certificate, a client-slash-server certificate for each pod when it comes up. It figures out which services it should be valid for. So it adds the DNS names, the appropriate DNS names to it, as DNS sands. And it adds the service account in as both the common name and the spiffy sort of style ID. And probably more as we figure out and turn ad feature flags to make this work in interesting ways. There's still some question about how exactly those identities should be represented, though spiffy seems to be a pretty clear, pretty commonly used. So this is another way to apply the same pattern. So this would be talking to partners on the web. So there are two different ways to do this. We prefer the way on the top because we think it's more long-term robust. But let's talk about that. So what you do is in your internal service and their internal service, you trade trust bundles. And that way when their API server uses its internally generated certificate to talk to your server, you're going to validate that certificate that you're going to do an MTLS, mutual TLS authentication, becoming client-cert with the trust bundle that's been injected into your pod. That trust bundle in this case is your partner's trust bundle. And therefore, you trust that certificate. On the calling side there, your partner's API is going to make a call to a web server. And when that web server sends back the hello, I am server XYZ, essentially hello, I am blue server internal API, it checks that blue certificate to see if it's valid. And it does so against your trust bundle, which is the blue trust bundle. And so it says, yes, this certificate chain matches cryptographically. So we're good to go. We believe we're talking to the correct blue server and that yellow certificate that we send down the line, the blue server is able to validate that yellow server as belonging to the yellow company. The other way to do this is basically not to trade trust bundles but to trade certificates. That works just as well because the blue server will be validating a blue certificate against the blue trust bundle. That works. The problem with that is that you want to be able to rotate your certificates pretty fast and that means exchanging certificates with your partners on a regular basis, say every 60 days, every month, every week, like whatever your aggressive certificate rotation plan is. You're probably not going to get down to seconds with that. You're probably not going to get seconds. But your certificates, your CAs aren't going to change all that often. So that's something you can do yearly and then still rotate your certificates. You rotate your blue certificates as fast as you want. They rotate their yellow certificates as fast as they want. You two pretty much don't care what you're doing operationally. You just care that you trust each other. You care that you trust each other. So here is the details on what a Kube TLS certificate looks like. So we start by being called as a webhook. We look up the container services. So one of the flaws in our model is that whatever services the pod is associated with when it starts up are the host names that it gets. If you create the service after you create the pod, then you're out of luck. If you change the pod afterwards, you change the labels on either the pod or the service afterwards and change the services it matches to, then you're out of luck. Which is fine so long as you don't do that very often. It's also fine if you move away from doing the DNS sands and start moving to a model where rather than authenticating the services, you authenticate the service accounts that you're trying to connect to. And that's sort of the step that you want to go to eventually. But for the moment, DNS sands is really easy and compatible with everything. So Kube TLS then creates a CSR. CSRs are an object, a top level, cluster level object in Kubernetes. And you can use them to do certificate signing and approval. It will approve or its controller to approve will approve. And then it attaches a secret to your pod by way of mutating your pod and responds back to Kubernetes with that. Yeah. So these are the key fields that we use in the X509 certificate that we're creating and injecting into the pod. So the common name is the name of the pod. So common name used to be the website that it served for, but now that has moved to the sand, the subject alternative name. So we've used sand many times without defining it, but the sand is the subject alternative name. And that's the list by which most of your HTTP mechanism will validate. Like, am I talking to server X? If the certificate that server X presented you contains the sand that says server X, then yes, you're talking to them. I know, by the way, that it cryptographically chains up to the root, right? That's what we do. So the name of the pod and the subject alternative name, it's a DNS with the names of the service, the name of the pod, and any other DNS name that makes sense based on the flags that you give Kube TLS. We also throw in a spiffy ID that's based on the service. And so you can use this with your spiffy IDs. We set the key usage. So in a X509 certificate, there's a field that says how can you use this key? There are a couple of values there. One is this is a root CA. This is a signing, typically an intermediate CA. Or this is a web server or this is a client certificate. And we set them both. You are a web server and a client certificate so that you can use the same certificate on either side, on the client side or the web server side. We do the same thing that Let's Encrypt does. Yes. That's what's important. And then it also has the identity of the root. It's signing up, too. So let's do a demo. Demos are fun. Apologies to Semisonic here for the lyrics. Let's mirror this and let's make this a little bit bigger. Is that big enough? Can people read that? It probably helps if I move it up so you can actually see what's going on. So I already have some of the pods running but I am going to delete it so that we can see a fresh one. So what we've got is a greeter service and a greeter client that we'll demo. It's the Hello World example. It's the GRPC Hello World example but modified to use TLS nicely as we want it to. So here is our pod and we can exec into it and look at the certificate that has been generated. It helps if I get my flags quite right. Cool. So here is a certificate. You can see its issuer is our issuer. And that will match up with the CA certificate that it's also provided in that file system. You can see the DNS names. It's got the service name. It's got the pod name and then it's got a spiffy UID. And it's just a standard TLS client server cert. RSA at the moment. I was potting around with adding ECDSA but close enough. And this more or less works. And you can see the CSR that was generated a few seconds ago when I deleted this pod. Kube TLS went through the process of creating this CSR, approving it a signer that we've set up for this, which you can use cert manager, you can use an external signer, or you can use a very basic one that is here for this demo. All of those work. It's agnostic to the actual signer, which allows you to provide your own or do something better. And then we can just call a service. So the Greeter service, the Greeter client, is just a batch job that runs and exits. And so the pod runs and then shuts down. So it's a job. Kubernetes doesn't restart it but saves the logs so we can examine the logs after it's run. Yeah. And so this just connects. It connects to the appropriate name. It gets the greeting. And it also sees the service account that we've been running as. So if I were to modify my Greeter client, I can set the service account name. That has chosen a very weird indentation. Here we go. Can you stop for one second? Yeah. So scroll down a little bit. We've got some stuff coming out. But when you look at the, okay. Yeah. I don't know what you're pointing at. Yeah, no. So we can create another version of it with a different service account. Running is a different service account. And now it has this other identity visible to it. Again, through the Kube TLS certificate. And this through other identity is not, it's not our client saying that. That is the server that it is connected to that has said, you know, here I know who you are. I can verify who you are. So what else is there that is interesting? The other thing that's sort of interesting is that, you know, it's MTLS, right? If I override the CA and I use some other random self-signed MTLS cert. Oh, geez. How did I screw this up? There's a mix of tabs and spaces. And now I am realizing it. Let's fix this. I don't know why you have a caps lock key. We create yet another copy of this. And this one will fail because it is using a self-signed certificate. This one is actually failing on the client side because it's signed by an unknown authority. But we can have it fail on the other side when we present it an unexpected key. And if we go look at the logs of our, okay, the debug logs are not working. It should have printed out that we got a failed connection from this other server. That we denied a connection. So the first failure was when we saw the other side certificate, like when it sent us our certificate, when it sent its server hello certificate, I am server XYZ, along with that certificate, we looked at it and said that doesn't change to our certificate authority. So we're closing the connection from the client side. The second error was we accepted that certificate because it did chain up. And then we sent it an invalid certificate that didn't chain up to its cluster trust bundle. And so the server closed the certificate on that side. It says use of closed network connection, which means we tried to write, because we thought this was going to work, but the server closed down on us because it didn't trust our certificate. Cool. I think that's all we have. A couple of wrap up slides. Yeah. So, yeah, let's move back here. Yeah, there we go. Yeah. This makes it really easy to have certificates populated and have certificates populated in your application's name space so your application can make use of certificates rather than trying to wrap it in some network layer that does this all magically. You know, their advantage is to both kinds of magic. We like ours. Yeah. We think all your developers should learn how to use TLS and mutual TLS. Like we think that should be the price of entry for being a programmer on cloud native. But also that we should make it simple enough that they can know. Again, you know, your MTLS demo is 95% plainware certificates. We've collapsed that down to certificates are provided for you. Just do the right code. Yeah. And, you know, there is some agnosticism to identity policy. We intend to provide a reasonable set of defaults that you can flip on and off. And hopefully one day it will, you know, be just one. But at the moment there's still some question. Once everybody's establishing authentication through MTLS, then the authorization layer becomes, you know, relatively easy. You can see who you're talking to. So then you can pick what their actual authorization level is. So other future directions. Private keys should be generated on the nodes and never leave the nodes. Currently they're created through secrets. That's suboptimal. If they were generated on the nodes, then the private key material would never leave the nodes. That would be great. So we're planning on moving to a CSI model. There is not ready for demo, but there's some work on that. So that rather than mounting a secret, you mount a Kube TLS CSI that just generates for you the private key material when the node comes up and the rest of it happens pretty much the same way. But it happens on the node rather than at a central set of pods on your cluster. So because it happens on, you know, that logic is on every node and the private keys stay on every node, never getting transmitted. Even though it's being transmitted by the Kubernetes control plane in TLS, we'd rather just not transport those around. Yeah. And this should be built into Kubernetes. But you know, that's our opinion. We like it. The cluster trust bundles is one step in that direction. It's a huge step. We hope that will knock down those dominoes and move forward. Cool. Plan for world domination. Yeah. Here is our repo. Here is our presentation slides. And here is the session feedback link. Sorry to keep you from the vendor booth crawl. So there should be beer available now soon. But thank you. We're here for questions. Yes, though, in a branch. Yeah. I will be glad to help you run that demo on your cluster if you would like to. If you look at the git commits, it was like, I don't know, about an hour ago. Yeah. Go ahead. Yes, but we're not confident enough in the code quality yet to do that. Mostly because I wanted to make the demo work. Yeah. Yeah. Cool. Thank you very much. Question in the back. So we work with any CA that provides a Kubernetes signer. So that is a very short list at the moment. That's cert manager. But at the same time, there is also a not even in the branch but a super janky demo of here is how you use a AWS CA to provide a Kubernetes signer that I spent 20 minutes on last weekend. So let's talk about it. I'm glad to do the work to make that work and I do intend to do the work to make that work. Microsoft has CA, like parts of their Azure CA are exposed in their AKS stuff, right? Right. Oh, then it will just work. Yeah. Yeah. Yeah. Yeah, it should just work. If that is already exposed, then yeah, it should just work. So I've tried it with our own CA and I've tried it with a cert manager and I see no reason why it wouldn't work with the other CA that I've used, which is the hash or vault signer for CSRs. And that, like anything that is a Kubernetes CSR signer will work unless there's something really weird about it. Excellent question. Thank you. Yeah. You get an extra beer. Yeah. Yes and no. The simple answer is that a trust bundle is all equivalent in most SSL implementations. And that means that you want to be careful about what you put in your CA list. There are additional, like I do believe that you are getting at a leading question about how open SSL handles CA PEMs, which I, yeah. And then you add new certs. Oh, yes, of course. Yeah, and that answer is no. That answer is very clearly no. And if you start your application and then we, and then you want to rotate certificates, the answer is still no. Your application, unlike with some other things, like it doesn't pick up new certificates on a regular basis. We don't provide a rotation mechanism of these secrets. What should you do about that? Kill your pod. Kill your pod. Kill your pod every 30 days. Yeah. Yeah. Yeah. Yeah. Yeah. No. Yeah. No. Pretty simple. Yes, but at the same time, like we're going for the least common denominator here of we don't expect, yeah, yeah. We don't. We are absolutely limited by all of the limitations of your open SSL library, whether that be open SSL, Libre SSL, Golang's and go SSL implementation, whatever you're using. Bouncy castle, whatever. Yeah, please not bouncy castle. Yeah. Okay. Cool. Well, we got the one minute. I think we, yeah. We're good. Yeah. Time to turn off the mics.