 Hi everybody, thanks for joining our talk today. This year, of course, it's virtual, so we don't get to see all of you in person, but we're hoping this time next year that we'll be in EUCubeCon when it's safe to travel and look forward to seeing everyone then. Today, we're gonna be talking about service identity and how pretty much it's fundamental to achieving zero trust in a cloud native environment. I'm Matt, I'm from Jetstack, just like Josh. He's joining me today. We're a company focused on cloud native and we've been working in the Kubernetes ecosystem since the very early days, back to about 2014, 2015, when CUBE was first open sourced, not really that long ago. It's amazing how far we've come. In that time, we've worked with many companies who have adopted cloud native from startups to some of the largest enterprises, and along the way, we've contributed to open source, upstream, and we've also open sourced a number of projects and we're probably best known for the CERT Manager project, which we originally created and now work on with lots of contributors in the community. And of course, it's now a CNCF project as well. Josh, I'm joined today by Josh. Do you wanna introduce yourself to you? Yeah, hi everyone. Yeah, I'm Josh, I'm a software engineer at Jetstack. Do a few things at Jetstack, but yeah, primarily part of the CERT Manager team. Great, thanks Josh. So today, we're gonna be, as I said, looking at ZeroTrust, of course, ZeroTrust is, it's quite a buzzword, but we're gonna look at it specifically in the context of cloud native and what it means for Kubernetes and also what it means in a service mesh, how you can use, how you can achieve ZeroTrust using the identity management that a service mesh provides. We're gonna be also looking at how you can plug in CERT Manager work behind the project and we've been working, so I say, wait, Josh, he's been working on some of the integrations that tap into managing certificates and providing those to the mesh. And we've got a number of demos as well, so let's get going. So briefly, to start, we're gonna introduce the concepts of ZeroTrust in the context of cloud native. So in traditional network security, it's all very much about having layers of security, a defense in depth, if you like, in order to protect really the most sensitive services. And the model is really based on the fact that it's very much about protecting networks from external attackers. And it often gets described as the Castle and Moat approach in which everything inside is trusted, everything. Once you're in the castle, effectively, you're trusted. But this really overlooks really the threat of insider attack. And there have been a number of high-profile cases where breaches have occurred internally. So now let's move to the world of cloud. How does this traditional network security model, does it work? When the world of cloud, multi and hybrid cloud, it's actually, it's really challenging. No longer, it's no longer clear where the parameters lie. And it really, it's very difficult to find a consistent means of really drawing up those parameters. It's fancy that the traditional model of securing application networking is really very much based on infrastructure that changes very infrequently, it's static. We rely on fixed IP addresses, ports, access control lists, firewalls, and in a world of cloud native, this becomes highly challenging. Why? Because we've got clusters in different clouds on different networks and virtual networks even. And the workloads that run in those clouds and clusters are really highly dynamic and ephemeral. Some are very short-lived, some much more longer-lived. And it's not just Kubernetes, right? We've got functions here, serverless, standard virtual machines, and of course, infrastructure on premise that we also need to integrate. So it's clear that this sort of perimeter-based approach really just doesn't fit and it doesn't scale. And it's incompatible with really how we're building and deploying software today. So instead, what we're going to move to where of course the industry is moving to is a model where effectively the network is completely untrusted and there is no implicit trust, you cannot trust another service. That doesn't mean giving up on the defenses, of course, perimeter, right? But now we're assuming that the attackers could well be in our network, they are in our networks. And that means making sure that all communications, the service-to-service calls are secure. I'm meaning encrypted on the wire, but also that the services are able to authenticate to each other and that's each and every service, each and every time. So how do we do this? Well, it means each service has to be able to identify itself. And we're talking here a cryptographically verifiable, a unique identity that it can attest. And with identity, we can authenticate between services, once we've got that, we can also begin to make authorization requests. We understand the service, who it is, and we can make a decision about what that service is requesting. For instance, can this type of request be made by this service accessing some particular resource, for instance. And we've also then got the ability to audit this, providing things like flight detection, so on and so on and so forth. But pretty much though, the foundation in all of this is identity, machine identity as it's now being referred to by the analysts. So rather than relying on developers actually obtaining identity certificates themselves and applying layers of implementation for things like observability, that identity, also reliability features, service measures provide that really built into the platform. And this is a capability that is programmable. So it's dynamic and is controlled with control plane configuration and the data plane itself are proxies. So now rather than the services, I've got an example here of service A, service B communicating directly, the services communicated via these proxies and these proxies are securely connected using MTLS and credentials which are obtained and renewed and have that identity encoded in them. And we're gonna talk a bit more about how the identities are obtained and what they are. But this is, service measures are incredibly convenient. They take away a lot of what a developer would typically do and they do, and we know we're developing things like this, it's quite easy to get security wrong. So service measures really convenient because they take away a lot of this difficult complex implementation and put it into, if you like, the network layer and enable it to be programmable. And so we're gonna look today at how specifically the service measures can help us manage that identity and then importantly use that identity to do the likes of authorization, policy-based authorization. So I mentioned MTLS or Meet Short TLS and that is how the proxies communicate in a mesh between the services that are in the mesh. Josh, you're gonna talk to us about what MTLS is and how it works. Great, so I'm gonna talk about mutual TLS and how Service Mesh uses mutual TLS to create a network of zero trust. So as was mentioned earlier, so all workloads running in the mesh, old Kubernetes have a SciCar container running next to them and that SciCar container is a proxy. And the responsibility of the proxy here is to intercept all ingress and egress traffic from the service which is backing that proxy. So what this means in terms of connections is say, Service A wants to talk to Service B or Service A will open up that connection, that will get intercepted by the proxy, that proxy will then connect to Service B's proxy and then assuming that connection is okay, that traffic will then get forwarded on to Service B. Now it's worth noting here that the way these proxies are injected like this means that there's no code changes needed for any service. And as far as either service is concerned, they're connecting to each other normally over HTTP, et cetera. So if you think about what kind of network flow would look like, so when these services boot up, they will crest these crypto bundles from the control plane of the mesh that they're running in. So inside these crypto bundles, they will contain the root of trust of the mesh along with a sign certificate which will contain their machine identity inside. So the kind of service identity as well as the private key which will be used which is the corresponding pair with the sign certificate. And these will be stored inside the proxy and these will be used to kind of open connections up to the other proxy. So if we say that client A wants to, service A wants to connect to Service B, what will happen is the proxy will connect to the other proxy. The server will just like a normal TLS present its sign certificate. The client will then verify the contents of that certificate. So it's going to verify that it's been signed by the root of trust that it's expecting and also that the identity matches that which is expecting. It can also make networking decisions here. It can also make authorizations decisions here based on kind of the identity it receives. So it's also going to challenge the kind of contents of that certificate. So it's going to make sure that the service B proxy also owns that certificate it's presenting. So it also holds the private key. And if it's happy, it's also going to send the server the its own client certificate and that process is going to repeat. So what you get here is, you know, in the normal case of the case of normal TLS, the client is going to verify that the server certificate matches what it's expecting. Also the server is going to verify that the client's certificate is also what it's expecting. So this is where the kind of mutual trust comes in. So if you take a bird's eye view of how this looks like, you can see that the control playing component, which I was mentioning earlier is responsible for delivering these bundles to the workloads on boot and on renewal. So if we take Istio's case, which we have here, it's the Citadel component, which is responsible for facilitating sending these crypto bundles. So it's the Citadel that's going to be signing these at the certificates which contain the identity of each service. The kind of identity of each service in the Kubernetes world is typically done by the service account. So on boot, the kind of service, say for example, send a request using the service, say service account to Citadel, it's going to verify that service account is signed by the API server and then it's going to sign a certificate to send back to the service, say proxy. The way the identity is encoded in the X509 certificate is either done by DNS, let's say, or in the Istio's case through Spiffy, through URI. Spiffy is a framework for effectively encoding identities into identity documents. So in the case of X509, this is actually just a string in the URI sands part of the certificate. And it's encoded like you see here. So you have the trust domain, so in this case, it's cluster local, but that could be Europe West one, mesh two, what have you. And then followed by whatever makes sense in the context of the kind of trust domain that it's operating in. So again, in the Kubernetes world, this is the service account, so it'll be followed by the namespace that the service account belongs to and then the name of the service account itself. We're now going to look at how service identity works with a number of different service mesh implementations. And we've got three, and we're going to look at them briefly and we're going to show you a bit of a demo. So just to start with, we're going to look at Istio. So this is the architecture of Istio, specifically focusing in on sort of the components that are used for security controls. And to point out here, Istio in its control plane has something called a component called the Citadel. It's now part of IstioD and that is effectively responsible for providing the X509, the TLS certificates to the proxies, in this case, the Envoy-based proxies. Now out of the box, it's self-signed, and so it has a self-signed CA, and you can really just sort it up and use it as is. But if you're wishing to integrate this into an existing PKI, and then there is a means to be able to plug your own CA into Istio, and Josh is going to demonstrate with a demo how to do this. Over to you, Josh. Thanks, Matt. So what I have here, I have two certificates already installed inside the cluster. So I have a root CA, which is the root of my PKI. This typically would live offline from the cluster, but I have it installed here. And I've also minted an intermediate certificate authority, which is going to be used as the root of trust for my Istio mesh that I'm about to install. So the first step for using this intermediate kind of custom CA with Istio is I need to extract the private key CA and certificate into a format which Istio accepts. So what I'm all in doing here is extracting that and putting into various files. I'm then going to create a secret based from those files and put them in, yeah, put them into a secret. So when I install Istio here, since that secret is already available, Istio D is going to read those files from disk because it's mounted from the secret, and then it's going to use those keys and certificate and CA, et cetera, as the kind of signing kind of CA. So what's going to happen then is all workloads in the mesh when they request their crypto bundles from the control plane, they're going to be signed from that intermediate that we've minted here. So now we have the Istio service mesh installed with our custom intermediate CA. And now I just want to verify that the identities that are issued by the Istio control plane are indeed signed by that intermediate. So all I'm doing here is I'm deploying a couple of dummy applications, HTTP bin and sleep. And all I'm doing is opening up an SL connection from the sleep proxy over to the HTTP bin. I'm extracting the certificate that the HTTP bin proxy responded with. And if we just look at the contents there, you can see there at the bottom that it was indeed issued by the Istio CA intermediate, it's difficult authority. And indeed, if we look at the sand name, the URI is the specific identity that we're expecting. So it's the similar to what we were looking at earlier, the trust arena's cluster.local and the service account is in the default namespace and the service account of HTTP bin. Thanks, Josh. That was great to see. We're now going to move on to Linkadee. And Linkadee, of course, is a service mesh that's in the CNCF. Very similar architecture here, or certainly has a control plane, has a data plane, somewhat different, of course, in that the proxy is not an envoy. It's a Rust-based envoy, the Linkadee proxy. But it also has a component in its control plane, much like Istio, that's responsible for identity and providing the identity to all of those proxies in the data plane. And so just as before, we're gonna see a demo of how to plug in an intermediate into Linkadee and see it set up and see a workload, or set a workload running. Great, so next I'm gonna be installing the Linkadee mesh with a custom root CA. So again, I have the root of my PKI living inside the cluster. Again, this would normally live offline somewhere. I've also minted an intermediate CA from that root, which I'm going to be using as the root of trust for my Linkadee mesh. So what I'm doing here is grabbing the CA of that intermediate and then installing Linkadee now using that CA. So that's noting here that I don't need to create a custom secret since the Linkadee mesh is expecting the same kind of format as my set manager's certificate that contains the intermediate CA. Great, so once again, I want to verify that the identities that are provided to you, the services running in my Linkadee mesh are those we're expecting and signed by the CA intermediate that we minted earlier. So again, I'm deploying the kind of dummy services and this time I'm gonna do the trick in the other way around. So from the HTTP proxy, I'm going to open a DNS connection onto the sleep pod. And if we grab out the certificate, which was returned, you can see here that the certificate was indeed signed by the intermediate that we're expecting there from the issuer and the certificate itself contained a DNS in the form that we're expecting of the kind of pod identity that we connected to. So in this case, it's DNS and indeed it's the sleep service account which was returned as the identity. Thanks, Josh. So finally, we're now going to look at Open Service Mesh. This is another service mesh that is in the CNCF. It's in the sandbox, just like cert manager. And this is a project from Microsoft also uses Envoy as its data plane in its data plane. And it actually has a component built into it. For example, from the get go, it's had support for certificate management and that was responsible for kind of generating various certificates and also distributing it across the mesh. And we actually worked quite early on with the project to integrate it directly with cert manager. So you can use, I believe, hash equal volts. It's got support for, it's got its own inbuilt component as well, I think key vault, Azure key vault and cert manager. And so, Josh, you're going to demonstrate how it works with cert manager. So finally, we're going to install Open Service Mesh. So like the others, I have a root CA installed in the cluster and an intermediate minted from that root. Like the others, I'm going to extract the CA Stifgut which I'm going to supply to Open Service Mesh via a Stifgut. And what's the interesting thing with Open Service Mesh is that it integrates as mentioned directly with cert manager. So I actually have an issuer installed in the cluster which is backed by that intermediate CA. And what this means is that the control plane, well, not only the control plane of Open Service Mesh but also the work closed running in the Open Service Mesh or request the Stifguts via Stifgut request resources which are cert manager resources. And so the issuer that they reference will go ahead and sign those resources. And what's the nice thing about this is that the cert manager issuer which is referenced by Open Service Mesh is probably the key doesn't need to live in the cluster whatsoever, it can live somewhere else just as long as it can sign those Stifgut request resources. So lastly, once again, I want to verify that the services that I deployed to my Open Service Mesh cluster, their identities are indeed signed by the intermediate CA that we minted earlier and as well the identities are matching at what we're expecting them to be. So again, I've deployed these dummy services and I'm going to be using the Open Service Mesh CLI to dump out the proxy configuration of our sleep pod here and we check if we grab out the kind of Stifgut which was dumped out. We can see again that the Stifgut was signed, issued by the intermediate that we minted earlier and the identity which is contained within that sleep Stifgut is the identity of the sleep service account. So in the Open Service Mesh case, it's represented in DNS and you can see here that it's the sleep service account and the default name space in the cluster local trust domain. Thanks, Josh. That was great to see the Mesh is actually in action and how it's possible to plug in an intermediate CA for the trust domains and have the Mesh issue those signed Stifguts distribute those to the proxies and to use for the purposes of the service identities for the applications. It's really, really neat. We should probably, I'm thinking about this, we should probably open source those demos, Josh. Let's try and make that happen in time for you then. Yeah, of course, it's worth pointing out that there are other measures, so many of many measures out there and we had limited time today. So what we've shown of three and those, for the most part, they've got of course, Spiffy identity, the Spiffy identity that Josh spoke about, but there are others and there are a number of them here that many of you will possibly be familiar with already and just worth pointing out, there are also some sort of Mesh-like implementations out there and specifically here, we've got one called DAPR Distributed Application Runtime from Microsoft and that's interesting because it's an important process for runtime and it provides much of the functionality of a Mesh and more. It's really interesting, worth checking out. So we're nearly at the end of our time now, in fact, we're possibly likely to go over, but just to summarize, service identity really is key to zero trust with service mesh and it really all starts effectively with identity and getting that right and it provides the foundations for being able to do yeah, everything else really with Mesh. You know, the Mesh provides the means to be able to automate those short-lived certificates more often than not with Spiffy and distribute those to the proxies in order for that mutual TLS that Josh spoke about to be established between services. And in most cases, this is transparent to the application developer. You know, it's completely built into the platform. It's not something they need to do themselves. You know, a number of the measures provide the ability to be extended so you can actually start sort of plugging in intermediates as we demonstrated, meaning that you can actually integrate this into an existing enterprise chain of trust. And there's certainly more to do in this space and we're involved in some of that with the various projects, but it's certainly great to see where we are, we already are and what's possible. So we look forward to the live Q&A. Thanks everyone for joining us today. Stay safe. Hope to see you all soon at in-person EUQCon and yeah, look forward to seeing you in the not too distant future. Thanks everyone, all the best. Bye-bye.