 It's good to see you all. It's going to be here together in person with our virtual attendees as well. My name is Andrew Harding. I'm a Spiffy Spire maintainer. I've been involved in those projects for the last three and a half years or so. I'm a staff engineer at VMware where I'm allowed to continue that work and work on a full time, which is fantastic. It's awesome. Real quick, before we get in today, all these celebrations and good times, everybody, I threw my back out, so I won't be doing any breakdancing today. If you see me tilting one side, maybe some can run up and tilt me back into their upright position. Anyway, moving on. We have a real quick, we've got a lot to get through today, so our agenda is fairly short, but each of these sections is going to be meaty. We're going to go over our problem statement, make sure we're all on the same page. We're going to do a very quick, hopefully just a few minute refresher of Spiffy-inspired to set the stage for the Kubernetes controller work that we're doing and to demonstrate accomplished and cross-cluster authentication using Spiffy-inspired, this Kubernetes controller work. Real quick, our problem statement, we're all very familiar, I think, with this sort of setup. We've got multiple Kubernetes clusters. We've got workloads running within those clusters. The intra-communication store within Kubernetes, there are things that work there and that people have done to communicate between workloads within the same cluster. When you start talking about workloads between clusters, then the authentication store gets a little more complicated. There are solutions out there with various pros and cons. Today, we're going to talk about solving it with Spiffy-inspired. What is Spiffy? It's the secure production identity framework for everyone. It is essentially a set of specifications that are all geared around how do you get a workload, a cryptographic identity that it can use to authenticate with other workloads, and how do you get the public key material around so that receiving parties of these verifiable identity documents can verify the signatures and verify the authenticity of those documents and get to your authentication. At the heart of Spiffy is something called the Spiffy ID. This is your user name for your service, your workload. This is your identity. It's a URI, very basic, it's got basically two parts to it. You've got a trust domain part and you've got this entity part within that trust domain. The trust domain is essentially acting as a namespace for these entities. It represents a security boundary or maybe a different administrative domain. This could be like you could have a company has a single trust domain, maybe it has a trust domain for prod and staging, maybe it has a trust domain between business units, a banking business unit, and the HR business unit or whatever. The point is that these trust domains form this cryptographic boundary, the security boundary, and trust domain typically has its own PKI or its own set of cryptographic keys that it uses to sign identities within itself. And that is what we'll talk about next, which is you've got this ID. You want to be able to assert this ID cryptographically. And so we put it into a document called an SBID and we've got both X509 and JOT documents within the specifications for SBIDs. This is just a signed document over that ID and it's signed again by an authority within that trust domain. And this is what workloads present as their proof of identity. Receiving parties, take something that is called the spiffy bundle and use this to verify those identities. Now the spiffy bundle is just a collection of the public keys for the signing authorities within a trust domain. We call it a bunch of different things. It's a trust bundle or a trust domain bundle, but in the specification correct standard is or correct naming is the spiffy bundle. So we've got these signed identity documents. We've got these bundles. We can use to verify them. How do we get those documents to the workload itself so that it can use them? And the answer there is something called the workload API. So this is an unauthenticated API, workloads don't have to bring anything to the table in order to talk to this thing and obtain credentials over it. They just stand up wherever they are, find the workload API, say, hey, give me my identity, the workload API through all the spiffy magic, identifies the workload, applies whatever policy is necessary, and then ships down across this workload API, the SBID and the bundles. As those secrets or materials change and are rotated because with spiffy you can get rotation all the way from the leaf certificates all the way up through your certificate authorities. As those materials change and rotate over time, sometimes very frequently, those updates are streamed down to the workload. They're pushed down to the workload. So the workload always has a very quick snappy idea of what the identity universe looks like, how it's able to authenticate other identities and present its own. Now the very last thing we'll talk about for spiffy is we've talked about how the workload receives these documents. The question is, how do other trust domains receive these documents? This is where we're going to talk about federation, right? If I have two trust domains and they want to be able to authenticate the SBIDs that are produced from each other, they need to get their hands on the spiffy bundle material for the foreign trust domain. And so this is, it's not really called the federation API in a spec. It's called the bundle endpoint. It's called the federation specification. Inside that there's something called the bundle endpoint, which is a very simple HTTPS endpoint that you talk to to obtain the spiffy bundle. And there's rules in there on how you authenticate that endpoint, a couple of different profiles that you can use to do that. But this is the federation API. It's the way that a foreign trust domain is able to obtain the bundle for your trust domain so they can verify the identities, the SBID documents that come out of your trust domain. So in a nutshell, what you get with spiffy is these cryptographic, verifiable, secret zero, like federatable, you know, frequently rotated namespace uniform identity. And that's kind of like the crux. That's what spiffy is giving you. And that's what we're going to build on today, or use today, leverage today to do our cross cluster authentication. And we're going to do that with spire. So spire is basically an implementation of all those spiffy standards that we just talked about. And the goal of spire is to take all the functionality you get through those APIs, all the interoperability, et cetera, and just shove that in as many places as we can get it. And we want to stick this thing up everywhere. Whether that's, you know, inside different hosted cloud services, Azure, AWS, GCP, whatever, whether it's on bare metal servers, whether it's running inside Kubernetes or outside Kubernetes, like we want to stick this thing everywhere because the more places you have the workload API and the related, you know, spiffy specifications, like out there, the more powerful it becomes. The easier it is to just say, hey, I don't care where my workload's running, I'm just going to kick that thing up. It's going to be able to talk to the workload API, get credentials and authenticate. And that's really what we're striving for is kind of like to provide workloads this identity in a way that isn't complicated for them. And we'll show you later, we're going to do a little bit of live coding, we'll see how easy it is to pick up credentials off the workload API and do this sort of thing. So let's talk real quick about the components of spire. We're going to start with the workload. That's not spire, but the whole reason spire exists is to like light up this workload and give it identity. Now the workload API we talked about is the thing that workloads talk to to get their credentials and that is hosted by a component called the spire agent. The spire agent is able to identify the workload and take the policy that it has and use that policy to figure out which identity is supposed to go to that workload or not and deliver those materials. The spire agent itself is not the signing authority for the trust domain that is delegated and also the policy is delegated to a component called the spire server. This has a managing API that, you know, operators can use to configure and register workloads and federation relationships and all that jazz. It's also the component that hosts the spiffy federation API. So in our example, we'll have these spire deployments running each cluster. The spire servers are going to be talking to each other to obtain the spiffy bundle for the other trust domain so we can do our federation across cluster authentication. So how do we typically deploy spire inside Kubernetes? Ta-da, we did it. There it is. We've got a spire server running either in a deployment or in a stateful set in a few different ways that you might want to decide to deploy that. The important bit is that on each node, we've got a daemon set that's going to be running the spire agent so that node or, you know, workloads that come up on that node can talk to the workload API hosted by that spire agent to get their credentials. All right. So the next bit we're going to talk about is this new work that we're doing in Kubernetes. Now, I say new because this is a new body of work, but it's not as if spire hasn't had this integration before this point. We've actually had support for a couple of years in a component called the Kubernetes Workload Registrar. The Kubernetes Workload Registrar is something that the community heavily contributed to. There are people who are using in production. It's got a couple of different operating modes and it solves mainly the problem of how do you automate the registration of work loads within your Kubernetes cluster so that they can talk to the workload API and get their credential. We've learned a bunch over the past couple of years about the features that it has grown over time and we think we can take those learnings and go even further and make it even more turnkey to light up Spiffy inside of Kubernetes with spire. Right now we currently have two CRDs defined with this work. We've got one that will allow us to declare workload identity and one that will allow us to declare these Federation relationships and we'll show how that's going to break down. The idea is that these controllers will take a look at the spire server management APIs and decide what has to change in order to reflect the declared state of the world back into spire server. Specifically we've got this is the workload identity CRD. It's called the cluster spiffy ID CRD. It allows you to define the shape of the spiffy ID that your workload is going to get. It allows you to declare what sort of trust domains that workload should federate with. It also allows you to kind of create nodes in the specific namespaces where this identity should be available. The second CRD is called the cluster federated trust domain CRD. Super basic has the name of the trust domain that you want to federate with. And also we talked about that bundle endpoint specification in the ways you authenticate and connect to that bundle endpoint. It has all the materials and all the information that you need to talk to that endpoint authenticate it in order to receive the bundle. And that's it so far. We might have other resources that we added in later points maybe go down the operator route. All of this is still in flux under design and development right now. Let's get into our demo architecture. So we will start today with our typical spire deployments at Kubernetes. And then we're going to toss in the controller into the mix. And the controller is going to look at the Kubernetes API, watch that, get reconciliation events for these different resources and then it's going to reconcile the state of the world with the spire server programming the workload identity and the federation relationships that are needed. And then of course spire server is going to be applying policy and what not and it's going to end up at some point shipping these S-Wids and bundles down to the spire agent and on to the workload. And that's our whole picture within the single cluster. Now between clusters you can see here on the left we've got cluster one on the right, we've got cluster two and we've kind of lumped all that spire infrastructure into one little box there named spire on each side. And then we've got our workloads. We have a greeter server running in cluster one and we have a greeter client running in cluster two. The spire servers are going to be programmed using that cluster federated trust domain CRD to federate with each other. So they're going to start talking to each other and pulling spiffy bundle material into each other's trust domains. And of course the workloads are going to connect up through the workload API and obtain those S-Wid and spiffy bundles over the workload API and then they're going to talk to each other over MTS and do mutual authentication. All right. Let's do the demo. Okay. Let me fix my display up here real quick so I can also see my terminal. All right. You guys see that okay? Do I need to bump up the font size? I'll go a little bit bigger. Never be too careful. Okay. All right. So first thing I'll show you is this workload that we've developed. We're not going to write it all by a scratch that would take way too long. So I've got most of the code in place. Let me show you what it does first. So the greeter server I'll build that real quick. And the greeter server is just going to be sitting there listening on port 8080 over plain text because we haven't changed it to take advantage of the workload API yet. And on this side we will start up the greeter client. And this thing's going to connect to the server and it's going to issue a request and say, hey, say hello to Joe and the server is going to respond with a message that said, hey, someone wants me to say hello to Joe, hello Joe and the server doesn't know who the client is that connected to it. It has no idea that connected over this plain text connection issued the request. It doesn't know the identity of the client. Likewise, the client doesn't really know who satisfied the request. It's some server. So we're going to fix that up so that each side can not only authenticate but look at the identity of the calling party. Okay. So let's make those code changes real quick. And really the point of this is to demonstrate how easy it is to wire up the workload API into your applications. So we'll start with the server. All right. Here's the server code. Very easy. We have a little flag to configure our listening address. We create our listener, bind our port, right? Then down here we're going to create our gRPC server and register our implementation and then we're just going to serve that gRPC server forever, right? Our implementation is very trivial. It essentially receives the request, does a little bit of logging and then responds with a little message to say hello to whoever it was asked to say hello to. We've got a couple of placeholders inside our implementation. We've got a to-do up here to wire up the workload API to get some spiffy credentials to power this gRPC server. And then down here we've got a little to-do to extract the identity of the client from the request context so that we can log it and send back the appropriate message. So we'll start here with this first one. We're going to be using a library that is provided in the spiffy org. It's a go spiffy library. We've got a version two that we've been working on for quite some time. And that's what we're going to use to do this work, okay? So the first thing we're going to create is something called an X509 source. This is a source of X509 materials that we're going to obtain from the workload API. So we'll create that source here. I'm going to borrow some error handling from the thing above. And when we're done, we should close this source down. And so now we have a source for X509 materials. This is both for the Svid and the bundles. So the next step here is we're going to create some gRPC credentials. You notice that when we created our gRPC server we didn't configure it with any credentials. So let's create those credentials now. We're going to use a little helper package provided by the Ghost Spiffy Library to give us MTLS server credentials. And we're going to give it the source twice here. This is a very low-level and flexible library. The first parameter here is the source of the Svid. The second parameter is the source of the bundles. We're going from the same source for both materials. So we'll pass the same source twice. And then we're going to... The third parameter here is a little authorization hook that you can do. So this is the gID. As long as it is authenticated against a bundle that is provided by the source, we'll allow the connection in. But you can imagine if you wanted to hook in your own custom authorization libraries, if you wanted to do Rego and an OPA or whatever, or something else, a flat file, whatever you want to do, this is the parameter that you'd pass in order to influence the authorization of the connections. So now that we've got our credentials, type things right, pass the creds into the server. That's it. That's all we need to do to provide spiffy credentials to our gRPC server. Down here, to obtain the client, we are going to again use that gRPC credentials package as a convenient little helper. We're in. We can give it the context of the request and it will extract the priority from that context. And if we're able to successfully do that, we're going to overwrite the value of this client ID variable with the identity of the peer. Let's make sure that builds. No, I did not. I missed a C. There we go. Now it should build. Okay, so server's done. Let's go to the client. Client again, very basic. You can configure the address through a flag. It dials the server, creates a little client over the gRPC connection and then sits here in a loop and every 10 seconds just issues into a request of the server. And then down here, to issue the request, it just invokes this RPC on the client and then logs out the response. We have the same set of to-dos inside the client that we did on the server. The to-do appear to wire up the workload API on the application of the RPC or on the response of the RPC to extract somehow the idea of the server. So just like in the server, we'll start here at the top to wire up the workload API. We're going to do the exact same thing here that we did on the server. Create an X509 source. We're handling there. And this time when we create our credentials, we are going to use the MTS client credentials. We're going to give it the same source for Svid and Bundle. And this time we'll do something cute. We're going to authorize a specific ID. If it doesn't have this ID, we're going to reject the connection. And the ID we're going to give it is the spiffy ID of the greeter server in the cluster 1 trustee main. So I didn't type any of this. Otherwise this is going to crash and burn. All right. So that's our first to do. We've wired out. No, we're not done yet. Again, we're not providing a transport credentials to this dialing operation. So let's fix that. That should be it for our first to do. All right. We're wired up to the workload API to get our client creds and the bundle we need to verify the server. And then down here, we're going to do the same sort of thing. Now this time we don't have a request context. We're going to get our identity from a pure object. And that I went a little backwards here. Where does this pure object come from? Well, to do that, we need to pass something into the RPC call. And GRPC will helpfully populate this peer structure with the peer information that for the server that fulfilled this request. So down here, we'll extract that peer ID out. And if we're able to, we'll go ahead and overwrite server ID with the identity of the server. All right. We built. Okay. So let's go ahead and just make follow. It's going to build the Docker container. It's going to push the server into the cluster into the cluster one cluster and the client into the cluster two cluster, which I have running, hopefully in a pristine state. Let's make sure this finishes. Great. I've got a Tmax window here for cluster one. I've got a similar window for cluster two. Let's go ahead and deploy all of that spire infrastructure, including spire, server, spire agent, the controller, etc. Okay. That's all deployed. Let's do the same thing in cluster two. All right. And like we talked about next, we're going to federate these two trust domains. And to do that, we have that cluster federated trust domain CRD. Real quick, if we look at the services provided in this first cluster, we can see the spire server bundle endpoint service. And this thing has an external IP. This is what we're going to use for the second cluster to talk to the bundle endpoint in the first cluster. Creating the CRD by hand is a pain, so I've got a little script here. They'll create it for me. And if we scroll up, there we go. We've got a cluster federated trust domain CRD. It's got the trust domain that we're going to be federating with. It's got the bundle endpoint URL. It's got the profile information. So how do I authenticate to this thing to pull down the bundle? This is the CRD that cluster one that we need to apply into cluster two. So I'm going to go ahead and just put that in my clipboard. We're going to jump over to cluster two here and apply that thing. And now we're going to do the same thing for cluster two. Cluster two, copy the cluster two CRD and apply that inside cluster one. Just to see what we've got here. Sorry, wrong CRD. We've got this CRD that's programmed with the right endpoint URL and has the profile information that needs to connect to it. Okay, perfect. All right, so now we should have if we peak into the spire infrastructure here a little bit inside cluster one, we're going to peak into the spire server and say, hey, give me all the federated bundles you have. And we can see that it has indeed the bundle for cluster two. Okay, and we can do the same thing in cluster two if we want to show that it has the bundle for cluster one. Okay, so now that these things are federated, let's go ahead and deploy that greeter workload server and client into each trust domain. So in cluster one, we'll deploy the greeter server. That should be getting up and running and now we will do the same for the greeter client. And now we wanted using the cluster spiffy ID CRD describe the identity for these two different workloads. So in cluster one I've got a little CRD prepared here for the greeter server. You can see here the spiffy ID template. Now I don't have any like template parameters on this, we're just doing a fixed ID but there's a lot that you can like fold into this template to take properties from the pod or the node or whatever and reflect that in the ID. We also have a pod selector because we're going to sharpshoot the greeter server running in this cluster as the one to whom this identity belongs to. And then we've got a little federates with clause here and this tells Spire to allow this workload to obtain the cluster two dot demo trust bundle over the workload API. Alright, so let's apply that and we'll go to cluster two and apply the same sort of thing for the greeter client. I can show you that. It's very similar. Got this fixed ID. We're matching on the greeter client pod and it can federate with cluster one. Alright, so now we can look at the logs before we probably should have for this workload but if we look at the logs now for these workloads they should be talking to each other. That client should be sitting in that loop every 10 seconds talking to the server. So let's dump the logs from that greeter client logs and there we go. This is much better experience. No more some server some client. We have the identity of the server right here right the greeter server that the client was able to obtain over that MTLS connection and the server was able to pull out the client ID from permits request and reflect that in the message that it returned to the server and so this is there's a lot going on in the covers but this is it. This is cross cluster authentication using spiffy inspire inside Kubernetes. There's a lot of details we can dig into. We don't have time in the presentation but I will be posting up this demo online so folks can run through it and tell us if they want at a later point. So back here what do we do today? Well we use CRDs to program our Kubernetes clusters with the workload identity that we wanted. We also declared our federation relationships and using those we were able to get two clusters to perform federation with each other and exchange spiffy bundles so that we could authenticate svids from each of the trust domains and then using the workload identity CRD we were able to configure spires so the workloads could connect to the workload API and obtain their credentials and then do mutual TLS between each other and get that strong authentication layer. If you'd like to learn more there's a website at the top there's a couple of our github repositories there's more than this. This specific controller itself has its own repository I'll make sure that gets published during the next week or so and then last week we got our spiffy slack and this is a place where not only maintainers on the projects but users of the projects and other interesting parties are there every day it's a very active spiffy it's a very active slack you can come there to get help or just shoot the breeze or to figure out where spiffy inspired can maybe help in your deployments and that's it do we have