 Ychydig yw hyn o'ch cyflaid i'w ddweud o'r parwyddioliau, a oedd i'n gweld â chyfyddiad yw'r ddweud. Ymweithio'r corffedysau, rhai eich cyflaid ar y cyflaid ffordd. Mae'r cyflaid yw'r ddweud o'r cyfflaid, mae'n cyflaid i'r cyfeirio ar gyfer y cyflaid o'r cyflaid yw'r cyflaid. Ymweithio ar gyfer cyflaid, mae'n cyflaid nrhyw gyfer yn gyflaid ddweud, ac mae'n cyflaid ar gyfer gyfer yn gyflaid, a gennym o'r meddwl, os yw'r cyfle yn ei wneud o'r ystod ar y llaw, a'r ganwch i'r gweithio i'r gweithio, yw'r hyn yn anon gyda'r rhaglen. O'r ystod, ydych chi'n ddiddordeb yn ddiddordeb yn rai yw ymddangos i gael y gweithio ar y cyfrifeth, a'n gweithio o'r rwyf yn y context o ysgolau o'r ysgolau a'r gweithio bydd angen, a'r drwy'n gweithio yn rai ymdwyllfa mewn i'r ddiffanol. Mae'n gweithio i chi'n ddifanol. roi o'r cwmhysgwyd y dyfodol yn ddylwg yma. So mae'n ffordd allan os rhai i'w gweithio'r cyffredinol. Mae'n gweithio i gydag ar y gwaith yn 30 ymyddol. Felly mae'n gweithio i ddweud o'r concept wrth i ddiwedd angen i ddechrau'r cyffredinol. Mae'r idea iddyn nhw i ddweud ychydig o'r gyffredinol, i ddim yn ei wneud i ddim yn rhaid i ddim yn yng Ngôr Gweith Gwmhannol. Just talking about some of the things that are available to you. Some of the things you might already know. Hopefully some of them you don't yet, otherwise it's not going to be very interesting talk for you. So, historically infrastructure was reasonably static. IP addresses were statically allocated. So you could reasonably use these to represent your identity. Firewalls used IP and port combinations. IP addresses were used in X599 certificates. Secrets and certificates were manually deployed to machines by admins. You might have used protocols like Kerberos to initiate trust between machines if you didn't really like yourself very much. But then a cloud computing fundamentally changed the way that we do infrastructure. IP addresses are no longer a useful identifier. Workloads come and go. Your workload will get a different IP address as it gets restarted. You could have multiple instances of your workload so they've got multiple IP addresses at the same time. Secrets distribution becomes a much harder problem with these dynamic workloads. How do you get the secrets on there? With more services wanting to communicate with each other. The old IP and boundary-based approaches, they no longer work. Secrets management solutions like HashiCorp Vault, Cloud Secrets Managers, Cloud KMS, they help with the secrets distribution, but in order to access the secrets, we need a secret to get access to the secrets management solution. This is commonly referred to as secret zero, and that's synonymous with workload identity. This thing that workloads can use to prove who they are. So what do we need? We need a trusted third party issuing workloads with their identities. We need a way for these subjects to retrieve their identities in a secure manner without the need to have a secret in the first place and possibly without any knowledge of the identity that they're being issued. We want these identity documents to be short-lived and we want the acquisition to be seamless and simple for these workloads. We need a way for the relying party to be able to verify the correctness and the validity of these identity documents. Ideally, the basis for that verifiability is cryptography and the format of choice for those identity documents will be JWTs, Jots and X599 certificates. So that all sounds great, but how do you get one? How do you get this secret zero, this workload identity? So we'll start simple with something that you're probably already familiar with and then we'll build on that. So who's running workloads on Kubernetes at the moment? Quick show, help me out a little show of hands. Yeah, loads of you. Excellent. Who's heard of the Token Review API? That's surprising. I thought we'd have more hands for that one. So basically we've got two service accounts, the subject and the relying party. They've both got service account tokens and we can use the Token Review API to verify that token and obtain some information about the subject. It's a simple API call, the subject's token in the payload there and then the relying party's token in the authorization bearer field there. We can just make this API call. If you're passing Jots around when you're making API calls and things like that, you want to make sure you're using a secure channel to protect yourself against interception and replay attacks. Once you've made that API call, from the response you can get information back about the subject. The username, the fully qualified name of the service account there. You might use the groups and now you can use this information to start to make authorization decisions. This is the most common way of integrating Kubernetes workloads with HashiCorp Vault, a Kubernetes authentication method. Now we can get dynamic secrets, passwords, API keys, certificates from Vault. That's pretty cool. The trust domain here is rooted in your Kubernetes cluster. Kubernetes is your identity provider. If you're building something that's going to run in a single Kubernetes cluster, you can build on this API yourself. It's not complex and you can use this to then start to make authorization decisions. If you are passing these service account tokens around, you need to be careful about this. The tokens themselves, they exist for the lifetime of that service account. They're replaced if you delete the token, but they're not rotated. So there's some things we can do to kind of restrict the usefulness of a compromised token. Have you heard of the bound service account tokens? They're sometimes called projected service account tokens. They use the projected volume functionality. They're bound by audience and time. So in this example here, if our demo service is malicious and we pass it our service account token, it can't use our token to impersonate us to another service because it's bound by that audience there. This works within a single Kubernetes cluster. Let's say we've got Volt outside a cluster and a service account in the cluster talking to Volt. Volt needs to verify this token, but it can't use a bound service account token because it's not in that Kubernetes cluster. So now we need to create a long-lived service account token and share that with Volt in order to be able to validate our token. So now we've got a sequence management problem again. We can reduce the risk around that token by just giving it permissions to create token reviews. But can we do something so that Volt or any other relying party doesn't need a token to validate our tokens? Well, unsurprisingly, we can. So since version 1.12 of Kubernetes and stable since 1.20, the service account issue discovery feature was added. Quickshare of hands, have anybody heard of this before? Quite a few. But in my experience, this isn't quite as well known as the token review API. But it's based on the OIDC discovery protocol, which gives you the means to dynamically retrieve the details you need to validate these tokens. And you can have it set up so that you don't need any authorization in order to get this information. So you can verify the identity from the token and you can get basic profile information stored in the claims in your jot. Your relying party should maintain a white list of trusted issuers, otherwise anybody with an OIDC compatible issuer can authenticate. So now you can start to extend your trust domain. And in order to really understand how this works, we need to understand what's in a jot. So it's a simple serialization format. We've got a header, a payload, and a signature. Base 64 URL encoded and separated with a period. So we can pipe that through JQ if we want to look at the claims in the payload of a token. You can't decode the signature because it's not UTF-8 encoded, but you're generally interested in the payload. You want to see what claims are when you mess them up with tokens. You might be interested in the header. So there's no need for you to copy your service account tokens and paste them into JWTIO just to see what claims are in there. You're welcome. Okay, so here's a decoded token. It's in the header there. We've got the algorithm and the ID for the signing key. We'll come back to that in a sec. And then in the payload we've got various claims for the token there. So we've got some around the validity of the token, the expiry time, not before, the issue that time. And then we've got information about the subject and also the audience. So who is this token valid for? But key to the OIDC discovery is this issuer field. So we'll go back to the previous slide. So with that issuer field, you can append this well-known suffix onto there, and that will give you the discovery document. And from that, we can extract the URL to get the JSON Web Keyset, which allows us to validate the signature. And then we know it's a valid token, and then we can trust the claims that are encoded in that token. It's a fixed URL out of the box with vanilla Kubernetes, but it is customisable. And it's also protected with vanilla Kubernetes. So if you tweak that URL, you need to tweak your RBAC so that you can access it. Or you make that unauthenticated. But it's also now supported by your major cloud providers with their managed Kubernetes offerings. It's previewing in Azure because they went off in a different direction initially, but they've now course corrected and are going down the OIDC route. So what does it look like? Just to reinforce this simple protocol, we'll go through an example and look at the data structures for each of the major cloud providers. So that's the payload for GKE. You can see the issuer. You can tell from that that it's a GKE cluster. We append the well-known URL on the end of that, and then we get the discovery document. We can see some metadata in there. And the key one is that JWKS URI. And if we do a get on that, we get the key set. So you can see there the key ID. We talked about a little bit earlier. The algorithm. And then we've got the modulus and the exponent for the RSA public key. And then we can reconstitute that public key and verify the signature. EKS looks the same. Take that issuer filled. Stick the well-known URL on the end. Get the discovery document. Slightly different metadata, but the same kind of thing. The key field again is that JWKS URI. And we get the key set. And for AKS, same sort of thing. Notice the trailing forward slash on the issuer field there. If you're building something yourself, it's a simple protocol. Get, extract, get. And you can validate these. And the OIDC spec tells you to check for that trailing slash so that you can get it. So you can build yourself a simple code to get the keys that you need to verify these tokens and extract the claims. So there's the discovery document and the key set. So in the key sets that we've seen so far, you've only got one key in there, but periodically these will be rotated so that you will see more keys in that list there. And that's why the key ID is important. You need to make sure that you're getting the right key. Otherwise, your signature's not going to validate even if your token is valid. So that key ID is very important. I guess that's all well and good. What can we do with it? Well, you can use it with anything that supports OIDC discovery. It's a widely used protocol. All the major cloud providers allow you to hook that in to their Web Identity Federation. So you can take this token and swap it for temporary cloud credentials. So you can access cloud secret stores, cloud KMS, basically all of the cloud APIs. So now you can start to run your infrastructure as code pipelines on your Kubernetes clusters with temporary cloud credentials. You're not having to put long-lived cloud credentials into secrets in your Kubernetes cluster. You can exchange your token for these temporary credentials. And you can go cross cloud providers as well. So, for example, you could run in GKE and you can access AWS. So IAM roles for service accounts from AWS that builds on this OIDC discovery and this Web Identity Federation. So there's a mutating webhook that AWS provides. You can install this into your GKE cluster and then you annotate your service account with the IAM role that you want to get credentials for. And then when you do a deployment it will mutate your pod and it will automatically mutate it and add the bound service account token for you and also the environment variables required to use the AWS to authenticate for the AWS APIs. So this is all seamless to your workload just with that webhook in there. Your workload can just use the AWS APIs without having to worry about it. It's pretty cool, right? You can use OPA to enforce access control based on bearer tokens in your API calls. So you can verify the token and make policy decisions based on the subject, the audience and the issuer. You should probably not just as I mentioned earlier not just trust any issuer. You probably want to maintain a list of trusted issuers and check against that. But that's so we can get cloud credentials if we're passing tokens between services we can do authorisation using OPA. But now we're in the OIDC space we can start looking outside of Kubernetes so your identity provider doesn't have to be Kubernetes. It could be GitHub. So we can run our infrastructure code pipelines in GitHub and get temporary cloud credentials and create infrastructure. So in this example we're configuring GCP to get temporary credentials so you need to create a service account a workload identity pool an OIDC provider and then you do some mapping between the claims in the GitHub token and then bind that to the GCP service account that you want to impersonate. And then Google provide a GitHub auth action that allows you to exchange that token for credentials so you can get an access token to use in authorisation bearer field for making API calls you can get a credentials file and in this case we're just getting an ID taken which is the same as the Kubernetes jobs that we've been looking at just so that it kind of looks the same and if we pipe that through JQ we can see a similar set of claims that we were getting from the Kubernetes cluster that subject field there that long numeric string that's the OAuth2 client ID for your service account and you can look in the console and verify that that's correct. What if you need more than this what if you want a meaningful name in that subject field unlike that numeric value we just looked at or the fully qualified name of your Kubernetes service account you might have multiple Kubernetes clusters and you want to simplify validation of those tokens between workloads you might want to use the same workload identity across multiple clusters your clusters might be repaved regularly and that issue of fields might change and you don't want to have to automatically keep changing that configuration for the IDC provider you might want to extend your trust domain over multiple Kubernetes clusters or you might want to have separate trust domains for each Kubernetes cluster but you want to federate them so that you can simplify verifying those tokens across those Kubernetes clusters you might not even be running in Kubernetes you might be running on VMs you might have IoT devices so we need something a little bit more we can't just rely on that that Kubernetes is the identity provider and that's where Spiffy Inspire come in so Spiffy defines the standards for an identity framework the Spiffy ID basically the format the representation of that that identifier it's a URI and it's made up of the Spiffy protocol the trust domain in this example acme.com and then an arbitrary string on the end as your workload identifier building.payments here those the Spiffy verifiable identity documents are short lived and rotated frequently the trust domain itself provides you with a bundle which is basically the root keys that allow you to verify that Svid and it could be a job but it could also be an X509 certificate the workload API that's how your workloads get these identity documents and other workloads can get the trust bundles identity documents and Spire is the implementation of these Spiffy standards Spire has a Spire server then you've got agents that run on nodes and they attest themselves to the server and then the workloads will attest themselves to the agent running on the node that they're running on and use the registration API to map selectors data depending on how they attest themselves to the workload and you map that to Spiffy IDs and when you're creating your Spiffy IDs for your workloads you map them to the Spiffy ID of the agent so now you're tying your workload identities to the nodes that they can run on and then the workloads just use the API to get their Svid and they don't need any secrets up front just by virtue of the attestation they can get these identity documents so we've got a list of the plugins listed in the documentation there so the node ones agent to server top three there are pre-shared secrets the next two are for Kubernetes the SAT is a regular service count token the PSAT is a bound service count token and then the next three are cloud identity documents so you can use kind of information that this node already has or that you've pre-populated and stored on there and then for workload attestation three key ones there UNIX Docker and Kubernetes and the the selectors so for the UNIX for example the selectors might be user ID, username, group ID group name for the process that's running that workload could be the path to the binary on the final system the digest of that binary itself for Docker it'll be things like the image ID labels environment variables for Kubernetes it's your service account, your namespace the image things like that just metadata about these running workloads and then when the workloads come up and talk to the nodes they're just attesting themselves the nodes checking for this metadata mapping it knowing which identity to assign so in this example deployment we're running on a single Kubernetes cluster using the projected service count token for the node attestation now under the covers that's the token review API that we saw at the start of the talk same as the bolt authentication the workloads because we're running on a Kubernetes cluster we're just using the Kubernetes workload attestation the OIDC discovery provider that's an optional component and that exposes the endpoints to retrieve the discovery document and the JWKS URL let's have a look and see what that looks like unsurprisingly the tokens look the same the discovery document looks very similar and the keyset looks very similar so we talked about X5 and 9 S-fids as well we focused primarily on jox before and the reasoning for that was because you can get these from your Kubernetes cluster you don't need to install anything else if you're running Kubernetes you've got some administration of that Kubernetes cluster but with the service count issue of discovery stuff you can start to use the tokens that your Kubernetes cluster is assigning to authenticate to things outside of your cluster it's not free because you've got to manage your Kubernetes cluster but it's freer than installing a spire server and agent somewhere else so that's just some of the information in the X5 and 9 S-fids you can see in the URI sand field there we've got the SPIFI ID so trust domain, arbitrary, worked out identifier and if we're running that's kind of different to service count names it's a nice meaningful identifier so that's kind of useful we can see that we can use the certificate for server and client authentication so we can now start to use this for mutual TLS we can log into vault with this X5 and 9 certificate or anything else that supports certificate based authentication spire itself implements Envoy's secret discovery service SDS so Envoy if you're running Envoy kind of as a lightweight service mesh that can retrieve these X5 and 9 S-fids and use those for your mutual TLS and now you're using the same identity for your for your certificate, for your TLS that you'd be using if you were going to pass your JOTS around whereas if you're running a service mesh like Istio Istio is issuing different identifiers for your certificates then Kubernetes would be issuing for your JOTS and again you can plug Envoy into OPA for policy based authorization decisions so I did say something about hardware root of trust so the rest of the rest of this deck gives an indication of where I think kind of workload identity is heading trying to tie that trust down to the hardware level there's been a couple of interest in talks already this week Wednesday there was talk about the confidential containers project using running Kubernetes workloads in trusted execution environments there was a talk this morning about running Kubernetes in trusted execution environments really kind of interesting stuff like right down to to hardware level attestation security this is something I'm just starting to look at so please don't ask me any difficult questions at the end about this but I wanted to just give you some information about some of the things that you might want to go away and look at and keep an eye on and say it's an evolving field and so some of this stuff isn't quite production ready so don't think you can go home and start to run this stuff but it's definitely worth keeping an eye on so trusted execution environments will leave that guys who know what they're talking about much more than I do have already spoken about that this week actually I've just scuppered myself there because I'm making up some of this stuff as I go along so TPMs quick show of hands, who knows what a trusted platform module is that's interesting, more people know about that than know about the Kubernetes token review API so you can't make assumptions when you come and do things like this so TPM is a cryptographic device it does things like secure generation and storage of keys and these keys can't leave that TPM in an unencrypted form so that's really that's a really useful property for tying things down to a specific machine so earlier we saw that you could use an X599 certificate to attest a node to Spire but if I get hold of that certificate and walk off with it and put it on another another machine I can impersonate that node with the keys from a TPM they're encrypted outside of that TPM and you have to load them into the TPM to decrypt them and use them so I can't steal your certificate and pretend to be you I have to be you with that TPM so that's quite a nice property PCRs their special registers you can read from them but you don't directly write to them you basically extend them so they store a hash of something so the new hash is the old hash plus the new measurement and keyline really interesting to look at so this uses it with the Linux IMA subsystem and so it's doing remote attestation so you can you can use that to scan the files on your node stick that the hash of that into the PCR and then keyline can remotely check that so that gives you the ability to continuously verify the integrity of your remote machines and on the keyline site they've got a nice little demo of removing a compromised LCD node from a cluster so we can use this with with Spire now I only came across this last week I'm fantastically playing around with it trying to get it to work I could get it in here but it's pretty cool so you've got a local device ID which is a public and private key in a certificate provisioned out of band and then they come out of the TPM the TPM encrypted and then in the attestation process the agent loads them into the TPM the server will then go through a challenge process to see if if you've got the keys in the TPM decrypted so now we can we can provide harder guarantees that that node is the node that we thought it is the spiffy ID uses the fingerprint of the certificate so it's very predictable just like the X509 and the SSH certificate attestation methods so that's it's not I only found it last week because it's not on the main documentation site for the for the Spire stuff it looks pretty interesting and that's it so key takeaways from my ramblin this afternoon you might already have a workload identity if you're running on Kubernetes you've got your service account taken and depending on what you're trying to do your Kubernetes cluster might work as your identity provider if you're running on Kubernetes and just talking to Vault on the same Kubernetes cluster then the token review API might be fine if you're just running Kubernetes workloads and you want to get temporary cloud credentials then maybe the OIDC stuff works to just swap your token for that if you would kind of go into the wider fields where you want to use the same job and the same identity in your job and your X5-9 certificate then you might want to think about Spiff Inspire you've got more complexity and an operational and administrative overhead to run Spire but if you need the extra capabilities that that gives you it's certainly worth thinking about it so yeah, consider Spiff Inspire see if your needs really require that and investigate things like TPMs, keyline and trusted execution environments and that's me done