 I am CEO and co-founder at control plane. We are a cloud native security consultancy We do security audits pen testing consulting and love to run cap to the flag events We do those at kubecon every six months or so If you find yourself at the security con prior to kubecon coming up in Detroit, please come and play one of our carefully constructed games I've done various things. I'm proud to be involved with the CNCS tag security and Also do some work for open UK advocating for open source and trying to keep up with the Pace of open source Awareness that the the US Biden administration is showing great leadership in at the moment I've also written a book recently with my esteemed friend and colleague mr. Michael Hassenblas the first half of the book is available as a free PDF download Which slightly edges the the printed book because you can just copy and paste the examples it is written with hacks and Yolo techniques for your q and at his clusters and Details everything from insider pods out to compromising a node a cluster and ultimately your cloud environment So today we are going to talk about trust in the context of secrets management and access control Focusing primarily on machine identity What it is how we get one Talk about some things that hopefully will be interesting and slightly new and Hopefully things that you can take back and use in the context of your day jobs So with that where did we come from before cloud? without needing to possess any primary secret material or Any knowledge of their own identity We want identity documents that are short lived the equation the acquisition and replacement of them has to be simple and We want this to be seamless for our workloads of course Finally, we need a way for the relying parties to verify these things are correct and the identity documents are valid So ideally the basis of course for this is cryptography The format of choice for these identity documents in the modern age is JWT's or Jots and x509 certificates So now we're sold on this concept. How do we get one? We start with something simple that we probably already know Who's running workloads on Kubernetes? Yeah, the majority good and who has heard of the token review API a few less. Yeah Okay, so we'll dig into this now Here we have two service accounts Each one has a token The subject makes a request it sends that token in an authorization header And it uses a secure channel like TLS to protect against interception and replay The relying party then makes a call to the Kubernetes API server sending In that cryptographic looking string Does anybody know the serialization format for Jots? It's a bit bizarre when you base 64 decoder. It doesn't quite spit out properly There is a header payload and a signature. It's base 64 and URL encoded separated with a period halfway through So we can pipe this through jq like this and have a look at the payload or the header We can't decode the signature as it's not utf8 But generally we're only interested in the payload and possibly the header No more pasting tokens into JWT.io. It's probably not a good idea So this is the decomposition of We see here the header. We have the algorithm Then we see metadata about the pods and the service accounts and the subject there at the bottom We have a lot of metadata in here as well The json web key sets URL So this is a set of keys containing the public keys used to verify any Jots Issued by the authorization. So the EKS payload looks very similar indeed Of course because it's a standardized format, but we have a different issuer That is not static that will change and The discovery document that goes with it and again the EKS JSON web key set Nothing particularly different in there one difference here with a rather as it mentions This defines a standard for an identity framework Spire is a concrete implementation And these technologies have been going round for a number of years now Istio implements some form of the spiffy standard it's Neither complete nor Absolute but still goes in the right direction Exchanging identity for cryptographic material that can be used not only for authorization, but also mutual TLS So for spiffy, we are defining the format of an identifier as a uri You can see here the spiffy identity spiffy at the dot-com billing payments for example This is baked into an SFID spiffy verifiable identity documents This is then used to for example generate an X509 or a Jots Which means that it's cryptographically verifiable in various different ways it can be short-lived and rotated frequently and Yeah, as I say that there's multiple different mechanisms for that SFID to be bundled into the trust domain Just provides the bundle and the root keys to verify the SFID the workload API then defines a protocol by which to securely retrieve and verify the SFID which gives us the ability to Not only mint and generate these certificates But then those certificates can be used to mutually identify each other when there is a shared root root of trust or shared public key So how does this work because there's various different implementations One of the ways is to run a node agent. So a trusted process sat Running his roots on a node Observing the workloads that are created as unprivileged or processes owned by unprivileged users on same node and Then attesting their identity against the server exchanging metadata about those processes with the server, which is then minted into a certificate, which is returned given to the process and used to bootstrap that workload identity and to also maintain encrypted communication Workloads attest themselves to the agents and the registration maps Selectors containing metadata specific to the attestation to spiffy IDs And a parent spiffy ID for workload maps workloads on the node So that we know which workloads are allowed to run in which nodes Workloads can then use the workload API to get their SFIDs This is not the only way that we can attest to identity We can also on the left-hand side the node based attestation Agent server so those top three mechanisms utilize shared secrets, but we also have kubernetes We also have cloud identity documents. So this concept of running a privileged agent process Incidentally, this is all threat modeled the security properties of published tag security did a review of this a couple of years ago And the glass radius of various different types of compromise. It's all available to download in the tag security archives The principle of having a trusted node agent means that the compromise of that node Means an attacker can then mint certificates based upon arbitrary falsified metadata So there are other ways to do this as well Of course that limits the blast radius to the workloads that are permitted to run on that node type But by extension some of these cloud provider integrations Use metadata about a virtual machine. So instead of having a trusted node at the attestation process on every node on every virtual machine on every cubanet on every cubelit in the cubanet cluster When attesting to cloud infrastructure We run that on one dedicated node that uses the AWS or the GCP APIs to retrieve metadata about the virtual machines That's what's then used as the metadata in the selectors to generate the certificates And that's what's then pushed to the machines to be used as the basis of identity So this can be done in other ways for the sake of simplicity. I think especially the way that Istio implements this this style is again not with a node attester But going directly to the cubanet is API. So there's multiple different ways to the skin and slice this Ultimately the implementation that makes the most sense is dependent on your threat model, of course We can also do individual workload attestation here We can use Unix identifiers like user IDs or process names Use group IDs and names the path to a processes binary the digest of that binary Can use for a container the image ID? labels environment variables if we so choose and of course in cubanet is service counts namespaces images, etc Then on a single cubanet is cluster by extension. We can use the projected service account token node attestation We're back to the token review API at this point, which is the same as the vaults authentication method earlier and We then also have the idc discovery provider exposing our endpoints to retrieve the doc the discovery documents and the jwks url as It's oidc discovery. It just looks the same as the previous examples And we can use it in the same way as a cubanet is service account jot So far we have focused just on the Jots Motivation for this was to demonstrate that we can do a lot with cubanet is as our IDP as our identity provider spy gives us extra capabilities, but with an Administrational and operational overhead You get these for free on cubanet is again There was a balance here of operational versus risk The specific use case will determine your need But the assumption is for some of this that a service mesh is used for secure communication Or at least a mutual TLS exchange Or indeed just a one-way TLS exchange as long as there is Encryption that prevents the provisioning of these tokens being intercepted obviously What if we want the same workload ID in the Jots and the x5 and 9? Well, we can also provide the Svid is the x5 and 9 document where the spiffy ID is in the URI SAN the subject alternative name Then we can use for server and client authentication the certificate be used for mutual TLS to log into vaults Or for anything else that supports certificate based authentication When we're using this to log into vaults the question of having a key to unlock the box to get the second key out It's fixed because the workload is provisioned a cryptographic identity that it can go to vault and say hi I'm allowed to do this vault can then go out of band to the cubanet is API server Or indeed just trust the fact that this is encrypted and encoded into a certificate with a shared public key that's trusted and Give that secret zero this is Easier and more stable in my opinion than a lot of the other vaults unlock mechanisms And doesn't necessarily require so much Sort of sidecar mounting as we may have seen So get as we say Spire is also implementing the envoy secret discovery service So from a from an envoy perspective and then by extension sort of service mesh and Istio We can integrate with envoy for TLS I'm just push that directly into envoy and optionally open for policy-based access control using the spiffy IDs Okay We've looked about the software now perhaps into the hardware We'll briefly go through and I realize there are People in the room for whom this is a specialist subject. So I won't go too deep into it An indication of where it looks like workflow identity is heading for the rest of us This is towards trust at the hardware level. This is anchoring and ideally what we're looking at from a supply chain perspective is anchoring from the git commit that we make and The cryptographic material that we attach to that commit through into our build system through interstitially and sort of inter build stage Gathering of signatures with something like in toto tecton chains. We'll do some of this for us today Bundling up those signatures into an artifact which we can then sign with something like cosine Notary V2 will do some of these things as well that then gets pushed to a Kubernetes cluster where admission control says Okay, I trust this signature I can also unbundle all of the other signatures sort of been pushed as metadata alongside this container and We've got a chain of trust from the GPG key the developer used or the SSH key that they used initially I think sign and git commits is a good thing, but I realize our benevolent leader doesn't necessarily agree But yes, indeed signing all of those commits validating all the signatures and then we have an artifact that Is immune to some forms of attack Obviously a malicious insider is still not prevented by the style of attack But we are preventing build stage tampering and we're increasing the trust and the veracity of that build as it moves to production And it's finally deployed But what about that workload as it sits in production? We have trusted platform modules. We have cryptographic devices. We have trusted execution environments that take us further into the space so We're looking at secure generation and storage of keys that never leave a trusted platform module unencrypted This property allows us to tie things to a specific TPM as compromised keys cannot be reused in another TPM We also have this concept of PCRs platform configuration registers That can be read but not directly written and as values are extended We have a new hash which is comprised of a previous hash plus a measurement offset And change that hash value This goes a long way to preventing faking and tampering It's likely that PCRs could well be used for access station in the future and key lime gives or uses IMA measurements stored in those PCRs Which is the integrity measurement architecture this gives it the ability to continuously verify the integrity of remote machines and we can In this case do things like automatically remove a compromised SCD node from a cluster when we detect that there has been The integrity is compromised in some way it is worth noting that The trusted execution environments and enclaves that were shipped in consumer grade CPUs are being discontinued in the main Intel have pulled them from the latest generation of processors comparing contrasts Google put TPMs into every every VM that they have on on Google Cloud I think they kind of announced that maybe four or five years ago So this is something that's actually more difficult to do With a modern CPU than it is with a cloud provider, which is strange for something that's hardware anchored, but but there we go Okay, so we also have trusted platform module device ID attestation we can these are provisioned out of band Attestation uses the encrypted keys Which are loaded into TPM and Includes a challenge that can only succeed with the correct keys on the correct machine It's tied to the TPM so it can't be impersonated Even if the keys are compromised and it uses a fingerprint of the certificates like x509 so the default spiffy ID is predictable This dream of anchoring everything end-to-end from Commit through to the runtime attestation anchoring the process to a specific node through the TPM Still has a bit of a way to go. There's lots of moving parts But everything you've seen here is open source with the exception of the microcode in the TPM And we're certainly moving towards a brighter future where Hopefully we can see these kind of things in production within the next six to 18 months perhaps So with that You might already have a workload identity that satisfies your use case. There are plenty of other flavors Kubernetes could well be your IDP if you wish it to be Service count tokens bound service count tokens are becoming the default There is a nuance of bound service count tokens, which is Because they're temporarily bound because they have this Time time out and these usage criteria base temporal usage criteria They're pushed into a pod on a tempFS partition by the cubelets and then they're rotated If your application does not reread them from disk every time they used your process will start read its configuration and When that key is rotated it will keep on using the old Cryptographic information that are pulled at boot time So applications need to be updated in order to use bounds projected service count tokens Various different implementations. I mean you could either just read that from disk every time You could have a I notify watch on the thing as well and just pull it when it changes But it's unfortunately not just a drop-in change Of course the cure channel for communication is is a deregur Consider spiff inspire and the next generation of trusted platform modules and key lime are very exciting and looking forward to deploying them and With that, thank you very much for your attention. It was in that case Thank you very much