 Okay, I think I have the green light. So thank you everyone for coming. My name is Billy Lynch, and today we are going to be talking about identity-based source integrity. So a little bit about myself, I'm a software engineer over at ChainGuard where we do sort of all things software supply chain security. I'm also a maintainer for some open source projects like TectonChains, as well as 6Stores GitSign, which we'll go into a little bit more today. So this talk is going to be sort of all about supply chain security. It's a big hot topic at the moment, but sort of what does that mean? We like to think it's all sunshine and rainbows, and that we have this nice pipeline where we start from our source and go to our build, and then from our build we then package it up and then send it to our production servers. In reality though, it's a little bit of a mess, and we can sort of look at that same graph and we can see all of these different points where things can go wrong, and it's a bit chaotic, but for this talk what we're really going to be focusing on is this source component on the left. So what are some of the compromises there? How can we sort of make improvements with code signing, stuff like that? So this is a real problem. We've seen many instances in the wild where people are impersonating commits, pushing malicious packages, source repositories being compromised. We want to be able to detect and mitigate these attacks with software signing. So code signing is something that's existed in Git for years and years now, and so it comes up whenever we have these problems, it's like, oh, could code signing have solved these? And so there's a lot of reasons to do code signing. So normally the data that's stored in your Git commit is really just text. What's stored in the author, what's stored in the committer? It's a name and an email, but you can modify those just by modifying your Git config messages. So what commit signing allows you to do is basically have additional metadata about the commits backed by a public private key pair so that you can then verify the data and make sure it hasn't been tampered with. By doing this you have some information that's more trustworthy than just that simple string, and what's also useful about it is your storing information in a secondary source that's separate from your repo. And so if you want to attack basically a signed commit, not only do you have to compromise the source repo, but you have to compromise the keys as well. And so having those two work in tandem gives you a lot more benefits from a security standpoint. One thing I will note here that code signing does not prevent, which we won't go too much into detail for this talk, is code signing does not stop a malicious user who is authorized from making a bad change. So this is, you know, disgruntled open source maintainer decides, hey, we're going to start pushing malware into our libraries now. That is also a concern for software supply chain, but not something we're going to be focused on here today. So a quick overview on how this works, if you haven't used commit signing before. There's a dash s, dash uppercase s flag to cryptographically signed commits. This is not to be confused with dash lowercase s, which is for dco sign off, which has a different purpose. And this will allow you, you configure and get, you tell it what key pair to use. Typically this has been GPG, but in recent years you can also use SSH keys as well as X509 certificates to sign. And get also has hooks to verify those commits as well. You can just say get verify commit, give it the revision, it will check those signatures and it will tell you pass or fail. So this is CD con, GitOps con. So you know, one question we want to ask ourselves is, hey, how do we use this in our GitOps pipelines? Being able to commit back to a repository is something that's very common within GitOps workflows. You can imagine hydrating manifests, populating some data, some configuration information, resolving those, pushing back that to infrastructure repo before we then deploy that out to our production systems. So it would be nice if we start signing our commits so we have more verifiable metadata about where these things came from. So there's sort of two approaches I see most commonly for GitOps pipelines in signing. One of them is sort of bring your own key. So this is where you provision a long lift key, usually in a key management system, vaults, cloud KMS system, Kubernetes secrets, and you make that available to your CISD pipeline via some mechanism. The other thing that we see pretty commonly is this is GitHub only, though I think there are some plans to adapt this for other source forages as well. GitHub has this special key called webflow.gpg that you may or may not know about. So GitHub, whenever you do any sort of API operation, any UI operation, when you hit that squash button in the UI, it will actually sign your commit data with its own key. It can't use your private key because GitHub shouldn't have access to your private key. So what they do instead is they authorize you and say, OK, you're signed into GitHub, we know who you are. We're going to sign it with our own key so you have that verified check mark and that signature attached to your commits. There's a lot of challenges when it comes to commit signing. It is very possible to do this all correctly, but there's a lot of steps that you need to take in order to have a good security stance. So things like, are you encrypting your keys? Or when you generated it, did you just hit Enter, Enter, Enter and moved on? Are you rotating your keys frequently? Ideally, you should be doing this every couple of months. How often are you rotating this in practice? If a key of yours is out in the wild, how would you know about this? How would you detect this? If you have a case where you do need to revoke a key, what's that process? How do you notify your downstream clients to no longer trust that key? And then finally, there's something interesting that's a little more philosophical, which is sort of the whole idea behind this talk, which is a lot of the times we make this assumption that a key equates to identity. You have your GPG key, we have our production release key, but is that always true? And is that a strong assertion of identity? So if we look at sort of the bring-your-own-key model, where are you storing this key? Who has access to it? If it's stored in your KMS system, do your developers have access to this key? Could they just download it and then sign whatever they wanted sort of inside a risk sort of threat? Again, how often is it rotated? Are you sort of taking good hygiene for managing your keys? And do you have a disaster recovery plan for not if, but really when it leaks? Because we always want to be prepared for the worst when we're sort of modeling our security threats. And same thing for Webflow.GPG, even though that's not a key that you yourself manage, there's sort of a big problem here, which is everyone uses the same key. You'll see on that verified check mark, it's like 4AEE. That is the same key that's used for everyone across from GitHub. And before, I mentioned some of the benefits for commit signing, having that piece of data that's separate from your commit content, using the Webflow key kind of breaks down that barrier a little bit because now we're trusting GitHub for both serving the content, but also providing that key. And if you want to take that as part of your trust model and say, hey, we trust GitHub, maybe that's fine. But it does sort of open yourself to a little bit of more risk if there was ever a problem where we can no longer trust GitHub. Another thing worth calling out is GitHub doesn't necessarily have the best visibility into identities. Git uses emails for its commit data for authors and committers, but it doesn't really know, like, do you still have access to an email? So for example, it was on its slide before, before I was at Chingard, I was at Google. So I have an atgoogle.com email still associated to my GitHub account. GitHub doesn't know I don't work at Google anymore. And there's actually some reasons why you would want that email still associated because all of my commits, when I was at Google, we still want to be able to verify and we still want to be able to check. But there's nothing stopping me from going and making a commit with my atgoogle.com email even though I don't longer have access to it and I'll still get that verified check mark. GitHub has no idea. It just knows that it was verified and associated to my account at some point in time. And then when I was playing this slide, when I was playing this deck together, there's also this interesting behavior when you start using some CICD automation, particularly with GitHub actions. So remember before, the webflow.gpg key is used whenever you use an API operation and you don't try to include any signing metadata or material with it. So you can actually use the GitHub actions GitHub token and you can just make a commit in GitHub actions. And by default, you will get this committer that just says GitHub actions bot. This ID is just the ID of the GitHub actions app. And this is true for any GitHub actions that you run and you get a commit that just looks like this. And so to me, this worries me because how can this be used in malicious ways? How do you know that this is the correct GitHub actions? Is it coming from your repo? Is it coming from another repo? Without having finer-grain metadata and more data about the identity and where it's coming from, it's much more difficult to make these decisions on do we trust this change? Do we trust this commit? So that raises the question, how do we deal with imposters? How do we deal with this? And so software supply chain, again, is a sort of big area. So one of the things that makes sense to do is what are other people doing? This is sort of focused on source, but there's also been a large focus on software supply chain for just artifacts in general, deploying to prod, container image, packages. And so that's where we're really going to start talking about SickStore. So SickStore is another open source project under the open SSF. Its whole goal is to make signing software and artifacts as easy as possible lower the barrier of entry. And one of the really nice things with SickStore, there's a tool called Cosign, which sort of popularized this concept called Keyless Signing, which allows you to sign artifacts using identities rather than just keys, long-lived keys that you sort of need to trust for a long period of time. There are still keys under the hood. It's more like ephemeral signing. But by doing this, we can sort of have a lot more metadata and a lot more fine-grain details about the user identities that are being used to sign these artifacts. So we've been seeing a ton of adoption on SickStore with Cosign. There's also been support added to PyPI, as well as GitHub just announced a week or two ago about public beta for NPM provenance using SickStore. We also have a ton of open source projects using SickStore to sign their own releases. Tecton, which I'm also a maintainer for, uses Cosign to sign their releases. Kubernetes uses it, as well as C-Python, a bunch of other projects as well. Yeah, so we've been seeing a ton of adoption on SickStore side. So then, going back to this chaotic graph from before, you'll notice if you look at some of the source threats on the left side here, and you look at some of the build threats on the packaging side here, you'll see sort of a similarity between the two, right? So on the source side, submit on authorized change. On the package side, upload modified package. Compromise source repo, compromise package repo. Build from modified source, use compromise package. And so the question is, can we sort of take the model of SickStore and what we're using for packages and artifacts and apply that for source signing? And that's really where the concept of Git sign comes from, which does exactly that. Let's apply SickStore, but model it for Git commit and signing so we can have that richer identity metadata in order to make these smarter policy decisions about our source code. Because really, when you think about it for GitOps pipelines, if our packages and our containers, we wanna be able to sign and verify for our production workloads, our source code is really our inputs for our GitOps pipelines, our CICD pipelines. So we should be treating this with the same level of security and signing that we do for everything else. So, how does this work? So there's a few different pieces that work here. So it all starts from the tooling, whether this is Cosign, NPM, Git sign, it all starts from a client. And what we're really doing for identity is we're relying on OIDC, OpenID Connect, which is a layer on top of OAuth. And so what we're doing here is the tools going off and saying, hey, I need to get an ID token. If it's a human user, we just send them through the very typical OAuth 2 flow. So open up the web browser, sign in with Google, sign in with GitHub, get that token, return it back. What the client will also do is generates an ephemeral key pair on the fly. So it'll generate a brand new private key public key pair. It will sign some challenge to basically prove that it has access to that public key. And what it will do is it will send this data to a component of six-door called Fulcio, which is a certificate authority that will look at that data, look at the public key, look at the OIDC token, verify that the OIDC token is valid. And if everything checks out, it will issue you back a code signing cert, a short-lived code signing cert that's only valid for 10 minutes. And so you can sign whatever you want in that 10-minute window, git commits, OCI images, whatever you want. And then finally, whatever we do sign because typically when you do certificate-based signing, you can only use it during the period where it's valid in order to verify things well after the fact, after it's expired. Anything that we sign, we put on a service called Recor, which is also part of the six-door project, which is a transparency log which is append-only immutable store where we can store these signatures and store these usage as well as the certificates. So that later on when we have a git commit, even if it was from months ago, we can say, hey, Recor, was this commit actually signed at this time? And that becomes our immutable log to do that verification well after the fact. So these are open-source components that you can run yourself. The six-door project also runs public instances for free that any open-source project can use. And so this is what Kubernetes and Tecton are using for all their open-source deployments. So everything is auditable so you can go and query where all the signatures are coming from and it gives a lot more transparency into build processes and signing processes. So I mentioned certificates. So this is one example certificate. So this is from a commit that I just pulled myself. So for human workflows, really what this comes down to is, hey, what's your email? And where was this issued from? So in this case, we have my email, billieachengar.dev, and this was issued by G Suite, so accounts.google.com. You can see here also, look at the not before, not after. You see only valid for 10 minutes and only to be used for code signing. So we can't use this for arbitrary web hosting and stuff like that. So the question then becomes for automated workflows, we don't actually have a user email, right? Your GitHub action doesn't have an at github.com email address that we can use. And this is where OIDC really shines. So OIDC is a very flexible spec and a lot of CICD providers and cloud providers in general actually provide OIDC tokens for their runtime environment. So this is true for DCP VMs, this is true for Amazon, Azure, GitHub Actions, GitLab, CircleCI, they're providing OIDC tokens that you can use and hook into that actually give you finer information about the environment that you're running in. So this is one example of a GitHub action token that I pulled from just the GitHub documentation. But you can see here, unlike the example we saw before with the GitHub Actions email where it's just GitHub Actions ID at whatever, no reply email. We actually have more finer grain information such as the REF, the SHA that it ran at, the repository it's running in, what run attempt is this, what workflow file are we using? And so this becomes a much richer source of data where we can start making smarter policy decisions on individual CI runs rather than just sort of holistic emails. And so Sixdoor will handle these as well. So this is an example of a certificate that was generated from a GitHub Actions workflow and we can see here from some of the fields that are taken out. We extract the issuer, we extract where it's coming from so we can say was this a push, was this a pull request, was this a manual run, what was the workflow name, REF branch, things like that. And this gives us a lot more control. So a quick demo, kick this off because this takes a little bit to run. So I'm gonna show a GitHub Actions run but while that runs I can also show the local workflow. So it's pretty easy to get started so if you just, all you really need to do is install the Get Signed binary, add it to your path and then all you really need to do is set two required fields that basically just says hey, Git, please use Get Signed for my signing program as well as this is an X509 type certificate signing model. There's a few other config settings you can see here that just configures behavior of Get Signed but those are optional, they're not really required. And then all we need to do is make a commit as normal with a meaningful message. And what's gonna happen, it's gonna open up a browser window. You're not gonna see it here because it's in the background and since I'm already signed in it already knows hey, you're signing with Google and it makes the commit signature everything else and we can see here we have this transparency log entry added. And so Get Signed verify, the Get Verify commit will work. However, Get Verify commit only looks at the keys. It doesn't actually look at identity information and so what we also have in Get Signed is the ability to do sort of finer grain verification not only of the key itself but also the account information included here. So here it's just for human user, it's what is the issuer, what is the identity and so we can run that and verify that and it will do all the checks for RECOR and everything else. And the idea is very easy, very simple and this just generated a new key under the hood on the fly. And so now, so this is the six-door off page that opened up in the backgrounds and so now we have a GitHub action that was done and all this really did was it just made a commit, didn't do anything too special but what we can do here because this is a little, we can grab the commit shot and six-door has this really nice UI where you can just query the RECOR log. Theoretically you can monitor the log for every single instance where your email is being used or when your production workflow is being used to sign things. And we can just look up by commit shot what that just made and we can say hey, this was created two minutes ago. This hash doesn't, it's actually a hash of a hash which isn't super useful but we can see here in the certificate issued by six-door, it's still technically valid but as far as get sign is concerned, once it's signed that commit, it threw that key away. That key never even hit disk. It's only stored in memory used for that one event never to be used again. So as far as I'm concerned, it's lost, it's rotated, we don't need to worry about it. And we can see all of the information here so it's a little more verbose than what we saw before in that small example but same information issued by tokens.actions.kidabuser content.com, here is the actual individual workflow that was ran including the REF, the commit information, URI, the fact that it was ran from a workflow dispatch, stuff like that. And so what this allows us to do, if we get pull, the latest, get sign verify, I had this in my history. All right, this looks a little gnarly but it is hopefully correct, I had it here somewhere. Demo was, yeah. So we can verify the same thing but now we're doing more than just the email identity, we can actually have policy that says hey, this came from REF's head's main, this came from a workflow dispatch and if we wanted to do something like hey, did this come from a push events type O but it doesn't really matter. This will actually fail validation so we can start making smarter decisions about push versus pull request versus manual run versus anything else. And because this is all bound by OIDC identities, what's really nice about this is I as a developer can't fake this unless I direct access to GitHub's OIDC token and make sure it's service. So if I had the ability to run arbitrary things on my CI workflow, maybe I could get around it but normally as a developer I need to go through my normal processes and this is really, really powerful as a policy control tool for CI CD. Cool, so just to review some of the pros and cons, I'll start with the cons. So common piece of feedback we get if you use get sign in practice is you actually won't see the green verify check mark. It is something that we do talk to you, GitHub and GitLab about quite a bit. And really what it boils down to is how Git sign approaches signing and verification is very, very different than what's traditionally been done. Normally for GitHub, when you wanna verify, when you wanna associate the key to your account you go into your profile settings, you add the key but in this case with key list, there is no key to add. We actually don't know what the private key is so we can't actually do that. And so there are some changes that need to be made to verifiers in order to do this more identity-based verification and those just aren't in production GitHub or GitLab at the moment. Another thing worth calling out is you saw some of the metadata that was included in the certificates. All of that goes onto the public transparency log. So if you're using the public six store instance and you're uncomfortable with that data being present, that may not be the best fit for you if you're concerned about repository names or branch names or identities like that. So that might mean you need to run your own instance. But again, all these components are open source so you do have the tools to run your own instance if you need to. But there are a ton of pros, right? So again, no private keys. I don't need to worry about protecting this long-lived key. I don't need to worry about that leaking outs. And even if it does leak out because that key is only valid it's bound to that certificate for 10 minutes that drastically reduces the time where that key can be exploited. So even if you detect oh hey oops we accidentally locked our key. As long as it's 10 minutes past you can guarantee that key could no longer be used to sign valid data anymore. And so that just puts you in a much better position in terms of responding to security events whenever they come up. Signatures are tied to runtime identities. The fact that you need to go and get a fresh OIDC token and a valid OIDC token in order to sign anything means that that email case that we saw before of oh hey I can sign things at Google.com I can't actually do this in this workflow and I can't go get a key for my CISD workload easily. So that gives you a little bit more trust in that identity rather than just assuming that a key is one to one to an identity. And then finally all of its usage since it's uploaded to the transparency log you can monitor that. And if there's ever any case where something pops up it's like oh hey your production identity is being used for an artifact or a signature that you don't actually know about that's a signal for you that maybe something funky is going on and it gives you some telemetry for where to investigate and where to look. So that's all I had. Here's some contact information if you wanna reach out. Here is the link for the Get Sign project if you wanna get involved or have any questions. Star is always appreciated. Thank you and I'm happy to take any questions. So with keyless signing if you suspect that at some period of time that maybe someone managed to get a hold of your identity and got your password and let's say you did the bad thing and you don't have 2FA enabled on your email or things like that. And they were able to start signing things with your identity. What can you do in that situation? Yeah so that's a great question. One of the nice things because you're getting a fresh key every time you could try to revoke every single key but the safer thing to do is basically say okay we don't know when this was compromised so we're not gonna trust anything before this period of time. And you can know that, you can have that as a policy that basically says any commit, anything made with this identity before this period of time we're no longer gonna trust. The other thing that you can do is because keys are now unique per instance is you can just revoke the artifacts, you can just remove the artifacts that they signed because you know what they're tied to based on the signatures. Easier to send than done for like OCI images, stuff like that. Git commits obviously much harder because Git itself is a Merkle tree. But yeah you would look for revoking the artifacts or identifying those artifacts that those signatures are tied to. But at least you have that point in time and you know exactly which artifacts that those would point to. But you do raise a good point that this keyless signing is sort of predicated on account security and having two factor auth enabled. So there is sort of a weak point there but the argument is people do a much better job securing their own personal email accounts and sort of pushes like GitHub requires 2FA now for all their accounts. And it's much easier to lower the barrier of entry and require those for sort of more safe account and identity management than it is to say hey make sure you do all these things for all your GPG keys and SSH keys and everything else. So it is a trade off. Thank you.