 Someone is going to record the talk. Anyway, I'm just going to start. Hi, everyone. My name is Astra, and I'm a software engineer at Google on the Google open source security team. And I'm here with my co-presenter, Laurent. I'm also in the same team in the Google open source security team. And today, we're going to be talking about authenticating supply chain metadata and how to build remote code attestations on GitHub. So this is going to be a fairly general talk. I know all day you've been listening to provenance and attestations and that sort of thing all day. So I'll hopefully try to keep it a little bit new. So to go over what I'm going to talk about today, first, I'm going to just introduce the problem space a little bit with some motivating software supply chain attacks as motivating exploits. And then I'll talk a little bit about what general remote code attestations are and the way that we are going to be thinking of them. And then go through two ways that you can achieve them on GitHub reusable workflows and GitHub actions. And we'll have a demo here, and we'll have some prizes, so make sure you stay awake. I might not. So anyway, as you guys have seen probably throughout yesterday and today, if you were at the open SSF day, is that there are lots and lots of supply chain attacks and compromises that have been going on in the past two years, if not, obviously longer and will probably continue to happen in the coming years. And in one of these cases, there was a compromised build infrastructure. This was the Webman 18.0 attack that caused a compromised build server to pull source files that were not actually from the source control repository. This actually caused a remote code exploit present in their release, so it was actually a pretty high severity bug, and lots of news kind of happened as a result. What I kind of want to take away from this is that I had our build server in this particular case done some sort of check on where it was getting those source files from. Or if maybe clients who are using that utility had done some kind of check, they may have been able to detect where that source was coming from. Again, lots more attacks here. In this case, not a build infrastructure attack, but a dependency attack. And dependencies have such a big attack surface, as you know, and there's lots of different ways that these dependencies can be maliciously injected in different ways. Dependency confusion is a really interesting one. This blog post was a pretty cool attack. Basically, lots of big companies are using a mix of both public and private packages that are hosted on public and private registries. And so this dependency confusion attack exploits that particular mix and confuses private ones with public ones. And therefore, you can kind of squat a public name, and you can resolve the package repository tool that you're using will resolve to the public one that you've squatted as a malicious attacker. So again, the takeaway here is a lot of times when you are building something, when you're pulling packages, when you're doing something, you might not have the information that would have been able to tell you what am I actually getting, or what am I using to build, or what am I actually running. So again, more information, the better. Another cool one that you will definitely have heard of is the CodeCup attack. And in this one, a malicious attacker used leak credentials to upload a malicious binary to their GCS bucket. And users would often just pull that binary in directly without any sort of verification and run that. And so people were using this in environments that had permissions and tokens exposed, and that was not good. So again here, this one, had there been some kind of proof of where that binary was built before it uploaded to GCS? Perhaps people would have been able to detect that. And likewise, if you were a consumer of that binary and you had pulled that in, you might have been able to detect that this is not something that you expected. And likewise, again, because dependency attacks are obituous more. So if people have more information, basically the key takeaway here is we need more trustworthy information out there about what's in our software artifacts to help secure our pipeline. So what are we using? What are we using to build? What dependencies are we using to build? What source are we using to build? All these sorts of pieces of information can be better used to verify that later and then make sure we're getting what we expect. So all right, what do I mean by a remote code attestation? So I don't like this slide because it's a government, I don't know, issued statement of what an attestation is. And it's very long. But what they say an attestation is an issue of a statement based on a decision that fulfillment of specified requirements has been demonstrated. All I mean to say is an attestation is a piece of data representing a proof of an event. So an attestation doesn't necessarily have to be in the world of Salsa, which you may have been hearing about before. But it is a general piece of information that can be used to attest to anything you want. And in that kind of event, you might be concerned about what was the environment of that event. So where was it happening? When was it happening? Or the materials, like what is being used, recall the Webman exploit where the incorrect source was being used. The recipe, so what steps were used to build, what configuration was used to build. And then finally, maybe what happened as a result of the event. So what was the output, the subjects. And finally, the ideal goal here is can you use that attestation or proof piece of data to trace that subject back to the materials with the recipe in the environment? So the goal is, can you basically get all the information you need to retrace, re-figure out, verify that you're getting exactly what you expected, whether that's you're concerned about the materials you're using, or whether you're concerned about the recipe being used. All right, so like I said before, we didn't really want to constantly say build Salsa, Salsa, Salsa, Salsa, Salsa. But you can also attest to events like code scanning or code commits or releases or vulnerability disclosures. So you may not always need a Salsa provenance. Perhaps you also need an attestation that you performed a vulnerability scan. Or perhaps you're an organization that reviews code and provides some kind of security audit on them. And you want to make a sort of transparent attestation that you've reviewed a particular package or a particular piece of software and share that with the world for them to use on their policy decisions. So again, they can make code attestations and codify what they are actually attesting to here. So you can use all these sorts of attestations for different things here. All right, so how would they have helped? So in the exploits that we had before, again, like I said before, the main idea here is that we'll be able to trace that artifact that you're consuming back to the source code through that recipe and through the materials in that environment. And this way, you're actually able to ensure that you're getting the expected source. You are not getting any backdoors inserted. Maybe your dependency list is something that you've curated and you expect and protect yourself from the supply chain attacks that we saw before. And also, like I said, if you're not looking at code attestations specifically related to builds and to things like Salsa, you may be able to say, OK, I have some extra assurance that XYZ Party did a security audit of this particular package. So in a world, maybe in five years, maybe we build up a common language of these attestations, which we do have, and we'll kind of grow the technology and tools for people to share these attestations with the world and make policies and decisions based on those. All right, so now I get into the big crux of the problem here, which is trust. So step one, it's great to have availability of data, but step two, it's probably great to also trust that data. And how can one trust an attestation? So what exactly do I mean by trust? Maybe you're considering, do I trust the producer of the attestation? Do I trust the organization that might have created that attestation and generated it? Or do I trust the process generating the attestation? Second, this is a bit of like a nuanced one, which I think is a little bit trickier to solve. Do I trust the attestation was not interfered with when it was produced? And so this is a particular point that I will try to hopefully drive in later in this talk and sort of describe what I mean by interference. But you may not only want to, let's say, create an attestation of something, you may want proof that that attestation was created in an environment where something might not have interfered with it. And that something could be the build itself, or that something could be the owner of the package. And especially when you might consider people who are creating self attestations or creating attestations on something that they are doing, how can they actually provide you with some trust or guarantee that they didn't meddle with it for their own benefits? Maybe they don't want to get alerts, or maybe they don't want to get people to send a letter to them saying that they didn't properly do something. And then last, do I trust that the attestation was not altered? This one should probably be a little bit easier to think of sometimes. Whenever you think of altering or tampering data, generally maybe someone thinks of digital signatures. So definitely that will be a part of my talk. All right, so now that we have what I mean by trust out of the way, now I'm going to kind of talk a little bit about how can we actually do this in practice. So what I really like is GitHub released a feature, I think sometime in October or late Q3, Q4 of last year, called reusable workflows, and we've been playing around with them a lot and messing around with them a lot and found a bunch of really cool features about them that we want to exploit. So anyway, obviously probably most of you are familiar with GitHub workflows already. They're a pretty standard way of running CI builds, actions, tools and so on natively in GitHub and many times people actually just run releases on GitHub as well. So these are all defined in your source repository under a particular folder and you can use these to run third party actions, you can use these to run your own commands, you can use these to, I don't know, maybe create attestations. And they can also be triggered on different events and those events can be manually triggered or they can be triggered on some sort of like pushing a tag. Anyway, they're awesome, I'm sure you all have seen them before. They look something like this, they're defined in YAML and they again have a particular trigger involved and then they can be used again to run some sort of arbitrary code in there. So they're built with a sequence of jobs and each of those jobs consist of a sequence of steps. And so inside those steps you can run whatever code you want. And so for the duration of this, let's just think of, you know, I want to create an attestation of that thing that I ran. I guess it cuts off for everyone, yeah. Sorry about that. There's probably no information over there, that's really that relevant. But right, so how do we create a verifiable attestation about what was run in that GitHub workflow? So again, if you're thinking maybe I wanna run a, you know, a code scan in your GitHub workflow or maybe I want to run a build and then perform a release on a GitHub workflow, can I actually create an attestation that I performed that in that workflow and can I prove it to other people? So for the sake of this, I'm going to say, let's create an attestation on this particular event. All right, so a naive attempt would be, let's perform the event that we want to attest to and then let's just record it. Let's say, you know, maybe I can pull some logs or maybe I can just, you know, check out the repository at the particular commit that the workflow is in and say, yes, I actually performed Echo Hello World here. That might be a naive way of doing this and it might be sufficient for some people, especially maybe if you trust what's going on in my event and you trust what's going on in that particular repository, but it might not be enough. For example, my event might interfere with record my event. We might see other, like perhaps also, there's an incentive for the maintainer of this repository to record a slightly different version of Hello World here and that particular owner would be able to manipulate the generated attestation to however they want. Maybe they do something weird, maybe they don't check out the repository, they just do whatever they want. So it's not quite an isolated provable process here to say, okay, I'm just going to generate the attestation by myself in the same workflow and I control how I'm generating the attestation and so you get what you get kind of thing. You kind of want some extra assurances around, okay, I didn't tamper with this. All right, so like I said before, a bunch of problems here, the first one around interference. Can any of the other steps or jobs in that workflow actually interfere with the recording of that in order to produce false info? And they might build processes, there's lots of weird things that might happen, there's hoax that you might run, there's all sorts of weird things that might happen and also you as a maintainer might interfere with the generation of that attestation itself. The second and the third, integrity and authenticity, those one kind of go hand in hand. Integrity, you might not trust that, let's say that GitHub workflow uploads that recorded attestation after the fact. Can I go in and just replace it with something else? I mean, maybe, right? So can I tamper prove that, can I make sure that it's not being altered after I produce that attestation? And then the authenticity portion of this, can I actually prove that ownership? So they kind of go hand in hand, can I prove that it wasn't tampered with and the original author was something in the workflow? So these are the three types of problems that I'm going to be dealing with and trying to say, okay, yes, we have a solution. All right, so my solution here is GitHub reusable workflows like I introduced before. So what I wanna show you is GitHub reusable workflows, they are basically like a GitHub action, but they are a reference to a workflow hosted in someone else's repository. So the owner of this user slash repo doesn't actually have access to modify trusted builder and can call into it like a separate workflow. And with this sort of like isolation step, user slash repo can actually call into something that they don't control. And let's say that that trusted event over here performs that exact event that we want to attest to and then generates the added station away from the user repository. So with this, your user repository doesn't really have a way of tampering with what's going on in both the event and the generation of the attestation. So what I like about this is that you have that isolation for free. And what's kind of different about this versus GitHub actions, which is probably going to be a question, is that there is kind of by default, no permeance of environment variables or defaults and things like that into a GitHub reusable workflow. So there isn't really a point of like, okay, I'm a maintainer who has influence over user repo. Can I go and manipulate the generated attestation in the trusted event builder? And Laurent will kind of talk about what steps can be done to, well, can kind of show you the pain process around what GitHub actions doesn't do by default where GitHub reusable workflows will. So again, this isolation piece, we get a nice layer of isolation between the user repository and the repository that's actually performing whatever event you want to attest to. So again, let's generate the attestation here. So interference, let's take care of that. Now the second two problems, integrity and authenticity, let's tackle those together. And I'll use everyone's favorite word here, is save store. So what we'll do is create signatures with authenticity. So create signatures with a form of like identity component based inside the signature metadata, I guess. So in this case, what we're doing is we're using workload identity here, which is similar to Spiffy, which you may have heard the talk previous on TectonChains a little bit about. So what we're gonna do is use open ID connect, which is supported inside GitHub workflows to create a signed certificate from a certificate authority on that identity. So our signing certificate will contain the identity of the trusted builder. So the trusted event, performer of the event and attestation. That way we have an ownership or authorship over who created that attestation because that particular signing cert must have been created inside that environment. So that certificate that signs that attestation or provenance is going to be located or it can be located to that identity of that trusted workflow. So as an example, the return signing certificate from something using open ID connect with SixTor's keyless signing would have a subject alternative name with the fully qualified location of the reusable workflow. So this makes it really nice because you have an identification of what actually performed that event and what actually recorded that provenance. But you also, in addition, get a lot of information that is maybe useful to you. For example, what triggered the event, the push did. That's the 1.2 extension. Or what was the caller repository? So that user repository that actually invoked the trusted reusable workflow. And that's the 1.5 extension, my own repository. And you'll also get some other information about which particular workflow invoked that. So with this signing cert, if you use this to sign the actual provenance inside that environment, then you have that authorship so that identity component of who created that attestation. And you also get a sort of bind to that data in a way that makes it tamper-proof. So if anyone decides to alter that provenance outside of that workflow, it would not be signed by this particular identity anymore. And so you would fail verification. So, all right, yeah, putting it all together. This is kind of the layout of how you can create these attestations. If you create as like perhaps a organization, a collection of reusable workflows, or use some that are distributed by Salsa, the Salsa framework organization, which is us, then what you can do is you can invoke those and create attestations that you definitely can prove that you did not tamper with and you did not interfere with the creation of. So roughly all of these are going to have the same structure of perform the event and then in a separate job, which is an isolated VM, record the event. And this way, when you get back that information into your calling workflow, you'll have a statement of proof that was created inside this trusted workflow and also with that guarantee of integrity. So, now we have all three. We have interference, integrity, and you can't see the check mark, but I can and we also have authenticity. So, all right, that's basically our rough setup. That's the whole like, this can be used for really anything. And so, verifying an attestation, we're kind of gonna rewind on those steps that I just talked about. We're first gonna try to get that integrity component by verifying the signature on the attestation, verify that prover identity, so verify the identity of the trusted reusable workflow. That's gonna give us authenticity and isolation. And so, now you have trust over the actual attestation or that proof. So, now you can go run with it, do what you want. Go check the source, go check the materials, go check the environment, go check the actual statement. All right, so now I'm gonna hand it off to Laurent who's gonna talk about applications of this and also do our demo. All right, thanks. Yes, I might turn to speak. I'm gonna try to do as well as Ostra. The bar is pretty high, so, let's go. You picked that. Yes, so as you might imagine, the most compelling application of reusable workflow and trusted builders is artifact attestation, also known as salsa provenance. And here by artifact, I mean anything which is the output of a build pipeline. It could be a binary, it could be a package, NPM package, it could also be something else like an S-bomb because an S-bomb is also something that's the output of a build and you might want to prove to a third party that the S-bomb is authentic, right? And how you created it might be important for a consumer. So, I'm going to repeat a little bit what Astra said and what you heard about salsa but essentially with the attestation we can create a strong link between the artifact and the source, the original sources which you think the binary or the artifact is coming from. In particular, we can actually tell you which repository, which source, which repository the source came from at which hash commit it was built and we can also report all the steps that were performed during the compilation of the build. So say if you want all your builds to have a special flag when they're being compiled, maybe you want something like CFI with the latest on pointer integrity support then you can trust, you can use artifact provenance to check for this before you deploy the binary into production. So I want to give you some of the use cases that we think are really interesting and that are enabled by having artifact attestation. So the first use case I'd like to take is the GitHub dependency API graph. GitHub of an API where you give it two commits, two different shards and it gives you in response the list of dependencies that have been changed during those two commits. So here on the slide you see that there's one npm package that has been added and its name is helmet, all right? And the API also returns the source repository that was used to create this package. The source repository unfortunately is not authenticated. It's taken from the manifest file. So we don't really have a strong binding between the package and the original source code that was used to create it with artifact attestation. We can fill this gap and create this strong binding. So once we have all those attestation we can start creating policies and you might be able to enforce policies at different time in your supply chain. You could have it, for example, before in the control plane, so you want to deploy a cluster. You would be able to check that your container is coming from the right source repository, for example. You might also want to enforce policies at build time instead of doing it at the last minute. So let's say you're creating a container image. You could have a policies that says fail unless the base image is coming from, say, distro less built from this repository, for example. And then you could also have policies at installation time when you run NPM install or PIP install or APT get install and all this sort of stuff. Another interesting use case is yet another GitHub dependency API. So I think two days ago, GitHub released this new API for maintainers. So the motivation behind this new API is that for certain ecosystems, it's pretty difficult to pass the dependencies statically. And in some cases, maybe you can't actually resolve the dependencies until you have built the final artifact. So they created this API where maintainers can build and then publish their exact dependencies. And thereafter, GitHub can give you custom alerts and more accurate alerts about the sort of vulnerabilities that have been found in those dependencies. You can think about it basically as an SBOM sort of API where maintainers push their SBOM to GitHub. Now in the context of the supply chain, me as a maintainer, I might want to prove to someone else who's consuming my artifact that my SBOM was generated without cheating and I'm not hiding vulnerabilities because I have some dependencies that might have vulnerabilities. So using artifact attestation or self-provenance, you can prove to a third party that your SBOM is authentic and you can prove to them how you created it. And as Asra mentioned throughout the talk, you can really use those kind of attestation for any kind of metadata. Maybe you're running a static analysis tool and you want to report the results. CodeCop could do something like this, for example, to report. If you want to prove to a third party that you have say 30% coverage on your unit test, you would be able to do this with a reusable workflow. All right, so as Asra said, we've been playing around with those reusable workflows for a few months and today we are actually releasing a builder workflow for the Go programming language, which is a V1 version. It's ready today, you can use it. We have everything working. We have a verifier where you can run it and verify that you can verify the attestation. So this is ready to use today and I'm gonna give you a demo just right now. The builder is Salsa3 compliant, meaning that the provenance information is non-forgeable using all the techniques that Asra described earlier in the talk. All right, so some demo time. Okay, so as an example, I'm gonna take the Scorecard project, which is another project from the OpenSSF and I'm gonna show you how to basically use the builder. So as you can see, the version 4.4.0 of Scorecard already has the binary and it's corresponding attestation or Salsa provenance that you can download from the website and verify it. So first, let me show you how you can use the builder that we have written for Go. It's really simple, it just takes two steps. The first is you create a config file, explaining the sort of like all the flags and the arguments that you want us to pass to the compiler. So it's pretty simple. And then second, you define a workflow. So here we're just calling the trusted builder built from a reusable workflow at line 32. And that's really all it takes to start building and generate Salsa provenance, which is non-forgeable using GitHub Actions. Now let me show you how we can verify this provenance. I have already downloaded the binary and the provenance. So all it takes is to use a project that we have on GitHub. It's called the Salsa Verifier. We plan to have it available as Linux packages and make it easier for people to install. But for now, you have to just install it by downloading the binary or using Go install to get it on your machine. So it's pretty simple. It takes a path to the binary, a path to the provenance. And then you give it the source repository. You believe this file, this binary is coming from. And then we have optional arguments such as the tag. As you saw earlier, this was the v4.4.0 version. So here it's succeeding. If I was trying to maybe perform a rollback attack and give you a binary that was from a previous version, then it would fail. And similarly, if you give it a different repository, it fails, all right? So this is great. Now let's take a look at the content of this, the provenance. So there's lots of stuff. So I'm just gonna focus on the sort of things that might be interesting. So obviously, we have the hash commit that was used to compile this binary and the repository, open SSF scorecard. Then here we have the steps. So as you can see here, we have a first step where we ran Go mode vendor. And then the second step is this long command, blah, blah, blah. And the list of environment variable that were used to compile this binary. And using this, you can just rebuild and just replay that. And maybe if you use the same Go compiler, you'll get the same result. Although I'm not sure that's entirely true, but that's a different question. All right, all the kind of information we see that is interesting might be here, the GitHub actor. That's basically the person who triggered the build. In this case, it was me. And the event name. So that was basically a build that was triggered on GitHub with a push event. All right? So let's go back to the presentation. Right, oops, that doesn't work very well. All right, and we have more information. How many, five? Oops. Yeah, so I highly encourage you to try out those builders and let us know on GitHub. I also want to mention that the OpenSSF reward program, rewards developers who will install this kind of trusted builders, which improve the supply chain of critical repositories. So unfortunately, I don't have a lot of time to go through this. But there's actually an interesting question that we asked ourselves a couple of months ago. And can we actually build remote code attestation without a reusable workflow? Just using a GitHub action. So remember what Asra said, if you just use GitHub action, you don't have isolation. So the maintainer can really meddle with everything that's happening in the build. And it turns out that we can. So I'm gonna skip over because we don't have time. Basically, the problem statement is the following. We have, say, I call it the scorecard GitHub action because that's something that we use in the scorecard action. So we have a scorecard action and we want to prove to a scorecard server that the results are genuine and haven't been tampered with by the maintainer. So what we do to solve this problem is, again, we reuse the, we reuse SIGSTORE and the OIDC token. What we do is we get a certificate from SIGSTORE, which indicates the repository name, the hash commit, and the workflow that's currently running. And then we sign the results of scorecard with this certificate. Server side, after verifying the signature, we fork that repository at this exact hash that is present in the certificate. That gives us the exact copy of the workflow that is being run. And then what we do is we inspect the source code of the workflow. But what I mean by inspecting is we kind of validate that this workflow, the source code of the workflow is just calling scorecard and hasn't added any additional steps to try to tamper with the results. So we're gonna look for things like, is it running on GitHub hosted runners or self-hosted runners? We're gonna look at whether there's additional scripts that are run or maybe additional jobs that are declared and run or specific services are started or containers are used. We're gonna check all this and if everything checks out and we trust the source code, then we know that what's being run and run by GitHub is actually the scorecard action. And therefore we can trust the results that we are receiving on the server side. So to conclude, we can achieve remote code attestation on GitHub with reusable workflow, but also with GitHub actions. However, I wouldn't recommend using GitHub action because it's pretty tricky to get right. You know, you have to pass the workflow. I have to be sure you're not forgetting anything. When you validate it, you also have this additional run trip where you have to fork the workflow file from GitHub to verify it. So all in all, I don't encourage you to use GitHub action for remote code attestation and I highly encourage you to use reusable workflows. They're really simple to use and they give you isolation for free. So yeah. So I'd like to reiterate that, yeah, today we've just launched the general availability of our trusted builders for Go projects. I encourage you to take a look, give it a try, give us some feedback. Astra also created a special repository for you to try out how to verify provenance, get used to its content, learn how you can use it in your own projects. So give it a try and let us know if this is helpful. And that's the end of our talk. Thank you very much. Any question? That's a good question. So right now we're using the Intodo attestation format for self-provenance as well, but in terms of what can we do to sort of generically support more of these, maybe it would be an interesting thing to create helper actions that can be invoked inside reusable workflows to help create some of those attestations in that format. So that's something that we were thinking about anyway for sort of scaling up our builders and sort of modularizing the components there and perhaps creating some like tooling around GitHub actions that you can invoke there for creating attestations would be really cool. That's a great idea. Yeah. Also, yeah, go ahead. Yeah, no, Brandon also has worked on a source attestation for S-bombs. So he has a special predicate and I think he has approval concept so we can easily use it for S-bombs. And in fact, our builder in the next, I think the next version, we want to support S-bombs generation with an attestation attached to it. Go find the Easter egg. We'll give you some swag. Yeah, we have shirts. If no further question, thank you again.