 Hello, everyone. Thank you for coming to the talk. So about four weeks ago, the NPM package manager released a public beta of our provenance feature using the six store project. And today we're going to go through and talk about what that looked like, what we learned along the way, and what we hope to see in the future for provenance in the open source ecosystem. I don't think we could fit the dozens or maybe hundreds of people who participated in this release on this stage. So unfortunately, it's just me here today, but I wanted to start out just by saying that this work would not have been possible without the six door open source project. It's key to the NPM provenance feature as we'll describe later in the presentation. But this particular implementation that we've launched so far is for the NPM package registry and works on GitHub actions. We are working with other partners to expand this across the open source ecosystem. And then of course this work would not have been possible without the work that happened beforehand with the in total project and also the salsa specification, which really inspired this concept of build provenance build integrity and supply chain security. Okay, this is what it looks like. It's a very short talk. No. So NPM build provenance ties an NPM package back to the exact source code exact commit in the exact build instructions that make up that package. So here this is the example for the six store JavaScript package. And this is what it looks like if you go to the NPM registry and look on the website. So this feature is powered by when you build on a cloud CI CD platform in this example that's GitHub actions. And we also publish the information to a public transparency log for monitoring and auditability purposes. So we're going to talk a little bit about what is provenance, what we built, and then what we learned along the way, starting of course with what is provenance. So the somewhat visceral analogy that we like to use is you're walking down the street and it's a hamburger lying on the ground. Are you going to eat it? Maybe you would maybe you wouldn't but the first question you're going to ask is, where did it come from? Right, you want to know the provenance of the hamburger. So that's exactly the case for open source projects as well. We already have this concept of maybe looking for social proof. How many stars does the project have? Are there other companies we respect that use it? Maybe we're looking for a privacy policy or a security policy. We're trying to determine the trustworthiness of open source component. Today in open source, there really isn't a connection between what a package claims it contains and what it actually contains. I'm sure someone can put in the read me. Here's a link to the source code. Here's how I'm building this, but they can't prove it. And so the reason we undertook this work is to have these verifiable links again back to the source code and the build instructions. Fortunately, we did not have to invent this concept ourselves. The salsa specification has evolved as we've been working on this project. This is the 1.0 specification that came out about a month ago. It's been a busy month by the way in open source supply chain security in a very good way. You can see it for yourself if you go to salsa.dev. But these are the different levels for build security. And so if you're unfamiliar, the salsa framework is an attempt to take the problems of supply chain security down to a framework describing the sort of concerns that you might have and giving you sort of a capability model to say, crawl, walk, run, how would you increase the security of these different elements? So for build L1 is that any sort of provenance exists. Again, this could be a link in a read me. It's just, you know, someone says this is the source code. It's better than nothing. The other thing is that it runs in some sort of trusted environment. This is something we're going to dive much deeper into in a little bit here. And then L3 is where it really starts to get interesting. We are saying that not only does the provenance exist, but it's verifiable. It's very difficult for attacker to forge. And this is really what we want to get to. It's not enough to say, okay, yeah, it's probably the source code is probably built in this way. It's going to be able to independently verify that people are doing the things that they claim to be doing. I like this slide for a couple of reasons. So this is just a different view of provenance. So of course, there's the nice shiny box on the NPA website. That's great. But if you look on the transparency log, this is what you will see in terms of provenance. All the same information is there. The package, the hash, a link to the URI, the source code. The entry point is the build instructions. But it's just, you know, this is the programmatically what you would interact with to receive this information. And then the reason I like to show this slide is because it's showing here again how we're building upon existing standards that have been developed in the community. This is an in-toto documents. We're using the 0.2 version of Salsa. We will update to 1.0. We just haven't had enough time to undertake that work yet. But we're not trying to, you know, come up with this on our own. We're really trying to leverage the standards that are part of the Linux Foundation, part of the OpenSSF, and bringing together this concept of provenance to be an industry-wide capability. Okay. In detail, what does this look like? If you own a cloud CICD platform, or if you own a program language package manager, how would you provide this functionality? I'll admit it's a little complicated. And I forgot to mention on the earlier slide, my role in all of this was as a part-time program manager, even though I'm an engineering by training. So I'm going to do my best to walk through the description of what this looks like. Going back in time a little bit, before we undertook this work, a lot of cloud CICD providers were adding in a OIDC token to uniquely identify a workflow, a build that was happening on their platform. The reason they did this wasn't for provenance. It was so that when your build was finished, you could assume into a cloud provider and do a deploy. But what we, the ecosystem noticed is that a lot of the information contained in that OIDC token is exactly what we would want for build provenance. So we, with a few exceptions, and so we worked with the GitHub actions team to add in that additional information. And then the other thing that we wanted to be careful about was ensuring that this was as easy as possible for people to adopt. So the NPM publish command already existed. What we did is we added a flag to it, dash dash provenance. And so when you include that flag, it will query the cloud CICD provider to get this OIDC token. It's signed by the cloud CICD provider, so it's non-falsifiable. That token is then sent to the six-door Fulcio project, along with a public key. That's a very important step. So once the build completes, you generate this short-lived public-private key pair. You use the private key to sign the build and then you throw it away. You take the public key and you send that along with the OIDC token to the Fulcio component of six-door. That returns to you a signing certificate and burned into the OIDs of that X509 signing certificate are all of the build provenance properties that we care about, along with the public key, along with a hash of the build artifact. All that information is uploaded to a six-door record. That's the transparency log to ensure that these open-source builds are happening in the open and they're auditable and monitorable. Then we send it to the registry. The registry validates the provenance attestations and then it makes it available to end users. Admittedly, it's a little bit complicated. There's a lot of different moving parts here. The real value of the six-door project is that it's creating this distributed source of trust that's independent of a specific cloud CICD provider. That's independent of a specific programming language package manager. This architecture is going to allow the concept of provenance to propagate through the open-source industry. That's our hope and dream. This is the first step towards that goal. Often we get questions about this approach. Again, it is a little bit complicated. Are we getting anything for that complexity? Yes, we're getting quite a few things. A couple other programming language package managers tried to deliver this concept of provenance in a different way. Instead of looking at the workload identity of the build script, they were focused more on human identity. I don't want to say that approach is wrong in every case, but it becomes very difficult. First, you have to strongly authenticate humans. Maybe you want government-issued IDs, and then how are you validating those? Then you ensure that the humans are securing their account. Maybe you have to enforce multi-factor authentication. Maybe now you have them require that they use pass keys if your platform supports that. Then you have to determine the human was a member of the project that they had permissions to do this action and then determine their intent. Was it malicious or not? No, skip all of that. We're really focusing here on the build workflow. We don't really care who's pushing the button. They still have to authenticate into NPM and have the permissions available to publish, but the provenance accesses themselves don't really care about which individual human is running this workflow. That simplified the process considerably for us. As I mentioned earlier, we're no longer maintaining durable keys long-term. You're generating a short-lived key pair. You're signing the build, and then you're throwing away the private key. This is in contrast to people provisioning a long-lived API token for their project. If someone leaves the project, they have to rotate it. It's just much easier for us to have the entire key lifecycle live within a single build instead of trying to worry about how long should this key exist? What is it still valid? What if someone leaves the project? That sort of thing. And then third and very important, we're distributing trust. The NPM registry already was signing packages that were uploaded to it. However, in the event of a compromise, if someone compromised the NPM registry sufficiently, they could gain access to the signing service. They could upload malicious packages. They could sign them. That sort of attack would be very difficult to detect. Now that we've distributed this trust between SigStore FullCO and NPM, even if you compromise one of those systems, you can't forge the provenance. So you could forge provenance certificates if you compromised FullCO, but then you wouldn't have an authenticated NPM user to be able to submit them. Or you could break into NPM and put your malicious package in there, but you wouldn't have any of the provenance certificates from SigStore. So by making use of these two separate systems, we've made it much more difficult for someone to sneak malicious code into these public good instances. Okay, what did we learn along the way? I already touched on some of these a little bit, but we needed to figure out sort of what our first toehold was in this problem space. So we narrowed it down to the specific Cloud CISD provider GitHub Actions in the specific package registry NPM. That is not the long-term goal. The long-term goal is that this works for many Cloud CISD providers and it works for many programming language package managers. To that end, we've been working on taking the idea of build provenance and codifying it. So you can go to the SigStore FullCO project. You can look at the docs, and we outline exactly what a Cloud CISD provider needs to have in order to be considered for providing build provenance. The whole point of this is that over time we're going to add support from our ecosystems. This is going to become a new capability in the broader open source ecosystem. So again, some of the pushback that we've received, people are like, well, okay, but why do I have to use a Cloud CISD system? I've always built on my laptop. Well, it's complicated. So my first answer is you can build on your laptop, but you can't do that with provenance, right? If you are just getting started in open source, if you're just starting a new open source project, you don't have to publish provenance on day one. We are not requiring provenance on the NPM registry. However, as you grow, as your project grows, as more people come to rely on it, you should really strongly consider having higher trust signals, just like you would go and update the README, ensure your project has a license. As you get larger and larger, ensure that you have a security policy, maybe you want to have a vulnerability reporting system. Adding provenance to your project is a step of maturity as you grow. And it's not that I don't trust any of your individual laptops. Sure, you're all fine people, but in aggregate, we have thousands, tens of thousands, hundreds of thousands, millions of laptops that are participating in build systems for open source. Long term, this isn't a sustainable way to have trust in our ecosystem. Oh, no. Okay, the emojis didn't come across, but the second thing we learned is that machine identity is greater than human identity. So imagine a robot with two hard eyes, and then there's a human below it crossing their arms saying no. I touched on this before, but it gets really messy when you start trying to figure out how to strongly authenticate a human and determine their trustworthiness over time. People join projects, people leave projects, their accounts are compromised if they're improperly secured. An important fact about build provenance is that we aren't saying that a package doesn't contain any malicious code. We're saying that we know exactly what the package contains. It's still up to you to go and audit that source code, to audit those build instructions, to ensure that they don't include malicious instructions. However, now that we have this link, we can do that. And then we no longer need to worry about how do we strongly authenticate users? What if they're in a different geopolitical jurisdiction? We don't care. The source code is what matters. The build instructions are what matters. And having a link to that is giving us this new security capability that didn't exist before. So the last thing we learned, and an area where we still need to improve our story, is that it's great that we're adding all this extra information, but provenance is really only as valuable as its verification story. So today we extended NPM audit, such that when you run this on your project, it will go through your dependency graph, and now it will additionally check if there's any build provenance for those packages. It will ensure the validity of that build provenance and then report on it. Really though, if you think back to that in Toto documents, probably you're going to want to write some sort of stronger policy. As an ecosystem, we haven't really defined what this looks like, but maybe you want to ensure that your dependencies are coming from certain organizations. Maybe you want to ensure that the projects that you're using have a common build workflow, or at least known build workflows that you've had the opportunity to audit. So today our story on how we verify provenance is a little weak, but this is an area that we're looking to do further work in so that we're not just publishing additional information, but we're actually validating it before the deploy, which is the goal here, is to ensure that the code that we're deploying is more secure. Okay, I snuck in a fourth question. How can you get involved? I'm so glad you asked. You might remember this slide. So again, there's a lot going on here. If you own a cloud CI CD system, we would love to talk to you. I believe that I can say publicly, because the full CEO repo is public, that our friends at GitLab just landed some support in full CEO, I think one four, maybe one three. So stay tuned for future updates about other platforms that might support provenance. Of course, if you own a package manager for a programming language ecosystem or otherwise, we would love to talk to you about how to add in this capability. But speaking of bugged slides, the OpenSSF was supposed to be in the middle of this. So another way to participate, of course, is to participate in these conversations that are happening out in the open. The OpenSSF has a working group securing software repos. I find them a bit confusing, but there by repos they mean programming language package managers. And again, these specifications are happening in the Salsa organization, in the Intodo organization. These groups have public community meetings. And then last but absolutely not least is the six-door project, of course. So we reached our 1.0 release a few months ago. So we're done. No, we're nowhere near done. There's quite a bit that we want to do to extend this. As the Salsa specification evolves, they're very interested in not just build properties, but also source control properties. Think about where all your commits reviewed, where all your commits signed. You know, we would love to figure out how to get that into the Intodo specification as well. Support those sort of properties as well. There's a ton of work to do here. And we would love your help. And just a final thanks to all these projects for all their support in helping us launch this feature. Yes. I think there's a microphone next to the projector. And then what I learned is near the bottom, there's an on-off switch that might help. All right. So you mentioned that all, that it's not, the provenance isn't validating the contents. And you had your slide about saying, like, why can't I build on my laptop? Well, it's not validating again, because NPM publish is just uploading a tar ball. So I can build on my laptop, upload the tar ball from anywhere, and then publish with provenance. And it works just fine. So that tells me that the only thing provenance is determining is where I published from. And so I'm curious if you could elaborate on the value of knowing where I published from, since the GitHub actions logs expire or can be deleted, like, that's not really useful for investigations. Are you aware of any NPM incidents that provenance would have prevented? Okay, there's a lot in that question. So let me start with why you can't build on your laptop. So there are a couple of different modes that FullCO supports. One is users authenticating from their laptop through like an OIDC web login flow. This is really popular for a Git sign, but people want to do commit signing without managing their keys for a long period of time. The providers for build provenance are not those, though. It's only Cloud CICD systems. And the reason for this is because we don't have a way today to attest to the security of your laptop, but we do have the ability to say, okay, this Cloud CICD system exists. It has these security properties. We know it's using a fresh VM per build. We know that these build instructions are what's executing. When a laptop, we don't have those guarantees. Right. What I mean is I could make a publish workflow on workflow dispatch. I could dispatch it with a tarball URL, and then in the GitHub action, it would publish that tarball, whose contents come from anywhere I choose, and that would mark correctly as having provenance on npm.com. So I can build from anywhere I like, currently, as long as the publish is done in GitHub action. So I'm asking, since that is literally the only thing it's guaranteeing is where the publish happens, what's the value there, and what incidents could that have prevented? Yeah, so then when you went to the provenance for that package, you would see when you linked to the build file that it was just uploading some tarball and saying that, yep, here's my log. The log of that expires after three months, and I could manually delete it, so that link wouldn't work. The build logs would be deleted. The build script would not. You could manually delete it from the repository, and then we would remove the provenance from the website. But on GitHub actions, the log expires after 90 days? That's correct. So does the provenance disappear after 90 days? No, you're differentiating between the build logs and a link to the build scripts. The build script lives in your software repo. Right, so it's a file that you commit, yeah. In terms of incidents that this would have prevented, I do not work on the NPM team full-time. And even if I did, I don't think we talk publicly much about our instance. We did recently publish a threats and mitigation page, and I was going to include a slide to that effect. Broadly speaking, what we're looking to do here is make it easier to detect that malicious behavior has been added. And that is one of the common threat vectors of NPM is people are embedding malicious code, either in their build instructions or in the package itself. But I don't have a specific instance to refer to. You had the slide up with, do you want to go back to the flow? Yeah, and so I'm familiar with the full CO recore flow in other attestation situations. I see the provenance attestation getting uploaded to recore and also added to the registry. Is there a reason for the duplication of that attestation? That's a great question. Couldn't you just get away with having it stored next to the package? Yes. So I think there's actually, this is kind of glossing over the fact that there's two separate artifacts here. There's the signing certificate from full CO, which has the properties burned into it and is an X509 certificate that you can validate. And then there's the attestation documents, which also has the properties in it. But offhand, I don't know the way that that's validated. So I believe what's happening is that it's Crofts referencing the properties in the signing certificate with the attestation document to ensure that they are valid. So you kind of need to keep the signing certificate around in order to ensure that the provenance attestations are valid. But then when you're interacting with it, not everyone wants to pass the X509 certificate, so it's a little bit easier to interact with the Intodo document. I thought there were maybe some other questions behind you. So in the unlikely, well, the undesirable event that the cloud provider has the security issue, do you have the knowledge to be able to invalidate that period of time's provenances and say, actually, don't trust these ones? Yes. We do have the ability to invalidate provenance for a specific package if there was some known security issue in that period of time. So we don't have the ability to add provenance if it doesn't exist, but we do have the ability to take it away. That's certainly an ability we would not use lightly because it has the potential to impact everyone sort of like supply chain security. But in the provenance attestation, you know the cloud provider that was used and the time period that that build was done. Yeah, that's correct. Okay, cool. Okay, I'll be here. Thanks, everyone.