 I'm Zach. I'm a research scientist at ChainGuard. This is Marina. Hi, I'm a PhD candidate at NYU. And today we are here to rant about why not all that's signed is secure. And so the subtitle of this talk is Verify the Right Way with Tough and Sig Store. So we're going to talk about some cool existing technologies. We're going to talk about how in theory they can be used together really nicely. And then we are going to ask you all in the audience to help make that integration nice and smooth and easy to adopt for everyone. So the problem that we're dealing with is that people love to sign things. People love to talk about once you've signed things, OK, they're secure, right? And the short answer, you might be able to infer from the fact that a phrase that has a question, no, I don't believe that necessarily, just because something is signed, it is secure. And in particular, signatures are only going to be helpful when you verify them correctly. What correctly means is going to totally depend on your use case. And so we're going to work through a couple of different use cases. But one really obvious example of an anti-pattern that I have seen in the wild is people verify that something is signed, but not to check at all who signed it. And you can imagine how that could go poorly. So what are we going to talk about today to actually fix that? We want to enable flexible and smart policy enforcement. Policy is how you check those signatures. And when you check those signatures correctly, you're going to get a lot of value out of that. So we say flexible because different settings have different needs. You're going to check the signatures on something you pull from an open source package repository in a different manner than you would if you're a government agency and you're checking signatures on internally developed software, right? Different parties are signing those the way you figure out who's supposed to be signing what is going to be different. But we're going to use existing secure solutions that have kind of worked all the way through all the kinks. They're going to protect you against sorts of attacks that maybe you didn't even know were going to be a problem. And so by using these off the shelf solutions, you're going to get protection in a way that something cobbled together without that theory is going to lack. And then we're going to kind of work through a couple of different examples. We're going to work through sort of the open source package repository setting. How are you signing in that environment, especially if the signatures are coming from people all around the world who maybe you're not going to be able to ship them all UBIC keys. And we're going to talk about internal container registries inside of an organization. And then I think from that, you're going to be able to interpolate to a whole bunch of different settings with, again, the same building blocks. So we have our obligatory collage of headlines slide. If you've been at this conference for the past two days, odds are pretty good that you've heard someone say the word software supply chain security. It's in the news a lot. But fundamentally, all of these problems boil down to someone is trying to run some software. And it's not the software that they thought that we're running. And they get an unpleasant surprise for it. And so we're getting interest from the government. We're getting interest in the news media when attacks happen. We're getting academic interest. And so these things are important. And so the next question is, does signing software, the subject of this talk, does it help? And people, armchair opinionists on sites with orange headers like to talk about, oh, just sign the software. And then it'll be secure. You're having all these vulnerabilities because you're not signing the software. And I think there's a little bit more nuance that we need for that. So I think signing software is part of the solution. If you download software from the right place, but it's not what it's intended, those kinds of attacks are things that signing software can help with. So compromised accounts, compromised build processes, even compromises of the package repositories themselves, we can deal with that by signing software. It's not going to stop all attacks. It's not going to stop normal vulnerabilities if someone published software honestly and they just made a mistake. That's how would a signature help with that? It's not going to defend against underhanded contributions. So I send a PR to a project that looks safe, but actually introduces a really bad vulnerability. It's not going to defend against me going and threatening an open source maintainer with a hammer until they publish the Bitcoin miner in place of their NPM package. But in situations where you know who's supposed to be signing a package, we are going to be able to help with the kinds of compromise mentioned on the previous slides. And that's a really big if. We're going to come back to that later. It's doing a lot of heavy lifting there. So Sigstore, in the talk title, really exciting technology. We're not going to have time, unfortunately, to do a deep dive into how it works. But the high level overview is that it enables really easy signing of software for containers and for other software as well. And it enables workflows without key management. So for a long time, we've known that people struggle to manage long-lived signing keys. I'm sure some of the folks in this room have a GPG key on a hard drive in a basement or a garage that is unencrypted. And they don't know how to access anymore. Key management is really, really hard. And this has been validated in the academic literature. There's a lot of usability research on this. Some people, it works great for them. And we're not coming for your guns. We're not coming for your GPG keys either. But we do want to enable workflows for folks who don't want to be managing keys but are comfortable managing single sign-on identities. So I have last count eight GPG keys that I've used at various points in my lifetime that I no longer have access to. My Gmail account I made when I was, well, I'm not going to date myself. But I made it at an earlier stage in my life. And it was many years ago. And it has remained compromise-free since then. And I've put all my eggs in one basket. And then I watched that basket really, really carefully. And we're not advocating for single points of failures. And we're going to talk about policies later on that prevent bottlenecking too much on a single point of failure. But we are saying a lot of people have a much easier time managing a single sign-on identity, like your Google account, your Microsoft account, your GitHub account. Then they do a GPG key pair. Similarly, when you're talking about machine identity, so I'm running a job in GitHub actions, GitLab pipelines, that identity, you don't want to link that build job to a particular key pair because if that key pair ever leaks, you know, all hell breaks elucid, basically, looking backwards. And so Sigstore has a nice feature where I can be running a job in GitHub actions and I can get a certificate issued that identifies the build job as the build job that it's running. So it can say I'm running the official NPM build job and then sign off on the software artifacts. And now you're convinced that the artifact produced is the output of this known job. So identity much easier for humans, much easier for machines. And then finally, Sigstore brings in an element of transparency. What do we mean by transparency? It means that we post all of the activity in a big public log so we can detect misbehavior. So if Sigstore itself is misbehaving, we'll be able to notice that. If Sigstore is issuing certificates in my name to someone else, I can get a text message alert if I set up a nerdy script to do that. But just like when someone logs into your Amazon account, you get a text message. So Sigstore's transparency kind of, we're doing some centralization here, which is scary, especially to a lot of people who are used to the 90s cipherpunk kind of model of crypto as decentralized everything. So we're doing some centralization, but I think to mitigate the impact of that, we're asking the central parties to be fully transparent in everything we do so we can double check them. So how does it work? Again, we're gonna skim, this is not the point of this talk, I would love to talk your ear off afterwards about the details of these components. But there's a few components. One is Fulcio, which is a certificate authority. And it's going to issue short-lived certificates in exchange for OIDC credentials. So think like log in with Facebook. So you basically log into this certificate authority and it gives you a certificate that's gonna last only about 10 minutes. And then there's a log called Recor, which is gonna give you a timestamp on that signature so you can know that it happened during that short validity period. And then it's gonna record some metadata about the signature itself. And then finally there's a client called Cosign, which is really useful in a cloud native context for sticking those signatures in OCI registries and kind of tying these components together. Okay, so you've got your sigstore, you're doing the signing, let's talk, how does verification work? So one way you could imagine doing this is you're a user, you go to your container registry and you say, give me the latest Nginx image. It gives you back Nginx at a particular hash and then it also will give you a signature. And then you verify that signature and life is good, right? Well, what if the container registry is evil? Then they can give you a signature and the signature could come from someone with the, I guess, app's name, evil hacker. And then you verify it just, that's not how we wanna be verifying these signatures. So we need a policy of some sort. A verification policy helps us interpret the signatures. They answer the question, what do I mean when I sign something? So again, people talk about sign your container image. What does it mean to sign the container image? Did you dump all the layers in hex dump and check each byte? Probably not. What you're really claiming when you sign something is much more specific than that. And so being explicit about what you mean when you're signing something, instead of having a signature just be over an artifact, you make a claim. That helps us compile policies that make a little bit more sense. So we'll give some examples of that. Yeah, so the simplest is exactly that, what I told you not to do, is this idea of a universal signer. There's one person you trust and if they sign a particular binary, then we trust the binary. There are more sophisticated, I guess ideas that we can check when we're having a verification policy. One is an idea of ownership. Package P came from Alice who happens to be, I happen to know, and again, we'll get into how we know that. But I happen to know that Alice is the maintainer of React or whatever. She made package P and she signs it and she asserts that she owns it and that that source is good. So we can also do assertions about build integrity. So machine M built a particular artifact, but what's extra powerful is when you combine both of these notions together. So the idea is that your trusted build machine would sign a claim like I built package P from source code S and Alice says I audited the source code S and then if you trust Alice, these two things together give you a much more powerful claim than either of them on their own. So policies are good. How do you get a policy? So you may recognize some of these figures from the previous slide and we're gonna go through a pretty similar exercise. Okay, so I'm gonna go request the latest nginx image. It's gonna give me a signature and the image at a particular digest. And then I go back to the container registry and I say by the way, who's supposed to sign that container? And the container registry says, oh, Marina Moore and then I verify it and life is good. That could go wrong in much the same way that it could go wrong the last time. So same thing, except now the signature is coming from the aptly named evil hacker. I say by the way, who's supposed to sign that container and the container registry says, oh, evil hacker of course. And then I verify boom check and that's not life is not good in this situation. So we have to use some slightly more sophisticated technologies to help us with that and that's where I'm gonna turn it over to Marina. Great, thanks. So yeah, so as those previous examples showed, you have to know what you're running and the context in which you're running it and this context tells you who it is that's supposed to sign each thing and you have to be able to get this context from a secure source, a source that's separate from where you're getting the image and the signature so that it actually provides an additional layer of verification. And so I'm gonna talk a bit about this project called the update framework or tough, which is a project that really focuses on this idea of secure distribution. It can be used to securely distribute not just signatures of artifacts, but who is supposed to sign these artifacts and it can be used for these kinds of trust relationships and this framework has this additional property of compromised resilience. So even if any single key or single even a single repository in the system is compromised, there are ways to recover or prevent these attacks. And also I think we're gonna very briefly cover in Kodo which really covers those combinations and I can help you talk about more detailed relationships about what happens in your entire supply chain, not just the distribution step, but going back from the distribution step, who actually built your software, who tested it, how do we know who's supposed to build it and who's supposed to test it, putting all that stuff together into a single place where you can check all the steps in the supply chain to make sure they're done by the proper people and then distribute information about who's supposed to sign those in a secure fashion. So as with Sigstore, unfortunately, this is also not a in depth tough talk. So a lot of the details about how this stuff we're gonna talk about works will not be in this slide or in this particular presentation. We have lots of great content on the tough website including in the talk listed there if you're interested in the low level details about how tough achieves these properties we're gonna talk about. We're gonna focus today just on kind of what those principles are. Yeah, tough is a CNCF graduated project. It was originally based on some peer-reviewed academic research by my advisor, Justin Capos. It's used in production by organizations like Fuchsia, Datadog, various automotive OEMs and many others. And it's based on a few principles to really achieve these security properties. These principles include separation of responsibility meaning that we can achieve that compromise resilience. So that if any one thing is compromised it only compromises the particular responsibilities of the entity and no entities have too much responsibility. So I do have multi-signature trust which we'll get into in a minute and explicit and implicit revocation to make sure that all parties in the system can be revoked if they're compromised and of course secure recovery when a compromise happens. So the first property we'll talk about is delegations. So you can have a centralized entity you kind of be rude of trust for the system if you will. So in the case of a organization like PyPI which is the Python packaging index the thing that posts Python packages that are downloaded by PIP and other installers. You can have it can control the space of packages that are uploaded to PyPI and you can say okay but I know that the NumPy package is owned by Alice and Alice should be the only person uploading the NumPy package. So Alice has permission specifically for a NumPy package but no other package on PyPI. And the Sci-Fi package on the other hand Bob is trusted to upload this package. Bob is trusted to this package and nothing else. And you can do more complicated relationships list of course multiple packages for a person separate namespaces all those kinds of things. Tuft also includes explicit revocation so if any entity so say we have in this example we have open SSLs this version 1.0.1 which we think is totally fine and then something changes right we learned that there's some issues with this version of a piece of software. We can respond to this vulnerability by explicitly roking trust in that version of the package but leaving the trust in future versions of the package. This includes a couple of additional this is done using the metadata system in Tuft but it really ensures that you can explicitly say when a version of the package is now has a vulnerability or you learn more about it it can be explicitly revoked and it won't be used anymore. And every client will know that they have a version of the index that's up to date and lists these like outdated packages as deleted. We also have implicit revocation. All keys, nothing is good forever so any key in the system has a lifetime and so if it's lost or stolen or it's lost or stolen without you knowing it still has an eventual expiration. So there's kind of a limit of time in which an attacker can take advantage of a lost key. And this helps with undetected compromises and make sure that people kind of keep track of the keys that they have. We also have signature thresholds in Tuft. This is one of the really powerful ways we achieve this compromised resilience is that you can have one package and you have to make sure that multiple different parties the test that they are okay with this particular version of the package. So you have lots of different people side of the same package. For example, you can have a developer and a security team. So you say this developer team wrote the package this security team checked it. Therefore we trust it and we won't trust it if either of these signatures is missing. And finally, we have one remaining issue with Tuft which is how are we gonna loop this back into our tough and sick store combination which is that how do you detect if a key is used by an attacker? So if someone manages to get your key without your knowledge, how do you know that that key was used on Tuft? And are you seeing the same signatures on these packages as everybody else is seeing? And this is where you really need this property of auditability. So as you guessed, this is where tough and sick store can come together. This is actually already supported today. You can include, you can upload tough metadata to the RECOR transparency log in sick store to get this auditability where you know not just that this is the tough metadata that you're seeing but it's the same tough metadata that everybody else is seeing for this particular package. And this allows you to audit the use of so any signer in the system can audit every time their key is used. And you can also ensure that you have global consistency. And if you have like one particular vulnerable target they know that they're seeing the same version of the Tuft repository as every other user of the system. Even in the event of a key compromise or any other change to the system. So now we're gonna talk a little bit about what this looks like in practice and what does this actually give you. So I kind of ground this in some more concrete things rather than these high level security ideas. This particular example is still in the realm of design docs and good ideas but we'll go into some other ones that are a little bit farther along in a minute. So say you're at a company that has a container registry for all the images that you make that you wanna give to your customers. And so if you can do this combination of tough metadata stored in Sigstore and then upload all of that metadata to OCI you can create like a fixed policy because you know as an internal company that your dev team should sign every image. So you can say, okay, we're gonna create this tough metadata that says this is the team that are trusted for these images. You know those particular keys that are made. And we also know that this image should be built by GitHub Actions. So that can be an additional signature that goes onto the image and you know the workload ID for that and you can say that's the ID that should be signing those images. And for free you get, because you're using tough you get this revocation property. If any of your developers lose their key if you change your GitHub Actions workload ID or switch from GitHub Actions to some other service entirely you can revoke and you can change the keys. You also get freshness which makes so you can make sure that everyone is seeing the most up-to-date set of images at any given time. And you can enforce all this policy that you would find in like a Kubernetes admission controller or other intake system. Another idea of how you can combine these systems that this one is a little bit more far along which is the idea that you have a public package repository. Someone like PyPI, NPM, RubyGems all those folks. And instead of the versus the other in this previous slide, you know your dev team, you know the people who are supposed to be creating the images versus these public package repositories anyone on the internet can upload to them. And so the policy looks a little bit different. So you can have this default policy that's created by the package repository where they just map packages to the person who has the account on PyPI or any other service and say I know that this account is tied to this. And so we can create a policy that says that this image has to come in the account associated with that package. And in this case, another kind of addition that you can use in this case is that you can use those SigStore signatures, the short-lived keys easy to manage that we talked about before. You can use those in the place of a long-term key pair within the tough system to make it easier for the developers. And so you have this default policy and you can also in addition to the default policy because again, this default policy if like any random person on the internet can create an account of PyPI and then upload packages, you might want to have an option for a more paranoid policy. So security vendors or companies or anybody else can create a separate policy for more paranoid users that allow this particular trusted uploaders and even creates that map between those trusted uploaders and the packages that they're known to maintain. And so you can kind of have a subset of everything on PyPI that's part of your trusted policy. And you can enforce this in your package manager and PIP or anything else as well as having like layers on top of that to make sure that you can use your specific allow list. And for free, as before you get this revocation, curation and freshness that comes with tough. And in addition, you get this protection from repository compromise because all of these signatures are done by developers and not by the package repository, which means that even if the package repository is compromised the malicious users won't install a malicious package because it's signed by the wrong people. And there's all kinds of other places you can use this combination of tough and sick store. You can see places like app stores which have some particular use cases in that there are no entrusted developers, which is a little bit different than the public package repositories because there's like a smaller set of folks who have the particular accounts to be a trusted developer in these app stores. You could have more curated package repositories that have additional signatures from security teams and really exclude those other kind of packages from anyone on the internet and just say these are the packages that we actually trust and our list of packages or policy will only include those trusted books. And finally, there's the case of a single product updater. So like your operating system update or any application update which can mitigate the compromise of a distribution server. So this kind of man in the middle which is I think auto corrected there. No, sorry, that's Mimi, which was a check. One of the headlines was there was a chat app which is popular in China called Mimi and they had their CDN basically owned and then the attacker substituted in place of Mimi, the normal chat app, they had an evil version of that instead. So it's an example of a mid-owned, but anyway. Yes. So yeah, so that's some other places that we can use this combination. And now we have a lot more details, a lot more future work. So this is kind of where we're hoping that all of you can come in. There's some additional pieces here that we need to finish solving and make better and really want to prove the usability of this that this is an obvious way to do it. And then doing updates and distribution the correct way should be the easiest way so that all of our software can be more secure. So things in this category include revocation and things that end up on a transparency log, how do these things get deleted and changed, scalability. There's a lot of pieces of both stuff and six-storey that have interesting scalability challenges, especially as you look at these really huge public package repositories. How do you, some of the particular roles in Tuff, like the snapshot role, which I didn't go over, but it has a list of every package, right, which scales linearly. So ways to do that in a more clever way. We have in-band key rotation. So making, so the delegation structure is great for saying this particular package should be signed by this person. But what if this person leaves the team or wants to switch it off to some other developer? How can we do that in-band without having to involve these volunteer repository administrators who don't have time to respond every time someone wants to change their key? What do we do when we have post-quantum broken encryption from post-quantum algorithms? How do we change every signing algorithm? One of the, as many of you may be aware, when quantum computers become a reality, signing algorithms are one of the critical pieces of internet infrastructure that will be broken. So how do we design these systems in a way that if and when that happens, we can quickly respond and keep everything secure. And then, of course, there's the shifting left, signing everything from the source all the way to this distribution. So right now, this is a lot focused on distribution. How do we incorporate more with projects like Nintendo and others to get this all the way back to the source? And then there's this whole usability piece of it, really simplifying the setup of these tough repositories. And we have a couple of really cool ideas about how to do that, including creating kind of a public good shared tough route that can be used by, especially all of these public software repositories so that they don't have to have the infrastructure and the teams to actually manage this themselves. You have one centralized place for all of the public open source things on the internet. And of course, in TOTA, which I mentioned before, but I think integrating the entire software supply chain security into this is gonna be a key piece. If you wanna get involved, we would love to hear from you. This is links to various projects within Tuff and Sigstore. You can get involved, including Slack to chat with us. Also, if you are on the end user side, rather than this side, and would like to see how this can work in your system, we would love to chat with you about how we can make that happen. So yeah, and there's a QR code for feedback. And we'd love to see any questions, thanks. All right, yeah, we did leave, I think almost 10 minutes for questions, so we'll take that. I think I saw your hand first. Go for it. Yeah, so the question was, have there been discussions with Linux OS package repositories, basically? So you're talking about Apt, RPM, and so on. The short answer is, it hasn't been an area of focus, largely because I think the problems are much bigger for community-operated package repositories, where you have developers coming and going. And so while, I think there's a fair argument that package signing inside of Linux distributions has a number of issues, they're in a much better position at the moment than most, you know, than like PyPI, where they have enabled GPG signing for 10, 15 years, and the adoption is low, low single digits. Whereas, you know, for Debian, you actually do see every package signed. Again, there are some issues, I think there are things we could learn from the Tough Project and bring into those implementations, but for the most part, they're in a better situation because they have a small number of trusted developers who can then manage keys. So the revocation story could be better, especially with some of this implicit revocation and so on, but it hasn't been the most active area of focus. It's certainly something I think that's on the Tough Project's radar, and it's a place we'd love to have conversations. I'll take a moment to plug in the open SSF. There is a working group called Securing Software Repositories, which has been convening a number of folks from various package repositories. We have the best representation from language, ecosystem libraries, or package repositories like PyPI, NPM, and so on, but we have some representation. There's someone from Gen2, someone from Homebrew, and so as we're kind of scaling that up, we'd love to hear more. So I guess the short answer is if you have a friendly Debian maintainer you'd like to send to that working group, I think we could benefit a lot from having more voices in that room. I think I saw a hand over here. Yes. Yeah, so there's a couple of reasons. So, oh yeah, sorry. So the question is about deleting transactions from a record log, and one of the main kind of, there's a couple of use cases here that have to do with this. Some of them have to do with privacy concerns or other reasons for deletion. And also the other question is if a package has a known vulnerability, what do you do to make it clear that this entry in the log is not the one that you want to be using, there's a future entry, that's the one that's better. So there's kind of two angles on that. Sure, so yeah, the question is about the scalability issues and if we can highlight a couple of those. Yeah, so I didn't want to get too much into the weeds I guess I can do that now about the tough scalability thing. So, tough has four different, what we call roles in tough that provide different specific things as part of that separation of responsibilities. One of those roles in tough is responsible for the consistency of the tough repository. This is, we call it the snapshot role because it has a snapshot of the current state of signatures and artifacts that are currently in the repository. And this makes sure that any signature that changes anything that's removed is then incorporated in this snapshot. However, the way that's done in tough today, which there are proposals to change this of course, but is that you list the every metadata file and version number that's in the repository. Which if you have a lot of artifacts, something in the millions of artifacts in your repository that lists can actually get fairly long and become noticeable when it comes to downloading things. And so we're looking at various more clever ways to store that than a linear list, like trees to make it a log in instead. But there's some security properties that we want to maintain while doing just the challenge. I don't know if there's any others you want to talk about. Yeah, that's the big one that comes to mind. And then grab us offline if you want to hear the gory details. When it comes to sigstore scalability concerns, some of those are around revocation. And this is something we see in the web PKI is that revocation is actually quite tricky because the scale is so massive. When you're maintaining a certificate revocation list for the entire web, that winds up getting too big to distribute to all of the sort of interested clients. And so similarly in Sigstore, if we have a revocation solution that sort of bottlenecked on that central party, then a revocation list, the scalability problems become pretty immense there. There's other concerns like, as one log for every sort of related universe of, one log that covers both NPM uploads and PIPI uploads and maybe down the road, Debian and every OCI registry in the world. You start to run into challenges here, so can we maintain some separate namespaces and have them roll up into one log because you benefit from scale in security here. Having sort of one central place means it's only one thing that you're gonna have to monitor. So that scale is good, but it comes with challenges of course when it relates to, is that log going to be able to handle the volume of traffic that it experiences? And so far it's been fine and there are some plans to sort of shard that out and they're going along quite well. But if Sigstore is as successful as the project as we hope it is, then that's gonna become a big issue pretty soon. That's a good question. So the question is, if there's a plan to create a public good tough instance. So this is definitely something that we are interested in. I think we even have some potential interest in folks who can actually host it, but I don't wanna promise anything on behalf of folks hosting it until it's a little bit farther along. All right, and that's probably gonna have to be the last question. All right, thank you everyone.