 Welcome everybody and great to be in a around real people again. Can't remember the last time I went to a conference and I've met a lot of folks that I've worked with for one or two years and it's the first time we've met in person. I think that's probably common for quite a few of you there as well. Yeah, we're going to be delving into software supply chain. Rhywun cyflawni iawn ac y cyflawni oherwydd ei dweud онch, nid yw arall gwm dangos cyflawni hefyd yna, ehol yn rhan i'r hynny i'n gweithio'r lleig, ac rhaid gwesteithio'r dweud hefyd yn lai ddefnydd â chanc cyflawni haf. Maeliau'r hoedd wedi ychydig yn dweud ac yn dweud i'r hoffi ar y cyflawni hefyd. O f repeating, ein bod yn bwysig ym mwyn i'r redhat yn y CTO ei ffordd, I'm the engineering lead for a security team, so we have about seven engineers who are working predominantly open upstream open source projects around the security domain. So all of the vulnerabilities that are found and reported in Kubernetes come to myself and several other people in that team and we then triage those vulnerabilities. Some of you might have heard of the open SSF, the open source security foundation, so I'm on the technical advisory council there. And last of all I helped start a project, founded a project called SIGSTOR, which has started to get a fair amount of recognition recently. Okay, so this talk we're going to look at what's the current state of play. Okay, and then what's next? What's coming along the pipeline? What do we anticipate as being exciting progressions that we've had within this ecosystem? So it's worth doing a quick round up of what is a supply chain. Okay, now most of you are experts here but so that we get everybody on to the same page, so to say. So a supply chain is a mix of humans and machines, okay, and essentially if we look at, we're kind of shifting from the left to the right here. We'll typically have a developer, they'll be working on some code, they'll commit that code, they'll push a pull request to maybe something like GitHub or GitLab, some sort of version control system, and then other humans will review that code, okay, so they'll make assumptions on the developer's identity, they'll think, well that's Bob or Jane, I know them, I can see they've made a pull request, they'll review that pull request, they'll approve it, at the same time machines start to get involved and run integration tests and unit tests and various sort of automated actions start to happen. There might be staging environments, various sort of, like I say, integration tests that kick off, and then from there, some sort of artifact might be generated. So a container image, a tarball, a package of some sort of artifact is generated and then perhaps pushed out to other systems, it starts to go into, for example, an OCI registry would be a typical example. And then we have sort of this middle juncture where we have a package, okay, as I said, an artifact, and it could in fact be a library, okay, so somebody else then pulls that code into their own CI flow, okay. And then eventually you have your end users, and again we're looking at a mix of people and machines here, so we obviously have humans that are going to use software but lots of different vertical industries that now use open source software, so we have energy, telco, military, public sectors and so forth, government systems. So there's a very wide pool of users and machines that interact with this supply chain, okay. Now, not news to you but one of the problems with these supply chains is they're very susceptible to attacks, okay. And there's a lot of attacks, so I'm not really going to go into the details, the meat of what these attacks consist of, but they are throughout the chain. So humans, their identity can be compromised, individual could be compromised, they might have a pair of cryptographic keys that are stolen somehow, or their identity credentials are stolen, somebody maybe gets hold of their single sign on access, which allows them to then log into their GitHub account or the GitLab account and from there on, so humans can be compromised and are compromised quite often. Maintainer accounts is another thing that are commonly taken over. Build systems can be compromised, I think the interesting thing with open source is when you look at a software project, their build flows are completely open for anybody to look at and read. You can go into, you can go and look at the YAML and look at machine IP addresses in there or any sort of details that would allow them to sort of perform an attack essentially. Typo squatting is something that happens very regular within, especially open source packages, okay. So somebody will change a single letter to make it look like you're a common library. Somebody makes a typo when they're listing their dependency matrix and then they're compromised. Another one which we've started to see recently and we can all relate to is just burned out developers. You're maintaining a project, it's often a thankless task and then sometimes people get to the edge where they just think, ah, and they'll act out and they'll do something to sort of make their package. A lot of the time it's relatively harmless stuff like it will perhaps spam out a message on a terminal to stand it out or, but it does happen and we're starting to see that happen more and more. We have these open source developers who have got projects that are critical and widely consumed by other projects, you know, and they get burned out and stressed. So that's something that can happen quite a lot. Vulnerability's not being patched, systems not being updated and then we get into things like replay attacks, okay, which package managers could be quite prone to. So there's a lot more here and if you've been following the news you'll know that there's more and more attacks that are coming out every week. The log4j was one of the big ones recently. So just a kind of a snippet of the increase that we're seeing. A few people are starting to use this graphic recently and there's a 650% increase. So I do remember the report from the previous year where it was, I can't remember the actual amount, but there's been a seismic rise there. And this is centered on open source projects. So we've got a clear problem here. A lot of the time security people like to be a bit of a scaremonger and make things seem worse than they are, but there is a problem here. So just to kind of further my point, we're actually, you know, this is a topic that's being discussed in the White House. Just, I think it was last week there was a meeting at the White House where there's this executive order around software supply chains and zero-chust and so forth. So one of the things that I and several others started to do is we looked at what are the gaps, okay? What's the low hanging fruit that we can try to fix and resolve in a software supply chain, okay? And one of them that was very clear was a significant lack of signing adoption, okay? So essentially we're talking about somebody will have some cryptographic keys and they'll sign a given object, okay? It could be some code, an artifact, a commit, a container image, okay? Very, very low adoption, okay? There's no credible, trustworthy path of breadcrumb of provenance. So when you're getting something, do you really know who you're getting it from? Is it who you expect it to be, okay? And then the other one as well is that quite a lot of the time when you do have something that explains to you some sort of provenance, the thing that created that is actually software itself. So you're trusting software to tell you what software you can trust, okay? So these are some gaps, many, many more, okay? There's many other sort of areas, you know, code scanning, all sorts of things, okay? But this was an area where we saw that there were some clear gaps and we could forge some sort of path here, make some sort of traction to help resolve these issues. So the first one I'm going to look at is digital signatures, okay? So what does software signing get us, okay? Well, the first one is you can verify the integrity of content. So if anybody changes a single bit in this artifact or code or whatever it is that you're ingesting, a single bit changes, it completely breaks the cryptographic structure, which will then invalidate the signature. So then you get tamper resistance, you've got assurance around the integrity, you've got some guarantees around the integrity of an artifact. Non-repudiation, so generally I can assume if somebody has a private key, they were the ones that signed that artifact, okay? So there's non-repudiation. So I don't know if it is the actual individual that claims that purports to have that key in their possession, keys can be compromised, but you do have non-repudiation at a particular key pair signed in artifact. For authentication, I can assume that, as I say, this particular individual or machine was the one that performed that action, and then I can grant them some sort of access or some sort of autonomous flow that can manifest from that. And if it contains a timestamp, which is something we're starting to do a lot more commonly now, you can have guarantees around when that event happened, when that signing event happened. And this can really help around replay attacks, forwarded attacks, where one system says, you go to update and one system says, this is the latest version. You think, great, I'm covered now, I've got the latest version. But there is actually another more recent version, but you've been spoofed into believing you're getting the latest version that contains a load of nasty CVEs. So that's kind of a common sort of replay attack. OK, so we did some due diligence around the open source ecosystem to look at who was signing. OK, and a very mixed picture. So the Linux kernel, they use something called Tofu, trust on first use. This is pretty similar to SSH. You know when you're SSH to a new box, it sends a fingerprint, and then you put yes, and then once you do that, the key is imported into your dot SSH, known hosts file. And then from there on, you trust that machine. That's trust on first use. So a lot of the time people will store a public key somewhere, or they'll circulate a public key, you import that key, and then you trust it from there on. There's no latter junction where you reconfirm that trust to make sure that that individual is a valid and trustworthy entity. OK. Again, it's kind of a very typical picture. We started to notice a lot of keys that were stored on github repositories, websites, all mediums that can be hacked and have been hacked, like a WordPress site. They have a public key there. All you need to do is somebody will compromise that site, swap out the key, and then people are copying and pasting that key off the website and using that to trust software. So, again, it's very prone to attacks this. The same picture against Node.js, Python, OpenSSL. Interestingly, Kubernetes were also one that had no sign-in of any of their release artifacts and so forth. But I've tagged that one to green because just last week, the SIG release team rigged in Sigstore. So they're actually a green now. So I got to update that one. And a lot of these communities were actually working with them. I'll add that a bit later to integrate this Sigstore system. Same again for package managers. So I'm not talking about RPM here. We've been doing this for quite a while in Red Hat. We've got a very robust established sign-in system. But a lot of the open source package managers, again, some of them have systems, but the adoption is incredibly poor. It's typically two to three percent. Again, the trust models are questionable. Some have nothing at all. I do quite a lot of work in Rust. I'm quite a fan of Rust. But with crates, you pull in everything completely untrusted. Rust has a lot of nice things around memory safety. It's a very strict ownership model within the compiler. So it's a very good language for security and not suffering from general memory attacks that can happen. But as I say, all of the modules are pulled in completely untrusted. And sure enough, there was a hostile package found just last week on crates. So why is nobody really signing? We know that we've got this technology that's been around for quite some while. And it provides all of those guarantees to you. But the adoption is quite poor. So why is that? Most people find the managing of private keys to be a challenge. And there is some expense in this. So if you want to do it right, you need to buy something like a UB key. You need to set it up. Now there's some folks in the audience that will be very comfortable about doing that. But you're the minority. A majority of people just don't know how to do that. They don't know how to leverage it into their everyday development environment. A handling key rotation is difficult, knowing what to do when there's a key compromise. If you ask most people what would happen if your GPG was compromised, what action would you take? They wouldn't know. They'd have to start searching Google and panic. As I said, key compromise is scary. And the existing of identities to keys. So really what we have is something called a web of trust. And I'm talking specifically about GPG here. So generally you would meet in people, you'd have these key signing parties where you'd meet in person. And you would all sign each other's keys. And you'd show each other your driving licence. So that was the mechanism that there was. And obviously that did not scale when you have things like a global pandemic and social distancing and so forth. So a lack of adoption is largely due to the toll then. People find it cumbersome to use. They're concerned about how do I react when there's a key compromise? How do I securely store my keys? Do I keep them on my laptop? What happens if my laptop's stolen? What happens if somebody hacks my laptop? How do I trust other people's keys? How do I use them on somebody else's system? If I want to run a GitHub action or some sort of automated flow. And the general consensus is to run among the people that are a lot smarter than me that it's broken and it's time to move on. The ISRG started to look at working on Let's Encrypt. Generally on the internet the proliferation or the count of websites that were running HTTPS was very, very low. Amazon.com and various banks, they used to run on HTTP. This is going back quite a while ago. Most of this, the adoption obviously improved within the enterprise sector but your everyday user that had a website or was running some sort of system very, very low rate of adoption. Most of that was because it was a cumbersome process. There was a monetary cost to this. If you set up a personal website for your blog maybe you had to go to a CA provider. You had to scan your passport, send it to them. You had to create keys, use an open SSL, generate a bundle. They would then return it to you. You then have to work out, how the hell do I get this to work in engines? What's wrong with my config? It's a really cumbersome process. Let's Encrypt came along and they said, let's make it free, let's make it easy to use. You can just run at all and then it will set everything up for you. What happened was after that the utilization of HTTPS just rocketed up. That allowed the browsers to start to circle the wagons around sites that are not running HTTPS. Now when you go to a site, if it's not running TLS it feels a bit dirty. I'm going to get a virus here. The browser gives you lots of angry warnings about to stay away. Our idea was what if we could do the same for software? When you pull a container image or you import a library or any particular action the general social construct is that if it's not signed I don't want to use it. That becomes the default model. At the moment the default model is very little is signed and everybody is using it. That was our vision for this project and it being a success. Coming back to SIGSTOR, what is SIGSTOR? It's a group of projects. They're all open source projects. It was founded by Red Hat. There's a lot of media attention around SIGSTOR at the moment but it's a project that was growing in Red Hat. It's a collection of projects that come together to provide a service an infrastructure for software signing. We'll go into what that consists of shortly. With SIGSTOR there's two models which I'll cover again a little bit later but it's important to point it out. One of them is what we call a public good service. We're looking to launch, we're actually in soft launch of a public service. This is free to use. There's tall in to make it easy to sign things and to verify things. We've just modelled on this successful let's encrypt rollout that's happened. If we jump into the different projects, we have FALCO, which is our CA. This is what provides the software signing certificates. I'll cover this a little bit more later but we leverage something called Open Identity Connect which is a key part of FALCO. We have RECOR, which is the transparency log. It's one of the first projects that we worked on. You can kind of think of it like a ledger. It's an immutable store. Anything that's recorded in there, it cannot be tampered. Then it's publicly auditable. This transparency log is open for anybody to audit, cryptographically audit. If any of you are curious on the protocol, it's something called a Merkel tree, which is pretty similar to what Git uses in things like OS tree and so forth. We then have various clients. One of them that's become very popular is CoSign, which is a container signing tool. There's clients for Maven, Ruby, Gems, Python. I've been working on a six-door Rust library and many more. There's many more projects coming online. You have your infrastructure, your service, and then you have your client implementations which sign and verify using that service. I'll try not to be too hand-wavy here, but one of the key technologies that we have is something that we're called keyless signing. This leverages our code signing system, FALCO, Open Identity Connect and Recall, the transparency log. What you can do with keyless is you can generate a cryptographic key pair. These keys do not even touch disk. They're encoded to memory, so they're very, very short-lived. What happens then is you'll connect to our system, and we'll see this happening a bit later with a demo. You'll connect to our system. It will then send you back an Open Identity Connect screen. So then you can log in with GitHub, you can log in with your Red Hat, Google, e-mail address. We then have an exchange. There's an ID token. We do a challenge. Then what happens is you're then given a code signing certificate which your client can then use to sign an artifact. All of that is stored into something called a transparency log. That whole event of this individual or this machine was in possession of this key pair at this particular time because it's time-stamped, and they signed a particular artifact by a digest. The real bit to keep in mind here, if you've not followed what I've just outlined, is that once this process completes, you can discard the keys. You do not need the private key. If anybody managed to compromise it off your machine, five minutes, really? Wow, okay. That's thrown me off a bit. Probably have to drop the demo. This is a view of what we capture. The top one is you can see that there's a GitHub action that's run. This is actually stuck in an X509 certificate, which is also recorded in the transparency log. You see another example as well where we have a SAN of my e-mail address. I signed something using my e-mail address. I'm going to have to skip the demo. I was going to show you a commit sign-in, so we've just actually got this new project that released just a few days ago where you can now sign commits with Sigstore. Okay, so we don't just do keyless as well. We do lots of integration with KMS systems, so AWS as your GCP vault. You can use local key management if you wanted to. We support PKCS 11. It's a crypto standard, so you could use a UB key or a HSM. This is more leveraged by your enterprise-type customers. Community adoption has been rife. I'll skip over this quite quickly, but like I said, Kubernetes have just come online, and there's lots of other communities that are bootstrapping and planning to go into production with Sigstore. This is a bit I really wanted to make sure I covered is what are we doing and how does this relate to OpenShift? How are we going to leverage this technology in OpenShift? So, presently advanced cluster manager, ACM and ACS, you can sign Kubernetes objects, so YAML files with Sigstore, and then an admission controller will validate that signing before that is allowed to execute. Podman, the Podman team, so Daniel Walsham folks, they're integrating the container signing parts into Podman. This is using Sigstore and part of the old Red Hat simple sign-in protocol. Then you'll be able to verify signatures. They also have a policy implementation that they're working on. Quay has shipped with support for Sigstore, so as of version 3.6, you can now sign a container, store the signature into Quay, which can then be retrieved and verified. We've got another project in Tecton called Chains. Chains allows you to sign a pipeline or a task. You can also generate something called an SBOM, a software bill of materials, which will list the actual various actions that were part of that task or pipeline and it will sign it, and it will store it into the transparency log. So, all of the time, all of these signature events are stored into the transparency log. This is experimental, but somebody in my team has recently been working on a Koji plug-in, so we're looking at sticking RPM signatures into the record transparency log. Last of all, just coming back to what we said earlier, we have two sort of models. One is the public service, so you can think of that like let's encrypt, but the other model is to deploy this behind a firewall. If you're a customer of Red Hat and you want to run for some reason your own internal Sigstore, you can do that. It's all cloud-native technology. It's relatively easy to set up and to run. Or you could run a hybrid model, where you utilise the public upstream service for when you're consuming software or artefacts from upstream, or you could have your own internal only code and artefacts that you're generating. So that brings me to the end. Happy to take questions. Anybody wants to come and grab me afterwards? Do so, and I can even show you the demo. Last of all, there is a Sigstore booth that's going to be in the marketplace, so you can also come there and find some more information about the community. I'm very, very last one, sorry. If you want to integrate and hack on this, come and see me. We really are encouraging people to start to innovate on top of this platform. Thank you.