 So, hello everybody. My name is Axel Simon. I am part of the office of the CTO at Red Hat. I'm part of a part of it called Emerging Technology, and we look at sort of future tech and how it's going to affect our customers, but also affect the company and more generally affect the open source community. So, we're very much focused on open source projects versus products. And today I'd like to talk about policy compliance using SigStore and the more general context of software supply chain, and we'll get more into that, but as you've probably heard, there's been quite a few problems with our software supply chain. So, my initial question would be who here has thought or investigated a bit the software supply chain issues that we've been hearing about? Right. And who has ever heard of SigStore? All right. So, I will go a bit into detail about SigStore, but first I'd like to start with a general context. So, it's been said, I've never been able to actually find the quote, but it's probably true that Kubernetes relies on about a thousand dependencies. So, that means when you're pulling, when you're using Kubernetes itself, it's built on another thousand open source components further upstream. This raises the immediate question, do you know what these thousand pieces are? Do you trust them? Do you think they have any issues? Basically, do you know what's coming in to your system? It's really hard question to answer, and most of us can't answer that question properly. I'm talking about this in the context, so as I was saying, of software supply chains. Most companies now use software, almost all of them, and we're all dependent on software from third parties from other parties. The software we receive and we use, whether we use it in production or we build the software on it, can contain vulnerabilities, and that puts you at risk, it puts everybody at risk. The software supply chain has been a way to get attacks to function in recent times more and more. You may have heard of the SolarWinds incident, where the US government got basically attacked by running an update. So, running updates is generally a good thing, but in this case, if the update itself is compromised at the vendor level, then you're bringing in malicious code to your organization. There are many, there are many, oh, I can get a pointer, yeah, excellent, there are many type of supply chain attacks. I'm not going to go into them, but this is just to give you an idea that it's a huge problem. It's a huge problem and I'm going to try and give you a brief overview of where we think you can attack, and basically the answer is everywhere. On your left here on the green thing, we have basically code as it's being built by programmers and developers. In the yellow part, we basically have where it's being packaged, you know, a software factory, a build system, and on the right in purple, it's anybody who consumes software, and you can attack basically, oh that was a bit quick for the end, but basically everywhere. I mean, really, you can attack a developer's machine and try and compromise the code there, you can compromise the way the build system builds the packages, you can compromise the container registry to provide bad containers, you can compromise the dependencies of what your developers are using, you can compromise the system that your reviewers are using to do their reviews. Essentially, it really is sort of everywhere and that's a huge issue. So thankfully, no, thankfully we'll get to that in a sec, but the first, the one thing I want to add is that this is not, it's a problem that concerns open source, and it concerns open source, some say even more because everybody uses everybody else's software as a basis, so open source has this very virtuous circle of allowing people to build on other people's code and you basically stand on the shoulders of giants, so that's a very positive thing, it has brought the open source community forward by leaps and bounds, and it makes it possible to build really impressive and useful things very quickly, but it means everybody relies on everybody else's code, and so this makes it a big problem in open source, but not only because it's been said also earlier today and I think we are all now aware of this, a lot of proprietary software and even internal proprietary software, even internal open source software also relies on upstream open source, basically open source is what we've been calling the innovation engine, everybody is basing their code on open source, and so this concerns everybody, it's not just a purely open source community problem, however the open source community is trying to tackle this and is working towards solving this, and it has been for a while because supply chain attacks are not something new, they've been around for a while, they're just getting a lot of attention now because the problem has become even more obvious, as open source has become even more of an obvious thing to use, one thing you can do, which has been done for a while is signing software, you can use cryptography and you can create cryptographic keys to sign your software, and that will enable you to prove who it came from, and to prove that it hasn't been tampered with, the only problem is it's hard, it's hard because you have to create a key, but more importantly you have to maintain that key, creating a key is easy, we can do this on this laptop in three minutes, but maintaining that key and making sure it stays secret, and that nobody takes it from you and then can sign in your name, that's hard, and maintaining a key over a long period of time or having a good system to rotate keys and to manage keys for multiple developers, that's always been hard, and it's still hard, and I'd like to show you the state of a few major projects and how they do software signing to try and protect their users, so the Linux kernel uses PGP signing, so it's asymmetric cryptography, it's been used for a long time, they use what is called TOEFLU, so trust on first use, we assume that the first time we see a signature it comes from the right person, and then we check after that if it's changed or not, if it seems to come from someone else, but a lot of things basically just have keys on their website, so to check if something's coming from the Python community, for instance, all you can do is go on their website, check that the key matches what you're seeing and matches the signature, and hope that their website's not being compromised, and you have to do it, this is different if you want to go for open SSL, so there is no overarching system for instance, everybody has a different system here, if you look at package managers, especially programming language package managers, the situation is even worse, the Python package index has an optional signature system, MAVEN, the Java one is one that uses it really, that's good, GO doesn't have it, Rust doesn't have it, so when you pull code in your Rust program, it's not even verified, it's not even checked that what you are pulling really is what you think it is, so this is a great way to slip in attacks and to basically put people at risk, so this is where I'd like to tell you a bit more about Sixdoor, which is a solution to this problem, Sixdoor is a project that was born out of the mind of two people, Dan Lawrence at Google and Luke Heinz at Red Hat, who's my team leader, and the point of Sixdoor is to try and make this easier, to try and make software signing, which we know is useful, but it's hard to do, a lot easier, so basically it tries to make it simple, easy and automatic, those are very desirable qualities when you're developing software, because the easier it is, and the more automatic it is, the more people will use it, and the more you will have this guarantee, this built-in guarantee that signing provides in your code. The idea really is to make it what we call invisible infrastructure, so it's just there, you don't need to think about it, software gets signed, and therefore it means software can get verified pretty easily. So not going into all of it, we don't need to read all of this, but basically Sixdoor is two things, it's a collection of open source projects that you can use on their own or together, and they're all designed to be cloud native, so they all run under Kubernetes, they were designed with that in mind, and it's also, and this is in progress, it's a public good service, just like Let's Encrypt, if you know Let's Encrypt that sort of helped to generalize the use of TLS certificates for websites by making it very easy to get a certificate and automated it, so automated and easy, that's what we're trying to replicate in time with the Sixdoor public good service. So you'll be able to use the Sixdoor service to check and to store signatures about software, and it also comes with what we call a trust route, which I'll go into a little bit. A trust route is a way for everybody to agree on a fundamental starting point for trusting, for making trust decisions, and we hope that the Sixdoor trust route can be used by the overall open source community and even everybody who needs it to check and verify trust decisions, essentially. These are the different projects that come on the Sixdoor, without going into too much detail, we have a certificate authority, we have a signature transparency log, and we have a tool to sign and to verify, initially it was containers, which is what it's called cosign, but it really enables you to sign pretty much anything you need to sign and to verify it against the RECORE signature transparency log and by using the FOSIO certificate authority to check. I'm not going to go into too much detail about the technical aspect, it's all on the website you're trying to see, you'll see a slide in a sec with an overall diagram, I'm not going to dive into it because it's a bit complicated, I don't think it's necessary, but we can go back to it later on if there are questions. So this is the diagram, yeah we'll just skip it for now because there's a lot of moving parts here and I'm not sure it's necessary right now. But what Sixdoor is based on is a lot of what we call what I call Merkle trees and the SHA-256 secure hash algorithm, so it enables you to take a digest of a piece of binary data and get a digest that is always the same, and so that's a good basis on which to build a transparency log, which is what we're doing here. It's what is used in blockchain, what's used in Git and other certificate transparency system, and it has the capacity of making it very easy to detect any malicious change. So if on this public log we have where people can send in their signatures, they sign a piece of software, they send that signature into the public log, if anybody were to try and tamper with that it would be very evident. So we can have this public auditable log where it's easy to send information but it's very hard to make it lie, which is what we want for a public transparency system. So in summary, so what is Sixdoor? It's something to easily sign and to verify software, it's entirely open source, 100% open source, including the tooling around it and the configurations. It's already active, it's being developed all the time, it's a very fast moving ecosystem, and it's already under soft launch as we say, so the public good service is not entirely ready, but you can already start playing with it, there's no guarantees that it's going to stay the same, but you can already start playing with it. It's a non-profit and it's run under Linux Foundation and it was so as I was saying, it was started at Red Hat, Google, also Purdue University and anybody who wants to join us, basically this is a fast growing ecosystem and we encourage people to join. Right, so we've talked about Sixdoor, another thing I want to talk about is software policy, basically how you choose to allow software on your systems or not, what makes that decision, that's a tricky question because just because you know what a piece of software is doesn't mean you should necessarily deploy it in production, it could have vulnerabilities, it could have licensing issues in your specific situation. So one way, so one thing to think about when we think about sort of making these decisions and building software and creating decisions, having decisions around software is that human action raises a chance of error, so the more you rely on people to make decisions and to take actions, the more likely when it gets repetitive people are likely to make a mistake. I mean humans are great, I like humans, I'm a human myself, but you know we don't like doing repetitive stuff and so that's better left to machines, machines are very good at repetitive stuff and computers are good at that and you know that's why they're annoying but also why they're very good, they are very good at the repetitive stuff. So basically this means when you're building software as a way to avoid mistakes that people might make you want to automate your systems, so you want to automate building software, you do that by using what's called CICD, continuous integration, continuous deployment systems, but you can also do it for policy, your policy decisions that say for instance only allow the software on my Kubernetes cluster if three out of five authorized people have said it's okay, so that's a policy decision and that is also good to try and automate and to try and make visible, so there are policy tools for that and what does this mean in more practical terms because it's a bit abstract, you know use CICD or use policy tools, so if we're talking about building software and starting from a piece of code and turning it into something that other users can actually use and deploy on their systems, we want to use a CICD system for instance if you're using Kubernetes the one we're looking at is called tecton pipelines and that runs in Kubernetes quite simply and it basically takes your upstream code and it turns it into a build, into a packaged piece of software and we have an extension to it called chains which makes it possible to sign, so we were talking earlier about signing software, so this is what tecton chains can do, it can sign a piece of software for you, so that's helpful, we'll keep that in mind. And for making policy decisions then what we want to use, what we're using here is a thing called OPA which stands for open policy agent, it's a tool that enables you to describe a policy that you want for your servers as I was saying for instance only allow this on my services three out of five people have signed it and turned that into code which makes it very reproducible maybe you can rely on it to always be the same and always give the same results without you know you can you can create complex policies if you want but you can automate the policy and that's really the point and it'll be really convenient if we could apply that policy to our Kubernetes cluster and say oh only allow this on the cluster as I was saying if three out of five people have signed it well this is where we have OPA gatekeeper, it's a gatekeeper as its name implies basically in Kubernetes terms it's what's called an admission controller so I will only allow on your on your Kubernetes cluster if the situation matches the policy so now we've automated building software and we've automated the policy around deploying it or even around building it if we want. So why would we want to do this well I like to look at it this way basically what so here's some principles that we want to apply to this that give us some good benefits one of the things you want to do is try and make your tasks in building software you want to try and make them small the smaller they are the easier it is to think about them if you have a long complex task of you know pull the software from this bit then apply this patch then do this then do that it gets hard to understand but if you keep each step very simple it's a lot easier to reason about it you get better visibility about what's happening in your build pipeline and also each step being smaller with the small step atomicity as it's called principle means each step is less risky you reduce the attack surface of the step and also if something goes wrong you reduce what's called the nickname the blast radius so if somebody manages to attack your build system and to put into it and to compromise it well if they can only compromise a tiny bit of it then the damage they can make is much smaller so by keeping things small you basically have better control over what happens in your system you have better visibility and so this improves the control you have over your infrastructure keeping tasks small thinking reasoning and small batches and small things gives you essentially better risk management you have better control over attack surface and over blast radius so yeah keep things small is a very good approach both for your policy to reason about it and both for your build systems right so we've talked about six door and we talked about automating things and trying to automate both policy and build systems so this is a bit more of a vision now it's a bit more experimental this is something we've been thinking about in the office of the CTO at Red Hat is how can we use these different things to try and protect us against the problems we were talking about earlier which are remember the graph with like all the potential attacks so they can happen everywhere but we could potentially check things everywhere if we know that every time we receive a piece of data from the previous step in our build process we know that it's been signed and we can verify it makes it very hard to start cheating if you're an attacker so yeah so what can we do to protect against this this large breadth of attack well so we need to be able to verify the software all the way from all the way from upstream like the people contributing to purely open source projects all the way down to the moment you're putting it on your servers in production this whole big range of open source software this whole chain if you will we'd like to be able to verify and check at each step and so the good news is that we do have tools at this point to build that and we're thinking about it with six store in mind a lot so here's the idea we have so what we'd like and I think this is a shared vision with other people in the in the open source community thinking about software supply chain problems is that so we'd want to be able to attest our whole software supply chain from the upstream commits all the way down to the production runtime and at each step we want everything to be cryptographically signed because that's extremely hard to forge to be measured so we can basically check that we're getting what we think we're getting and we're not getting something else so we can verify who it comes from and that it is what we think it is we're not getting version compromised version of a library we're getting the the one the developer the author of it intended we want to make sure that we're using a common root of trust so we can all relate back to this common thing we trust to make our trust decisions and we and we basically want to put all that in a log that is only that is append only so you can't modify the log all the decisions all the all the signatures and everything they're in a log that can only be added to so it can't be compromised it can't be changed i won't go into all the tools too much we've mentioned some of it we've got six doors so as i was saying that's basically gives you the ability to sign and verify containers binaries but also configurations for instance we talked about tecton chains which enables you to sign your builds when you're doing a cicd pipeline opa is the the one that makes allows policy and i'll just briefly introduce key lime key lime is a tool that we've been working on for a while which makes it possible to to trust a remote server that you don't have control over modern servers come with this chip called a tpm which enables you which it stands for trusted platform module and by using that little chip in the server we can basically verify that it's boots normally and that it boots into something we trust into a known good operating system so that means we can basically start computers start servers remotely but have a high degree of confidence that they started in the real version of you know fedora or rel or ubuntu or whatever operating system using we know that it's starting the good version of the operating system not a modified version of the operating system that we might not trust so going back to our software supply chain basically as i was saying earlier in green we have the upstream source code in yellow we have the part where we build the the software from the source code and in purple we deploy runtime and what we want conceptually is to be able to pass between the upstream source code and the authors a signature and then again between the build system and the deployment systems another signature so we want to do this along the way these are just examples of what you know what where your upstream source code comes from how it's being built and what it's run on in the end so there's a variety of solutions in practice this means we get an upstream piece of code we take so like say a tarble of source code we sign it with sigstore and then we store that signature in sigstore in the component called vcore okay we've got a signature then so you can take code from either open source upstream or your own internal code both go into the same build system and here's what we would like we want the build system to when it receives this thing to build it checks it again using sigstore then it builds it automatically this is a tecton chain bit it checks and it checks against the policy so this is where opa comes it comes in again sorry and it checks using and it builds on a system that we know is a trusted system as trusted as possible because it's run it uses keyline and then when it's finished building it again uses sigstore to sign it itself so we've got another signature that says I took it from a known good source I did my job I did my job in a good environment and now here's the result and I'm signing this so you can check it later and the next step is then we have our deployment so our actual production servers and again these will be on the system that is measured using keyline as they receive the software they will check that it's valid that it has not been compromised using sigstore again they'll choose to deploy it or not depending on a policy which is again where opa comes in and lastly we can do an extra neat little thing which is keyline can check your server continuously to check that it's not being modified while it runs so the software you've deployed you can check that it's actually not being modified while it is running so that's a very interesting capacity and it makes sure that your systems once they've been started in a good state stay in a good state otherwise you can sort of start making decisions so brace yourself the next one is a bit there's a lot of stuff on the next one but it's just to give you an overview of what the overall picture looks once you start combining policy automating the builds and signing everything so there's a lot obviously because there's a lot of moving pieces but it's essentially what we've been talking about so you get your code from the upstream you sign it you send all these sign measurements all the sign software you send it into your build systems your build systems check that it conforms to policy once they're done once they've checked the signatures and everything they can sign themselves and eventually you you you deploy this on servers that you think you have good you have strong guarantees about their their integrity because they're running they're using key lime and they're running good operating systems that you know and with all this what really matters I think are the bits at the bottom that's that's what I would most like to try and convey to you which is we were saying in beginning how hard it is to sign and how hard it is to do all of this what we really want for this to work for everybody is for you know the software signing part that the verification is signing to be easy for developers to be automatic in the middle so all the systems that build it all that is automatic most people should not have to worry about it and on your production servers to make it as safe as possible and with that you really can catch a lot of the attacks we were looking at in the beginning because you've got basically signatures a little signatures are there and there you have them pretty much everywhere throughout your system which makes it possible which makes it very hard to cheat and very good to verify so we we really can prevent a lot of attacks here and in more business terms what does this give us well so it reduces the attack surface so it makes it harder to attack your attack surface is smaller because we've removed a lot of possible attacks and so this improves risk management we have better risk management because there's less attacks we have better cost management because we don't have to answer to so many attacks and it enhances your business capabilities because you have to worry less and you can build more essentially it also gives us better traceability it's easier to go back and see what happened because everything is logged and everything is signed at every step so that's faster analysis and if or slash when you get compromised it's easier to remediate because it's faster to figure out where something went wrong again that gives us risk management and cost management and it improves the trustworthiness of your systems overall because you have a better idea of what's running why it's running how it's running so this is both true for build build systems and production systems and again in terms of business point this gives us risk management and cost management how far are we from actually achieving it so as I was saying this is a vision this is something we've been working on this we've been thinking about it we've been trying to combine all these open source tools to make it work so some of these tools have been in production for quite a while key lime for instance is in production IBM has a cloud that is using key lime so all the machines are verified as they boot some are still growing for instance for instance six door is a very active project things are being built all the time we're also needing a there's something we don't have yet it's a common standard a format a way of exchanging data between each of these steps that all the tools can understand there's a project called in total which is designed to sort of pass attestations about software as it's being built it's a likely candidate in total is quite cool we'll see if that becomes the standard the format or not and we have a slight problem for instance that some tools can sign but they can't verify yet so it's great for future reference but they can't yet check a signature and then make a decision or build or not according to that so that's that's currently a state with technical change but hopefully it'll change and as I was saying in the beginning everyone is concerned this is not just an open source like a small open source community is huge but it's not limited to the open source community it's it's it's important to everybody because everybody relies on open source so this is why you know we want to build this as a community want everybody we want to invite everybody to join and to take part in this so if you want to go and have a look at the different projects I've listed two here which are two of the biggest ones so you can go to sixdoor.dev or keylime.dev to check out those those tools and they're both on github of course they're open source projects they all have chats so you can easily find the the key lime slack or the sixdoor slack which is the CNCF sorry this key lime slack is the CNCF cloud native computing foundation slack so feel free to join if you've got questions if you want to try and build something with it sixdoor is being used by a lot of people to build new interesting and cool things so this is very community oriented so please don't hesitate to come and ask questions and lastly it's my contact if you'd like to talk to me specifically and with this I'd like to open up to questions if anybody has any questions so you could for instance if it's going to be an internal thing or if you could you could sign your own you could use cosine to sign your rust package and that will upload that signature to sixdoor as a service and then somebody else if they have your package so you'll have to distribute your package in another way but if they get your piece of rust code then they can check from sixdoor that the signature is all good and then they can trust you mean as long as they trust you they can trust that piece of rust code and we're working it eventually the goal is in time for sixdoor to be integrated automatically in in rust and in ruby and in python and in npm so that when somebody writes a piece of rust code and they want to put it on crates.io they just it's automatically signed for them and so what this means by automatically is that they write the code they say okay i'm sending this to the repositories and then there's a pop-up that comes up on their computer and says oh we just who are you signing this is so you sign into your maybe your google account maybe your github account maybe your company's you know internal account you sign in and it proves it's you and that that's enough that's all you have to do and you don't have to maintain a key that's the whole point of sixdoor is you don't have to maintain a long-term key you sign that's it you don't have to worry after that so sixdoor effectively takes away the whole management of key rotation yeah how does sixdoor keeps logging of that rotation internal so what so the way we do it is one of the hardest things to do in signing software is maintaining keys so the core question that sixdoor asks which is quite clever is say what if we didn't have keys to manage what if we create a key we sign and then we throw the key away so we don't have to manage it it can't be taken because it's already been destroyed and so all you need all you need to do sounds easy but all you need to do at that point is to check that that person who signed that thing really had that key at that point in time and so going back to this much earlier one let's see if I can find it briefly which is the more technical one here so what happens is that basically when you want to sign you get a certificate that you tell the you tell full seo you have a key in exchange if you can prove that's what I was talking about with the pop-up if you can prove at that moment in time that you do own that address using it's called open id connect oh idc if you can prove that then full seo will do two things it'll give you a certificate that says this email address has this key you'll find to throw it out after but it'll give you that certificate with that certificate you can sign and it gets stored in recore so your signature gets stored the signature on the software gets signed in recore and your your certificate gets signed and gets stored in full seo so when someone comes later they find the signature they say what key was this they find the key they say did this really belong to this person they go and look in full seo they find a certificate that says at this point in time this person owned this key they don't have it anymore but it doesn't matter because when they sign they did and so you can you can check that it's good yeah it's what they call key less signing there is a key but it's just it it's just put in in life memory and then it's destroyed so you really don't need to worry about it yes um ideally this is all done with reproducible builds it's to me it's a sort of apparel track to it it you should be doing reproducible builds anyway if you can but you really should but a lot of a lot of people working on all this sort of open source software supply chain I thinking a lot about I'm trying to go back to that I know it might not be the best one to have in mind but um yeah a lot of people who are thinking about this sort of overall pictures are thinking a lot about reproducible builds if only because if we have reproducible builds we get the same signatures from the same keys we get at least the same hashes on them which means it's much easier to verify because if you and I have a different build checking that they were okay it's very hard what you know this obviously because that's why you're asking about every reproducible build but um reproducible builds for those who might not know means that two people building the software independently will arrive at the same exact result which means it's much easier for instance to check a signature like once you go and ask the recor transparency log up there um if this piece of software that has this hash is okay if we all have the same reproducible build then we can all check the signature much more easily because we're all checking against the same exact package so I so the answer so that's a bit of a long answer but the shorter answer is there is no current this doesn't include any specific thoughts about reproducible build but it does sort of take them for granted or you know they're just kind of obvious at this point that you should be doing that but nothing specific about it are there any other questions all right then well thank you very much for coming and uh yeah cheers