 Hello, thank you for joining our talk. My name is Raul. Here we also have my colleague Victor. We are both software engineers at Suzeranche and today we're going to talk about how to enforce a secure supply chain in quantities. We took this image from the salsa security framework. As you can see, attacks can happen at every link in a typical software supply chain. If we start looking at this picture from the left, first thing we need to protect is our source code. Imagine we're using some kind of a source control management system, such as Git. And our Git credential gets compromised. Someone can push some malicious code. We are not aware of it. Our build pipelines will eventually trigger any build containing this malicious code. It will fetch the code on such dependencies. It will compile and build a content image. And then we push this image to a container registry. Then in our production cluster, we will eventually pull this malicious image in production. So not just our source codes, everything in our supply chain we need to protect. It's our build system. It's dependencies. It's where we store our packages or container images. Imagine our container registry gets compromised and someone pushes a malicious image. We don't notice that this image then came from our build system. If we pull that from our production cluster, that's it. We already have some malicious code running in production. So that's what we want to avoid. If you go to the salsa website, you will see real attacks that happen for each of these points. Today we want to focus on two of them, which happened between the build and the package phase. So the first one, it's an attack that happened to code code. Somehow the credentials to Google Cloud got leaked and a malicious user managed to upload a malicious artifact to the market. Then the users downloaded this artifact. They were not aware it was not coming from the code code. So how they could have prevented this from happening? By using some kind of provenance check of the artifact, then they would have known that this artifact didn't came from code code. The second one, it's an attack that happened on package mirrors. Someone started running mirrors for several popular package repositories. They were used by some user that they thought they were pulling from the right repository and they didn't notice. Again, same as before, they didn't have any way to prove the provenance of the artifact. So this provenance check could have prevented this from happening. So this is what we're going to learn today, how to implement this provenance check. For that, we're going to sign our container images and then we're going to verify this signature before we deploy our bots in our Kubernetes cluster. And for designing, let me introduce SIGSTOR. SIGSTOR, it's a combination of open source technologies that allow us to handle dining, verification, and provenance check. We can sign almost everything. It's not just container images. It's everything that's stored in an OCI registry. It's a binary blob. It's software built of materials and many more. We're seeing a lot of increasing adoption for SIGSTOR in the open source community. Even Kubernetes itself has started signing their own artifacts, starting in version 124. So let's talk about some of the tools that comes with SIGSTOR. Cosign. Cosign is the tool that we use for the actual signing and verification. If we are signing an artifact that's stored in an OCI registry, such as a container image, the signature will be stored as a separate object in the same OCI registry with a predictive name, which is the digest of this image. There are two ways of signing with SIGSTOR. We have the traditional way of keeping where you need a keeper that you can bring your own if you have one, or you can generate one with Cosign. And then you need the private key alongside its password for signing. And then you need to distribute the public key for verification. And then we have the keyless, which doesn't involve using any key at all. It's using OIDC as authentication. And we will see later. We actually will see examples for both of them. So let's first see how we can sign using a keeper. And for that, I will create a new keeper with Cosign. Cosign generates a keeper. We need to enter a password. And we need to keep this password secure. I don't say our private key. Cosign has support from some key management systems, such as Hasco Vault, AWS, KMS, also Google Cloud. So that could help you with the key management. Okay, so we have our keeper. Let's sign. I created a small project for this demo, and I already pushed some container images to GitHub. So we will use that for signing. Okay, so we just need to pass the key. We just created Cosign and the name of the image. We need to enter the password. And that's it. That's it. Our image is signed. As you can see here, the signature was pushed to our container registry. So let's see that in GitHub. Okay, so this is the image we were signed. This is the tag we signed. It's actually, we are not signing a tag. It's always signing the manifest digest. We see here the digest which matches the name of the signature. So this signature that was pushed, yes, right now. The signature leaves alongside the same OCI registry. Okay, so that's it. That's how you sign. Now let's see how we can verify. For verify, same as before. We need to provide the key, but this time it's the public key. And the name of the container. And that's it. It was successfully verified. We see here information on our damage, but we didn't see any error. That means that it was successfully verified. Okay, so this is how you sign with a keeper. But remember, you need to handle the keys. You need to store securely your private key and the password. And you need to distribute the public key for verification. If you don't have a public key, you cannot verify the signature. So before we go into the keyless mode, let me introduce another tool. Two more tools that comes with the system. And the first one is Fulcho. Fulcho is the root certificate authority that issues a certificate that we will use for signing our artifacts. It issues solely to certificate the leave for under 20 minutes. It's based on an OIDC email address. OIDC or OpenID Connect is an identity layer built on top of OAuth 2. And it allows third party application to verify the identity of an end user. So the way it works is we request a certificate to Fulcho. We authenticate with one of our OIDC provider. We pass this OIDC token to Fulcho. Fulcho will give us some short lead certificates. And it will publish the certificate to our transparency law. And there is a public instance available, supposedly by the sister community, but still experimental at this moment. Okay, so now let's talk about the record or the transparency law. Record is an immutable ledger of metadata. That means that you can append new record, but you cannot modify or delete any existing record. It provides a RESTful API for querying the data. So you can use it for validation. There's also the record CLI available at the same. You can query the transparency law. And then as Fulcho, there is a public instance available, but it's still experimental. Then as with Fulcho, you can run your instance if you want. That's totally fine. Okay, so let's put everything together and see how we can sign using the keyless mode. So keyless uses OIDC for identity. So that's it. We don't need key. We just need identification from an OIDC provider. That's all we need to prove our identity. Then we will use the short lead certificate issued by Fulcho for the actual signing. And we will push these certificates alongside the signature to the record transparency law. And that's how we verify. We verify the record transparency law with also the timestamp, and then we can know if a signature we can trust it or not. It also has support for automated environment, or CI CD pipelines, as we will see an example of that later. I know this keyless mode can be a bit confusing at the beginning, so let me explain with this diagram which hopefully will help you to understand better how it works. So let's start with the developer's workflow. We want to sign an artifact. So the first thing we do is we request a certificate to Fulcho. And then we need to provide an OIDC token. So we have to authenticate with an OIDC provider. We authenticate, we get back a token, and then we pass this token to Fulcho. Fulcho will verify our identity based on this OIDC token. It will issue a certificate that we can use for signing. It will give us back this certificate, and it will also publish the certificate to the record transparency law. So now we're back here. We have our certificate. We sign the artifact, and we publish the signed artifact. So our OIDC has reached it. That's it. Our artifact is signed. So now let's see the other user's story. We are an end user, and we want to verify the signature. So we're here, end user. First thing we need to do is to find and download the artifacts from the registry, and then we need to query the record transparency law. And we check the signature, we check the keys, we check the timestamp, and if we trust it, we say, okay, we have to check that the signing party, it's inside the register. And with that information, we can successfully verify the signature. That's it. We don't need anything. We don't need a public key. We don't need anything at all. So you don't need private key for signing. You don't need public key for verifying. And then for monitors, that's even easier. Everything is stored in the record transparency law. Remember, we mentioned that record is an immutable legend of metadata. No one can modify or delete any 16 record. So everything is there. It cannot be modified. So it's perfect for monitor just to look into this query. It also has a restful API, CLI, so you can easily query the transparency law. Okay. So this is how the keyless workflow works. Let's see an example. Let's see how we can sign with keyless. Again, same as before, let's try to sign the same container image. Now we're going to assign the other type, the other version, the other type, sorry, the version 0.2.0. We need to mark it. We need to pass this environment variable because I experimented because at this moment it's still experimental. Okay. And now what happened is my browser opened this and it's asking me to log in using one of these OIDC provider. For this demo, I will use GitHub. So click on GitHub. And that's it. The dedication was successful. We can now close this. And if we go back, successfully verify the token and it created a transparency log entry with this index. We could use this index to query record using the record CLI and we will see the entry there. And also it has pushed the signature to the registry. Let's check that. Same as before, the signature is here. This was the first example with a keeper and now we have the keyless here, which is this tag. This is the digest and this is the name. As you can see, it matches. So this is how you sign with keyless. And as you have seen, there is no keys. I don't have to store any keeper, any key, any password. And let's see how we can verify an image. That's it. We don't need anything, any public key or anything at all. This is saying that the cosine is very fine. It's checking the record transparency log for the signature and for the certificate. It said, okay, it was successful. This was the issue and the subject that came from the OIDC provider. So here we can say, okay, it was good help the provider and this is the email address. I trust this email address so I can trust this signature. That's how you verify using the keyless workflow. Okay. This is great, but it still requires a human interaction. So as you can see, I typed cosine sign with experimental flag on my terminal and it opened a browser and I had to manually go there, which is not great for a real environment because I'm pretty sure you're all using some built-in pipelines, CI CD pipelines. Cosine has support for that. You can directly pass the token using the identity token flag to cosine. And if you have a GitHub, if you're using GitHub Action or Google Cloud, they have automatic discovery, which means that if you're running inside one of these environments, cosine is smart enough to get that you're running inside of one of them, it will fetch the OIDC token for you and it will pass it to a future. So everything happens transparently to end user. Let's see how we can do that with GitHub Action. Again, let me go back to the example that I created here. This is the workflow I created. This is saying, okay, every time you push attack, let's start with a V, run this job. We're giving permission to publish a package, which is important because we need to publish the signature and also the container as well. And the ID token, that's also very important. This is the permission for the OIDC token because we need to fetch an OIDC token from GitHub. Then this is our standard login to GitHub. To reduce the tag name, we are building the image. Here is where Sisto comes in. We are installing Sisto. And then we are signing. As you have seen here, we just need to pass the environment variable, the experimental one, and that's it, and sign. There is no key, again, there is nothing at all. There is not even a call to OIDC. And that's because cosine will detect that we are running inside of our workflow, GitHub Action Workflow, and it will fetch the OIDC token for us. So everything happens automatically. That's really cool. There's nothing to be done. So let's see how this worked. There are run-ins. Let's see this workflow. Let's go to the course. Okay, this is the installer, and this is the signature. Okay, same as before. Create a transparency log entry. Again, we could query that if we wanted to record transparency log, cosine verify will do this for you, and it pushes the signature again to the same registry. So that's it. That's how you integrate with GitHub Action in a CI pipeline. That's pretty cool. Everything happens automatically. So let's recap. We have learned how to sign container images with cosine. You can use the keeper workflow, but again, you need to handle the primary key. You need to keep the password secure, and you need to distribute the public key for verification. Or you can use the keyless mode as we have seen, which is still experimental, but hopefully will not be experimental soon, and by the way, when it is using the keyless, it's just really cool because you don't need to handle any keys, and you don't need to distribute public keys for verification. Then you can verify container images. We have seen how you can verify images in the CLI, but okay, you're not going to manually verify images. You want to do that inside the Kubernetes cluster. Just before you deploy a bot in your Kubernetes cluster. So how can you do that? Well, you have to do that with a dynamic admission control. For that, I will hand over to Victor, who will explain how dynamic admission control works. Many thanks, Raul. So as my colleague Raul was saying, let's see how dynamic admission control works in Kubernetes. In this slide, we have a diagram of a Kubernetes cluster. In the left side, we have our happy user, or maybe the tooling that is hitting the cluster, CICD, things like that. In the middle, the big blue box is the API server. We can see that it has four distinct phases in this case. The first one is authentication and authorization, where we authenticate that the user is who they say that they are, and then we see if they are authorized to perform the request that they want to perform. The request meaning let's see that the user wants to create a bot. In the second phase, we have the mutating admission, where Kubernetes exposes us some webhooks, where we can just connect and run our own logic. In this case, we maybe want to mutate the request. Let's say that we get a pod and we want to change what is inside of the pod. We do it, and then we go to the next phase, which is the schema validation, where we validate that the request, which is a JSON, matches the schema. In this case, matches the spec of the pod, for example, a pod spec. If that goes correct, we go to the next phase, and last one, which is validating admission. Here, we check for whatever we want, because we can run, again, our own logic, provided by connecting to those webhooks, and then we provide a binary response, either we are validating or not validating. If everything goes fine, then we validate, and the things get persisted to ATCD, which will get picked up by the reconcilers of the cluster, and things will get created or deleted, or whatever we want. Let's see it with a specific example. Let's say that we want to change all pods that go into the cluster, so they contain an annotation, and that annotation contains the string prod. Okay. Then, as a user, we just do a cube control apply of a pod, and that's a jammer, but it's just JSON, in the request. We go through authentication and authorization to see if we are authenticated and if we are authorized by airbag to create that pod, perfect. Then, we go to mutation. Here, we run our own logic, and we just add the annotation prod into the JSON of the pod. Pretty simple. Afterwards, we go to schema validation, where we get the JSON of the pod with our annotation prod inside. We check that the JSON matches the pod spec. Perfect. Then, we go to validating admission, where we run again our own logic again. We check if that pod contains the annotation prod. In this case, it contains it, because we have added it in the mutating phase. We validate. Perfect. It goes to ADCD, gets persisted, and gets reconciled and created by the rest of the cluster. Perfect. With this display, we can use this mechanism to change and validate things happening in the cluster. With that, we can use a policy engine to secure our cluster with Syxtor and so on. How does that work? Let's see a specific example. In this case, we have selected QBorden, which is a CNCF project. It's a policy engine that our colleagues and I are working on, so we are fairly used to it. It also has some properties that are great for this case of Syxtor. Let's see which ones are they. As a normal policy engine, it's going to monitor and enforce policies in the cluster and validate or mutate the requests. It's installed via home charts. It provides three CRDs, the policy server, to run the policies, and then you have two policies, either cluster and mission policies, cluster-wide or mission policies, main space. The policy server libraries are written in Rust, which for us meant that we needed to improve Syxtor's Rust libraries, and there was no Syxtor Rust crate, so we created one and we donated it to Syxtor upstream, and it has taken a life of its own, and we are super happy with that. One thing that is important and relevant for us is that policies are Wasm modules. All languages that can compile to Wasm can be used to write the policy and be run as a policy in the cluster. Why is that important? Let's see why is that important. What are the benefits for us if about WebAssembly? For those that don't know about WebAssembly, it's a binary instruction format. It provides a common target architecture for several languages, and it's very small. Once you compile it, the binaries for us, in this case, are going to be super small. The policies are 1 megabyte, 2 megabytes, 3 megabytes big, which is quite small, which means that it's fast, in that sense, pulling and pushing policies. It's polyglot, so every language that compiles to Wasm is useful for us, and more languages are added every day, and we have, for example, Rust, Go, Swift, TypeScript, and Rigo policies and OPA policies can also be compiled to Wasm, which allows us to run policies already written for other policy engines, which is great. In this case, Wasm is secure. Wasm is run in a sandbox runtime that is isolated, so the apps cannot escape the sandbox without going through the defined APIs, and this comes from the web world, because that's where Wasm was created, so these APIs are pretty defined and pretty secure. It has memory safety, it has control flow integrity, protected call stacks, a lot of things that are nice but no buffer overflows, so it's pretty secure. It interfaces the host with... We interface the host with POSIX-like interface, in this case. We use Wasi, so it's kind of like a POSIX-like system interface for all the things needed by the policy, and it's quite portable. You just need to compile it once and run it whatever you want. All of this means that it's nice, it's a bit of a topic, but it's nice because it kind of removes layers in a way. You just compile it once and there's a lot of things that you can take out, so in any moment you wanted to simplify the stack, in general, maybe Wasm is something to have an eye on. With all that said, what are the benefits for us in the policy world? Well, maybe you are used already to write things in Go and you know your Go and so on. Then you can keep using it for writing your policies. You choose Go, use your favorite libraries, your favorite tooling, linters and so on. Maybe you don't need to change how you work, you use Git, you use your CI CD that you already have and so on, which is great. And apart from that, Wasm modules, the Wasm binaries, are in OCR registries with first-class support. The same level as container images in registries or health charts in OCR registries, Wasm modules are there in OCR registries, which is great, as we will see in the future. And also, thanks to this, we can run the policies outside of the cluster. We can iterate on the policies and we can iterate on the signatures and so on, outside of the cluster. And once we are done, we go to the cluster. Okay, back to SIGSTOR. How do we sign and verify a policy, no? Well, we are saying that policies, Wasm modules have first-class support in OCR registries, same as container images and same as container images. My colleague Raúl has explained how to do signing and verifying of container images with cosine, so the same is true for Q-word and policies. We just do cosine, sine, cosine, verify. We can see here an example of cosine, cosine, verify. It's a keyless one, provides an easier and simpler, pretty simple. Of course, you can do it locally and we provide also the tool in KW-CTL, which allows us to verify the policy locally, for example, for signatures, or run the policy locally, or pull it locally and so on, which is pretty nice. Okay, where do we find policies? Well, you can find them in the hub that Q-word has, which is hub.q-word.io, where you can see also the signatures, the policies are signed, or not via SIGSTOR, but Q-word is a CNCF project, which means that we are present in artifacthub.io. So if you go to artifacthub.io, you look for the kind Q-word and policies, and all the policies that have been submitted there will be listed. Those policies that have been signed with SIGSTOR have the little tick there, as you can see, in the screenshot signed. So that's great. Many thanks to the artifacthub for this feature. Okay, so now we know about Q-word and policies, we know that there are wasn't modules. How do we secure the cluster with SIGSTOR and Q-word and policies? We have to do two things. We have first to trust the policies, and once we have trusted the policies, we deploy one policy that secures the cluster. How do we ensure that all policies are trusted then? We just configure the policy server with the settings that we want. Here we have the example. The policy server has a field verification config in the spec, and it needs to point to a config map that contains the verification config. That verification config you can obtain the default by using KWCDL scaffold, as you can see here, and it's pretty simple. You can see that it has all of and any of in the config. All of is an array of all the things, all the signatures that need to be present. And in this case, we only have the one for Q-word. So we're talking about policies. We want all the things that are... We want everything to be signed via GitHub Actions workflows and the owner Q-word and the repo Q-word. That's it. Why are we not listing here the issuer and subject and so on? Well, this is part of the best practices that one should use. And the thing is the certificates, the OICD certificates from GitHub for workflows and so on provide issuer and subject but there could be a problem where if you use reusable workflows an attacker could reuse your workflow and pretend that it's signed in SU. For that, by design because GitHub provides another extension in the certificate that lists which specific job was run for signing. And that's the one that we also need to look. And we in Q-word have implemented it so we are looking for that specific X502 certificate extension so we are checking for what we need to check. So if we were to configure the policies server with this we will be asking for all the policies to be signed and that signature being performed in GitHub Actions and being performed from a job that has run inside of the Q-word. It's an array you could add your own. Maybe you have your own policies maybe you are resigning your policies they come from your own CI and so on. You could just fill here what you need to fill. Once we are that we are ensuring that our policies that are going to be deployed are trusted. And now we need to ensure that all images, container images in the cluster are trusted. How do we achieve that? We can use a policy. In this case we are using this verify image signatures policy from Artifab. This is a policy that we have written the Q-word and theme. You could write your own. We provide SDKs for several languages so you could just write your own. But if you want to use this one then we can just go with this one. In this case it is written in Rust it is 200 lines of Rust just checking the JSON a bit if things are there and then pulling the signatures and checking if the signatures are signed by the people that you want to be signed and that's it. It is 200 lines of Rust, 400 lines of unit tests and a bit more for end-to-end tests and that's it. Not much. How do we deploy this policy in the cluster? We do it by instantiating a cluster admission policy. We can see one here. We can do a cluster admission policy where we set the spec.module to the wasn't module of this verify image signatures as we can see here and then we have the rules. In the rules we can see that we are checking this policy is going to run every time that a pod gets created or updated. So every time that a request is going to try to change the spec of a pod create or update. Then you can see that it says mutating true, which is going to see why that is important and then we have the settings of this policy. They are pretty similar as before. Here we are checking that all images that come from the GitHub container registry go-releaser, namespace are signed with a GitHub actions workflow that was run from inside the go-releaser organization. And we are also checking that all the images that come from the GitHub container registry namespace Q-warden are signed via a GitHub actions workflow that has been run inside of the Q-warden organization. Pretty simple. How does the policy work? How does the policy check with these settings? Well, let's see. The policy works like this. We have the JSON request on the left, which is for a pod and then the policy logic runs. The policy is going to check the pod for all the containers inside of the pod, so in this case container images, init containers, ephemeral containers. Then for those, it's going to check if they are signed and if they are signed, it's going to check if the signature matches what we have listed as trusted and if not, we reject the pod. And if it's signed and signed by the people that we want it to be signed, we can approve. Perfect. But ah! Normally, no. One in the pod, in the container image, we release the URL for the container and the tag. But the tags are mutable. We maybe have go-releaser 1.0, but maybe it's signed, but one can just push another go-releaser 1.0 overwriting it because it's mutable. So that's not enough because an attacker could just in the future change the tag. What the tag points to. So what we should do is mutate the pod and we append the checksum digest of the image after the tag. And that way, we are pinning exactly the tag with checksum digest of the image and nothing can be changed. And then we can approve and that's it. So to apply the policy, we just do kubectl apply of the policy that we have seen before and then you can see that it's going to take a bit, a minute. And then we can see, for example, in this kubectl get cluster admission policies that the status is active at the end to the right. With that, the policy is enforcing and it's protecting. Perfect. You could also use the condition of the custom resource in this case, we would be waiting for condition policy active and once it's done, we know that it's enforcing. Okay, let's see how that looks. So let's try to instantiate the pod with an untrusted image. Maybe this pod is signed, maybe it's not signed but it's signed by somebody maybe it's signed, maybe it's not signed or it's signed by somebody that we don't trust. So here we have a pod, it's an nginx pod, it's not signed and we can see we try to instantiate it and it doesn't work. We see that nginx is not accepted. Purification of image nginx failed. No signatures found for image. Perfect, that's what we wanted. What happens if we try to instantiate a pod with trusted images? In this case, the GoReleaser. GoReleaser was signed via GitHub Actions Warflow and this particular version is signed so we just run it, the policy checks it perfect and we can see that it has mutated the pod. Once it has been instantiated if we do the kubectl getPods GoReleaser and we look for the container image the container image not only has the tag but it has the checksum digest appended at the end. Perfect, that's what we wanted. Okay, with this making sure that the policies are signed are trusted and everything every workload resource every pod container image jobs, replicasets and everything are secured in the cluster we are done. What's next? Okay, my colleague Raúl and I have talked about how to secure with SIGSTOR the cluster but we have mainly focused on the lifts we have mainly focused on our own workload resources but normally you depend on other things and we have not talked about those things how do you solve that problem? Well, that is solved using software buildup materials where we just list the dependencies of everything and for those dependencies you would run similar checks that we have run but this topic is bit complex it falls out of this specific talk and we look forward to talk about it in the future. Talking about dependencies everybody depends on something so if you take something from this talk if you take one thing is to sign and verify everything. If you can this is a community effort we should be able to just sign and verify everything that we create and use and in specific for QBorden what's next for us is apart from other things in this topic in the SIGSTOR topic is expanding the CI integration we have shown how one can use public keys how can use GitHub actions for signing and verifying but what happens from other CI providers for those one needs specific best practices for their certificates and so on and we look forward to adding those in QBorden and talking about QBorden or maybe you want to talk to us about SIGSTOR in general so please come to us in the QBorden Slack in the QBorden Workspace or hit us in the GitHub organization QBorden and if you are interested in seeing the QBorden policies and so on go to QBorden.io or Artifact app and have a look and don't forget to talk to us on Twitter if you feel like there's something that you would be interested in and with that said many thanks for your time we hope that you have learned something and enjoyed the talk and see you around