 So yeah, I'm going to be talking about securing your supply chain by building with OpenSSF Fresca. All right. I'm Mike Lieberman, just a few facts about myself. I'm a co-founder slash CTO at Kusari, a supply chain security startup. My main focus is I'm an architect focused around sort of cyber security, cloud native security, but mostly recently supply chain security. I am a Salsa Steering Committee member and for folks who are familiar with Salsa, the supply chain security framework. I am a cloud native computing foundation technical advisory group security lead, as well as I recently co-led the CNCF Secure Software Factory reference architecture. So let's first talk a little bit about the sort of threats that we're trying to protect eventually with a tool like Fresca. And so this diagram here comes from Salsa website. And so the main focuses of a tool like Fresca, which is a build-related tool, is mostly around the sort of, I would say, between C and like F up there. And so the main things are, what are we worried about? We're worried about things like what happened to solar winds. We're worried about pulling in bad dependencies and all those sorts of things. So some of the big issues we're trying to kind of solve are, one is, are you pulling from source code that you're supposed to be? Did somebody intercept the source code and inject something else? Is the build itself compromised? So like we saw with solar winds, one of the issues was the builder itself got compromised and was signing stuff with solar winds' key. And so yeah, there's also concern about pulling in bad dependencies, as well as when you go to publish a package, is the package you're publishing to your package repository, is that the package you actually built? So one of the other common attack vectors is, hey, sure, you're publishing built packages, but you're also publishing. Somebody else who has access to your package registry is pushing out software. And because you're not, let's say, verifying signatures on that software, it's getting compromised there. So there's a couple of ways that we can sort of, when we start to think about this, how we can start to protect against this. So when we look at C, we want to make sure that the only pipelines that are running are pipelines that have been approved. So a lot of folks who might be running tools like Jenkins or GitHub Actions, one of the common questions that we get asked is, hey, how do I know that a developer or somebody who's not supposed to be able to modify a pipeline, did they modify a pipeline? Did somebody push a new pipeline that wasn't approved and those sorts of things? When we look at the build process, certain things that we can do to essentially verify that the build is operating securely, one of the things is, hey, can we cryptographically verify everything that's running inside of the build process? And then also, within that build process, what's actually happening at the execution layer, are we looking at runtime? Are we looking at what's running at the actual build at build time? And then when we go to upload a package, are we sure that what got built did not get changed while it's being published? And so really what we need to do is we need to establish the provenance between where you get the source to where you publish the package. And so that's where the Secure Software Factory came in originally, which once again, that's part of the Cloud Native Computing Foundation. And so the Secure Software Factory really was a project intended to develop a tool agnostic approach to how things should be built in a cloud native way and done securely with supply chain security in mind. And so really, this is what this sort of thing looks like. And I'm not going to get too deep into too much of the architecture. There's a couple of talks I've already given, and I'll have links at the end of this talk on that. But to briefly highlight what the core components are, as you can see to the right, there's a thing called Pipeline Framework and Tooling. So first, things first, you want to make sure that when you actually are orchestrating builds, that you're only orchestrating builds and orchestrating pipelines that meet your governance requirements, that meet your policy requirements. Because you don't want to just have any sort of developer say, hey, yeah, it's great that you want me to do a security scan. My pipeline doesn't have that security scan. So first things first, you want to make sure that the only pipelines that are allowed to run are ones that match policy and match governance are approved. And that's maintained by the tooling, as well as an admission controller, in this case, inside of Kubernetes itself. We have an admission controller to make sure that only pipelines that are approved and match our policy are allowed to run. So that helps out with the actual orchestration piece. But what about the actual like when something gets spun up? So once again, this is cloud native, so you can imagine this is going to be Kubernetes here. So when actually running the workloads, you want to make sure that the workloads themselves are not modified while running. As you can imagine, if somebody has admin access, super user access to your Kubernetes cluster, they can modify pods as they're executing. They can say, you're running on this container, actually run on this container image. Well, that's where workload and NOTA testers come in. And those things essentially enforce that if something were to change, if you told it to run using build container x and now you're using build container y or build image y, that it would no longer match. And the workload of testers would block that from continuing to run. And you would no longer get a signed build. And in this case, it's using Spiffy-Spire. And I'll get into that a little bit later. You also want to make sure that you don't want the build environments themselves to be reporting their own output, because the build environment can lie to you. You want to have something that exists in a separate security boundary that is looking into the build environment, observing the build environment, and reporting the results as opposed to just trusting what's coming out of the build environment itself. And then finally, now that we've looked at the pipeline framework, which is securing the orchestration of the builds, and we're looking at stuff like the workload and NOTA testers, which are focused on essentially attesting that only workloads are allowed to run on approved hardware. Approved workloads are allowed to run on approved hardware. How do we make sure that an actual container that is running a build isn't being tampered with while it's running? So that could be something like a malicious build image itself. It could be malicious code happening inside of the build image. It could be somebody trying to exec into the container and trying to mess with what's going on in there. So that's where runtime visibility comes into play. And you can imagine this is sort of EBPF. And that helps out there as well. Once again, it's not about any one of these tools. It's about kind of a combination of these tools. And also integrating with other things like in Toto and other processes to sort of essentially verify that, yes, what you're running, what you intended to run is actually what's getting run. And the thing that's highlighted in red there is what we're going to be focused on for this talk. All right, so now that we know what the secure software factory is, what is Fresca? So Fresca is mostly two main things. The first is it's a set of secure tools and services configured into a secure build service. So that means it's in, and in this case, we'll see in the next slide, it's a bunch of different components like Spiffy Spire, Tecton, and Tecton Chains, and a bunch of other tools configured securely. So that's one big piece. But that's only one piece of a shared responsibility. It's not just about running the build service securely. You also want to make sure that the build you're running are secure. Are they running the tasks that you expect them to and so on? So that's where the second piece comes in, which is going to be the highlight of the latter part of this talk is abstractions to actually run build securely. So this is how do you make sure that somebody can't just say, yeah, we know that we are required to have an S-bomb, a software build of materials, but I'm not going to run that step. How can we enforce that step actually does get run? And then the key components of the key principles for Fresca is one is we want to make it as simple as humanly possible for the end user. We want to apply zero trust at every stage we can. So that means if an individual component is compromised, it doesn't necessarily mean the whole service is compromised. We must run securely, but we want to be flexible otherwise. So that means if there is something like you are required to have an S-bomb, but we don't necessarily care whether or not it's Cyclone DX or SPDX, great. You should be allowed to use what you want to use. And then the other thing is automate as much as we can. We want to make sure that this is simple and straightforward so that folks do not have a huge maintenance burden in managing all this. So a few highlights on Fresca, a few extra facts here. So Fresca is an acronym that stands for the factory for repeatable secure creation of artifacts inspired a little bit by Salsa. So Salsa Fresca is a type of Salsa. It's an implementation of the Secure Software Factory reference architecture. We're taking a holistic approach to meeting Salsa and these other security requirements. And just as a reminder here, Fresca is an open SSF project underneath the supply chain integrity working group. We plan, the idea behind also Fresca is that there's a tamper evidence seal on every piece of your workflow. So if something does get compromised, you have evidence that it was compromised. And then in addition to that, we are building out a pipeline framework that extracts sort of security and policy away from the developer so that they don't have to worry about checking the box themselves, having to make sure that, hey, is my pipeline correct? It's more like, no, your pipeline should automatically get generated based off of just what you need. And all of the sort of governance and policy should be generated for you. So I'm gonna briefly talk about this, but I'm not gonna get too deep into it just because there's been a couple other talks on this. But to kinda just talk about how Fresca is set up at an architectural level, right? So Fresca runs inside of Kubernetes and it's an opinionated set of tools. And those opinionated set of tools are tools like Tecton and Tecton chains, which you can see at the top there with triggers and pipeline runs and task runs. It is an emission controller. In this case, we're using Keyverno, but we plan to make that pluggable. So if you wanted to use OPA or something else, you'd be able to plug in there. We also use Spiffy Spire to sort of essentially enforce workload identities as well as node identities so that you can enforce that, yes, is this build happening on software that is registered, that has been registered or did somebody somehow get into our Kubernetes cluster at a malicious node if somebody tries to mess with a pipeline as it's running and tries to change out elements of the pod, it would get detected by Spire and would not sign it. We're using HashiCorp Vault along with the transit plugin to actually sign, to actually do a lot of the signing and to also enforce stuff like your individual jobs and your individual pipelines can actually get access to the secrets that they need and only access to the secrets they need based on their Spiffy Spire workload identities. We're using runtime observability. Right now we're using Tetragon to actually go in and look at what's happening inside of the containers as they're running and if we detect something nefarious like all of a sudden it's trying to reach out to malware.com and download something suspicious, we'll be able to detect that. And yeah, and then the idea here is all of that sort of generates metadata like salsa attestations and signatures and S-bombs that all get signed and then when you go to production you could use admission controllers that can essentially verify, did it go through all the right steps? And assuming it has, the admission controller should allow that to go through. And once again, the area we're gonna be focused on a little bit here is mostly on what's sort of in highlighted there in red. So at this point, not really, probably need to make it a little bit clearer here. The stuff that's in blue is mostly the build itself. The stuff that's in purple are some of the supporting infrastructure, so that's like a supporting tools and supporting infrastructure. So the stuff that's in blue is tecton and tecton chains and the stuff that's in purple is stuff that's like the admission controller essentially enforcing that only pipelines that are allowed to run or the only images that are allowed to run inside of the build. And then the stuff that's in red is stuff that's supporting that's actually external to the actual build cluster. So you wouldn't wanna run Spire inside of the same cluster that you're actually running everything in. You would wanna have it in a separate sort of trust domain and that can kind of, that would have access inside there. Okay, so as I kind of alluded to before, the primary focus of what I'm gonna be talking about here is the pipeline framework and tooling. And why is that needed, right? So why do we wanna sort of have something and abstraction on top of something like tecton and tecton chains and all these things? Well, one of the key reasons is there's a shared responsibility here, right? So Fresca is saying, from our project perspective, we're saying our responsibility as a build service is we're providing you a set of secure tools, secure services that are spun up, they're secured by design, they have least privilege, all that good stuff. But when you actually run your jobs on top of that, we can't determine exactly what you're gonna run on there, but what we can do is we can provide you with a set of tools that make it super easy to operate on this secure set of tool, this secure build service. Another reason is, right, everybody's gonna have different needs. Some folks are gonna wanna have, like especially very, very large enterprises, and I've worked at multiple large banks, including Citi and MUFG and those sorts of places where they might be running tens of thousands of different pipelines, and they're gonna wanna have a whole hierarchy for their pipelines of different departments that are gonna wanna have different restrictions on what's allowed to run, whereas you might have a startup, and your startup might say, no, no, we're just relatively small where you wanna be very flexible, so different people are gonna have different needs. You also wanna have defense in depth, right? You don't wanna just rely on Fresca as a build service to enforce everything for you. You also wanna make sure that you're protecting yourself from even potentially trying to deploy a bad pipeline, right? And then the other thing is it just makes it very easy to manage multiple resources, as we'll see. Like when you're deploying out Tecton, TectonChains, different Keyverno policies, different secrets and config maps and all those sorts of things, it can get very complicated very quickly and having to, you know, telling the end user to manage that for themselves or telling the developer for their build, hey, you have to manage all of these different Kubernetes resources yourselves is kind of difficult. So the way we sort of are doing this is we're using a tool called, or a language called Q which is built by, or actually I'll get into that in a second here, but, and then the thing that we're doing here, right, is we can have a single sort of Fresca configuration that produces multiple Kubernetes and custom resources, right? Okay, so now a little bit of information to what is Q and why do we use it for folks who are maybe not familiar. So Q is a language for sort of generating and validating data. It's similar to JSON and in GCL, if anybody's familiar with that. It's actually written by one of the folks who had built GCL at Google. A key component of it actually is it's not Turing complete, which means that, you know, there's a lot more things you can reason about without needing to worry about like infinite loops and a lot of other sorts of, you know, nasty situations like that. The other thing that's really nice about it is it natively supports the ability to sort of just import Kubernetes resource definition straight from the go APIs, which allows you to then just validate stuff like, is this a valid config map, right, without necessarily needing to write a parser or a schema validator yourself. And so as you can see here, sort of in the center, right, you have something that looks like your normal, down in the example here in the municipality, you see something that looks like a schema, right? You know, hey, I have a normal struct. It's name, a string, a pop, you know, is an int and a capital, you know, which is a Boolean. And when you actually sort of define this and you sort of use it in Q, you can say, well, okay, great, the name, still a string, but the population of a large city is greater than five million. So now I've implemented a constraint. And Q makes it very easy to sort of define constraints. And then capital is, you know, a default to true or it could be a Boolean. And so then when we look at it, something like Moscow, hey, it's a large city, its name is a string. I mean, in this case, an actual sort of string literal. And the population is 11.92 million, right? And okay, great, it's greater than five million. So yes, it is a large city. And if for some reason this was like, you know, less than five million, this would fail and would say, hey, no, that doesn't match. And then when it actually gets generated, because there's the capital, you know, asterisk true on the right-hand side there, it defaults as a capital. So now let me talk a little bit more about where those sort of shared responsibilities come in. And what's, why this is sort of, once again, why we're using this, why is this important? And I realize that some of it is kind of the blue and purple are sort of blending in. I apologize. So the stuff that's in blue, like the scripted build, as well as the parameter list build, those are responsibilities that are purely the responsibility of the end user, the developer, the person who's responsible for the build, right? Like no build service is gonna tell you how to script your own build, you know, we can provide some hints and yada, but it's up to you to say, yep, I'm running go, or I'm running co, or I'm running build packs, or whatever. The stuff that's in red here is our responsibilities of the build service itself, right? So these are things like, you know, ephemeral environment and isolated, right? You know, if the build service itself is sort of saying, yeah, we are providing you with ephemeral environments, we're providing you with isolated environments. We are sort of generating provenance via the service. We are sort of providing some guarantees that that provenance is non-falsifiable. And then there are things, and so those are things that are the responsibility of the build service itself, in this case, Fresca, and then the stuff that's in purple is kind of stuff that's shared, right? So there's build as code, which is like, well, Fresca provides a mechanism for you to run build as code, so that's one piece of it, but it's up to you to actually write the build as code. And then for the provenance, right, you know, there's some available and authenticated, right? It's up to you to actually provide the mechanisms for, like, it's up to Fresca to actually generate this stuff, but it's up to you to sort of provide the key or the secrets that you're using to sign it. Okay, so now I'm going to briefly sort of just talk about what this pipeline framework sort of looks like. And so you can imagine, right, like, there's gonna be some stuff in green, which is what is being brought to you by Fresca, and there's gonna be some basic sort of configuration and all that good stuff. And then you have stuff like your organization's config, which maybe sort of further constrains what you're getting out of Fresca, so you might imagine, right, Fresca says, you can use any build that you want, that is, you know, that meets the sort of basic secure requirements, but you might say, well, you know, my organization only runs Python and Go. Okay, well, so if you try to now go, if somebody then tries to build a pipeline that is Rust, sorry, that's not allowed, we're not, we only support Python and Go at this organization. You could also enforce things like naming constraints and all that good stuff. And so all of this sort of makes it really, really easy to sort of enforce a bunch of different constraints on how you're building, and on how you're actually building your stuff. And the same thing kind of goes when you also have stuff like security constraints that might be sort of global to the org or the infrastructure constraints, right, where you might say, hey, I don't want to run a million dollars worth of infrastructure for every build, so I have a constraint that says this is only what's allowed to run. And then what it actually does here is it will go out and it will, you know, the stuff once again in orange there is the user, but largely what's happening is all of that configuration, you know, we're pulling from a shared library, well, you know, you have your organization config, your whatever hierarchy that you have set up, you're maybe pulling in some additional vendor dependencies, like, you know, let's say different tasks from the Tecton task catalog. And then the Fresca tooling is then takes all that, generates a bunch of Kubernetes YAML out of it, and then the output inside of the actual Fresca environment is stuff like Tecton tasks and Tecton pipelines, Keverno policies, Kubernetes config maps, different security rules, monitoring secrets. And then this is sort of what an example layout might look like of, you know, how the code is all sort of structured and I'll show this in the demo in a second, but you have stuff like, hey, I might have a baseline and the baseline has some constraints and, you know, I include go and build packs, let's say, but, you know, the org maybe pulls in some user-defined things, you might have different overrides where you might say, hey, I'm pulling in Fresca's go, but I'm adding some additional things in there to overwrite it, right, and maybe I'm ingesting the sort of Vendored Git task in whatever and there could be also helper libraries like a hash validation function that can help you sort of say, hey, is this a valid hash, right, or is this a valid, you know, that sort of thing. So now let me actually switch over to a demo, real quick, and I'll show what's actually happening and then I'll get a little bit into the config. So I just triggered a build, which just sort of creates a PR in Gidia, which Gidia, oh, I realized this is probably hard to read. Yeah, is that better? Yeah, okay, all right, so while this is running, let me see if there's a thing here, is there like a... So I just executed a pipeline and that pipeline, and I'll actually show what that is right here. And I apologize while it's kind of running. So this is sort of the queue for that pipeline. And so this pipeline here, right, you know, it's calling pretty much the Fresco library, calling a pipeline and it's sort of filling in the blanks of what that pipeline is, and I'll kind of show what that is actually happening here in a second. But pretty much, you know, this is running, it takes in some inputs and then the output is a whole bunch of different tecton and Kubernetes resources. And so if I actually show you what actually ran here, it pretty much just runs, runs, hold on one second. So runs this queue command, which actually generates all of that Kubernetes YAML, but right now I just generated a PR, which then pulls all this sort of stuff in. But let me first show off here. Oops, let me first show off since it's pretty much done now. Let me show what it actually did. And so I'm just gonna use a couple of magic commands here just to kind of pull in the image URL and I'm pulling the information directly out of tecton. And some of this will go away once we get a Fresco CLI. But if I go in and I run Crane LS on that repo, so here's the image, right? And you see that there's a dot ATT and a dot SIG. Pretty much those are, this is a salsa attestation, which I'll show off in a second. This is an S-bomb and this is a signature. So now I can just sort of pull information about that from there and I can go and essentially verify that, hey, was this signed by the right key and all that good stuff, right? So yes, it was signed by the key that we were using to sign it. Sorry, it was, we were able to verify that it was signed by the right key. Now we can also check, is it valid salsa provenance? Yes, and that sort of just sort of verifies here. You can see, yes, the cosine claims were validated. But the other thing I can do is I can now actually show you what that salsa provenance that got generated. The actual once, that was base 64 encoded. Now that it's decoded, you can see the builder was tecton chains, the builder image was this, and so on. And I will just briefly just sort of show you what the S-bomb looks like real quick. And so this is what the S-bomb looks like. I'm not gonna get too deep into that, but it generated an S-bomb, right? And the thing that's nice about this is because all of this happened inside of the queue, the queue actually generated a whole bunch of stuff. And so if I go over to here, I can actually show you what some of this looks like. So one of the things here, right, is this is just sort of an example what the configuration could be, right? So to walk through this briefly, right, you might imagine, hey, the images that I generate should be a naming scheme based on the organization I belong to, dash the department I belong to, dash the team, dash the project. And if I actually were to run that, right, and just sort of show you sort of behind the scenes what's happening, so it generates all of this Kubernetes YAML, right? So it generates a policy regarding that, verifying of that image. And as you can see here, it actually shares all the variables, so you actually see the image that it wants to verify is going to be org X, department Y, team Z, dash foo, where foo is the name of the project. And, you know, stuff like that. And so the org can sort of constrain stuff, right? I can say, hey, this is the name of the org, this is the name of what the images should be, right? And then these are the allowed builds, right? Build packs and go, right? And then there's some additional information in there about actually sort of generating what that configuration looks like. And so if I actually look over here, the actual input for the end user is just something like a basic YAML, the name of their project, what type of the build it is, their repository and so on. And you can, the nice thing about this framework is it lets you sort of be flexible. You can sort of make it, you can give folks access to all sorts of different pipelines or whatever, you can give them access to just a handful of things. And so if I were to go in and say, hey, this is not build packs or go, and I want to run this, you can see here, it gives me an error and says, hey, wait, I'm expecting build packs or, I'm expecting build packs or I'm expecting go. And some of this I recognize is the comment, the error's messages could be a bit nicer, but generally what this is saying is like, hey, I expected build packs, I got Python, I expected go, I got Python, it's not one of the accepted values, I'm not gonna run that. And so now you have this ability to sort of restrict what people are actually running in there and what they're allowed to run. And then in addition to this, all of this sort of configuration, their policy, the pipelines, the tasks, et cetera, are all generated automatically. And now to sort of, oops, to just sort of end this in a second. So the next steps for us is, from the community standpoint, once again, this is an open SSF community project underneath the Supply Chain Integrity Working Group, we're looking for more use cases, more collaboration, we're looking for contributions, it's an open project. From the Q perspective, we're hoping to speed some of this up. Once you have very complicated pipelines, it does get a little slow. We also wanna make it simpler and easier to use. And then for Fresca itself, we wanna work with Tough and Intodo more so that we wanna be able to sort of generate a lot of tough metadata automatically and all that good stuff. We wanna build a Fresca CLI tool and then we also wanna build a secure by design, like we wanna take that framework I just showed you and turn it into something a lot more complete. All right, and so up here, some additional resources, there's a couple of QR codes to links to some other stuff. So for a link that's more of a deep dive into the actual architecture itself, there's the Fresca deep dive and then there's the secure software factory reference architecture paper over to the left there. And if anybody has any questions. So right now we allow for any sort of S-bomb. It could be, what was that? Oh, sorry, the question was, what type of S-bombs could be generated by this tool? So we allow for any sort of S-bomb, but we also allow through the pipeline framework for you to restrict it. So if you say, we only generate SPDX and we only generate SPDX through the SIFT tool. Great, you can actually restrict that in the library and do that. But the Fresca, the pipeline framework that we're trying to build should be support pretty much any sort of tool that you would want to use there. Yeah, so right now it just, the example I showed was using SIFT after the fact to scan the image, but there's nothing to prevent. And the question was just, yeah, so we could use whatever. Is there anybody else? Thank you for group presentation. Is there any capability to extend generated Tecton pipelines to insert the user custom tasks? Yep, yeah, so the point of all of that is to allow for it and allow the individual organization department team that might be using it to restrict it how they want to. Like if they want to allow developers to sort of build whatever tasks that they want to, great, you can do that in this framework. If they want to say, yes, you can use any task, but that task must include images that are pinned by hash, you can do that as well. So you can say, yes, we're allowing tasks, but the tasks must be pre-approved or you can pick and choose. You can also do, I don't know, like pretty much anything that you'd want to be able to do in Tecton, you can do through this while also sharing the data between them. So you can also, you know, essentially enforce stuff like different tasks, sorry, different teams can have different restrictions on what tasks they want to be able to run or what tasks they want to be able to create. And in addition to that, you can say, like you can have any pipeline you want as long as you have a S-bomb generation task and as long as you have a security scanning task, stuff like that you can do as well. Any other questions? Thank you for your presentation. Anything else in terms of like, where do you see this going from here? How does it evolve? And how does it sort of bridge the gap between the OpenSSF and the CNCF organizations? Yeah, so that's actually an interesting question. So this tool is an amalgamation of tools that came from the CD Foundation, the CNCF, the OpenSSF. And so in the spirit of collaboration, we're looking to, you know, and Brian's over there and we're looking to sort of work across the sort of Linux foundation to see where we can be more collaborative. And in fact, the thing that we're trying to be is, we're trying to be opinionated because we recognize at least right now, you have to be opinionated to kind of show how some of the stuff works. Because there's certain things like, yes, you might still be able to do similar sorts of things, let's say with just pure baseline Jenkins, but it might require a ton of configuration. We're trying to say, hey, this is the right way to do it. It's not the only way, but it is a right way to do it. And we're hoping two things. One is to make it more easily adopted. And then second, to maybe help inspire like, oh, I didn't realize in order for me to think about these things, I need to think about Spiffy Spire and workload identities. Okay, so now for the next generation for my CI CD tool, I need to implement that. Anything else?