 Hey everyone. My name is Aditya. I'm a PhD student at New York University and I'm going to be talking about in Toto at the stations and more for software supply chain security. And you know, whenever we do talk about in Toto, we always do a quick primer on what software supply chain attacks are. So software supply chain as I'm guessing most of you in this room are intimately familiar with and given given everything that's happened over the last couple of years is everything, the people, the artifacts, all of the different systems that come together to produce a piece of software and an attack is compromise of any of the many different things that came together in creating a software artifact, whether it was, you know, the person, a human in the loop who was part of creating the software or compromise of a build system or something like that. And the attacks and this compromise has to be, you know, targeted at compromising the consumer of whatever was being built, right? So this, this has been a 742% increase. We've seen this number from Sonotype's last report on the state of the software supply chain. The number is probably going to go up at which point we'll update these slides by probably now in KubeCon North America, which is around when the report comes out. And we've seen a lot of these attacks over the years and the internal team were used to track them in what's called a catalog of software supply chain security compromises, which we donated to the CNCF tax security software supply chain security working group. It's really helpful and I highly recommend you all check it out. It's being updated by the tax security working group over the years and we used a lot of these incidents to try to, you know, to drive the development of the internal framework. So yeah, definitely recommend checking this resource out. And as a response to all of these software supply chain security incidents, seen many solutions emerge, which we can, I think broadly categorize into three different buckets, right? We've got projects that are focused on generating various types of evidence of what happened during your software supply chain execution, whether that's like prominence of a build step that you'd get with something like Salsa generated from your CI system or the NPM logo I threw up today because of NPM's announcement from yesterday that you can now generate prominence from during NPM builds or other forms of evidence can be your software bill of materials, whether it's Cyclone DX or SBDX. The second category is various tools and systems that make it possible to discover this information and to try and gain insights from the evidence that was gathered during the software supply chain execution. And finally, you've got tools and systems that help you create policies and verify them as your software artifacts being consumed at some point. And in total is like the common link between all of these different categories of software supply chain security solutions. You can express your different types of evidence as in total metadata in a more consistent manner so that it's easy to consume and easy to make sense of depending on what you're using it for. And all of this evidence is then fed into information discovery systems, right? Projects like six stores, RECAR can store in total attestations. GWAC is using in total metadata to try and gain insights about software artifacts, what components are in a software artifact and archivistic, or phase and things like that. And all of these can then again help you develop the policies you want to be enforcing when you're actually taking that software artifact and sending it to your end user or an admission controller or whatever. And in the total world, that's what we call a software supply chain layout. And that's how we see in total connecting all of these different categories. And user, the admission controller, whatever, depending on the context, receives not only the thing that they were trying to get, the artifact itself, but also the in total layout, all of the evidence that was generated as the final product was created. And before actually using or installing or deploying that software artifact to be from the verification workflow to see that all of the evidence stacks up against whatever policies were created in the layout. So let's dive further into in total. We're an incubating project and we're trying to target CNCF graduation at some point this year. And to that end, I also want to talk about two major changes that are happening within the framework. First up, I'm going to focus on, you know, the evidence part of what I just described. So the total specification before and up until version 1.0 only really had one way to express some evidence of what happened in your software supply chain. It was called link method data. It tried to be very generic and agnostic of what was happening. It captured the input artifacts that went into a particular software supply chain step and it recorded what came out of that particular step. And it was cryptographically signed and all of that. But we realized that there were several other types of operations that were happening that couldn't always be mapped to that, you know, input artifacts and output artifacts approach. And you see that with things like software builds of materials that can be expressed as, you know, the new contextual and total attestations. You also see that with other very common operations in software supply chains, like you run a test of your software or you get a code review of your software, right? And so the in total attestation framework allows you to express all of this more contextual information. And as of a couple or maybe three weeks ago, the attestation framework reached the version 1.0 milestone, it was developed as a sub project within the in total umbrella, so that the focus could purely be on how, you know, on how to express the claims alone without, you know, breaking existing users of link metadata and things like that. So yeah, that's I'm going to give a super quick introduction to what the what the what an attestation looks like. The outermost layer is typically a signing envelope of some variety. It recommends the use of DC or the that simple signing envelope. The attestation begins at what's labeled as the statement layer, which identifies the one or more subjects or resources that the attestation applies to the type of attestation it is, or I should say predicate, which is the contextual part, and the predicate itself. So you know, in the case of an in SPDX document, you'd have that corresponding predicate type and the contents of the SPDX document as the predicate. And in the case of salsa provenance, you'd have that as the predicate type and provenance as the predicate itself. We also have a number of what what we call wetted predicates for things like, you know, I'm going to pick up a couple that I haven't named so far, but things like capturing the results of runtime traces of something that some operation in your software supply chain. test results. Salsa's verification summary is also captured as an in total predicate. And we've got a couple of new ones in the review for things like code review results and things like that. And there's a group of people, the in total attestation framework maintainers who are who you know, who who are who help with reviewing and wetting these predicates to ensure that the fit a broad variety of use cases and can be useful for different consumers across various organization boundaries. And I should note that, like I said, the attestation framework was a sub project within the in total umbrella. And it was introduced as, and we have a formal way to, you know, introduce such sub projects that are going to modify the specification and we call them in total enhancements. They're in total equivalent of like a pet or whatever. So you see that I six is the one that actually introduced the attestation framework in was accepted a couple of weeks ago, right around when we hit version 1.0 with the attestation framework. So that was, you know, the major changes that kind of already happened and kind of stabilized with version 1.0 with how do we express claims of what's happening in our software supply chain. So I'm going to talk a bit more about what we're doing now to enhance in total layouts to support verifying attestations. So going back to what we had with less like, you know, up to and including version 1.0, because the entire specification primarily supported the use of link mint data, and not all of the contextual in total attestations. The layout was also very tailored with a set of capabilities allowed for verifying, you know, and uniting all of the steps that needed to be performed in this in the in a particular supply chain and the functionaries are because which is a general term we use to describe, you know, people are bought performing each of those steps. And it focused on verifying the flow of artifacts as they moved from one step to the next. So Carol are in this, in this case, who's performing the build step could verify that, you know, you could verify that Carol received the right set of sources from Bob, who, you know, checked out something from the version control system and things like that. And you could also verify that as Carol built the software, like the piece of software, maybe we should just run make, right? She only modified the file she was supposed to or created the file she was supposed to and so on. We are we're so with the with the enhancement I just talked about, right? We were working to update layouts to support the to to support all of the same capabilities that I just described. But we're also adding in capabilities within the layout itself to make other to set other kind of constraints against different artifacts in the supply chain or the steps as they're happening. I mentioned as an example of a weather predicate a few minutes ago, the runtime trace predicate, so you could not only capture link metadata and sells a provenance for your builds and so on. But at the same time, also capture the runtime trace of your build, and then set a constraint examining that runtime trace at a station to see if any network calls were made during the build process. You could say, okay, my runtime trace tells me that, you know, no network calls were made. So I know that that particular build was performed in an environment without outbound calls and things like that. So we're this is a work in progress. It's also it's being introduced as another in total enhancement, which is I 10. It is focused. It's still quite an early, it's in a well, I don't want to say very early state, but it's it's in the open PR state. And we're actively working with a number of stakeholders to ensure that it correctly bridges all of the requirements. And it remains backwards compatible with the layouts that we had before as well. So post version 1.0, the idea is that the in total specification, excuse me, the in total specification will use the at a station framework, the sub project as its mechanism for expressing claims about the software supply chain execution. And it'll use enhanced capabilities from it 10 for layouts that allow you to verify all of the different things that I just talked about with, which are captured in your attestations. So we have a number of implementations or libraries that we maintain. And Python go Java and rust. They're they're they have their feature complete to varying degrees, because we've had a number of enhancements over the years that haven't been implemented in all of the different implementations. But you know, if you want to see some capability in a particular implementation, we're always open to pull requests and happy to work with you to make that happen. I also want to highlight witness, which is a community driven open source in total implementation. It was developed after the introduction of the in total attestation framework. So it's focused purely on in total attestations. It also has a derivative of the in total layout schema called witness policies that we're again trying to bridge the with it 10, you know, we're trying to have the new layout schema bridge, witness policies, as well as the original in total layouts, so that they all remain compatible and witnesses. And so we're working with the folks at testify sec who built witness to ensure that witnesses also going to be compatible with the I can lay out schema. With that, I want to play a quick demo from Cole Kennedy of testify sec, one of the developers behind witness and anecdotal steering committee member. So this works. Okay. Hey, thank you. Now one thing we're really trying to do with witness and archivista is make the whole process of creating in consuming attestations much, much, much easier. So I'm going to show you two tools that we recently developed and released to help to this end. The first tool is called the witness run action. Now this action allows you to configure and run witness as part of your software supply chain very easily by just defining some some variables. The second tool I'm going to talk about today is called the policy tool. This will eventually make its way into into the witness binary right now. It's pretty experimental. It's got some rough edges, but it still provides a ton of value. You know, one of the biggest complaints that we hear about in total is how hard it is to create layouts and enforce policy on the attestations out that we create. We've done some work with the witness policy language to kind of help with that. There's some additional work going on with I-10 that should alleviate some more of those as well as make it into the official spec. We're really excited to be working with with Dithya on I-10 as we continue that process. But anyways, this policy tool, what it does is it takes attestations that were created in previous pipeline runs and allows you to define specific variables and attributes that you want to stay the same on all the future pipeline runs, right? So there's certain things like values in the JDBT, the owner of the repo, etc, that are the really easy low hanging fruit that can really add a lot of security to your supply chain artifacts. So what we've done is we've started working with the Hewitt-Packard team that is developing GalJail. This is part of the SpiffySpire project. SpiffySpire offers federation, but it's really tough to manage and they're trying to solve some of the issues around that. So that's what this project is doing. But let's go into exactly what we're doing on the supply chain security side. Go ahead and make this guy a little smaller here. So we'll go ahead and go into the actions. And you can see we have a couple of pipeline runs that are going on here. So let's just pick one. We'll go into the release, go down into the workflow file, and we can see exactly how we are running our Go Releaser on this instrumented with witness. So it makes it really, really easy. Look at this. So you can see we're using our latest version of the witness run action. Make sure you have the latest version. There's some bug fixes that we included recently. But I want to key in on these with statements here. So the first one is enable six store. So if you're going to create an attestation, you need some way to trust that attestation. Unless you want to deal with key management, you should probably use six store or some other tools that may be out there and give us a key provider. But six store right now is really the tool to be using if you're going to be doing keyless signing. So that's what we use here. And then secondly, we're going to enable Archivista. So we need some place to store, manage, discover all this attestation data that we're creating. We tried using Recor for this, but there's some issues with discoverability. And it's really not designed for this use case, right? It's designed more for verifying that attestation is valid rather than discovering a bunch of attestations. Third, we're going to set trace to true on this. That means that we're going to run a ptrace trap on the command that we execute and grab all the files that were touched or written during this process. In future versions of witness will also be grabbing all the network calls as well. And then you have the step. And you know, we've got to name our step. That goes into our policy. And then we finally we have the command that we're going to run, right? Nothing special there. This is exact same command you'd be running on this build machine if you didn't have, if you were not using this witness run action. Okay. So now that we've done that, let's go ahead and drop down to the command line. And we're going to start doing some stuff. So the first thing we're going to do is we're going to, I want to paste in some environment variables or some variables here to set the stage. So we can see we have these git oids, these identifiers. And this is going to reach out into Archivista. But for different steps during our build process, so we have a commit step, we have a scorecard skip, a scan step, right? You may have seen that trippy scan. Then we have a container build step as well as a binary. We're not rebuilding this in two different ways. One we're using GoReleaser and then we're using KO actually for the container build. All right. So let's go ahead and do this. All right. And then next thing we're going to do, we need to get desserts from Sigstore. So we're going to copy, we're going to go ahead and download those. I wrote a little script to do that there in sake of time. And then next, right, we're going to use that policy tool that I talked about. Right. And you can see we're passing in those Sigstore certificates. We're passing in those access station IDs. And what this policy tool is going to do is going to reach out into Archivista and grab all those access stations that we're defining here. And it's going to look and inspect some of the values of what those access stations are and create a policy based on that. Right. All right. Bam. Now we got a policy. One thing I forgot. Let's actually go and look at what the sticky.yaml looks like. Right. So right here, we're defining, okay, in this type of an access station, we have this GitHub access station. We want these values to stay the same from pipeline run to pipeline run to pipeline run to pipeline run. All right. If they change, that means we've probably got to go look at something. So let's go ahead and we'll show you what that policy actually looks like. So this actually encoded some Rego policies within that policy document. We're going to use JQ to go look at those real quick. Bam. Now you can see, right, these are things that we want to stay the same with that artifact from pipeline run to pipeline run. So we go ahead and do that. All right. So now we have a policy file. We're actually going to need to sign that policy file. So we're going to generate some keys here and then we're going to use a witness sign tool or witness sign directive to actually sign that policy. So now, right, let's actually go and verify this commit that we're on. So we're going to go ahead and you can see get rev par. So this commit is already part of that previous pipeline run that we have. So we actually have a brand of the steps. So this is a valid commit that's gone through every single step that we'd find in our software supply chain. And you can see that evidence right there. One of the things that we did do in that pipeline is we actually built a container image. Right. And you can see right, we're passing in the image ID of the container. That's not the manifest, but we're passing the image ID as a subject. That's a shot 256 hash that's uniquely identifies that container or that image. You can see, right, verification succeeded. All right, let's go ahead and actually do a different verification. So when we ran GoReleaser, we created a bunch of binaries. So now we're going to go ahead and look at everything that's in our disk folder. Now, I downloaded these ahead of time, but this is from that release from that build page in GitHub. We unzip that into this folder. And now what we can do is we can go verify every single one of those files against that policy we created. Right. And you can see it was all succeed. Now, this is something your customer do. This is something you put in a mission controller before you let these workloads go into a Kubernetes cluster. Or what you can also do is add it at as a last step on your sky pipeline. So that way we know that none of this stuff has been tampered with in between the individual steps. And it also gives us this assurance that artifacts do meet the policy that we specify. And they weren't built in on some Devs machine sitting underneath their desk or on a build form that we don't trust that may be malicious. Thank you very much. Cool. So a couple of quick updates from what's been happening on the Internet community side of things. The biggest piece of news is a change in the governance model of the project. Previously, we had what we call a consensus builder who is Santiago Torres-Aries. You can see in the screen up there. And earlier this year, he proposed that in total move to a steering committee member. I also recognized that I introduced call as a steering committee member before I actually told you about the internal steering committee member, sorry, internal steering committee. So the internal community voted on, you know, the folks were nominated by the community or by themselves and picked five people to form the first steering committee earlier this month. And we have a healthy mix of academia and industry, which I'm personally very pleased about. And we've got representation from folks who are using Intodo today in their pipelines. We've got representation from folks who are doing research that goes into Intodo at universities and building cool things like witness where we have calls. So yeah, really pleased to introduce all of these folks as our steering committee. And I want to, you know, in general give a shout out to the Intodo community. We've had a number of invaluable contributions from so many folks over the last several years. Multiple Intodo enhancements, multiple contributions to our various implementations and integrations that we maintain. A lot of, you know, as the attestation framework came to be, the number of predicates were proposed by members of the community. So that's all of this has been really great. And all of this is also, you know, directly translated to a large number of integrations and adoptions. Like we've got integrations with other CNCF projects like Key Lime and Spiffy and Tough. We are, you know, we have predicates that support open standards like S-Bombs, like SBDX and Cyclone DX, as well as Salsa, where, you know, Salsa recommends the use of Intodo attestations for expressing claims like currently provenance, possibly more in the future. We work on the reproducible builds project where you can use Intodo metadata to verify that two isolated rebuilders were, you know, built the same bit-for-bit equivalent packages and you can use Intodo metadata to verify that the rebuilders you trust actually signed off on those builds. And through that we work with the Debian and Arch Linux folks. And at NYU, we maintain a rebuilder for Arch Linux packages and at Purdue, I think there's a Debian rebuilder similarly being maintained and both of them are in total metadata. So, you know, you could, if you're a user of one of those distributions, you could also plug this into that. A large number of open source projects and systems that, you know, either use Intodo or we have integrations for, you know, popular CI systems like Jenkins, TectonChains for Tecton, and integrations in Guac, which consumes Intodo metadata. And all of this is also meant that we've been adopted by a number of organizations, right, and whether it's Datadoc who use both Tuff and Intodo in their pipelines or Toradex, which is one that we're actively working on right now, that's working in the embedded space in the Internet of Things space for generating attestations for the images they develop for their boards. We're also formalizing how we talk about all of these. Like, it's nice to be able to show off all of those logos, but we also want others in the community, possibly people who are new to the community, to learn how a particular integration works or learn how someone's using Intodo in their pipeline. So, you know, we're trying to collect brief descriptions of how some of the folks, you just saw, some of the organizations and projects you just saw use Intodo or how you can use Intodo within a system like that. If you were listed on the previous slide but don't see yourselves on this one, please feel free to reach out with, you know, a brief description of how you use Intodo or how we can, you know, talk about that integration. Finally, Intodo is still, you know, quite actively maintained and developed at academic institutions like Purdue, NYU, and NJIT. We are also active participants for the last several years in Google Summer of Code, so we have a, you know, we have a history of mentorship. We, I can't remember the time in the last several years we haven't had an undergraduate student or a master's student working on various aspects of Intodo and contributing to it alongside all of the other members of the community in the industry. So, and so if you're new to the space and you'd like to start, you know, contributing, feel free to reach out. We're very welcoming, even if I say so myself. So join us. We meet on the first Friday of every month for a community meeting. We're on the CNCF Slack on Hash and Toto. That's our main channel. We've got a few others in there for sub-projects and the like. We are, we also have a very non-active presence on IRC. If you do want to reach out by IRC, we will respond, but not a lot of people do that. You can also use our mailing list. You can feel free to join our mailing list and you can find us on GitHub at the Intodo organization. Thank you. Questions. Hi, thanks for the presentation. Can you elaborate on how tough and in Toto integrate? I have know a little bit about what tough does, but not so much about this. So what is your common picture? Sure. So I want to start off by saying that tough and in total our sister projects and a lot of the same people were instrumental in developing them, especially in the early days. In Toto uses stuff to securely distribute the layout, you know, the policies I was talking about, and the keys used to verify that layout. So it essentially allows you to use stuff as a root of trust for your in total layout and all of the other metadata. You can also use stuff to associate a particular set of in total metadata with the artifact you're distributing from your tough repository. So we've got a fairly detailed, we've got a couple of fairly detailed write-ups as in total enhancements and I'm happy to, you know, share the link with you if you come up with me. But and one of them also detail data docs deployment of exactly that set up of, yeah. Okay, cool. So the tough layout is kind of published in the archiver or archivist, yeah. Archivista is building in tough support into, like they're building tough support into Archivista as well. So, but I don't know all of the details of, you know, that particular roadmap, but the current deployments don't use it with Archivista. Okay. Thank you. Hi, I have a question here. Thank you for your presentation. I entered this room having a small knowledge about, no knowledge about co-sign, six-store, about connoisseur, which can verify the signature of a container at runtime. You have also six-store policy controller which does the same. I know that software build of material is becoming a big deal. I know that in total can help me achieve protecting myself against attack from the supply chain. But I think I miss the connection between all the project. The previous question was about the update framework. Am I right? Tough. You mentioned witness. And I think I need a clearer view. So how can I get this clearer view of this consideration of a project? That's a good question. And I think I'm going to take that at like two levels. At a higher level, I think one of the things we've been discussing is more of a landscape approach to explain how all of these different projects come together. Because a lot of what you just talked about are kind of complementary projects that you probably want to use, you know, together. Together. Yeah. So at a lower level, yeah, so I'll take the example of six store, right? You could use six store to, you know, you can use six store to sign your in total metadata. You could also store your in total metadata on recurve if you sort of choose. And so that it really helps with the key and identity management and to an extent the metadata storage aspect. But you'd still need in total for the end to end, every step in my supply chain kind of letifications. Okay. Go ahead. And for, for example, does it make sense to say I want to verify the in total attestation at runtime like I would verify the signature of a container at runtime, you know, just before the Kubernetes schedule your container, you can had a web book to say if this is not signed, I do not want to instantiate it. So does that make sense? Yeah, we do have like, do I need to take a look at it? It's been a while since I looked at that particular integration, but we do have like an admission controller hook to where to find total metadata prior to deploying something. Thank you very much for your answer. And anyone else? Cool. Thanks. I'll, you know, feel free to come up to me. There are a number of other in total folks here. Happy to answer in chat.