 Hello, everybody. Can you hear me well? We are going to start talking about enforceable supply chain policies and attestations using in Toto. I'm really excited to be all the way across the pond to talk about a project that's been de-earned to my heart for. I think it's turning 10 years this year, which is very exciting. And Santiago, I work at Purdue, and this is Alan. Hi, everyone. So in order to understand what in Toto is for and what it is about, we first need to understand what a software supply chain attack is. And actually, does everybody know what a software supply chain is, first of all? I assume that if you're in this room, you have some passing knowledge, and you're interested in securing the supply chain rather than understanding what it is. But as a supply chain is essentially a collection of steps that are carried out in order to produce a software product. The common understanding nowadays is that in order to make a serious piece of software, really any software, you're going to rely on a bunch of different things to make that software. This includes devices, servers, organizations, and people that will help you produce the software that you want to produce. A software supply chain attack is when somebody tries to attack that particular aspect of software development in order to carry out something malicious. This is stealing your private property or intellectual property or introducing a Trojan in your software or even denying your ability to deliver the software to your users. The software supply chain security ecosystem nowadays is pretty mature. There's a lot of things out there. Sometimes it's a little bit daunting when you look at all of the solutions that exist. Something that I want to help you contextualize is this model where you have three major things that you want to do in order to protect the software supply chain. At the left, you have everything that's gathering. Things like Salsa, Technic Chains, Witness, Cyclone DX, and SPDX are ways for you to say, hey, this is what I know about this particular step or element of my software supply chain. Once you know everything about the software supply chain, you need to communicate it to your users. So you may want to use a system that allows you to verifiably transmit this information to your clients. You have things like Quack, or Git Bomb, or Omnibor. You have Archivista. You have Graphius, Skit, Sick Store. All of these systems provide transparency to the evidence that has been gathered through the supply chain. Now, giving this information to the user is not enough. They want to know whether they can trust your piece of software based on the evidence that you collected. So things like Q, an open policy agent, or Kaverno. There's really many things out there that can help you verify this information. But they will contact and talk to these transparency services in order to verify the state of the supply chain. Now, you may be asking yourself, does this have to do with Intoto? And the answer is yes. Intoto is part of all of these solutions in some way or another. Or it really is meant to allow you to carry out this flow of gathering, delivery, and verification in a unified framework. You may notice, Intoto is involved in other things that you are very familiar with. For example, Salsa. Salsa and Intoto play very well together. If you have seen Docker Image Attestations, they are piggybacking on top of Intoto to also communicate the information about the images. NPM and GitHub Attestations, which is a very exciting public beta that allows you to verify that a build on GitHub actions is actually providing provenance for the package that you're about to install in a different platform, which is NPM. I'll talk a little bit more about those in detail in a minute. Now, more explicitly, Intoto is a way to manage all of the supply chain metadata and deliver it verifiably to your consumers. So for example, you collect information about your software supply chain as you are carrying out source code development, or you're building a CI-CD system like Travis, or you are using an independent bill farm using Messon or something like that, or you're packaging on a little distro, you want to communicate information about each individual step. Now, each individual step in this case is an attestation, and it's literally a small receipt, science receipt that says, I did the source code development, I did the building, I did the testing, I did the packaging. Intoto is comprised of two fundamental things. Then we have attestation, so I just pointed out, and you also have layouts and policies that tell you what are those attestations supposed to look like. This looks a little bit like this. You have an attestation that comprises of an envelope, a predicate, and a subject. At the very high level, the envelope combines the predicate and the subject. It says this particular piece of software, say a package or a piece of source code, has this information attached to it, that is the predicate. So for example, to put it on more concrete terms, you may have something like, this particular build has a salsa provenance attached to it. Now you know that this salsa provenance is applicable to this particular artifact that was built. If you wanted to seek congruence or check that there's also a vulnerability scan, then you can find an attestation that talks about the exact same piece of software, but with a different predicate that will be the result of something like trivia. When you combine enough attestations, you're able to essentially visualize the software supply chain and the different steps as they connect to each other. In this case, the attestations that I showed you, where displaying something like, you take the version control system code, you put it on Travis CI to check that everything is correct, but you also want to build it on a separate build farm and create the salsa provenance attestation for it, and then you eventually want to package it and produce a nascent for it. And you want all of these things to be in agreement so that you know that all of these are. Part of the same supply chain, it's very viable, it's a truck sporty. In total, then let's you answer a bunch of different questions, not only at the individual predicate level, but actually at the whole and to end software supply chain level. So in this case, we know who did what, that is who's the person who tagged the release, which server was used to build the source code, which package or which build farm is the one where this was built, what server was this run, are they allowed to do this release? Is this as well the one that applies to this particular artifact that also has a salsa provenance attached to it, and so on and so forth. Really the fundamental question that we want to answer when we're using in total is should I be using this piece of software? So to bring it a little bit more to the concrete, you can use a tool like Trevi today, you can scan your software and you can spit out information about the scan in an internal attestation. And then, well, you can create the policy that says I want every single artifact that I'm delivering to my users to have a vulnerability scan. And that vulnerability scan needs to be trustworthy. So I want somebody in my company to be the one that's running this vulnerability scan. You can do things like salsa provenance attestations, which you're probably very familiar with by now, which are, I want this to be built on GitHub Actions, or I want this to be built on my Jenkins runner. And I only want it to be built on that Jenkins runner with a proper compiler and the proper tools and dependencies and the latest kernel with the best patches available and so on and so forth. We'll talk a little bit more about this on the demo. You can also answer other more open-ended questions. That's why there's open-ended attestation types that can fill in the blanks of, well, when I run the tests, did they pass? Was there an okay at the end? Or was there a built, like a runtime trace that collected information of the build as it was being built? So I can do a little bit more granular analysis of the artifact as it was being built. The idea is that you can use in total as a common method to manage and to analyze the software supply chain at the station and apply it to a policy that applies not only to a particular part of the supply chain, but rather encompasses an end-to-end flow of the software supply chain. Now, you can do this with in total tooling. There's a lot, so don't get overwhelmed. Something that I wanted to do this time around is to give you a little bit of an overview of all of the tools that are out there and give you a little bit of guidance on how to get started. There's a lot of in total reference implementations in different languages. There's Java, Rust, Python, Go. I think there's a couple of others, but I'm really focused on those ones. I think that's enough to play with. If you wanted to just go ahead and use in total directly in your system, you probably want to use something like Witness. Witness is a production-ready CLI that allows you to collect different types of attestations of different steps of the software supply chain, sign them, deliver to a right to the discovery platform, and so on and so forth. You can use Archivista as your discovery platform as well, so that as you generate attestations on Witness, you can deliver them to Archivista for discovery later. Yeah, and we also have, there's also Judge, which can let you verify as well. Really, you can apply in total policies with layouts using total tooling as well, in total attestation verifier, which I think will showcase today, covers this really well. Now, if you think about it, this means that there's a lot of different things that you can say about software, and it's a little bit of a, it's almost like a programming language or a natural language. We have a collection of different predicates that you can use that help you answer and collect the information about most common questions on the software supply chain. If you're familiar with Salsa, well, Salsa is one of the predicates, probably the most popular one, and it's answering a very specific question, which is how did I build this? Well, there may be other things that you may want to answer. Again, like, what was the version of the compiler or the shell of the compiler that built this binary, or what was the result or output of a vulnerability scan on my particular artifact as it was being built? So, I encourage you to take a look at the attestation repo and just almost look at it as a menu of, well, what would I want to do in order to protect my supply chain? To summarize all of the version about the tooling, this table, I think it encapsulates it very well. If you are somebody who wants to use Intodo and you want to use a very complex workflow, that's in very elaborate, you may want to use Witness and Archivista together to collect and discover the information and apply policy. An example of this would be something like Automated NIST SP 800204D. There's European Union and French and British and German standards are very similar to this. They're trying to guide you in ways that you can collect different information, not only provenance, but also S-bombs or whatever these scans or information about the stack that was used to build something. If you want to automate compliance, which is a very elaborate endeavor, you may want to use something like Witness or Archivista. If you're a developer that wants to extend or integrate or add support to your project, you may want to use Intodo libraries. If you're, for example, adding total verification to the package manager, like NPN did, you may want to use Intodo reference libraries. If you're a user and you just want to get started and adding some degree of Intodo verification to your pipeline, you may want to start with a CI platform. They are the ones that are the most ahead in integrating. So you may just want to enable it in your platform. GitHub Actions already has a ways to enable a provenance on your project. GitHub Runners also have support for Intodo. Technic Chains, there's a Jenkins plugin that you can use. All of these tools are almost just flick a switch and start producing at the stations that the Jeep can verify later down the line. I'm going to hand it to Alan. So he can walk you through or walk us through the demo. I think this will help crystallize the concept a little bit better. Yeah, thank you Santiago. So for today's demo, I'm doing something very bare bones, just a simple C application, just to demonstrate the power of like using, like attesting to certain steps and then using layouts to verify your policy and making sure that the integrity of your supply chain is intact, right? So this is a high level overview. So you're taking a tar ball, you untar it and you generate all the files and then you compile each file to an object file and then compile it all to the binary. So let's see. So if you look at this, the demo will basically take all of this and then switch out the external object file just to basically simulate what an example supply chain attack will be, right? So for example, this we're running on a specific environment where for some reason, every time you try to build the executable, it switches out the object file. You will not be able to detect it even if you know that the external C file is the original, right? So the way I'll demonstrate how Intodo works is basically using Intodo to basically secure each of these steps. These steps are encircled in the dotted lines that you can see and basically between each step, we have think of it as like a chain and in that case, if the external object file is compromised, the chain will break. So hopefully the demo will demonstrate it. So let's give it a go. Let's see. Yep, so first of all, I'll enter the project. So first of all, I'm gonna do a dry run without Intodo just so you guys can see what's expected. So I enter the project, I see the project directory and then you start building the object files and run the command without any errors. So as you can see, just print safe for a little world. Now I will inject the malicious object file and build it again. And obviously, if you capture hashes, it will be different, but if you run the executable again, it's on safe unknown code. Who knows what's in there? Now, right now I'm gonna demonstrate how it will happen with using Intodo, right? So I have a new, I wrote a small program that basically just generates a new link at the station at each step. So link at the station, high level basically takes you have the, you take in the input files and you capture the output files. It's pretty simple. Gives you a lot of power to basically connect a lot of the steps together just using files and other artifacts. So as you can see, you basically pass it a key to sign the Intodo artifact and then you also give it a name. And if we run through most of the steps again, you get the same executable. And now we're running the Intodo Verify command, which basically passes in a layout. I'll show the layout later. But basically what happens is that it takes in all of the attestations generated and basically verifies each step. And then if we were to inject a malicious file, and build a project again, this should, and as you can see, it's the unsafe code. If you run the Verify again, it fails. So it doesn't print it in red, but verification failed. So I'll show the layout right now. And as you can see, we basically declare something called a functionary. So people who are allowed to carry out certain steps. And then we also have, and in each step, you can see how I define what's created, what's produced, and what a command I expected to use. So the untar creates all these files, build external, uses the source files as I showed earlier, and generates the object file. Same goes with main. And then as you can see in the final, one of the nice things you could do is basically match, so whatever's, so you can have one future step rely on the product of past steps. Yeah. That's it for the demo. And if you'd like to look at it, still of working production, but yeah, here's the link to the demo. If you'd like to look at it in the later time. Yeah. I'll pass it off to Santiago to talk about. Thank you. World War use cases. So something that I don't know if you noticed here is that in total works a little bit like a firewall and it was able to filter any artifact that was not authenticated from an actor in supply chain. This also connects to the use cases because this is a simulation of probably the most infamous software supply chain attack in history that was over wins in which a process was changing an object file as it was being linked. Within total, you can essentially filter out the object file as it was, as it's being linked, but you can also ensure that say different builders agree or that different aspects of the stack are actually authenticated, like the compiler, the linker, other processes that are touching the files and so on and so forth. I wanted to give a shout out to GitHub and NPM. I think it's one of the coolest in total integrations out there to explain how GitHub and NPM are using in total and six store another related technologies. This is a diagram directly from the blog. Essentially what they're doing is every time you do publish NPM publish with provenance, it will authenticate you using six store or full CO and it will give you a signing key. So thanks to six store, you don't even need to reason about how do I manage my keys, but rather who is the email or the identity that's allowed to publish this artifact. Once you get the key or use the keyless flow from six store, you generate a provenance at the station. What does that mean? It will connect the GitHub repo where the source code is located to the NPM package as it's being published. When, oh, once it's signed and generated, it is put together into, oh, I think I'm missing a slide. It is put together. The package and the provenance are put together into the NPM repo. What does that mean? That if somebody were to break into the NPM repo and change the packages, they wouldn't be able to also tamper with the original commit that was in GitHub because these two things are linked together now. Does this make sense? I see a couple of people nodding. That's great. The other example that I wanted to highlight is the post Sunburst SolarWinds deployment. Again, the inspiration for the demo was borrowing a little bit of how SolarWinds happened. They did it a little bit differently on SolarWinds though. They have a white paper called Trebuchet if you want to read on it. There's a lot of more details. Here's a screenshot of their diagram. What they do is they do a very similar thing as GitHub and NPM, but instead they're leveraging reproducible builds and they're leveraging in total to verify reproducible builds. What does that mean? They separate two different build pipelines, completely disconnected from each other, different stacks, different network configurations so that if somebody were to hack into the system, they would have to hack into both systems at the same time and hack it in the very same way to produce the same malicious binary. What this really means is that in order to introduce a backdoor through the compiler, you not only need to compromise both servers, but you also need to make your compromise reproducible in a way. Now, once each individual pipeline builds their binary, they produce in total plus cells a provenance attestation and they store it on the metadata store. That's the store that's in the, sorry, the box that's in the middle. When it is, when a deployment, this takes more information for example of whenever they scan, this also makes it to the metadata store. And eventually, they receive a signal to validate the result of the two. What does this mean? It goes and queries both cells have provenance attestations. It queries any other associated metadata like a vulnerability scan and it asks itself, are these two the same binary? Even though they were built by different people, are they the same exact binary? If that passes, then the binary is released. It is put to the built binary artifacts and the container images so that users can download the resulting build. To close out, I wanted to talk a little bit about how you can get involved into the project and then jump into the questions. In total, it's a mature project. It's actually waiting for graduation. This means that even though there's a lot of mature pieces of software out there, there's also a lot of things that you can do to get involved in the project. Perhaps the most obvious one is contributing to tooling. I think a very exciting place to be working at right now is the witness and archivist implementations. These are two recently donated pieces of software. They're meant to work with most of the new attestation types. They're meant to allow you to create new attestation types if you think that there's no predicates that match your use case. They allow you to integrate with other projects a little bit better. For example, if you're using SixTor to sign, they allow you to use SixTor to sign. And they also come with a lot of little knobs that you can really use to forget about implementing in total and just using in total. Contributing to witness and archivists is something that I really suggest because I believe that's the way that the total should be used in the future. If you are trying to use in total in a particular project and you feel that there's a feature missing, integrating with that particular project is also a good way to help us out. What this is really saying is not contributing to the total project proper, but maybe working your project and adding total support for it. Making your tool, especially if it's a collection-related, spit out attestations or generate attestations is relatively simple. I've done it for a couple of projects myself. A lot of the times, it's 15 lines of code. You use the runtime libraries, you import the right modules, you fill in a data structure, you call sign, and then you print it out on the appropriate channel. I think CubeSync is a good example if you want to take a look at how easy it would be to do it. You just add that particular line of code on an API response and then you can start authenticating within total the provenance of API questions and answers, for example, for codecs or new like emerging technologies. The last thing, and I don't want to minimize this, I think this is actually the most useful, is help us integrate with your environment. That is, if you have a project in the CNCF or the OpenSSF or really anything out there and you don't know how in total could fit, we learn a lot from that perspective. A lot of the times, this is what drives new predicate types or new projects altogether that help us work with integrating things. If you don't know how to verify a particular thing of the software supply chain, like how do I know if this is running on an Intel SGX HACCLAVE? This allows us to drive better tooling and future innovation in the project. So don't discard that. I think that's a very good way to engage with us. And well, we have a more open-ended way to interact, which is just join our communication channel, sorry. The total community meeting is on the first Friday of every month. Right now it's at 11 a.m. Eastern, which is a little bit late for the EU time zone folks. We may be adding another time zone to accommodate EU and Asia time zones. You can drop by the CNCF Slack and the total channel, that's the main hub. We have other channels, but that's the first place to interact and get involved. If you are old school, like myself, there is an IRC channel on Libra. I am there. It's not very chatty right now, but it's supposed to be rich. So a lot of the times, the messages bounce between Slack and Libra and we'll find a way to answer there. We also have the public mailing list. If you're even more old school than IRC and you want to send emails exclusively to a mailing list, this is also what we use for announcements of like big releases or project donations and things like that. And well, you can also take a look at the organization and just browse the projects, leave us feedback, open an issue and so on and so forth. I think with this, we can open the floor for questions. We have plenty of time for questions. Oh, I think I have this. You said runtime or at link time. Oh, okay. So like you mean monitoring once it is deployed. Yeah, so we have runtime trace at the stations that you can use to just actually check for the state of a process. They're a little bit open-ended and the use case is not entirely mature. I would say that there's a distinction between like runtime and how to call it. So runtime and continuous monitoring, I want to separate those two things. So for example, you may be continuously creating at the stations of vulnerability scans and replacing all vulnerability scan at the stations so that if the policy stops verifying at any point, you deprivation the container. Yeah, okay, makes sense. We've acquired a new regulation in the UK for telco stuff that says, thou must only run signed containers and we're trying to work out how... Yeah, well, we should also add that to a demo. There are a couple of admission controllers that can verify signatures of our containers as they are getting in and you can verify a policy on admission time. I think Kaivarno is one that does that. We also add support for six-door and for internal attestations as they are coming in. I think there's the six-door policy controller that allows you to do the same more on the six-door native land. And yet you can also do the continuous monitoring but for that, I think better support in something like a service mesh would be ADO. Essentially, can you deprivation something that somebody marked as malicious or untrusted? I think integrating a walk with a service mesh would be an interesting killer combo in that regard but I think that use case is not fully developed yet. Interesting, sounds unsolved to me. Well, it depends on whether it's solved at the theoretical level which I'm comfortable with as an academic or is it productionized yet? Hi, everyone, Nicolas from TALES. Thank you, merci for your presentation. I have a question about environment where you are not necessarily connected to the network or you have intermittent network. Like I think yesterday it was the edge day that a lot of people talking about intermittent connections. So what happens to the attestation when you don't have access to like a remote registry with everything going on there? That is a good, very good question. Discovery is agnostic to a lot of internal tooling. For example, in the case that you saw here it was copying attestations like through directory almost like sneaker net. You could put it in USB stick and carry it over. There is a somewhat established solution for communicating distrust information as part of a six-door bundle. That is you get the regular six-door trust information plus associated attestations on a single sort of log that then you go deliver somewhere and it's meant to work fully offline. For the more intermittent case, I would assume that a lot of these discovery platforms allow you to pull or batch. I know that WAC for example allows you to do a batch push like you detect connection you send as many attestations as you can and then you stop transmitting but really you wouldn't trust the artifact until all of the attestations are there. Or maybe you could trigger the deployment once you have the connection. For example, with a Kyvarno which is looking after it and then when you are not connected anymore maybe the cache would be enough to... Yes, so the verification usually pulls in the attestations and then you have those for essentially reference on your side, which should be enough for you to say reprovision of the container or re-verify them. Okay, so should I look to six-door bundles? I like them. I think they're a good solution. I recommend taking a look at them. Thank you very much. Hi, I had a question. So if you are a consumer of a pipeline that was digitally signed and attested to and let's say it's a base docker image and I validated that signature, validated those attestations, is there a standardized way for me then in my own pipeline to, for the consumers of my artifacts, say that I've verified this, whether it's inheriting those attestations or some level of verification of the entire supply chain? Yes, I love that question. Almost feels like I was wishing for it. So there's a couple of things there. There is a type of predicate. You can generate an attestation that says I've verified this collection of attestations. That's called a VSA. Running in total verify on a pipeline produces that attestation, only if you successfully do it. If you want to be very transparent about it, in total layouts are also composable. So you can say my step of getting dependencies, for example, needed to run this verification. So I'll give you all of the attestations and you can verify them yourself if you so want to. Now, the VSA, the verification summary attestations are useful a lot of the times because you don't want to A, transmit like petabytes of attestations and B, you may not want to be fully disclosing of every single detail, right? If it's a third party sort of like private dependency that doesn't want to tell how exactly is this built but there's a good compromise of well, trust me, I verified it under a particular policy. That is a good sort of like approach, I feel. Thank you. I think we have time for our last one or maybe not actually at a time. It was a pleasure and I'll be around if you have any questions for offline.