 All right, all right, all right. Hello, everybody. All right, before we get started, since people are still trickling in. So... Nímen shǔ zhōnggǒrén zhǔ zhō. Okay, and if you're an English speaker, please raise your hand. Great. If you're super, super fluent in English, but Ní shǔ zhōnggǒrén, go ahead and raise your hand. Okay, so we have one hesitant person there. So what this means is that about two-thirds of the audience is Chinese and doesn't feel comfortable saying that they're super fluent. And the rest of you are English, or speak English probably close to natively. I am going to speak English during this talk, which is good for everybody, especially me, because we'll put on shǔ zhō bǔ hǎl. I speak Mandarin very poorly. So I will go a bit slower. And one other thing that I wanted to announce and say about this talk is, unfortunately, my co-presenter, Ding Ma, is not able to be here for family reasons. So I'm going to be presenting his part of the talk as well on confidential containers, something that I only learned about when preparing the talk slides and the talk presentations. So I'll go into much more detail with the Intodo project, which I was one of the creators of. Okay, so let's get started. So fundamentally, the Confidential Containers Project and the things that confidential computing in general is focused on is about trust assumptions. And there's a natural question of, what do you trust in the cloud? And in general, when you run code in a container, you're trusting really everything else running on that system other than maybe other containers. You're trusting the hardware is safe and doing all the right things. You're trusting that the kernel is doing all the right things. You're trusting that all the rest of the sort of infrastructure that's in place, container D, doing the right things that somebody at your hosting provider, if you're renting hosting from a provider like Amazon or Microsoft or Google or somebody, you're trusting that those people who work at that company are not going and tampering with your data or looking at things that they shouldn't look at, right? So what's really interesting about this concept about confidential containers and that project in general is it tries to say, well, what trust assumptions should we have? Could we make it so that we don't have to have quite so much trust in so many different people, in so many different technologies? And so the real goal of confidential containers is to minimize trust in the cloud provider so that even if there's an evil system administrator at whoever your hosting provider is, they can't go in and tamper with your containers and steal your data and so on. So with confidential containers, the items that are in blue here on the diagram have actually some protection against all sorts of malicious actions, which you don't usually get inside of a cloud environment. Usually if container D or whatever you're running that does those aspects of the container landscape for you are hacked, then all bets are off from a security standpoint. So I'm going to talk only a little bit about how this works because unfortunately Ding is unable to be here. But basically confidential containers and confidential computing in general builds on concepts from trusted platform modules. So what this does is this uses different built-in hardware protections that exist inside of many modern processors and it uses those concepts so that you get a root of trust and build up from there in order to have a chain of trust and to have what are called attestations, although we're going to use that word in a different way in a little bit, that go and describe what software is running in a trusted way. And so what this basically allows you to do is it allows you to really only have to trust your silicon vendor, your processor manufacturer, people like that as opposed to always needing to go and trust all of these other parties that we saw before. However, confidential containers does have a lot of trust assumptions that remain, which would make a lot of sense. It's a very hard problem to have a way to do computation that doesn't have a lot of these assumptions. And in general, the concept is focused on boot and startup time validation of the things that it is going and running. And a real question that you have in a system like this is how do you know that the thing that you're running is actually the thing you're supposed to be running? And that is the focus of the second part of this talk. And to really get into that, we need to talk first about how is software made? So in general, I'm just going to present a kind of silly, very simple diagram here. There's lots of different ways to draw this. But you can imagine that you have a version control system. You're probably using Git and some form of hosting for Git if you're like most software projects. And you might do some testing. You can do testing over things like your source code, like linting and stuff like this. Or you can do testing over the things that you build, like with fuzzing and unit testing and stuff like that. And many projects will have both of these steps. But in my very simple example here, I'm just showing you one of those. And then you also will go and you'll have a build process by which you turn your source code into something you're going to execute. And then you will go and package it up in some way. You'll containerize it. You'll do whatever, depending on your environment. And then you'll go and ship it off. And a real problem is that attackers can hack basically everywhere in this diagram. And even in lots of places that I haven't described. So I'll show you some historical examples of these. So yeah, Free Software Foundation was hacked. VS FTP. This was kind of an interesting one where the FBI allegedly backdoored OpenBSD, which is really bad. Juniper was hacked. This is allegedly by the National Security Agency, the NSA inside of America. So even the American government and a lot of those agencies are going and potentially hacking into these kind of organizations. And of course, things that we all care about, like the Linux source code repo have also been hacked over time and so on. And so I want to say that if I wanted to, I could have put 30 or 40 examples of this happening. This is just kind of a quick selection. This is not an isolated problem. It's also very possible to go and hack the build system. And this has been known for a long time. Ken Thompson referenced this about how to put undetectable backdoors into build systems. Apple had a big problem with Xcode Ghost. Fedora had a hack happen quite a while ago where hackers were able to go and get into their infrastructure and steal their signing key and go and using that signing key, they went and then signed a bunch of malicious packages. And I actually reached out to Fedora when this first happened. I saw some of the Fedora team because I had been doing work on package manager security at that time. And I said, you know, if we'd just done a whole bunch of work on this, a bunch of the package managers are using our work, including yum. If you want, I'd be happy to look at or give you any thoughts I have on any security design you have. And the response that I got was basically, well, we could tell you about it, but we're afraid it would compromise the security of our design if we tell people what it is. And as a security person, that's always a really bad sign if people are not willing to tell you what their design is. But they were actually, I found out what it was because they were actually hacked about eight months later. And when they were hacked, they revealed what their design was. Basically they'd used a TPM to do signing in their infrastructure. And so when they got hacked the second time, rather than the hacker stealing their signing key and downloading it and using it to sign packages offline, they just uploaded the malicious software they wanted to sign to the build server and just signed it on there. So it really didn't provide them any protection. Yeah, so there's a lot of examples of this, including SolarWinds. It's another one that people talk about a lot where allegedly hackers from Russia went and broke in and actually changed code right before it went into the compiler so they'd inserted code that had malicious backdoors and things like this. And yes, of course there's also a bunch of compromises that can happen in the packaging step. I'll just go a little quickly through this so we have time for everything else. There's a lot of really interesting security bugs here. And finally, testing is another thing that often gets overlooked. So I'm going to spend a little time trying to convince you that this is an important problem. The problem is that it's not uncommon for an organization to believe that they have run tests or to have a process for running tests and then accidentally release something that didn't go through their QA process. This actually happens quite a lot. Happened to Linux Kernel. Windows had an interesting situation where they released an update that only went to certain countries that do a lot of censorship and only went to mirrors that really served those countries. And so the security community was thinking that maybe Microsoft was pushing out a backdoor version of Windows for those users in order to let them be compromised by their governments. But like two weeks after that happened, they accidentally pushed another update to a different set of update servers somewhere else. So then people said, no, maybe it really was a mistake as Microsoft claimed the first time. But this has happened for other companies. Apple, other major tech companies have done this. Lots of groups have accidentally pushed out an update that was only supposed to be a beta update or only supposed to be sent out, you know, tested or something, but not actually a production update. And even worse than this, hackers can hack all kinds of other places in between steps. So in between your code and your build process, a hacker could interfere and take, rather than build the thing that came out of your version control system, have you built something different entirely? Rather than have you packaged the thing that was built by your build system, have you packaged something else? The question is, how do we fix this? And today there's lots of different solutions that work at different points of the design space. So there's a bunch of things here with Git signing and using other kind of version control protection mechanisms. In fact, we actually created the Git tag signing mechanism that Git uses. We found a bunch of flaws in the way Git was doing tag signing and work with their community to upstream some fixes there. And we're working and have been working since to upstream and work to address other issues inside of that part of the design space. Another aspect that has a bunch of problems here but has a bunch of solutions is the build process. And there are a lot of really exciting things that people have done with build security. The one that I'd like to highlight the most is the reproducible builds project. And this is the idea that you can go and take and build software on two different machines with two moderately different configurations and actually get bitwise identical copies of the software. And this lets you go and do verification that your software supply chain, the build process part of it doesn't seem to be backdoor because you have perhaps even two different compilers, at least in theory, running two different operating systems, running two different pieces of hardware that build exactly the same thing. And there's also a bunch of work that's happened in the packaging step. We've done a bunch of work in this area. I mentioned I'd done a bunch of work with the Linux package managers. I also created, and one of the creators at least of the Tough project, which I'm not going to go into a lot of detail, but Tough is used. And there's automotive versions of it that are used in automotive grade Linux and other stuff like that. But that's not really what this talk is about. There's also something that you may have heard of called an S-bomb. Who's heard of an S-bomb before? Who uses S-bombs today? Okay, a lot fewer hands, but it's still good. That's a lot more hands than I would have seen probably a year or two ago. This is an effort to try to go and capture information about the software supply chain process. And it's quite a useful thing. It's kind of the way I think a lot of people think of it, is it's like the ingredients label on a package of food where it tells you what's in it. But the problem is that S-bombs in practice are often not very accurate. The S-bomb often doesn't match what is said. There's also a lot of missing information and so on with it. And while there's a lot of work going on to try to improve this, there's definitely still a big gap here. So at least today the way most S-bombs are generated and the way most things happen, it's perhaps better to treat them as informational rather than security-critical information you can rely on, especially in situations where you might have an attacker who's compromised parts of your infrastructure. So with all of these point solutions, the natural question is have we fixed the problem? And the answer is we'll really know we haven't. Because you don't know that these steps were all correctly followed. You don't know they were done by the right people. You're not resistant to situations where an attacker has compromised or modified or become part of your infrastructure. And so the goal of the InToto project and what it provides to many projects including some work that we're doing now with confidential containers, is working to secure the complete software supply chain. And so we want to verifiably define the steps of the software supply chain. We want to define who the actors are who do those steps, which in our system they're called functionaries. And guarantee that everything happens according to the way it's defined and nothing else occurs. So the way that we do this is we have a definition of what the software supply chain is supposed to look like. That's in a document called an InTotoLayout. And an InTotoLayout consists of a series of steps. So for our simple example we had before, we have our different steps that occur and it describes basically what happens in each of those steps. Which parties hold keys. In this case we have these functionaries. These could be people like a system administrator that has a key or it could be a key inside of a TPM or some other aspect like that. And then each step has something that it produces that comes out of the step. And also a step may have things that it takes into the step. So for instance, in a lot of projects, the way that you're going to make a production release, it starts with you tagging, creating a tag inside of Git, and signing that tag. Then the build system takes that signed tag version of your code and actually goes and builds it. And then the thing that the build system produces is binaries and so on. And then that's the thing that actually gets packaged and shipped off to your users. And so the rules inside of the layout actually link all of these different aspects together. And then this layout itself is signed by a party in our system who is, we call the project owner, basically serves as a root of trust for Intodo to say, you know, this is the layout that you're supposed to follow. And importantly, when you do each of the steps that you're going to do in the software supply chain, you generate evidence called links for every step. These are also called attestations. In fact, if you've heard of Salsa attestations or other attestation formats, they use Intodo attestations underneath. So Salsa is an opinion layer on top of Intodo attestation metadata. And so each of these steps, as they run, they create these attestations or links. And then what you can do for verification is you can bundle this all together. You put the layout, you put the attestations together, and you have the actual thing that you're shipping. You send it to the end user and then the end user can go and just check to make sure that everything works and is verified according to the policy. So see here, how am I doing on time? I have enough time to talk about this. Okay, so I'll go through this in a little more detail so that you can understand a bit more about how this works and what happens here. So the layout describes in JSON exactly which functionaries, keys exist, exactly what actions are supposed to pass at every single layer and how all this works. It also contains other information like expiration dates and so on. So for instance, a layout might indicate something like that a specific test process is supposed to return a certain error code. Like when you run the unit tests, you should have an error code zero instead of an error code negative one or something else like that. And this is all signed by the project owner for authenticity and it lists the public keys for all the functionaries here. You can do things like key rotation and revocation and change functionaries and handle compromise situations in all these other cases by using in-toto layouts to do so. Then the individual steps contained in the layout describe operations in the software supply chain. These are things like once again the build process and so on. And they're performed by different functionaries who provide signed linked metadata as proof the step was carried out. It's important to note that these don't have to be computational things. For instance, if you have a lawyer who performs a license review to sign off to say a certain open source license is okay, they can just create an in-toto attestation and say I've approved this and that will be linked to the actual hash of the license that they've approved and it will all be correctly indicated in there. And in general the steps limit the actions a functionary can perform. It describes things about what the functionaries able to do and also describes things like how the materials and products, how the sort of inputs and outputs from the steps all connect together. And so within a step you also limit the trust that you apply by having different rules about this saying for instance, maybe you're saying the build process is able to create a binary with this name. That would be a very normal step to do but everything is done in a least privileged manner. We don't just blindly give privileges to steps and just let them do whatever, you go and you specify this. And usually the way this happens in practice is you'll just run your normal supply chain and in-toto can watch what you do and then say this is what I see, does this look right and you can take a peek and tweak it a little if you need to. But you can do things like this prevents situations that you have today where for some Linux distributions the same set of keys are trusted. You're either a trust developer or you're not. And so if you're the person who's supposed to be doing let's say the Pig Latin translation, language translation, you shouldn't also have access to change the kernel source code and modify the kernel package but in many distributions today you do have that privilege. Not with things like in-toto where the artifact rules will limit actually what people are able to do in different steps. So you can ever process where like creation says there shouldn't be this thing to start with but it should exist after the step. And you link steps together using rules. Match says that basically this file has to have the same secure hash with the products that came from this other step. And so this is how you can ensure things like the things that came out of your build process are the things you package. The things that you tagged are the things that went into the build process. It does all of that. So yeah it ensures that in this case the VCS works this way. And then inspections can be used to verify metadata within every step. They're performed by the client and it can provide, it can use additional link and other metadata attestations and stuff like this. If you're doing normal salsa things today you're mostly doing just like isolated pieces of link metadata along with inspections. For the most part salsa installs tend not to use layouts yet because they require a little more work to set up. So for example here an inspection would do something like ensure that all of your commits are signed if you have get signing turned on. Verify that only certain people who are allowed are the ones who've done the merges to master. You can have those types of things be done in inspections. Okay so I'm going to wrap up and I want to leave plenty of time for questions and back and forth with the audience because I know when I go to a talk I often think that's the most valuable thing is to listen to the speaker answer questions rather than death by PowerPoint which is what sometimes it turns into. So to wrap up I'll mention a few things that I hope you take away from this. So securing the software supply chain is really important and absolutely confidential containers is a really important use case that we're really excited to be working with on this project. In total is a Linux foundation project that's widely used in industry so you'll see it popping up all over the place. Pretty much if you hear someone talk about a supply chain at the station what they're really saying is in total underneath. We were the first tool like we were doing this before it was cool. I don't know if supply chain security is cool now but if it is then we were doing it before it was cool. When we still had to convince people no somebody could really break into your compiler and stuff like this. And the my former PhD student Santiago Torres that is has done an amazing job with this project. He was in my office for many many months almost a year where we worked out on the whiteboard different approaches for trying to handle like a candidate set we take in I think I don't know a dozen or so real world software supply chains from different organizations and tried to work out exactly how to do a lot of the metadata signing and storage in a way that worked well for everybody. And after you know nine months or so of finding little problems and everything that we've done we finally settled on what we use in in total. And we're really pleased to see that a lot of people who've come after us it's been very common that they come and say oh well I like in total but I think I want to do the metadata this way. And almost always in fact I think to date it's always been one of the things we thought of and then tried and found it didn't work for some reason. So then we go and say oh well you have this problem if you do it and you have this problem and you have this problem. And then they say well what do you think we should do and it's like oh just use our just use our simple signing framework and so it's what a lot of folks have done. So yeah I encourage you all to give in total a try and let us know any thoughts or questions or things like that. We're really friendly community and always welcome participation from others. So with that I'm happy to take any questions. Yes. Okay let me answer it. Yeah let me answer that question first and then I'll listen to your second question but I'm going to repeat your question first because I want to make sure everybody heard it and it's on the recording. So your question is is that is it critical to record things from like a TPM or etc. as that are as part of the build process or do you really get any advantage if you don't have something like TPM recording or other kind of root of trust or your functionaries in a system like in total is that a good. Okay so you do you do get an advantage to recording it using in total. You do not get the same level of advantage by doing by using a TPM. TPM is much stronger basis for you to do attestations on top of but you still get a lot from just signing things with functionaries in the right places and we've actually gone and done an analysis using the software supply chain catalog of vulnerabilities that are maintained by tag security and the CNCF which by the way is another friendly community that I'm a member of and I encourage anybody who's interested to come to our meetings and join. Anyway so we went through and looked at a lot of historical compromises and a lot of them would not have necessarily been protected with you know TPMs or with other stuff like this. Things need to happen in the right place but you get a lot just by signing with functionaries. You actually get you know a very substantial percentage when TPMs are used in some environments that's just going to raise the bar and make those folks already so much harder to hack. So it adds value but right now the bar is low enough that attackers it's it's not clear how much of a difference it is between signing things and so on. As TPM is from my understanding TPM is a very complicated device. Should we use TPM or we should just use some like confidential computing technologies like TDS or other measurements just like the RTMR. This values to implement simple or simple measurements mechanisms. What do you prefer is TPM is TPM is too complex. So I will say that's a little more out of my area of expertise but I'll take a shot at answering your question which is to say in general you want to use whatever technologies you have. That are available but everything in security has a cost benefit trade off. And if the complexity is very high in using a system you also have to look at what it protects against TPMs tend to be good at giving you hashes of things at boot time. That tends to be what is fairly useful and they tend to also end up with a fair number of assumptions about the security of the things that are running. Like you know at lower levels that basically you don't have certain types of vulnerabilities and stuff like this. So it's really hard to judge. In general most systems that have TPMs and do these types of things haven't been hacked. But then we also don't see a lot of insider attacks at cloud providers in general. And I think that probably the answer to that is like let's say you heard that one of the major cloud providers had the administrators going and looking through people's cloud environments and stealing company secrets. What do you think would happen to that cloud providers stock price. What do you think would happen to their user base that would all just go immediately down. So they're all the cloud providers are very incentivized to play well and do a nice job in general. And so there's you know a question about how much effort and thought you know and work needs to be put into that aspect of it. So putting some protection in place probably gives you you know such a high bar that no one is going to cross it is my thinking. OK thanks for your answer me. OK. Happy to take any more questions if anyone has one. Just a moment. Just wait for the microphone please. I think my question is very simple because I do not know idea of the in total. Sorry. But I because I you and the previous speakers talk about many of the TPM. I do not know what's the Stanford TPM. OK. Just a trusted platform module. The way to think about it is kind of like it's a thing in inside your hardware that's like a secure processor that can do certain things. And what it can do depends on the type of TPM that exists. There's there's some variation. But in general it lets you go and either run code in a more secure enclave like environment. You might have heard of things like SGX and stuff like this are ways of doing that. Or it can do things like just have keys and stuff that it manages internally and then can compute hashes like secure hashes over parts of memory. And then often what you'll do is you'll provide that to a third party. So part of the dream of TPMs at least originally is that you would be able to walk into an internet cafe back when that was a thing. And this is before cell phones were a thing. So imagine you don't have a smartphone and you wanted to sit down at a computer and log into your AOL account or your whatever it was at the time to check your email. And the idea was is that how do you know that that computer is trusted. Maybe you want to look at your bank stuff and other things. And part of the original motivation behind TPM was it would boot part of the system. And then this hardware would constantly be checking to make sure that no malicious code had run as part of the boot process. No malicious code had run later as you're doing different parts of the bias and everything else. No malicious code was running as part of the operating system. You loaded all the correct things. And then you would know that OK now my application my web browser my email clients my whatever it is that's running on here. I can trust it because it's running all the right things. If you trust that you have to trust the hardware maker but it's a much lower barrier for that. But it's really morphed into something different in the last five to 10 years where it's been a lot more about isolating bits of computation on a big running system. Because one of the problems with just doing everything at boot time is operating systems are very complex. They have lots of bits and pieces. They have lots of bugs. There's lots of other issues and stuff like this that can make them exploitable. And so being able to do more with things later in the process makes TPMs a lot more valuable like makes trusted hardware a lot more valuable in practice. So I hope that helped. Thank you very much. Anyone else have a question or does anyone who's like a hardcore TPM nerd want to correct anything I said OK looks like no more questions. All right well thank you all for coming. Enjoy the rest of your conference.