 Hello everyone and welcome to our talk Securing Content Distribution with the Update Framework. My name is Lukas Püringer. I'm maintainer of the Software Security Supply Chain Projects TAF and INTODO at New York University's Secure System Lab. My name is Joshua Locke. I am the security lead in VMware's open-source technology center where I also maintain TAF. Okay, let's see what we're going to talk about today. So first we will talk very briefly about content distribution, what we actually mean by that, and what challenges exist. Then we'll give a TAF primer, TAF being short for the Update Framework and talk about the pillars TAF is based on and the principles it uses and how it faces the challenges of content distribution. And last but not least we'll talk about how to get involved in the project because this is a maintainer-track talk of an open-source project. So we want you to get excited and help us out. But first things first, content distribution. What content do we mean? Basically any digital content that needs to be kept up to date, this can be software, primarily it is software, but it can also be any sort of metadata, legal documents, etc. But yeah, usually it is software. As such, the content distribution is part of the software supply chain and the very crucial part too because it's at the user boundary in this software supply chain graph. By software supply chain I mean all the steps that are carried out in order to write software, to test it, build a package and to finally ship it out. So being at the edge of this graph, content distribution needs to be done carefully because whatever gets distributed is what should be, well let's put it differently. So once software leaves the premises of the software producers, they will have a hard time to enforce any quality assurance on the software. So it should make sure that the software they intend the user to have is actually the software that they get. So that is why this is a crucial part of the supply chain and also a very attractive target for attackers because if the attackers compromise the content distribution, they can have a huge impact of millions of users potentially. And this also happens in the real world and Joshua will talk a little bit about when this happened before. Just the next slide please Lucas. So you might think that software, secure software updates are a solved problem especially if you like me come from a background of using Linux distributions where you get all of your software from a single kind of point and it's delivered in a secure fashion. But remote update systems are regularly compromised to deliver malicious content to users and you can see kind of a logo soup of systems that have been compromised effectively to deliver malicious content and this continues to happen to this day as more software is produced that leverages new and different software update systems. So you continue to be deployed with flaws that are susceptible to the kinds of compromise that enable someone to deliver malicious content. So relatively recently a large computer systems vendor had the piece of software on their devices that they sold that was used to update the firmware and drivers and whatnot on those devices and that software update system was compromised to deliver malicious content that was targeting specific users. So the malicious content was delivered to huge swathes of users and the subset of those users were targeted to kind of activate that malicious content and do unkind things and that continues to be the case you know we continue to see these kinds of attacks happen against multiple types of software update systems. And so you might think surely there's an easy solution to that and could you just switch to next slide please. And so a couple of things might come to mind when you're thinking of this being a relatively easy problem or solved problem. One solution that might come to mind is that we all became so used to seeing the little padlock in our browser URL bars that the browsers don't even show it anymore so that used to indicate the presence of of SSL or TLS certificates protecting your connection. And that kind of is like a very low amount of security for a software update system. It does improve on your HTTP kind of status quo of old but it's still a single point of failure. There's a single key which is kept online in the systems memory which is prone to compromise and even then all the really kind of guarantee that an SSL TLS certificate gives is that you have received content from the server that you intended to connect. It doesn't provide any additional guarantees about the integrity of that content or that the server that you did connect to was even the one you intended to because your software update system is really indicating which server it's actually connecting to. So if you think about it a bit more you probably have heard of PGP, Pretty Good Privacy, I think that's the acronym, or the GNU PG implementation, GPG and think maybe you can just use that to sign the things you're distributing. And this is a positive step because it gives you offline keys and that's good. The keys are not kept consistently in memory so that's an extra level of protection. But as soon as you start to introduce offline keys you have to worry about things like key distribution and revocation. So key management is known to be a difficult problem as is knowing which keys to trust. The GPG or any kind of sign system, any naive sign system only provides the security properties of knowing that the thing was authentic at the time of the signature and that the thing is has retained its integrity that it hasn't been tampered with. The signature would not validate if the thing that's been signed is modified. But it doesn't help with a whole slew of attacks including things like replay of attacks where you deliver a previously signed bit of content that's no longer valid or mix and match attacks or various other types of attacks that are covered in more detail in the update framework's documentation. The other thing I'd say about GPG is that it has well-known usability problems which have been documented ad nauseam and have led to kind of huge or relatively huge numbers of security folks just giving up on the system. So there's no real good easy solution. There's also new types of attack coming out all the time and so very recently there was this dependency confusion or dependency substitution attacks whereby the security researcher Alex Persson noticed that a lot of corporations have internal package repositories that publish or that host content for their internal systems and the users of those internal package repositories had misconfigured package managers that would reach out to public repositories before reaching out to the internal repository or instead of reaching out to the internal repository and that meant that the security researcher was able to put packages on the public repositories with malicious content that replicated the internal names and so companies were just blindly installing software from the public repository that was malicious thinking they were installing the packages they intended from their internal systems and I think this is interesting for two reasons one we absolutely have an enhancement to the tough specification which would help protect against an attack like this but I think it's also just indicative of how content delivery is a very attractive target for malicious software and tough provides a very solid foundation for protecting against known attacks and for building protections against future attacks on top of so with that kind of introduction I'll hand back to Lucas to talk us through some of the principles of tough thanks Joshua okay let's look at tough more closely what can tough do for you tough is based on these three pillars first of all it protects your content more precisely it protects the threshness consistency and integrity properties of your contents then on top of that it reduces the impact of a successful compromise Joshua has shown us that these compromises do happen that keys especially keys that are kept online can get lost but even TPG keys need to be revoked every now and then so that indicates they also get lost or stolen so tough has mechanisms to reduce the impact of such a key loss or compromise and speaking of recovery tough was built with recovery from wait yeah tough was built with recovery of keys in mind so that's built into the core design of tough let's see how tough can do all those things regarding the content protection tough employers cryptographic signatures for that uses asymmetric cryptography um and signs both the content but also the the entire repository so um signing one software package for instance or one container image is pretty good it guarantees your guarantee about the integrity of that file and that's what many or most software updateers maybe do it's not nothing out of the ordinary but it's also important to ensure that the entire repository is consistent because an attacker who can control which files are served might be able to serve files that are benign by themselves each of them but not if they're served together so that's why tough needs to or tough signs both the content and individual content and the entire repository and as for the freshness property um tough uses implicit key revocation um also known as expiration so tough just puts expiration dates on signatures that means that if an attacker can intercept traffic and tell the client that there are no new updates the client will detect this at some point because the local signatures will expire and the client will ask for new um new signatures um so much to protecting the content um what about reducing impact of key loss um tough employers uh several principles to reduce the impact of key loss first of all it separates responsibilities so I've already talked about different responsibilities um consistency integrity freshness all of these uh for all of these a separate role exists in the tough design um and the separate signing keys um separation of responsibilities uh allows you to um still continue operation if one key is lost uh same goes for threshold signing um that's like separation of responsibilities within one responsibility so for instance for the role that uh deals with the integrity of the contents um tough might require multiple keys to sign to sign for that content then last but not least uh the separation of responsibilities also allows you to balance trust or responsibility and uh risk which is often related to um the availability um of keys so um an example would be a key that needs to sign content every couple of minutes which needs to be highly available and thus kept online is it much higher risk to to be compromised so tough can assign very few responsibilities place very little trust into that key on the other hand a key that has uh huge responsibilities and um needs to be trusted a lot uh can be kept offline and um if uh a compromise happened then the repository content repository needs to make sure that the client gets new keys and tough allows to do this in band in band mechanisms by um creating a hierarchical trust delegation tree uh where you have a root role that's the root of trust and that delegates trust to all the other roles that we already talked about so there's the timestamp role which is responsible for freshness the snapshot role responsible for consistency the target's role responsible responsible for the integrity of the individual content pieces individual files usually and root delegates trust to all of them um let's take a let's take a look at a diagram of how this can look in practice this is court to see of pep 458 um secure pipeline downloads with signed repository metadata pep stands for python enhancement proposal and this is basically a design document uh that describes how tough can be used to secure the python package index and um these are the roles on the top of this diagram you can see the root role then you can see below you can see the other roles I talked about and the spinny's role roles are basically also targets roles that are responsible for subsets of the content integrity and the arrows show the trust delegation relationships so a root signs for time to them snapshot targets and for itself and targets can sign for delegated targets I'm mostly showing you this because we already said that an easy solution won't be enough to solve supply chain security or content distribution particular um tough is not an easy solution but it is also not as hard as it looks and um and this lets me transition to our next part of the talk tough as a very friendly community um and um there are a lot of people who will love to help you to get started with tough to better understand it to um integrate it with your package manager uh or to help you um start contributing if you are interested to do so exactly that so um yeah if you have seen uh this brief introduction to uh secure content delivery and attacks on those systems and how tough helps protect against them maybe you'd like to know how to get involved with tough this is a maintain attractor after all so uh if you could switch to the next slide Lucas um tough is effectively three uh main projects um the primary project is actually the specification this is the um the framework in the update framework that describes how to implement a secure update system and then we have a reference implementation which is symbiotic with that specification and that um being the reference implementation it aims to always represent the state of the specification but also is where we improve concept any enhancements to the specification and uh we have this tough augmentations proposals or tax process similar to the uh python enhancement proposals or pet process that was mentioned for the pipeline work um and this process is for documenting information about the tough system or proposing new features to the tough system and i'm going to talk about each of those three projects in a lot more detail so the specification if you were interested in the specification itself and and how you might contribute to that project um probably the easiest way that anyone can contribute to tough is to review the specification and suggest any clarification um we have we uh continuously working to improve the kind of ease of use for adopters of the tough specification and make it clearer and more succinct and more approachable um and if you have any feedback on uh ambiguities or things which just um don't seem to make sense then we would welcome that feedback and would like to work with you on how to improve that and you can reach out to us via the issue tracker or um our slack channel which we'll uh mention later um in the vein of helping improve the ease of use for adopters we recently switched the specification from plain markdown to bikehead flavoured markdown which produces this really nice looking specification document you can see in the screenshot here with a table of content and syntax highlighting and and anchors for the different sections and subsections i think it would be really beneficial if some aspects of the specification specifically the detailed client workflow had more anchors so that you could refer to specific points in the workflow for example from a code comment or when discussing the specification and and everyone would be able to follow that link and understand exactly which part of the spectrum is talking about another um ease of use uh change we would like to make to the specification is that today it's um all written as pros and various parts of the pros are repetitive you know there's certain things that we do in the tough specification which um are very similar to other steps in the specification just with uh some key uh words changed and so we'd like to reduce some of this duplication by uh following the example of specifications like those specifications that drive the web so the what wg and the w3c specifications um they tend to break down these details into subsections that kind of called out a bit like a procedure in the programming language and we'd like to explore doing that within the tough specification as well um next slide please so uh i also mentioned the augmentation proposals and this would be a nice way to get involved in the project as well we have several work in progress augmentation proposals which could do with security minded folks reviewing them and seeing you know what potential problems they find we also have uh draft augmentation proposals and these are ones which have been kind of merged and approved as a good idea but still need some work to um to complete and so reviewing of those would also be welcome and but so too would approve the concept implementation we don't merge any uh changes to the specification until they've been proven in code um and so some of the draft tabs are waiting for that proofing code to demonstrate that the idea is as sound as it seems kind of on proverbial paper and the final way you might contribute to the tax process is that you might propose new orientations to the specification or new um informational documents that help people kind of understand and deploy tough um the other way you might get involved if your model of code minded is contributing to the reference implementation and we're currently undertaking a refactoring efforts to provide a really exemplary implementation of tough with the python reference implementation and this is motivated in part by those big software engineers but also in part by the integration into the python packaging index where we've been extending the reference implementation to be a little bit more modular so you can now plug in kind of storage backends used to assume that you're writing files to disk now that might be cloud storage um and could be a number of other things with the modularity that we've implemented we've also recently extracted out um networking operations so pip which is the the python packaging installer has a very mature um code base for interacting or for performing network operations and we wanted uh the tough implementation to be able to take advantage of that um so we're very much welcome contributors to come and help us with our ongoing refactoring efforts we're really looking to make this a very good pythonic uh our project um looking to be very clearly matched to the specification very easy to understand so that it could become kind of supplementary to the specification as a way to understand what is happening in tough and then if python and specification writing is not your thing there are multiple other implementations of tough that I think it would be great if you were interested in contributing to we have a largely dormant uh go implementation as part of the update framework organization on github um we'd very much welcome uh active contributions to that project to bring it back to life a bit bring it up to date with recent changes in the specification and there's a new effort to implement uh tough in php which is part of um a collaboration between various php content management systems to provide secure auto updates for those systems and then there are two implementations of tough in rust one is uh rust tough which is kind of automotive focused and is used in various automotive systems today and then there's uh tough to ugh uh which is um the system that adds uh secure updates to um a degree aws is bottle rocket os which is their uh linux container host uh operating system and then um another way that you might care to get involved is to participate in some of the integrations so integrations are how we describe taking tough and uh integrating it into an existing software update system if you have a software update system you work on that you think would benefit from tough we'd be happy to discuss how to approach these integrations with you but there are also some ongoing integration efforts that you might wish to contribute to the python package index is um adopting the reference implementation and adding tough to warehouse which is the other source project but bex pipeline so any contributions to the reference implementation are going to benefit that project but you might also want to contribute to warehouse directly and there are various tasks left to to enable that effort and i also mentioned uh the php cms is collaborating on a php implementation of tough um they have uh the php tough implementation as well as various plugins for the composer system to enable tough and other security related features and i'm sure they will welcome contributions from excited participants and then with that before we uh just switch over to questions um wanted to take a moment to thank the uh the happy friendly contributors to the system that lucas mentioned uh earlier uh if you're interested in learning more you can go to our website the update framework.io you can join the tough channel on the cncf slack workspace or if you're into email like i am you can send an email to the update framework at googlegroups.com and um i've added some references if you'd like to go and read a bit more about tough and some of the ongoing integrations and implementations uh you can follow these links and learn some more that way so with that uh yeah thank you very much for your attention and uh i look forward to your questions