 Hello and welcome to my talk, Back to the Drawing Board, Building Containers with S-Bombs. I'm Nisha Kumar and I'm a senior open source engineer at VMware. So I'm hoping you got enough messaging about S-Bombs and why we need them. But if you haven't, here are a couple of analogies. If you've ever looked at the ingredients on a product's packaging, you may be surprised at what you find. For example, Twinkies are not vegetarian and coconut milk conditioner does not have coconut milk. As with these ingredients, a software bill of materials helps you make informed decisions about the distribution and maintenance of the software you consume, either as a tool or as a dependency included in the software you distribute. You may be asking yourself, why do I need to generate S-Bombs for containers specifically? Well, this is a picture of a Node.js container. As you can see, there are plenty of components included that may not even be needed by a Node.js application. One of my friends on Twitter noted that there was a Libraoffice hanging around there somewhere. I would Node.js require Libraoffice? Who knows? Now think about the several hundred Node.js applications comprising your SaaS offering. The number of software components being consumed becomes exponentially large. And it gets very hard to keep track of so many components. So naturally, they are hidden behind many layers of abstractions. A Kubernetes distribution consists of hundreds of components, but it is a dependency in and of itself for a number of services. The abstractions are needed as we cannot keep track of so many components and so many moving parts. As developers, we focus just on the thing that we are building. We know we need certain top-level dependencies, but those dependencies need dependencies. It's turtles all the way down, as they say. The tools we use to build containers are not repeatable and they're not hermetic. Building a typical container requires downloading many artifacts from all over the internet and several configuration steps. Not to mention the volume amounts to the host and copying files from the host into the container. Many of these steps are not repeatable. They are dependent upon the state of the network and the state of the internet and the state of your host machine. You may not be able to rebuild the same container a few months ago from when you originally built it. S-bombs help bring some transparency to the container ecosystem, even highlighting what the known unknowns are and what the unknown unknowns are. They also provide a separation between the source code and the rest of the system required for software to run. You don't need to look at an S-bomb until it becomes necessary to look at them. One question I get asked a lot, as an engineer working for a rapidly transforming SaaS company. Why do SaaS providers need to generate S-bombs in the first place? They are hosting all of the services, they're not delivering any software to a customer. Here's the thing, service providers don't ship software, but they do take custody of the customer's software and provide all the machinery to deploy it and run it. So customers entrust SaaS providers to take good care of their assets. With great power comes great responsibility. The burden of software management now falls on the SaaS provider. Customers are now entrusting their code to SaaS providers and expect that SaaS providers know their infrastructure and their deployment pipelines inside and out. They expect their assets to be secure and auditable. They don't care about what tools and processes are used to do this, but they do care about the risk they will be taking by entrusting a significant amount of their business operations to the SaaS provider. So there is actually more on the line with regards to reputation. And that should mean more investigation into what exactly gets built and shipped. Can a service provider today be sure about what they use to build and ship someone else's code? They may not need to provide this information to a customer, but they certainly need it for themselves. So why not use static analyzers to generate these S-bombs? To understand why, let's take a minute to consider how static analyzers work. In general, static analyzers look for patterns and make inferences about what's in a file based on some rules and heuristics. Typically, there is some software that reads a file, looks up a public database for known patterns, checks to see if those patterns exist in the file and then generates a report. The security solutions that you pay for may make use of a proprietary database, may have an API and a client to query that database, and may generate some fancy reports that makes the decision makers feel good about their decisions. Now, static analyzers or static analysis in general works really well on a large container image, such as the ones you build like how you would provision a VM or a desktop. OSS have metadata and they come with a package manager. One can use the package manager to install some system dependencies, including a programming language tool chain. Then one can use the tool chain to install the app's dependencies and finally make some configuration changes for the apps to work. So all along the way, things like package manifest, license files, documentation and other metadata may get installed onto the container's file system. And that's what static analyzers lead in order to make their inferences about what packages are installed. But all this metadata makes the container too large and too large containers make it difficult to deploy them and distribute them. To reduce the size of the container, we use this thing called multi-stage docker bills. So these multi-stage bills are used to copy out just the artifacts needed for the application to run. We put those artifacts into a minimal operating system, file system that doesn't have any of the metadata scanners expect to find. Further, we use tools like Docker Slim to remove all of the files the app does not open at runtime. So by this stage in the container build process, all the metadata about the container build is gone. So this process is great for reducing the stack surface of a container image, but scanners will not be able to detect any useful information about the container image at the end of it. So how do we build transparency into container builds? How many of you thought, let's use eBPF? Well, we don't have to get that complicated. The thing is we don't need to do things like inspect the Docker build logs or watch the kernel or do nmap scans. Container builders already know some portion of what they are installing. And we already have tools that read package manifest or invoke the package manager or the tool chain to list the transitive dependencies. If container builders can create accurate S-bombs for the pieces they are installing and reuse the S-bombs created by folks like OS suppliers or software packages or any other automation, then they can include them in the container build and distribution ecosystem. What that means is that we have the best of both worlds. We can reduce the size of the container images and we don't have to sacrifice any of the metadata that we collect along the way. So today I'm going to show you a workflow that's a good starting point to generate S-bombs during container builds. And we're also going to sign the container image and the S-bombs and push them to a local OCI registry. So we will be making use of several tools to accomplish this. We'll be using Builder to build the container. We'll use turn to inventory the container to generate an S-bomb. We'll use ORA CLI to push the S-bombs to the local registry and we'll use six doors cosine to sign all of the artifacts. Now, all of these tools make use of a concept called OCI artifacts. And I'll talk a little about that later. But first, let's go on to the demo. Okay, let's see if we can do this 10 minute demo. I have here a vagrant box, which I have provisioned by the way you can provision your own vagrant box by using the tools. This repo container is with S-bombs that is hosted on GitHub and I'll share the links to that later. So part of the provisioning of this vagrant file is a vagrant box is creating a minimal root FS using Dev Bootstrap. So let me show you what that looks like. So that's a Debian root file system. And I created that with Dev Bootstrap. Let's look at the containers with S-bombs repo. As you can see, there's a Debian tar ball right there. That's just a tar of this Debian root file system. Okay, we need a local OCI registry. We'll use Podman to set that up. And that's just pulling this Docker image and running it as a demon. So let's give that a whirl and it's done. Let's check if it's running and there we go. It's running. Let's check if we have any images. Yeah, we just have the registry. All right, let's get to building our first image. I have a convenience script over here that I'd like to go over. So what we're doing is we're using Builder to build an image from scratch. That's what this is. And that will return a container. And then we'll mount this container. I'm using Builder Unshare and that lets me mount without root privileges. And I need this mount point because that's the one that I'm going to point to turn to in order to generate an S-bomb. And then I'm going to add the Debian tar ball to the container. That's going to take a little while when it runs. But what should happen at the end is that we have now a minimal Debian container. And we'll commit that container to an image. And then we will inventory that container. We point the mount point to turn and we'll generate an S-bomb called Debian S-bomb and we'll use the format SBDX JSON. So SBDX is a standard S-bomb format. And one of the like sub formats that SBDX supports is JSON. OK, so we have an image and we have an S-bomb. And then we will use Builder to push the image to the local registry. And then we'll use ORAS to push the S-bomb also to the local registry. And then we'll use Cosign to sign both the image and the S-bomb. And we'll do some cleanup over here. All right, let's see how this goes. So this is where we add the Debian tar ball to the container. Off it goes. Now we generate the S-bomb. That was done. We now have the image and the S-bomb. Now we push the image, the S-bomb and the signatures. So yay, we're done. OK, now you should see that there's... Oh, I don't have the Debian S-bomb here. Never mind. It is up. It is in the registry and we're going to use that to create the next container that we're building on top of this Debian container. Let's take a look. So this is a convenience grip. We'll start off with verifying the container that we built. And that should work. And then we will use Builder to build on top of this image. We do the same thing that we did last time. We did Builder from and that Debian image. Then we mount it because we need that mount point. We need to give that to turn. And then we're going to just install some stuff. We're going to install Python 3. And then we're going to commit this image as Python and with a tag 3. OK, then we're going to download the Debian S-bomb. And before that, we'll verify whether it's signed. And once after we verify that it's signed, we pull it using ORAS. And then we make turn inventory the container with the Debian S-bomb as a context. And we will use the same format as VDX JSON. And we will produce an S-bomb for just the Python bits. We can now push the Python image and the S-bomb. You'll use Builder to push it, push the image. And we'll use ORAS to push the Python S-bomb. And that would include the Debian S-bomb and the Python S-bomb because we needed this guy to be able to inventory this guy. And then we're going to sign our new artifacts using cosine. All right, let's give this a whirl. OK, we verified it. And this is building, installing Python 3. And we're storing signatures. And we verified our S-bomb. And we're done. OK. Now I want to show you the two S-bombs that we had. We'll use JQ to filter it because there's a lot of data in it. So let's do JQ. I just want packages. That's a list. And I want the name. And I'll give, hopefully it should be there on the system. There it is. OK. So those are all the packages that are installed on the Debian file system. Now we'll see what that looks like for the Python S-bomb. OK, not that many because along with the Python 3 that we installed, we also brought in all of these transitive dependencies. And that's what this S-bomb contains. All right, so cool. That's the end of my demo. Let's see, I have 20 more minutes. And I can switch back here. Hopefully there we go. OK, so there are certainly a lot of gaps that we need to fill out for this to become a viable or at least have like an easier UX than what I've shown. And one of them is using artifact management. And that's with OCI artifacts. So OCI artifacts is basically using the open container initiatives, image and distribution specification to store artifacts in container registries that are not container images. So in order to support that requirement, there are a number of changes that need to be done on the OCI image spec and on the OCI distribution spec. And there's a working group under the OCI that aims to do this. So this would help us not manage so many tags. And instead, you'll be able to refer to all of the supplemental artifacts for the container image with just one tag. The other thing that maybe you've noticed is that the method in which we used to build the container was still using the old method of using the package manager or AppGet install Python 3. That is not really reproducible. So repeatable builds in regards to containers is basically just recovering a file system from a cache. And that's not what we want. What we really want is the ability to redo that Docker run or build a run such that you get the same set or similar set of files at the end of it every time you run it. And this is a problem that's already solved by Linux distro tools. And these are good places to look for ideas and inspiration on how to make repeatable container builds happen. Where you can get involved, I have some links to the OCI artifact references proposal that Steve Lasker has against the Open Container Initiative. Please come and join the turn community meetings, because we talk a lot about this ecosystem and how turn can help. You can read up on the update framework, which is the backbone of SIGSTOR and Cosine. And of course, there's a link to our friends at SIGSTOR. OK, some resources. This demo is hosted on GitHub. It's public. You can check it out. I don't think I'm going to. I don't expect to have any contributions. But if you have a question or an issue, feel free to file a GitHub issue. Here are a number of blogs discussing OCI artifacts. And it's a concept that is a little difficult to get your head around. And I hope that these blogs would be useful to you. Dan Loring of Cosine fame has also written a blog about the update framework, which is a nice, gentle introduction to the update framework. This is also a concept that is difficult to get around. I'm still trying to wrap my head around it. And then there's a link to the turn repo and the SBD expect. And you can talk to me on Twitter at these places. And these slides will be available. OK, and with that, that's the end of the talk. Thank you for listening. I'm Nisha Kumar.