 Hello. This is the OCI Images for More Than Containers Lightning Talk, and I am John Johnson. A little about me. I have been working at Google for the last five years on various container things, starting out on the GCR team and recently focusing on the generic artifact registry service. Since 2018, I've also served on the OCI Technical Oversight Board, which is where I formally argue about semantics on the Internet, mostly around containers. I also maintain a couple of products on GitHub, and on the right you can see our dog, Grace Ann, Grace.dog for more pictures of her if you're interested. We will eventually get to crossplane, but first we're going to go over some OCI data structure background, the registry and this Go Container Registry Library. At the end, we'll talk about how crossplane ties all of this together to manage packages. And just as a disclaimer, this was originally going to be a longer talk with both me and Daniel Mangum, but it has been squished down into a lightning talk, so I'm going to go really fast and I apologize. Also, caveat, I am not a crossplane expert, so I will hand wave over some of those details while focusing more on the OCI stuff, moving on to the image specification. So I would assume most people watching this are somewhat familiar with Merkle DAGs, but if not, a DAG here is a directed acyclic graph, which means there are no cycles. A Merkle DAG is a graph constructed out of hashes, which means the entire thing is immutable and also enforces that acyclic bit. This is a very nice property for a data structure. You're probably familiar with things like Git and Bitcoin, which use Merkle DAGs, but you may not be familiar with the fact that OCI images also are built on top of Merkle DAGs. So on the right we have this example graph that represents an OCI image. The arrows between nodes represent hashes. And so when I say digest, I'm talking about these arrows pointing to content is the hash of content. So the fundamental primitive of OCI data structures is called a content descriptor. This is just a simple tuple of media type size and digest. These properties together act as the strongly typed immutable pointer to arbitrary things, right? This generic data structure that's very useful. So the media type tells us the format of some bytes and how we should interpret that. The size tells us exactly how many bytes we should expect, which is helpful mostly for safety reasons. And the digest is that immutable content identifier from earlier. And so for this example, if you were to find this descriptor on the ground somewhere in isolation, you would know that it's talking about an OCI image manifest, which is encoded as JSON, that has exactly 7,682 bytes and has this shot 56 hash, blah, blah, blah, right? Usually though, we don't find these on the ground. They're actually just embedded in other data structures, for example, the image manifest. So the image manifest is what most people would be familiar with as, say, like a Docker image and is equivalent to the Docker v2 schema 2 image format, roughly. Simply put, it is two things. There is this config descriptor and that points to a JSON blob with various information about how to run the image. For example, what environment variables should be set, the user that should run the process. It also contains some other metadata like the creation time and also how the image was built. Then there are also layers, which are just a list of descriptors that describe the container image's file system. These are represented as special change set tar balls that are usually GZipped, but they are flattened into the representation of the file system using some union FS mechanism usually. The last interesting data structure is an image index also known as a manifest list. So this is just kind of like a meta manifest that references other manifests or really anything that you can describe with a content descriptor. So in my head, I think of this as like a folder, right? But the most common use by far for these is to distribute multi-platform images. So in this case, we have a manifest list that points to a PowerPC image and an AMD64 image. And so clients, when they encounter this, they can look at those platforms and select an appropriate image for the target runtime. Now briefly on to registries. So the registry protocol basically is just a protocol for uploading and downloading stuff via HTTP. And it is pretty similar to like the dumb and get protocol. We don't have too much time to get too deep into it, but roughly there are two handlers, the manifest handler, which is for structured content like the data structures we just discussed. And then there's the blobs handler, which is for opaque content that registries don't try to understand. Things that are uploaded as blobs are usually leaves in a Merkle DAG. And things uploaded as manifest usually have outward pointers to those leaves. And so registries tend to parse things uploaded as manifest so that they can do ref counting and garbage collection and enforce invariance like you don't want to upload something that points to a blob that doesn't exist. Both manifest and blobs can be referenced by their digest, but manifest can also be referenced by mutable tag identifiers, which we'll get to later. The reason we're talking about this is that registries are interesting to us. If you're running say images on Kubernetes, you need to pull them from somewhere. And so we have this service that stores artifacts already that we're already using from a cluster, which we'll get to later. I can demonstrate with a tool I have built what these look like in an actual registry. So here I'm showing a manifest list on Docker Hub. We can look at this first one, which is an AMD64 Linux image. And this pulls up just the image manifest, which again points to a bunch of layers and some config. The config has various things. For example, here's the environment variable set via the path. And then the file system again is just a tarball, right? So we can look at this just like any other file system. Moving on to my magnum opus, which is the go container registry library. So the reason I was asked to give this talk is that I maintain this go library and cross-plane uses it. And more than that, many other tools depend on this. So for example, my own CLI called crane is a generic registry client, which I have just shamelessly plugged. More interesting for this KubeCon talk, though, is that there are various Kubernetes controllers that use go container registry. So the first of those was the Knative revision controller. And that roughly just would resolve image tags to their immutable image digest references. You can read more on this Kubernetes issue about why that's a useful thing to do. The second adopter of go container registry was Tecton. And they used the library for basically rewriting a pod's entry point to enable interesting features that Kubernetes doesn't allow, like ordering tasks. Christie Wilson and Jason Hall gave a really great talk on this already. You can follow that YouTube link. But more recently is Tecton's use of go container registry for something called Tecton bundles, which are basically just OCI images that contain a bunch of YAML describing Tecton resources. That's very relevant to this talk because that's basically the exact same thing that cross-plane does to manage packages. So cross-plane, same thing, right? YAML and an OCI image, cross-plane packages. So finally tying this together. And this is where I will hand wave because I don't really understand it. But basically packages come in a couple varieties. There are providers and invigurations. I don't really know what those are, but I do know it's a bunch of YAML given that this is Kubernetes. The cross-plane controller has two reconcilers, a revision reconciler and a manager. So the revision reconciler is what actually talks to the registry. It pulls down these images, caches them, extracts the YAML, and does all the actual work of like installing a package, all the business logic. The manager is quite literally a package manager. It is responsible for pinging the registry to detect updates to images and also garbage collection old package images that are known or needed. Again, really hand waving over this, but I do have a couple examples that I think make this clearer a little bit. So here is the example GCP provider, right? There's this very special YAML at the top. And then there is also a folder of CRDs in more YAML. If we look at what this looks like in the upbound registry, you can see that there is this one layer. It contains this one package.yaml file, and this is just all of that YAML concatenated together. So the controller just pulls down that image, extracts the YAML, and processes it. So to summarize, you can use OCI images for more things than just containers like YAML. Go container registry is very cool. It makes this easier. And cross-plane packages are one example of that. There are three other projects here that do similar things that you should check out as well. Thank you.