 All right, welcome, everybody. Hopefully you've lasted this long. You can make it through one or two more talks before the end of KubeCon. But we're thankful you stuck around enough to come here about Container D. This is our maintainer update that Derek and I and other maintainers from the project are here in the room. We've usually rotated these around at KubeCon and try and give just an update on what's been happening in the project. And today, a good half of this talk will focus on Container D 2.0, which just went beta on Monday. So hopefully you'll get to hear more about that. But I just wanted to start with some basic updates about the project, the velocity, the community, and always timely that Datadog or Cystig usually comes out with a state of containers report, usually a week or so before KubeCon. So this is just from last week, one of their top 10 indicators about container adoption was the use of Container D itself. And their highlight here from their graph was that the adoption of Container D is doubled in the last year. And again, much of that is probably related to the Docker shim deprecation. A lot more people moving to Container D is a runtime for Kubernetes. But again, great to see that growth of usage that adds to what was already significant use with every Docker installation. There was Container D as well and other existing adopters. Related to that, the CNCF puts out a velocity report and I realize the chart here is incredibly busy. But Container D came out as the 13th highest velocity. Again, there's some algorithm that looks at sort of committers and PR and issue activity. And so again, just a sort of stake in the ground Container D continues to be a very healthy project with a lot of involvement across the ecosystem. Related to that, we've been excited since the last KubeCon to have a lot of new maintainers. If you look at our project governance, we have sort of two levels of activity. We have a reviewer role and a committer role. We call that whole group maintainers. They're all maintainers of the project. Most of these are new reviewers of the project who then as they mature in their involvement in the project can move on to become committers as well. So it's really cool to see the diversity in this group. There's people from Microsoft and Google and VMware. And again, just kind of continuing this activity in the project that is really very broad across our ecosystem. It's not all about one group's cloud or project or product. So anyway, great to see this. And I'm especially excited that one of my colleagues, AWS Henry Wang is now a reviewer as well. Cool to see some of our young SDEs get involved in the project as well. Obviously, one of the ways that we see adoption growing is through Kubernetes distributions or managed Kubernetes offerings. I won't bore you by reading every title of every Kubernetes distro here, but many of these have existed for a while. Some of them are newer adopters. And again, this is just a representation of Kubernetes. There's always existing other adoption. AWS Fargate uses container D, which is not a Kubernetes distribution. You've got Docker's use of container D within Docker desktop, within the Docker engine. So again, we continue to see growth in overall project adoption. One of the things we've really focused on in the last few years, and Derek's gonna talk about this some more, is really we've tried to build container D for extensibility. We focused on this core that has now been around since 2016 and donated to the CNCF in 2020. So in many ways, the core of the project is fairly mature, but as we've matured, we tried to make sure that you can add capabilities, functionality, or consume container D in a way that's valuable for your use case. So that sort of went into the spectrum is building your own client or using the various client interfaces. So the Kubelet obviously uses the CRI interface of container D as a client. You've got Docker, Build Kit, CTR, Nerd CTL. You have a set of other clients using the Go SDK. And again, that's an extensible way that you can build around container D. At the other end, at the back end of container D, there's a set of built-in snapshotters and there's really been this growth and expansion into remote snapshotters that are proxied in that let you do things like lazy loading images. So we have Kohei here who's created StarGZ Snapshotter to kind of kick off that work that's grown into Sochi and Nidus and other lazy loading snapshotters that exist today. And that's a way that you can expand container D for your own use case. And then as well as Shim, so one of the most recent ones is the RunWazzy Shim that allows you to use container D to drive WASM workloads. We have an existing set of Shim, some of them have been around since the start of the project like the default Run-C Shim. And there's other pluggable interfaces and as Derek walks through some of the architecture, you'll see other ways that we're trying to make sure that container D remains a pluggable, expandable, extensible project. I just mentioned clients and I'll try not to belabor all the ways that the client area has grown over time to give Derek some time. So you can read a lot of the words on the slide. CTR has existed since we started the project. Many of you know about Nerd CTL, which Akahiro, one of our maintainers created, that gives you a more complete Docker-compatible client. The Kubernetes project has CRI CTL, which allows you to drive just the CRI interface and any other CRI-compatible runtime would also can work with CRI CTL. And then the one I wanted to make sure to mention is that I've mentioned that Docker continues to use container D underneath its project, but there's been some expansion of that in that they've started, they have an experimental feature first in Docker desktop, it's in the Moby project as well that allows you to use container D's image store, which as I just mentioned about the extensible Snapshotters, one of the values is now, that means that Docker and Moby clients can take advantage of some of that extensibility and use some of these custom Snapshotters. And so that work is ongoing in the Docker Moby project. And then there's higher-layer platforms being built around that. Many of you have heard of Colema or Finch that my team created at AWS or Rancher Desktop coming out of Rancher and SUSE. And these are higher-layer platforms building around some of these tools like Nerd CTL. I mentioned Snapshotters, there's a ton of built-in core Snapshotters that have existed, many of them, since the start of the project, but Blockfile is a new one. And then the remote Snapshotters, extensible by being proxied into the project like StarGZ, OverlayBD, NIDUS, Sochi, a project we open sourced, and GKE also has their image streaming built around the same remote Snapshotter technology. So again, that's a way that if you have an interesting idea or use case, you can extend and create your own Snapshotter. Runtimes and shims, again I mentioned some of these are ready. We've always had the default RunC, Runtime that for Linux coming out of the OCI. We also support CRUN, we actually have that tested as part of our CI on every PR. And then there's a few alternatives and experimental runtimes that exist. Shims also, as I mentioned, an area that allows you to extend ContainerD for your use case, and RunWazzy is a non-core sub-project of ContainerD, and there's a lot of activity happening there, as I said, to run WASM workloads driven through ContainerD. Many of these others have existed for a while, or we've talked about them at prior KubeCons. But again, feel free to go and dig deeper when you have time if you're interested in some of these shim implementers. Talking a little bit about our releases, so 1.5 is fully End of Life at this point. 1.6 is sort of our main long-term support release that we announced a KubeCon ago. ContainerD 1.7 was the last of the 1.x line. We'll talk about that in a minute. And then ContainerD 2.0, as I just mentioned, we just started the beta release cycle and should have that released in early 2024. Again, dependent on our testing and your testing and feedback on the beta and RC life cycle. There's some question marks about End of Life, because as you can see from this snapshot taken straight from our releases markdown file, there's either a max End of Life date or it's based on a specific release plus six months. And so as we have releases, those dates become clearer about the specific End of Life. So as I mentioned, we announced this, I'm pretty sure last KubeCon, about our first long-term support release that many people were happy that we'd have a release that didn't have such a short life cycle. It's supported at least until February 2025. And we've really sort of tried to create, again, with some judgment calls on behalf of the maintainers and expanded scope. Usually when we would only backport bug fixes to prior releases, now given we're talking about supporting this for a few years, we wanna make sure that we have the flexibility to update go versions, to update dependencies appropriately, and to keep compatibility with current and future Kubernetes releases, which sometimes requires changes to be backported into the CRI, which are effectively new features. And then as we reach the end of this LTS period, we'll move this to an active release with a little more limited scope, because by that time, there'll be other releases that'll carry the new features into new versions and a new LTS release. 1.7, again, is our last of the 1.x line. There were, when we first launched 1.7, we marked a set of new capabilities as experimental. I'm not gonna read through these, because many of these Derek is gonna talk about in a minute. These are features that we wanted to exist in 1.7 so that people could start to try them out and use them, and then finalize them and make them sort of GA ready in 2.0, and so Derek will talk through many of those. Our general release plan for 2.0 is that, we just released the beta zero on Monday. We'll have a series of beta builds, releases, move into an RC period late this year, early next year, and then hopefully within late January, February, have a 2.0 GA, and again, there's a set of components that we'll move from experimental to fully supported in 2.0. One of the reasons to sort of semver around a new major release is to finish the deprecation. Many things that have been marked deprecated for many releases are now removed in 2.0. The only one I crossed out, grayed out, is the config file version, and that's because Derek was able to create a config migration implementation, so instead of you having to rewrite your config, removing deprecated config properties and the format, the migration will do that for you in 2.0. And now I'm gonna turn it over to Derek to take us through a lot of these features that are gonna be supported in 2.0. All right, thank you, Phil. So we'll start to go through some of these features. We're not gonna go through kind of the high-level architecture that we've gone through in many other cube cons. We're gonna kind of focus on what we're adding new in 2.0 and some of these new services and APIs that we've added. So one of the big ones we've added is this new Sandbox API. Think of the Sandbox API, it's an API that we can now use to group together containers. So traditionally that was done via the shim, but if you played around with this at all, you'd see that the lifecycle of the Sandbox was the lifecycle of a container and there was always some complication about trying to figure that out. It wasn't very explicit before, whereas now we have an API that can actually handle a Sandbox as a first-class concept. So we have a few different interfaces around that for basically creating and updating the Sandbox environment and some of the use cases around that for some of the multi-platform or different VM use cases. We added some configuration around it so that the Sandboxes or Sandboxers, as they're called right now, works similar to Snapshotters. So as part of our goal to have the extensibility and plugability, Snapshotters have been a pretty good success in that you can have many different implementations of it and you can select it via configuration or even via runtime, it's kind of similar here. So we end up with an architecture that kind of looks like this for the Sandbox environment. So if we had two different Sandboxers that would be the Pod Sandbox, which is what kind of exists today, and then kind of a new ShIM Sandbox, which uses some of the new APIs, the Pod Sandbox does what you'd expect in terms of it's going to create a container. What you would know is like the Pods container today in Kubernetes, that's gonna hold open all the namespaces and everything associated with that Pod Sandbox environment, it's gonna go ahead and do all the task creation, task management. In container D, we have a single ShIM that will handle that, so we'll have a ShIM process that actually serves the task API and we'll talk to that via ShIM Manager. What's really new here with the Sandbox service is now you can create that ShIM environment before you even create the container, you can alter that environment separately from having to make alterations to those containers and basically have that environment without the containers and then go and create that later. It introduces this new Sandbox service at the actual ShIM layer, so the actual ShIMs will have a new service that the Sandboxer inside of container D can talk to. In the future, this gives us a little bit more flexibility in defining what a ShIM is and what the API between container D is and the ShIM, whereas traditionally it was, there was a one-to-one relationship between container D's capabilities and the task service. Now we actually have a little bit more room for extensibility there. The other kind of area in the runtime is NRI, the node resource interface. This happens a little bit earlier in the process. It's in between when the CRI is handling it and when we actually generate the OCI runtime specification. NRI's a pluggable interfaces, has these pluggable interfaces where you can take the OCI spec, you can call into a various set of NRI plugins and actually make adjustments to the OCI spec so you can think of anything you wanna do around dynamic resource allocation. If you wanna allocate parts of your GPU, everything there, it's actually a pluggable interface for doing that. So it keeps container D pretty simple. We don't have to make changes to container D for every type of node resource that you might wanna allocate to your container here, so we've added support for this and I believe I've heard that some of the other container runtimes have adopted this as well, so it's something that we can use across the ecosystem here. So what that ends up looking like from a container D perspective is, there's two different ways that you can plug these in. You have NRI binaries, you can also have a socket that the NRI plugin is listening on. The plugin will register itself and when a new container request comes in, it sends that request down to NRI and the NRI is able to send those container updates back to container D so that it can be passed down to the runtime configuration. The next thing I wanted to talk about is the transfer service. So this was something that we added in 1.7. Traditionally in container D, we've had this kind of fat client model so everything related to pushing and pulling of images was done in the client and we have a whole bunch of lower level services for managing snapshots, managing images, managing content, even down to we have plugability for doing the differ, calculating the difference between those and applying the different layers to the snapshots. But if you're trying to re-implement this from scratch, it's a little challenging. CRI ended up having kind of a circular dependency on our own client because of this and it's led to kind of some weird behavior where if you're using a client such as Nerd CTL, you might have a registry configuration and you have another configuration in CRI and they're not using the same implementation for this so the transfer service is aiming to solve that but it's aiming to solve it in a way that's fairly generic and simple. So you can see that the interface that we've defined for the transfer service is extremely simple. We have a source and we have a destination. It's fairly generic in terms of what those source and destinations could be so if the source is a registry and the destination is your local image store, that's a pole command. If you're going from an image store to a registry, that's a push command. If you have an archive, like an OCI archive or something and you're going into the image store, that's an import. So it's kind of generalized some of these operations for transferring images to different locations and adding a service so that we can do it but also in a way that can be easily extensible in the future where we can easily add different sources, different destinations, different functionality for doing that so there's a few that aren't implemented today such as registry to registry which is just, you could think of that as a mirroring operation. So internally the transfer service looks like we have our service layer, the client is going to, when it initiates a poll, it's going to just send that request to the transfer service. Behind the scenes there's a few things that are being done here. So we have another service called the streaming service which is just a generic way for handling a stream of data from the client to the daemon in this case. It's able to create a stream and set that identifier in the transfer request in order to do stuff such as returning progress back from the daemon back to the client. It's also really useful for being able to make requests from the daemon back to the client for getting credentials. That's been a sticky point for a while when you have a daemon side transfer is at what point do you get credentials and what do you get them for? In this case you can configure the daemon to talk to various different mirrors or whatever your registry configuration is and actually get the appropriate credentials for that when you actually make the request to the registry. And then we have these different objects that are playing a much smaller role. So the registry source is going to use a community resolver that's always existed which will contact the remote registry. And then we have the image store destination that's responsible for doing stuff like unpack and doing all the content manipulation and image storage. So if we break down what a poll looks like today, this is the simple case. We have a client, it's gonna get a manifest from a registry using the distribution API. It's gonna get a manifest, it's gonna get a config and then once it has that config it knows which layers it needs to get that complete image. It's gonna go and it's gonna fetch each of those layers, it's gonna store them in the content store. And then at that point the fetch of the image is done from a KineRD perspective. Now we're gonna go through each of those layers, we're gonna prepare a snapshot, we're gonna apply the diff, in this case they're just tar streams. So we're going to untar each of those into the snapshot by reading the content from the content store, mounting, doing the diff and then getting back, yeah, identifier for that layer and then we're gonna commit that snapshot and then in the end we create an image pointing to that top snapshot. Now we've optimized this a little bit so you can do some of these operations in parallel where when you go to get the manifest, you get the config, you know what the final snapshot is that you're gonna need. So you can check to see if you have that snapshot. If you already have it, you don't actually need to poll any more content in order to get that, in order to make that image available. So when it's not available, we're gonna go through a similar process, we're gonna now prepare a snapshot but now we can wait, once we know we need to create that snapshot, we can kick off that fetch operation and actually do that fetch operation in parallel for all the layers that are needed. As they become available, we do the, we apply those diffs and then in the end we'll get a created image. So it's a little bit more optimized as we can start doing these in parallel. Now the real kind of time saver is kind of what Phil was mentioning before around lazy loading. And this is where you can almost erase TAR completely from the picture, where once we have that config, we know what the final snapshot is that we're gonna need. When we go to prepare that snapshot, if we have a snapshot or that's smart enough to know where it can actually get that content from in the most efficient way to make that file system available because that's all we need in the end is we need the snapshot or to make the file system available. It can actually go to the registry directly to get the content it needed and it can just return to container D that I have everything I need to make the file system available as the snapshot or exists. Can just commit it, create the image and the snapshot or can actually fetch all the content as needed in the background or whatever the implementation of the snapshot or is in that case. But you can see how these operations can short-circuit through pretty fast to make a pull as quick as possible without having to go through all those extra steps. So for the transfer service, some of the new use cases and extensibility points here, some of this was written for confidential computing where we actually want a service that we can put in different places and it could be behind a confidential or secure enclave environment. Some of the OCI refers work that I know some of it, like some of the lazy loading implementations make use of refers, being able to actually get those and make those available. We do have some ongoing image validation work that's going on as well. And some of that's been merged and will be in 2.0. There's also more work that's being done for stuff like making credential management better. And then obviously the work to get, we're updating CRI plug-in in order to use the transfer service so that we have one common implementation. The other big change in CRI that was mentioned is username spaces. So we've added username space support in 1.7 for stateless pods and then we have kind of additional support coming in for 2.0 for stateful pods. It's much more challenging, but there's a lot of, you can follow along some of the CEPs that are going along in Kubernetes to make some of this work. But we've had a lot of that stuff available in container D for a while with username spaces, but having that plumbed all the way through CRI is new here. And then kind of lastly, I want to talk about some of the future and in-development work that's going on. 1.7 was kind of an exciting release. We had a lot of experimental stuff. 2.0, I think I said this a few Q-cons ago is not designed to be an exciting release. It's designed to be a boring release. We took experimental stuff and now we're making it stable, making it more usable. You know, there's still some loose ends we're cleaning up here with the transfer service and making Sandbox GA in the default. We have some work going on for shim plugins. So, yeah, I mentioned earlier how the shims now can have multiple services. So we want to kind of extend that concept to make it give more ways that users can manage some of those. Today we have proxy plugins. So the shims would be a, shim plugins would be a way to kind of extend that farther. And then we also have some higher level services that might come in 2.0 around image management. Today our APIs are pretty low level. They don't, the image service knows nothing about the snapshotter service or the container service. So you can think of a higher level service as something that can glue some of those components together and provide something that will really simplify some of the clients today like Nerd CTL and Moby and even possibly make it easier to implement new clients there. Same for the higher level container service. If you try to use container APIs to like start a container, it's challenging. There's a lot of different APIs you need to call from creating a snapshot, creating a container to create a task, waiting for that task. So having something simpler and higher level and something that can make use of some of the other features we've added such as the streaming service for managing kind of just basic input output over the API without having to necessarily touch low level FIFOs and stuff like that. So I mean that's kind of what we have coming along in 2.0, we hope everybody wants to get involved if you're here, like even if you're not gonna come contribute to a container to core, there's a huge ecosystem now of projects. Use those projects, try them out, propose new projects, bring your own projects. We're constantly getting new core, non-core projects being added to container D, which is great to see. We have our community meetings the second and fourth Thursday of every month. It's very specific but you can look at the CNCF calendar if you wanna kind of add that to your own calendar. And yeah, as always come open pull requests, have discussions, I think we're pretty welcoming community and we're always looking for more users come in as reviewers and more involvement. So yeah, thank you everybody for joining. I think we might have some time for questions. Let's see, two minutes left. So for compatibility with WASM, I mean a lot of that's put on the WASM community for now to implement kind of the APIs we have. One of the things we're trying to do with adding more extensibility at that layer is that if WASM has brand new functionality that maybe doesn't align with containers today, that the shims, they can have that functionality and we can go and support that in container D. But for the most part, the compatibility burden is on them, but I think as that ecosystem matures a little bit, we'll be able to have a much more like have that come back, be able to define interfaces that maybe make sense. Like our task service was written many, many years ago. It's probably the oldest API we have in container D back to like 2015 or something when it was originally written and it's very specific to Linux containers and as we know like modern sandboxes and WASM runtimes, they look different. They can do much, they have much more capabilities, so. Hey, question about running in Kubernetes with container D. This is a question about logs from containers. The logs from containers in this scenario land on the nodes file system as files with contents in specific format. These kinds of get problematic. It has issues, you need to have access to those files. You need the throughput on the IO, you need to parse it, et cetera, et cetera. I was wondering if it's possible. I know Docker has something like logging drivers and not sure if this is part of container D. Would it be possible to tell container D to just ship the logs via HTTP to an endpoint? Yeah, so the way container D treats all the IO from containers, we send it along to the next one and in this case, I believe would be the kubelet that would be responsible for how it processes it. Yeah, we made a conscious decision not to touch the container output. Awesome, so no need to change anything in container D actually. Yeah, awesome, great, thanks. Oh, another quick question to AWS. What's the plan to include container D Vito in EKS? Will it be available as an option? Where's my EKS team, they're here. It's been an option, I wanna say for a couple releases. So it was first, it wasn't the default, you could choose it, but now I believe 1.24 Kubernetes and above. No, I mean container D Vito. Oh, V2, well, yeah, as you can see, it's not released yet. So that step one is a release and then, yeah, I mean, obviously over time, we have to validate that with EKS and other services that consume container D. So next year, check with us in Paris. All right, thank you very much. Hey, yeah, so I'm deploying Kubernetes in a number of air cap environments with private registries. And then what I'm doing is I'm using container D's mirroring to, you know. You can come up and talk there. Just to stop, so we'll take questions, we'll stick around. Thank you everybody. We'll be here.