 Hello, everyone. Thank you for coming to the last talk of the day. My name is Maxim. I'm a software engineer at Apple. And I'm also a continuity maintainer for about four years. Hi, everyone. I'm Sam. I'm a staff engineer at Google. I've been involved with the ContainerD project as a contributor since about 2017, and more officially as a security advisor and then later as a maintainer since about 2020. Today we're going to talk about updates in the ContainerD project and deep dive into some new functionality. So I want to start off with ContainerD's support lifecycle. ContainerD has three different kinds of supported releases. We have the active release type, which is generally going to be the most recent version, which receives bug fixes and backports fastest for at least a year from the time that we release. And at least six months after the next active release comes out. We also have an extended release type, which is used sometimes as people are transitioning between the previous active release to the most recent active release. And this just contains security fixes, so we don't generally fix other kinds of bugs in the extended release type. Finally, at the last coupon, we announced the introduction of long-term stable releases. This is a new type, like active, but a longer support window. So three years instead of one year. And we consider an expanded range of backports, including things like staying on a supported Go version if the Go version goes out of support during the three-year support lifecycle. So I want to talk a little bit more about LTS. We do have our first LTS release, which is 1.6. It's going to be supported until February of 2025, which is three years from the date that it was released. In its longer support window, it's going to have bug fixes, security patches. It'll also have, we're aiming to support Kubernetes releases during that three-year support window. So that means things like updates to work with new Kubernetes versions, potentially including implementing new caps. It'll convert into a regular stable release up to six months before, so that it would be August 2024. And it'll have its normal backport scope at that point. So bug fixes, security patches, but not new functionality. Just looking at the timelines for Kubernetes releases and where we are with container D, we'd expect to support 1.24 up through 1.30. Depending on the release timing for Kubernetes and whether or not we end up extending this, it might support a few beyond that. Separate from LTS, we also have another regular active release that just came out, which is container D 1.7. This contains two major new features, both of which are experimental. So it has a new Sandbox API, which enables the core of container D to understand groups of containers, like pods, and enables new functionalities when you consider a group of containers together. So think about things like VM-based runtimes. There's also the transfer service, which introduces new extension points for customizing and controlling the behavior of content transfer, things like image pull, image push, but also things where you might want to intercept that for things like signature validation, encryption, things like that. Maxon's gonna deep dive into both of those new features in a few minutes. So I'm just gonna leave that as an overview for right now. Container D 1.7 is a regular active release. So that means it only has a one-year support window. The support is going to end before 1.6, but 1.6 doesn't have any of the new functionality. It's also going to be the last 1.x release of container D. So let's talk about new features to expect in 2.0. During last coupon, we mentioned that Container D 2.0 is gonna be the most boring release ever. So we have introduced two new APIs in 1.7 as experimental sandbox API and transfer service. Our goal for 2.0 is to stabilize and productionize these APIs. Another big chunk of work we are focused on is CRI integration, which expected to be finished in 2.0 as well. Transfer service will get deeper integration with other core Container D services, including integration with sandbox API itself. This way, Container D runtimes will be able to process artifacts as well. We also plan to remove a lot of deprecated code that we accumulated since 1.4 release, which might break backward functionality if you still rely on those features. Sandbox APIs aims to offer better support for VM-based runtime. There is also new term that I've heard recently, Pozla spots, which perfectly describes what that is. To quickly recap, it's a new set of APIs that add a notion of a group of containers in Container D. And by group here, we mean really anything. So it's now up to the underlying runtime, how and which resources to allocate for a particular pod. For instance, in Kube, we use PodSandbox, the resource network interfaces for a pod, and runs this postcontainer which acts as a parent container for all pod containers and which holds all resources for a pod. But now with Sandbox API, we can substitute implementations and run, for example, a hypervisor instead. Sandbox implementations will be provided by Container D. Shims. So we introduce this new interface called controller, which Shim can implement in order to provide custom logic. This interface is optional, so when implemented, Container D knows that Shim supports sandboxes. Otherwise, we will fall back to default implementation. The demo itself will remain implementation agnostic, so it'll have no knowledge of what kind of Sandbox runtime will run. So it's not enough to just introduce a new interface and have Shims implemented it. Someone has to call it. In case of Kube, there is a CRI plugin in for Container D, which is expected to be the primary user of these APIs. CRI integration is ongoing and we plan to finish it in 2.0. Currently, CRI is the largest plugin in Container D. So we plan to split it into smaller, more maintainable pieces. And CRI today maintains Pots and Box Container Logic, which ran past Container behind. In 1.7, we started extracting Pots and Box Logic into a separate package, which become the first controller implementation. And to keep things from falling apart and in order to avoid affecting existing customers, we decided to fork CRI plugin and keep another version of CRI server in tree next to our existing CRI code. So we do all refactoring work there. And you can use this environment variable, enable CRI Sandboxes to switch between these implementations. In 2.0, we plan to enable the new implementation by default and remove the old CRI code. Another new addition in 1.7 means transfer service. So high level, it's a new API to transfer artifacts from source to destination. Where source and destination will be a pluggable code which can be specified when we initiate new transfer session. So for instance, if you want to pull an image from NSEI registry, we would specify registry as a source plugin and image source as destination. But now with this new architecture, nothing prevents us from implementing new source plugins to support new arbitrary data sources. This table on the slide represents currently supported source and destination plugins we already implemented in 1.7. So we already support push, pull and import operations. Essentially, lots of code just moved from client to the demon side, but now we have a lot more granular control over the process. Transfer service aims to be a strong foundation which will simplify any new features that we receive often requested by the community. Here are a few examples, things like image signing and validation, credential supports, image decryption and so on are now straightforward to add to the new architecture, with the new architecture. Some further work that we plan in 2.0 is send box API integration. So container editions will be able to provide its own plugin for transfer service. This will unlock some new use cases, such as confidential computing. So images that we pull can be handled directly by container runtime bypassing host machine. Also worth mentioning an important change in Kubernetes request to GCR.io will be redirected to this new address. With the plan to sunset GCR.io in longer term, so you should update your configuration on your notes as soon as possible. In case of continuity, here is an example of such configuration. We just changed send box image field to the new URL. NRI is a new way to intercept a CI spec from CI. It's a new edition in 1.7 release, which was significantly reworked. We've switched from CNI style plugin model in which we call a plugin binary and pass information in STD in to a more complex GRPC-based model. So now NRI plugin can amend parts of the CI spec and handle events at every step of pod and container lifecycle. There is an experimental integration with send box API CI, which we plan to complete in 2.0, so NRI plugins will be capable to handle send box events as well. And there are a couple of new things that we explore in 2.0. Right now NRI has access to a limited subset of CI spec and we want to expand what can be changed from a plugin. Another possibility that's been explored is getting NRI on non-linux platforms. As I mentioned before, we plan to remove a lot of deprecated functionality in 2.0. In some cases, this might lead to breaking backward compatibility, but it's right to be reasonable and minimize impact. This table summarizes high-level features that we plan to remove and suggested alternatives. Green checks represent items that already removed and PRs are merged in our repository. For example, we did some runtime cleaning. We've removed V1 runtime API and RnCV1 implementation. So runtime V2 will be the only supported runtime in 2.0. We also removed deprecated UFS snapshotter and suggest using overlay instead. And also in 2.0, we switch configuration format to version two. Sam is gonna tell about the continuity ecosystem. Sam, over to you. Thanks, Max. So I want to transition to talking about the continuity ecosystem now. So one of the core ideas for continuity is to be flexible and extensible. That means that continuity comes with a full set of features and implementations for those, but many of those implementations can be swapped out for something else. That includes things like snapshotters, which are used for handling image layer storage, runtimes which control the container execution environment, and a substantial client library means that high-level client programs are easy to build. The continuity project hosts some extensions. So we have some core extensions that are included in continuity by default and have the same set of stability and support model as continuity. And we also have some non-core extensions, which are also supported but are not included by default and don't have the same lifecycle associated with them. But there's also community projects and there's not much different between a community project and a non-core project other than the governance model and vendor products that are launched on top of continuity too. We've seen a really, really great adoption in the ecosystem and it's really cool to see. One of them of course is Kubernetes. So many of you will have heard that Kubernetes 1.24 remove DockerShim. Continuity is a well-supported runtime that can be used in place of DockerShim. Many popular Kubernetes distributions are either adopting container need now or have already adopted it. You can switch your own local Kubernetes cluster using a kubelet flag to tell it to talk to the containerD socket instead of talking to a different socket. Beyond Kubernetes, there are other ways of interacting with containerD. So CTR is a command line tool. It's our development tool for containerD. We don't really recommend that people depend on it for a production use case as it has no stability guarantees for it, but it's really good for exploring the different kinds of things that containerD can do and playing around with the containerD API. And the code for CTR is pretty accessible if you want to look at how to interact with containerD as a client. CryCuttle is from the Kubernetes project as a generic way to interact with CRI runtimes. It works really well with containerD and you can use it to simulate how Kubernetes runs pods. NerdCuttle is another high-level CLI that brings more familiar Docker-like functionality to containerD. It adds additional things for things like lazy loading images, encryption, signing. It's a non-core containerD project, so it's still managed by the same set of people that are involved with containerD, but it's own separate non-default thing. And then we have a couple other community and commercial things that are built on top of containerD and NerdCuttle and another CNCF project called Lima. These are things that are aiming to bridge the development experience over to places like macOS or Windows. CoLima is a community project that does that built on Lima. RancherDefTop and Finch are both commercial offerings that are built on that and tend to package it along with some other extensions that are provided by those vendors. Snapshotters are the mechanism that containerD uses to store image layers. All of them are designed to present the container's file system as a union of layers. This is what enables things like image inheritance where you can build on top of one image as a base and extend it without duplicating the storage. ContainerD comes with a selection of built-in snapshotters that are focused on some different use cases. So Linux options include overlay, which is the default and is a general purpose one that's suitable for many use cases, but also includes the butterFS and ZFS snapshotters, which you can use if you have a preference for either of those file systems and the way that they handle snapshotting and storage. There's also the device mapper snapshotter, which is useful for some alternative execution modes where file system sharing might not be possible. This is for things like VM-based runtimes. On Windows, there's two different options. There's the Windows snapshotter that is used for regular Windows containers. And there's also a Linux containers on Windows, Elkow snapshotter that is used for the Linux workloads that run there. We have started to see some use of containerD on FreeBSD and there is the option to use FreeBSD with the ZFS snapshotter. There's also the native snapshotter, which is supported on all of the different operating systems, but has some particular drawbacks, including it being slow and inefficient. So we don't generally recommend that people are using that one in production. Snapshotters can also be extended by means of a proxy plugin. A proxy plugin lets you run a separate process to provide the functionality of a snapshotter. That means it doesn't need to be in tree with the rest of the containerD code and can be provided by vendors or provided by community projects. We have a few that exist within the containerD project as non-core. All of those are focused on a particular remote lazy loading use case. So lazy loading is the idea that you could start the container ahead of actually having all of the content presence on the local file system. So in the continuity project, we manage the eStarGZ, the NITIS, and overlayBD snapshotters as different models to achieve that lazy loading. But there's also other commercial approaches to lazy loading from different vendors. So Sochi is from Amazon and GKE has image streaming built into it. These are vendor-specific solutions for doing lazy loading. Proxy plugins don't have to be about lazy loading or have to be about remote either, though. So the device map or snapshotter, for example, didn't used to be part of containerD core. It wasn't built into it and it had started its life as a remote snapshotter until the point that its use case had been basically more validated and was more generally applicable beyond its initial start. So I talked a little bit about runtime environments. Runtimes and shims are the mechanism that controls the execution environments of a container. So this is things like a standard Linux container or alternative different things. Run C is the default runtime and the reference implementation of the OCI spec. It's what most installations of containerD are going to use, but it's important to note it's not actually part of containerD, it's its own independent project, part of the OCI, the open container initiative, which is a separate foundation. There's also C run, which is an alternative Linux runtime that does the same thing as Run C. Both of these are leveraging typical Linux container technologies like namespaces, C groups, sec comp, capabilities, SC Linux and app armor. Beyond that, the containerD project does host a non-core runtime, which is Run Wasi, which is focused on Wasm containers. So not Linux, but WebAssembly. There's Run HCS, which is the runtime for Windows containers that leverages the host compute service on Windows, which is what actually implements the containerization there. There's also Run J, which is an experimental runtime for free VSD jails. Beyond that, there's some other community-driven or commercially-driven runtimes, like Cata containers, G visor and Firecracker container D. These are all different models focused on increased isolation around Linux containers. So they go beyond the typical technologies that are used by Run C and C run and can do things like isolate with a hypervisor or use syscall re-implementation to provide additional isolation beyond what you would get with just a regular Linux container. So I think we're going through this pretty quickly, but before we move on into Q&A, I want to point out a few ways that you can get involved with container D. So we have a couple active channels on the CNCF Slack. There's the container D channel, which is more focused on general use of container D. There's the container D dev channel, which is focused on developing container D or extensions of it. We run a monthly community meeting, which is on the second Thursday of each month. It's on the CNCF calendar, and that's where you should check for your local time zone. We also love to hear about adoption and folks building new things in the ecosystem, whether that's in the community, something personal, something commercial. It's really great to just hear about what's going on. We're very active on GitHub, and we use it as another primary means for communication. We have an active GitHub discussions forum for conversation. We also have GitHub issues that are available for bug reports, feature requests, and collaboration. If you have security things that you need to report, there's a private security reporting mechanism through GitHub, and we also use pull requests as a primary means of code review, contribution, things like that. If you want to see what's going on, if you want to author new pull requests, if you want to comment on active ones, we'd love to have you involved with the community. So, I want to open this up for Q&A. If you are a virtual attendee or we don't end up getting to you in the time that we have here, please go ahead and ask in the Slack channel and tag us. Can't promise that we'll get back to you immediately, but we are going to try and get back to you and continue the conversation there. So one of the staff members is going to come around with a mic. Hello, my name is Andre from London. The question would be around the hardware acceleration or SSL offloading, any plans or any thoughts on this in implementing in container D, so to provide more native support for these kind of technologies. Thank you. I can take that. So, I don't know that I have specific thoughts on SSL offloading, but we have been looking at how do we make various things in container D faster to take advantage of hardware capabilities. So we had a pull request recently that was looking at using hardware acceleration for SHA-256, which is used as part of the digest calculation for images and verification as they're downloaded. But generally we're staying up to date with things in the Go ecosystem as well. So if there are hardware accelerations that happen as part of the Go standard library, we're going to pick those up too. Hopefully that answers your question. Cool. Hello. My name is Azar. I'm from Berlin. And the question is, is there any relation between Podman and container D because I haven't seen any mentioning? Yeah, those are separate independent projects. Then it would be question with regards to containerized operational systems and one of the implications, not containerized, but with better container support, which is Amazon bottle rocket operational system. So it's like having this closer to the kernel, can you speak about this support by the container D? What are the developments out there? Thank you. Yeah, so thank you for asking a question that I can answer pretty well. I used to work on bottle rocket before I left Amazon. And bottle rocket is a container focused operating system. What it does is it uses container D as its containerization mechanism. So it doesn't have additional isolation for the containers beyond something like container D. Your container execution environment is going to be the same on bottle rocket as it is on any other Linux distribution, container focused Linux distributions like what used to be CoroS and is now, I think, something else at Red Hat. There's also costs from Google. I think there's Talos as another container operating system. They're all gonna be the same, but it's also gonna be the same as if you just run it on Debian or Ubuntu. All right, excited to hear about the LTS support. That's great. Any idea on the cadence of that and if the support Windows for LTS releases will be overlapping? I think we're early in the stage of trying to figure that out. So 1.6 is the first LTS. We haven't tried doing one before. We're kind of gonna see how that goes over the next couple of years. If you have feedback on that, we'd love to have it. So I'm guessing from your question that you're looking for overlapping support Windows, is there anything else that you'd like to see in LTS or if you have any feedback for us? It's interesting that the LTS is starting and then 2.0 is coming. So I could see potentially in 2.1 or 2.2. There's a neat feature that would force us or really encourage us to move off of LTS 1.6. Yeah, with 2.0 being a fairly large change, including some backwards incompatibility breaking change, I think that's why it's moving from 1.x to 2.0. 1.6 is hopefully going to provide a longer time for people to be able to migrate and test and make sure that they're okay with the changes that are happening in 2.0. I think that if the 1.6 LTS goes really well, we'll hopefully have something more concrete around what a cadence could be for future LTSs. But I think at this point, we don't really know. Okay, great, thanks. I've seen the DevMapper Snapshotter hang periodically after starting a whole bunch of snapshots for micro VMs and then just after a period of time, you just stuck there. Have you seen that in your experience with Lambda or Fargate when you were using it at AWS? Samuel? This is a question for you. I haven't, but you should file an issue and we will try to reproduce it, but I'm not sure about Fargate and if we can comment Fargate stuff. I don't think I have anything. Yeah, yeah. So it's generally not advised to use it with a loopback file system, but can be more convenient than formatting a separate hard drive and allocating that for the DM pool. Do you think that could be why, potentially? Yes, maybe. It's sort of hard to judge from the scene whether it's like a good idea or not, but we can follow up in GitHub and we can figure it out. Yeah, thanks. It's me again. So with Plugable Transport Layer, can I, in theory, use something like pulling images and pushing images to S3 or compatible storage? Yeah, that's the idea behind it so you can replace source with any plugin you like and basically it's plugins responsibility to retrieve data from S3 Dropbox or whatever and provide it to continuity. I think the way to think about it is right now the behavior with container D is the only remote that it really knows how to pull from is an OCI registry and with the transfer service, you should be able to have it pull from or push to whatever your plugin implements and that could be an OCI registry if you want, but it could be something else. Cool, so maybe we'll close Q&A now and you're welcome to come up and talk to us after. We would love to have feedback on the session so there's a QR code up here that you can scan and leave us feedback, that would be great, but thank you all for coming.