 Hi, everyone. Welcome to the Cata Containers project update. Let me begin by introducing myself. I'm Arjun Shinde and I work at Intel. I also serve on the Cata Containers Architecture Committee. Today, I'll be joined by Tao Peng, who is also a member of the Cata Architecture Committee and works at Ant Financial. With that, let's get started. What are Cata Containers? If you're here, I believe you must be already somewhat familiar with Cata Containers. We started out the project with the vision to make traditional containers more secure by making use of the security provided by hardware-based virtualization. Essentially, what we do with Cata is we run each container or pod inside a lightweight virtual machine and provide all the necessary plumbing to be completely transparent to the upper container manager or orchestration list such as Kubernetes. So to lay out a brief history of Cata, the project itself was started in December of 2017 as a result of the merger of the Run Vee project and the Intel Clear Containers project. In May 2018, we came up with the first major release. And in April 2019, Cata Containers was confirmed by the Foundation Board as the second project of the OpenStack Foundation. Following that, in April of 2020, we made our second major release where we introduced quite a few architectural improvements. So it's been quite a while since our last update. There's a lot to catch up to. There have been a bunch of new cool features and improvements that have gone and cut us since then. Our last update was in Shanghai OIF Summit back in 2019. There we highlighted a few things. Architectural support for S390X was added. And there were some operability improvements as well in terms of adding support for agent tracing. So that now the entire Cata stack could be traced from end to end, from the host side to the guest side. Our TC filter was made as a default network mechanism. And we also added initial support for the continuity Shim V2 architecture. There were hypervisor improvements as well. Support for ACON was added. Firecracker jail is support, as well as what I resork. A lot of new releases have been made since then. As I mentioned, we made a second major release in October of 2020 where introduced quite a few breaking changes. While we are working on 2.x, we continue to support 1.x for the while in order to support users that we're using Cata 1.x in production. Now in May, 2021, we made the decision to drop Cata 1.x. And instead, we focused all our energies and efforts on the 2.x releases itself. As of May 2020, our last release was 2.4.1 and the alpha release for 2.5.0. So if we take a look at the progress that we've made since Jan 2020, we have seen an impressive growth in the number of contributions from several organizations. We're also seeing a lot of diversity in the contributions as more and more people are adopting Cata and they're using Cata in production. We've seen a good increase in the number of pull requests as well as healthy discussions around what should go in to the next Cata release. So with that, let's take a look at the major changes that went into Cata 2.0. The most important architectural change that was done was the Chimpy to architecture. I'll talk about this in some detail in the next slide. Next, the Cata agent, which is the agent that runs inside the VM and handles all of the sandbox and container lifecycle management within the VM was rewritten in Rust. This rewrite was from Golang to Rust and this gave us significant reductions in the memory overhead and reduce overall attack surface as well. On the same lines, the agent protocol, which is the protocol that runtime uses to communicate with the agent was simplified to use TTRPC from GRPC. TTRPC is a much lighter weight protocol as compared to GRPC. A new component called Cata Monitor was added to improve observability and manageability. This tool can be used with other tools in the Kubernetes ecosystem such as Prometheus in order to gather useful container metrics from Cata. Another developer tool called AgentCuttle was added, which helped us to validate the agent API as we continue to make improvements to it. What IFS. Now, what IFS is a shared file system that Red Hat came up with. And in 2.0, we made the decision to make this as a default and we deprecated the what IR9P protocol, which was no longer being maintained. Now, with the shift to what IFS, we saw better POSIX compliance and also better performance in general. So, as I mentioned earlier, one of the major changes in Cata 2.0 was the Shim V2 architecture being made the default with the support for the early architecture being dropped completely. Our early architecture was basically replicating the runcy way of doing things. It wasn't very efficient for Cata that made use of VM sandboxes to replicate what exactly runcy does. So because of the assumptions of this architecture, we also had to have at least one or more host processes for every process, container process that is running inside the guest. So what this means is that if there are end processes which are running inside the guest, it meant we had to have at least N plus one processes on the host side. Now, the Shim V2 architecture, which is something that a folks in the continuity community came up with, it is much more suited for sandbox sometimes, such as Cata and Geoizer. It makes no assumptions about processes that are running on the host and it introduces APIs, which are left to the container runtime to implement. So for example, an API such as getmetrics was left to the Cata runtime to implement rather than making an assumption that there's a host process running which would provide the metrics instead. So with this, it was also possible to have a single process, which is called the continuity Shim Cata V2 process running for every pod inside the guest. So this reduced the memory overhead as well as overall complexity as well. So we later worked with the cryo community to have this implemented on the cryo side as well. And with that, both cryo and continuity supporting this, we made this as a deeper architecture in Cata 2.0. So let's talk about all the features that were introduced in the later 2.x releases. We added quite a few features to improve performance. One of them was the Nidus image acceleration service. Now this is a new image service that brings significant, among them was the container image lazy loading for both Kim and Cloud Hypervisor. Our topic, we'll talk about this in some detail in the polling slides. In addition to that, we also made rather closed all the gaps in our VFIO device pass through to make sure that Cata works correctly with device plugins such as SRV and also to support the DPDK use case. Moving on, we added support for direct assigned volume. Now what does this mean? Since Cata makes use of virtual machines, passing in a block device is much more performant and efficient rather than passing in entire volumes, which are then passed to Cata using a shared file system such as VotaFS. Now with that in mind, we worked with other open source communities such as Kubernetes and Container Team to make sure that volumes that are backed by block devices are passed directly to Cata. So the responsibility of mounting these devices is left to Cata and these devices are no longer mounted on the host side. Lastly, we added support for huge pages. This again ties into our DPDK use case. And finally, we added support for disk and network rate limiting for Cloud Hypervisor. Now this feature is important in terms of performance as it helps reduce the noise-enabled problem and ensures that a workload is IO-isolated from all other workloads that are running on the same machine. Now in terms of security, we made several improvements. We worked with both the cryo and container decommunities to have AC Linux support implemented end-to-end for Cata. What this entailed was adding a separate AC Linux profile specific to Cata. So this profile was most suitable for hypervisor-based runtimes. Now a lot of work went into this and we have an excellent talk from Fabiano Financial on this at the OIF Summit at Berlin. Now in addition to AC Linux, we also added support for a second and also a support for rootless hypervisor. In WonderDex, we added support for running rootless. But this feature, namely the rootless hypervisor allows the hypervisor itself to run with minimum privilege even when the runtime itself is called with root privileges. Now with respect to compatibility, we have added quite a few features including support for IPv6, which helped us integrate well with Red Hat OpenShift. We also added support for NerdCTL and What I Am In. What I Am In is a new paravirtualized mechanism for adding or removing memory to and from a virtual machine. Taoping will talk about the rest of the topics in the slide in much more detail, including iNudify support for secrets and confirm maps in Kubernetes. Now in addition to all of the extensive feature ads, there were a few deprecations as well. We, I've already talked about the first two, namely 90 and the Shim V2 being made the default. What we did was we deprecated chemilite as well. A chemilite was a lightweight version of chemium that we had introduced after getting rid of all of the legacy code and devices that are no longer applicable to container workloads. And we dropped this hypervisor as most of these downstream improvements were merged upstream. With that, Netmon, which is a network monitoring tool that we used with Docker, was dropped as well as it was no longer required. And finally, from a developer point of view, we moved to a mono repo model. So what we did was we moved all our code and document repositories into a single repo. And this greatly simplified our CI and release process as well. Now this shows all of the dependency versions that the latest Cata version is compatible with. We try to keep up with the latest and greatest tools in the container ecosystem. So this just gives a snapshot of the various components that Cata works with. In terms of architecture updates for the x86 platform, we added support for Intel TDX, Intel SGX, as well as support for AMD SCV. This all ties into our confidential computing use case. Now with confidential computing, we have enhanced a threat model. So as to make sure that the container workload itself is protected from the host by running the container workload inside a hardware-based trusted execution environment. For the AMP platform, we made quite a few improvements. We added support for NVIDIA plus DAX, which allows the VM root FS to be passed in by passing the host page cache. The improvement was adding support for VCP hotplug and what I meant. For hardware accelerators, we added support for Intel QAT, Intel GPUs and NVIDIA GPUs. SPDK, we host user target was also added. This helps in to run your high performance user mode applications. So with that, I'll hand it over to Topping to talk in detail about some of our key future highlights, right? Thank you. Thank you, Achenna. Hi, everyone. My name is Topping. I'm a staff engineer from AMP group and also part of the Kata Connectors architecture community. Let me continue with our recent future highlights. During the 2020 timeframe, we have rewritten Kata agent with Rust and replaced the communication protocol between long-time agent from GRPC to TTRPC. And again, the benefit is obvious that we have reduced the RSS of Kata agent from 11 megabytes down to 300 kilobytes. We also ended a new node-wide component called Kata Monitor to connect Kata-specific metrics and presented to external services. Kata Monitor will be responsible for connecting metrics from different Kata streams processes and present them to external services such as permissions. In the 2.x timeframe, we have also ended notified support for Kata Connectors watchable months. Watchable months is a special month that is defined by Kata Connectors. And we have an example to identify certain Kubernetes volumes and make them watchable for the containers. This is done through making the Kata agent process to periodically pull for changes in the Kata volumes and present them for time prefixes in the guest. And then, Kata Connectors applications can use I-Nodify to monitor for such changes. This is introduced mainly because the prefixes cannot support and the remote file sharing protocol they cannot support I-Nodify, but we would like to provide such functionality to the users and so we've ended it. Also, we have ended a direct block volume assignment support without it currently Kata Connectors will present all their volumes through VALOFS, but it is slower than VALOVlog. So folks are working on to present the block volume directed to the applications inside Kata Connectors. This is done by defining a Kata specific month info.json file and put it inside the volume direct stream on the host. And then Kata Connectors, the Kata stream will analyze the JSON file and find out the block device on the host and pass it to the guest VALOFS. This work requires a specialized CSI driver and this will be open sourced as a subproject in the Kata community as well. Also, we have integrated a CNCF container image there's a pooling project called LIDARS in Kata Connectors. With it, we can launch Kata Connectors in a constant time such as less than five seconds regardless of the container image size because without this pooling we will have to pull the entire Kata Connector image and it will cost a lot of time when we first start the Kata Connectors. Also, we have ended a new component called RANKAY to start a standard NUNUS container based on Kata agent. This is currently an experimental feature and as we can see, because RANKAY is written with Rust, it obviously is faster than RANKAY and it costs a smaller memory footprint than RANKAY, but obviously there is still room to improve comparing it with CR1, which is written in CR language. But because RANKAY is the investor, we can gain the interest features such as we do not have to look for mnemonics or different signal for such bugs in RANKAY. Also, some updates about Kata testing. We have a group of people working on a so-called green CIA efforts project within Kata Connectors. We have defined all the Kata CIA jobs and design different maintainers of different jobs and with these folks' hard work, we have a quite stable CIA for within the project right now. Also, some updates on the architecture community. We have gone through several ones of the directions in the past few years and the latest architecture community member, this Ashina from Intel, Eric from Apple, Fabiano also from Intel, Tappan from M-group and also Samuel from Rebos. And we have seen these developers from these companies contributing to the project. This is an expanding list of companies and we are currently it's a little more than 30 and we expect to see more folks from different community, different companies come to contribute to the project. Also, some updates about Kata users. We have we have carried a few use cases from different enterprise users such as Hengu. They are running online applications and benchmarks on thousands of production nodes with Kata Connectors. And the claim to achieve Kata new reality in last year and the Kata Connectors was one of the key technologies that helped them to achieve it. And they are working on a base practice white paper that will be published during the summertime. And also by do, they offer AI code and edge computing services and massive scale with Kata Connectors. And it's total tenure, they are startup work using Kata Connectors to implement a new cloud resource management technology. And IBM is using Kata Connectors for their CICD workloads and Huawei is joining Kata Connectors in production in their Cloud Container instance service and also their Cloud Container engine services. And Lost Flake is using Kata Connectors to deploy unsurpassed workloads. So, they do not have to worry about container escape events and color vulnerabilities. And new prefabricures is using Kata Connectors to deliver a service framework for cloud and edge resources. And last but not least, Red Hat has integrated Kata Connectors in their OpenShift product and provide additional isolation with the same cloud native user experiences to their production users. And our future plans currently there are some major ongoing changes within the Kata Connectors community. The first one is Confidential Computing. It is a feature that expands Kata Connectors through model from protecting the host infrastructure to also protecting their workloads. Also, we are rewriting Kata runtime with Rust. So this will vastly reduce Kata Connectors resource overhead on the host and enables a new use case such as function computing that requires a lot of density on the host. And we are working on integrating, introducing an integrated VMM with the Rustified runtime. This will be a highly customized RustVMM built built into the Rustified Kata runtime to simplify deployment and operability of Kata Connectors. And we view this as a step forward to virtualization for the cloud native use case. With this major ongoing changes, we are expecting a new major Kata Connectors with CIDR 0 to be happening October this year. And let's take a look at Confidential Computing use case. This will be using Kata Connectors as a key component to run a trusted, trusted domain VMM or SEVM or NSEVM for the workloads so that processes on the host cannot see memory or data of their workloads in the guest. And also the Rustified Kata runtime and integrated VMM, we are building them into a single process. So in future, we will have Kata runtime, the hypervisor, the guest, the agent, all running in the same process on the host. Without further ado, we still have a lot of channels for folks who are interested in the Kata Connectors project to reach out to us. We have a website for it. Please also, we see our GitHub repository and you can reach out to us with IRC, with STAM, Twitter, and also many NIST. So if you have any questions, please reach out to us with these channels and we look forward to seeing more folks and to join the community. Thank you, everyone.