 Hello everyone, I'm Yan Song, an engineer from Trust Native Team from Ant Group. I focused on cloud-native container image and also Rust developer. Today I will introduce my topic. Toward the next generation container image, here is an outline of the topic. First, I will introduce current state of container image. And then talk about OCI image drawbacks and community discussion about improvement of container image. Finally, I will introduce the open source project, NEDAS image service. Container ecosystem is always inseparable from image. For example, Docker, it's composed of container and image. Container is fast, lightweight. There is certain isolation for company. There is certain isolation for obligation running environment. Well, image provides immutable data source. In more detail, container image is lightweight. Stand-alone executable package of software. It contains all data needed to run the container application. Such as code, runtime, system tools, system library, and runtime configuration. Let's take a look at the container root FS structure. It consists of a read-write layer and several read-only layers. A read-write layer keeps all the data container modified or saved. The read-only layers are the immutable part of the container image. These layers are combined well over-levels or similar technologies into a complete view of the container image. When it comes to container image, we have to talk about OCI community. OCI community aims to standardize all kinds of container-related technologies so that the implementation of vendors in container area can be compatible with each other. OCI defends container runtime, image format, image distribution spec. Here, we are mainly talking about image spec, which describes the format of OCI container image. The runtime spec specifies the configuration, execution runtime, and lifecycle of a container. The image spec describes OCI container image format. The distribution spec defends an API protocol to facilitate distribution of all images. This is structured, described by OCI image spec. Starting with an index JSON, which indexes one or more OS and ARC image manifest. And the manifest references the container runtime configuration ball. It describes runtime environment and the data for each layer of image. Usually in target format, these layers are read-only layers of the container root FS. Next, let's take a look at drawbacks of OCI image structure. First, before creating a container, the image must be completely put and unzipped into the file system. But in fact, container application in runtime will only use a small part of image data. And report says only about 7% of image data will be used. But the container must wait until the entire image pulled down before creating. In addition, the duplication at image layer level is less effecal. For example, the file metadata changes will lead the entire file data to be saved again to the new layer. And if the file changes many times, container will only use the latest data. The history version will be pulled down too. In addition, the deleted file will also be pulled, which waste unnecessary storage and network bandwidth. Of course, communities also try to resolve these issues. We're happy to see these related open source projects, such as Dragonfly and Karkin's P2P image service, and ServingM from Star. It's really on local FS's local file cache. Slicker from Universal of Wisconsin, it uses storage layer snapshot compatibility. CRFS from Google Goran build team, it keeps layered format. Split metadata and data in each layer. File green from NTT, and almost from Suzy. This breaks layered format. Separate image metadata and data. Unfortunately, all of them are more or less imperfect and have many problems, such as Dragonfly and Karkin missing on demand load. Slicker and CRFS only have layer-level deduplication. SurveyMFS. File green, almost, have too many blobs data for each image. Slicker and Umos. Slicker and Umos. Really on storage layer snapshot and refling capability. All of them have no data verification when read by container application. For these issues mentioned above, we developed the Midas image service project. It can read container image data on demand. Also, it supports chunk-level data deduplication with configurable chances. It flattens the container image into a metadata layer and a data layer. Removing all intermediate layers. It also supports end-to-end image data integrated. Last but not least, it's compatible with OCI artifacts spec and distribution spec. So it can easily integrate into the existing container's ecosystem. This is the architecture of Midas. It is user-space-demand. It contains a user-space file system and an optional local cache. Through the views and local IOFS protocol, it supports running both RunC and Kata containers. The user-space file system storing its data and metadata in different storage back-end. For example, OSS NAS, registry, Dragonfly P2P network. When container access its image data, Midas will first look at local cache if it's not there. It can fetch it from storage back-end. A Midas image is divided into two parts, a metadata layer and a data layer. The metadata layer was split into chunks. The metadata layer provides file system download information about entire images, directory, file, symbol link, and hard link. The metadata records the location index of file data chunks stored in remote storage back-end. In addition, a Merkle tree, also known as a hash tree, is constructed in the metadata. It stores all hash of file data as the leaf nodes. And then calculate hash of directory based on all hash of leaf nodes. So we can easily check the entire file tree from bottom to top like blockchain. Let's look back. Midas design resolved the on-demand load problem. Container application only reads data when it's requested. And because file data was split into chunks, so chunk level deduplication can be implied. And it's supporting a variety of remote storage back-end, no any dependency on storage layer. Based on Merkle hash tree, can also be used for data validation of runtime to support end-to-end data integration check. Let's take a look at Midas performance. The testing are using BZBox CentOS OpenJDK Node.js TensorFlow image. As we can see, as image size increase, OSI image pool time is also increased. TensorFlow image pool time list more than 200 seconds, while the Midas image pool time is always kept at the millisecond level. End-to-end container start time is only about 2 seconds. Midas also implies end-to-end data integrator validation. When the local cache data chunk is broken, maybe some hacked the machine or local storage data corruption, Midas can detect it by checking the data digest and fall back to remote storage back-end to take the data, and then also relates to local cache. Let's reveal some community projects via Midas. Midas has a lot to offer in terms of on-demand load, better data deduplication level, reproducibility, runtime data integrator, runtime data integrator validation, more or less storage layer dependency and more. In future, Midas will support more data compression algorithm, scalable chunk size, and deduplication level at image build time, and deeper integration with P2P system like Dragonfly, better integration with different CRI runtime, and actively promote Midas as a reference implementation of OSI-V2 image spec. That's all for my introduction. We have open sourced Midas on GitHub. As a subproject of Dragonfly, here is the link and will come any contributions. There will be cure and AC then during the summit. Feel free to drop by. My colleague and I will be there to answer any questions you may have. Thanks.