 Hello, Chiangpeng and I would like to talk about building high efficient storage infrastructure for SQL container on top of SBDK. In this session, the SQL container mainly refers to Cata containers. The main contents are about several parts. The first part is the overview of container storage infrastructure and the SQL container storage. Then is the introduction of SBDK warehouse and how to apply it to Cata. The following is to provide Docker volume from SBDK and provide U2FS service from SBDK. In the end is the progress, imitation and our expectation. So what makes container to be the most valuable and popular technology? One of the factors I think is the innovative storage usage. Containers are ephemeral in nature. They have a file system of their own. When container dies, the data stored locally in their file system is also gone. Images and containers are built up from a series of layers. When you create a new container, you add a new writable layer on top of the underlying layers. Our drivers use stackable image layers and the copy on write strategy. That means any modifications are made to this copy of file and the container cannot see the read only copy of the file that exists in the lower layer. So based on the specific storage requirements, container storage infrastructure has several characteristics. The first one is layers storage. It is for U2FS and the second is scratch space. It is for ephemeral storage and the last one is the persistent volume. The last one is for stateful containers which will keep its data alive during containers crash, dead or deletion. For secure container, for general container, the storage infrastructure is built on top of container on top of kernel functionality. It is tightly coupled to the kernel of the host machine where the container is running. Storage drivers in kernel, like AUFIs, DressMapper, OverlayFIs are leveraged for root IFIs. Handles details about stackable image layers and the copy on write strategy. By the amount of mechanisms helps combine the cans of volumes and the root IFIs to be a consistent IFS entry to the container. Secure container has one actual layer for isolation, for example, Cata container. It is based on that with VM. It has a virtualization layer. With this actual layer, container inside VM cannot directly touch its storage resources. So, what about block is a convenient way for access within the virtual machine. Votel, JLP, and Votel IFIs are shared file systems. And they are also Votel-based devices that will let virtual machine access the directory tree on the host. Cata containers, Votel IFIs and Votel F2T are used for shared container directory volumes, to increase config maps and other files, and the container root IFIs on the host with the guest. The underlying are still kernel storage stack. The actual layer is a barrier for resources. As a result, they do not perform as well as local file systems and local block devices. But in another point of view, with the actual layer of isolation, the host kernel storage service can be decoupled. Okay, first let's talk about SPDK. So, what's SPDK? SPDK is short for storage performance developer kit. And it is built of several libraries. It is optimized for latest generation of CPU and NVMe SSDs. And it is totally open sourced with the BSD license. And you can review the SPD code in the GitHub and some documents via the sp.io website. And the right column is the company that has deployed SPD in their environment. So here we mainly talked about SPDK, we host a user solution. And we host a user solution is mainly focused on the performance target. So here for SPD we host a user solution and it means we can use a fewer cost for our processing, which can achieve the maximum RPS inside the guest. So the diagram in the left, the yellow part is our cost used for the different solutions. The first one is the KMU inbox with our driver. And the next one is the host kernel solution. And the right part is SP will host a user solution. You can see that the way SP will host a user solution will just use one CPU call. And the right part is the performance we can get. Compared with the kernel solution, SP will host a user solution can achieve over two times of the OPSS, which only use the one CPU call. With several years of development of SPDK, SPDK already has plenty of modules and functionalities. Several modules can get composed to support the container storage. We have different ingranding. In the lower part is the driver part. And then on the drive part we put some components such as block device module. And also we have some integration into the existing ecosystem, such as we have send a plug-in to OpenStack. And we can integrate the US-based BLOB-FS into the ROX-EB and also have self-integration. And for the kernel part we have the host user solution. So for the way host a user solution, let's see how to use the block device in the card container scenario. At the beginning, when a kernel-based block device is assigned to the card containers, we want to add the device, the block device into a container such as the destination pass is dev-sda by the doc command. The steps are first we must have one block device in the kernel. And we pass it in this block device in the doc command. And the doc will code it inside the JSON configuration file. And then, in a long time, we'll pass this container JSON file and instruct the hypervisor to add this block device inside the container VM. The hypervisor will transform the block device to a virtualized block device into the container. So this is the OCS specification when creating the device. The block device consists of the major and the minor block numbers. So here the major eight and the minor zero means it's a host device that you can list in the LS device to see the major and minor device node. But it is cannot work for, I believe, with the URL solution because we will use the domain socket as a replant tail. So here we have some code that makes this happen. Make this can be used with the card containers. Currently, we cover this limitation by using the make node to break through the device limitation in the OCS specification. First, we will create a fake node file for each of the users' storage device. And each node file will correspond to the user device that passed in the doc command. Still, doc will code it inside the JSON configuration file. Cata runtime will pass this container JSON configuration file. And last, Cata runtime will instruct its hypervisor to add the user storage device into the container VM. And the host target will transform the storage device to be a virtualized block device into the container. So now you can use a similar mechanism to have the block device in your container. And the service is served by SPV host target. So when considering building container storage infrastructure on SPTK, we have several steps to take. At the start, the most straightforward step is to providing storage volume from SPTK we host by doc command. In doc run command parameter, doc device can add a host device to the container. For example, doc device SDC to doc device VDC. The steps to add SPTK device to containers. Generally, it has these steps. First one is create SPTK real speed dials inside SPTK. And then we will create a corresponding VHOST block device from SPTK. Then we will create a node file for each VHOST block device. When running Docker, we will add a parameter doc device to assign the node file to the destination pass inside Docker. Generally, these steps are not very brief. So we would like to use SPTK Docker volume plug-in to simplify the operations. Since data volumes persist data independent of a container's life cycle, the life cycle of data volume is controlled by a container operator or enemy directly. For example, Docker has a set of Docker volume command to manage in data volumes. A data volume is a specially designed directory that bypasses storage driver management. Here we can use the SPTK Docker volume plug-in. The first one is to create SPTK BDAL pool inside SPTK. Then we can use Docker volume command site to create, delete, and improve the Docker volume. Then we can run Docker with parameter-volume to assign the block volumes into Docker container. Then the next service from the infrastructure we considered is the image and root-if-is. Since SPTK logical volume provides basic functionalities to meet the container root-if-is requirement, for example, SPTK logical volume has these features. The first one is simple reasoning, snapshots and a clone, decoupling, inflation. So SPTK logical volume can work in a same manner with the kernel LVM or device mapper. How SPTK application components operate for image and root-if-is? So we can show you in the diagram. First is to download the container image. In the host, we need to download an image from the image registry to the local host. We would like to create a same provision BDAL from logical volume and export it to the host by Linux NBD. Then we can unpack the image file into the same BDAL through NBD. After BDAL is prepared with the image, we would like to clone BDAL as a preparation root-if-is for container one. During some running time of container one, we would like to commit the container one content as a second image. Then we would do a snapshot inside the logical volume. When we would like to run in container two on top of the image two, we can do a clone operation inside the logical volume and take it as the preparation root-if-is for container two. So based on this imagination, we're doing the root-if-is service pathfinding on container D. Container D is a mediator between an orchestrator and a runtime engine. It abstracts a snapshotter for different storage drivers like overlay, FIS, USMAPR, driver, AUFIS. Here we implemented a SPTK storage driver under a snapshotter. With the SPTK storage driver, it works similar with kernel DBSMAPR storage driver. It bypasses host kernel storage stack when providing root-if-is for running containers. Here we have a link for evaluation. Okay, let me give a summary of this presentation. And currently, the first is the progress we've made. The first one is with the user block solution. It is enabled and upstream to the Kata container community. And the second one is SDK doc volume plug-in to this PUC code is ready for demo. And you can download this load from the link we provided in this presentation. The third one is integrated as SDK into the container D. For this one, the PUC code is also ready for demo. And the last one is some SDK internal improvement. With continuous progress, we will still improve some other modules inside the SPTK. Of course, there are some limitations. First, the user storage volume isn't directly faced. So the second one is empty and ephemeral storage haven't been supported yet. The third one is this is a totally PUC code-based, so not fully functional validated for now. We let go some large-scale verification for the features we provided. The last one is not ready for the seal, like the Kubernetes. At last, we would like to implement a complete storage infrastructure for the security containers. That's our expectation for the future development. That's all.