 Hi, I am Michael from ARM. Today, Henry and me will give you a presentation of Cloud HiPoizer, a new choice for machine monitor. Cloud HiPoizer is an open source project started by Intel. It is implemented in Rust language and is based on Rust VMM project. In the first half of this presentation, I will introduce the development of Rust-based machine monitors and Rust VMM, the architecture of Cloud HiPoizer, the virtual IoT devices of it, and how it works with Cata containers. After that, I will hand over to Henry who will continue to talk about the progress of Cloud HiPoizer on ARM and make a live demo to show how everything works. The first well-known machine monitor written in Rust is CrossVM. It was published in April 2017 by Google. CrossVM is based on KVM. It is designed to run on Chrome OS. Later, in October of the same year, Faircracker project started. In the very beginning, Faircracker was forked from CrossVM, but it became different soon because Faircracker aimed at different use cases, that is, lightweight virtual machines or micro-VMs. Micro-VMs are used for creating multi-talented containers and microservices. As the two projects have some functionalities and such code in common, duplicate ifers existed. The code was frequently ported from one project to the other, and much effort was used in testing and reviving. How can this be improved? In December 2018, developers discussed the way to share the code and finally concluded to start RustVM project. RustVM shared common components for making VMMs from the two projects. After that, in May 2019, Intel announced a new VMM project, CloudHepwizer, which is based on RustVM. RustVM is a collection of Rust crates. A crate is a building block in Rust, like a library in C. These crates are well-designed and tested. To build a new VMM, you only need to write the list code for combining these crates and customizing. Here are some examples of the crates. Linux loader is used for loading guest kernel. And the VM memory is for managing guest memory. It's Siteshare. In future, building your own photo machine monitor should be further simplified by using VMM reference. It is a new crate of RustVM introduced several months ago. It is a reference implementation on top of other crates. It aims to be convenient to extend and customize. You can choose to start your own VMM by extending this reference. Here is the diagram to show the major components of CloudHepwizer stack. Let me introduce them from top bottom up and the left right one by one. On top of host Linux kernel, it is RustVM. And on top of that, it is CloudHepwizer. CloudHepwizer is based on KVM, but it will not be the only choice in future. From the second half of this year, Microsoft joined the community and began to extend CloudHepwizer to HEPV. So a new crate was designed to wrap low-layer hypervisor details, that is the hypervisor crate. It has the details of KVM and provides a unique interface to upper-layer components. Architecture specific things are placed in ArcCrate. For ARM64, we have code handling, geek, and FTT here. VMM is the core component of CloudHepwizer. I have mentioned that on top of RustVM, a machine monitor should only contain customizing code. In CloudHepwizer, this customizing part is VMM crate. It manages devices, memory, CPUs, and interrupts of a guest VM. VM allocator is a resource management component. It is in charge of allocating and deallocating DRAM addresses and IO addresses and IRQ numbers. The last part is devices. It contains some crates for VRTIO and Vhost user devices. Ideally, the VTIO stuff should be designed agnostic for implementations. The vision is to make it unique in RustVM, but now RustVM haven't been ready in this part. In future, VTIO things in CloudHepwizer should be replaced by aligned RustVM crates. The conventional choice for device emulation could be QEMU, but QEMU is heavy-weighted. It emulates a number of devices, including some old ones. Well, CloudHepwizer focuses on cloud workloads only. It supports a limited side of pyro-virtualized devices. As the VMM running in the cloud computing centers, it doesn't care much about legacy devices. VTIO is the major type of CloudHepwizer devices. So far, it has supported many VTIO tabs, including VTIO block, balloon, console, IOMMU, networking, persistent memory, random number generator, and VSOC. Some kinds of Vhost user devices are available as well. This diagram depicts the workloads of VTIO emulation in general in CloudHepwizer. VM VTIO handles the car model of protocols. Various device tabs are emulated in VTIO device crate. In VMM crate, device manager manages all the devices. The notification from VTIO driver in guest kernel are carried by IOEventFD, which are handled by Epo Help Handler. In the return trip, interrupt manager can inject interrupts into guest via IRQFD. Working as a container runtime is an important use case of lightweight VM nowadays. CloudHepwizer can work with Cata containers to achieve that. Cata containers is an open source community working to build a secure container runtime with machine. By using Cata, a user feels like using a container, but in fact, the workload is isolated by a VM. This way, an additional layer of defense is built. In a pure container stack, typically, one C is under the runtime layer. But with Cata, it is a VM behind the runtime. In the right half of this diagram, it is where CloudHepwizer works. It serves to provide booting machines in which Cata agent is installed. The real container workload is running in the VM. That is all the slides for me. Now, I will hand over to Henry for the rest of the presentation. Thank you for watching. Thanks, Michael, for the wonderful sharing. From here, I'm going to talk about the CloudHepwizer or ARM. Currently, our contribution for the ARC64 platform can be divided into three parts. The first one is enabling CloudHepwizer on the ARC64 platform, where we implemented the guest VM memory layout on specific registers and devices. We also created a Rust foreign function interface for the Libre FDT library to implement the flattened device tree. This device tree will be used in booting the VM. Our second contribution is the ARC64 test infrastructure. This part, firstly, includes the enablement of the development container, as every development and test script is executed in the container. After the CloudHepwizer was enabled on ARM, we then added the ARC64 specific checks as well as the unit and integration test to the community CR. Based on this, our last key contribution to the community was about some feature enablement, including the VM snapshot and restore on ARM. Details about these contributions will be discussed in the next few slides. This slide shows the design of the ARC64 guest memory layout. One of the most important function of this layout is to provide memory sections to allocate each kind of devices. And in each section, devices are allocated from higher memory to lower memory. From bottom to top, the first 144 megabytes is reserved for the big-gig devices. The next 112 megabytes to 256 megabytes is reserved for the legacy devices, including the serial and the RTC. The memory from 256 megabytes to 1 gigabyte is the first part of the PCI-MMIO space, where the PCI devices which only support 32-bit addressing are allocated. The first 256 megabytes memory from 1 gigabyte to 2 gigabyte is for the PCI-MMM config, and the rest is reserved for the future use. Memory from 2 gigabyte to the physical address is for DRAM and the second part of the PCI-MMIO devices. Since we allocate device from higher to lower, the DRAM has a dynamic size, which depends on the number of high-mem PCI-MMIO devices. In this slide, I'm going to introduce the ARC64-specific implementations that we added to the cloud hypervisor. Code bases from this part were originally from the Firecracker project. We did modifications to this code basis to fit the own requirements of the cloud hypervisor project. The first part of the ARC64-specific implementations is the ARC64 registers, and this part is based on the KVM binding script from Rust-VMM. Similarly, as the Linux kernel, we divided the ARC64 registers into two parts, namely the core registers and the system registers. Key devices for the ARC64 are the VGIC device that manages the interrupt and the RTC device that provides cloud. The implementations of VGIC devices are based on the KVM IO controls, GIC-V2, GIC-V3, and GIC-V3 ITS were implemented respectively. The RTC device is implemented through software emulation. The VM snapshot restore feature has been merged to the master branch recently. It is based on the existing implementations of X86 with some modifications to fit the ARM platform. The snapshot or the restore process can be separated, summarized into a single figure where the orange part is the step needed on the X86 platform. The blue part is the step needed on the ARM platform and the black part is the common step for both platforms. In order to save the states of the VM, we need to save each component of the VM in a specific order. In current implementation, we save the states following the order of CPU manager, memory manager, and device manager of ARC64 platform. The CPU states that needed to be saved are the core registers, system registers, KVM-MP states, and MPIDR registers, which are different from those for the X86 platform. Also, a crucial step for saving the VM or ARC64 platform is saving the states of the VQT device, namely saving the distributor, redistributor, and ICC registers, as well as the VQT control R registers. This step should be executed between saving the memory manager and saving the device manager. The VM restore follows the order of memory manager, CPU manager, and device manager. One thing that need to be noticed is that previously at the step of restoring the CPU manager, the vCPUs are started directly after their states are restored. However, on ARC64, the vGeek is required to be created and restored before the vCPUs are started. Therefore, the original design was reflected. The start of the vCPUs was split into a single step. The restoring of the vGeek was inserted before vCPUs are started. So the previous slides have summarized the work we have done so far. Future works on our platform includes some code redesign and feature parity on ARC64. Currently, the code to generate FDT is based on the RAS foreign function interface, which depends on our C library. Calling a RAS FFI for C functions needs RAS off-save blocks, which is not preferred to address this problem. Both the cloud-type weather and the RAS VMM community has helped suggested that we could probably implement a dedicated FDT-created RAS and any design comments are highly welcome to the GitHub issue. The feature parity on ARC64 contains the improvement of the VM snapshot restore support, namely the save restore for the GQB3 ITS, as well as some other features such as ACPI, UEFI, VFIO, and CPU device hotbox support. Now I'm going to do some demos about the basic use of Cloud Advisor and VM snapshot restore using Cloud Advisor. Here we open the terminal. This machine has a Ubuntu Bionic OS pre-installed. In this demo, we assume that the essential packages, including Git, buildEssentials, and libappdT, as well as the RAS tourchains, have already installed on this machine. The current stable release of RAS tourchains would be enough for this demo. Before this demo, I have prepared the kernel and disk image file for the GAS2VM. Here for the GAS2VM, we use the Ubuntu full call. In the beginning, we firstly verified the machine architecture using uname-m. Now we can see that we are on a ARC64 host. The next step is getting the Cloud Hypervisor source code. Here we clone the current master branch of Cloud Hypervisor. We go to the source code directory and build those source code using cargo. Here we built the current master branch. We use KVM for this demo. Now the code is building. It will take about one minute to build this binary. The Cloud Hypervisor binary will be placed in target debug directory. After the binary is built, here we can start the GAS2VM. Here, I will give a brief introduction to the command we will use. The APS socket option here is to tell the Cloud Hypervisor to create a socket file for other process to connect. The kernel option provides the kernel and the disk option provides the disk file for the GAS2VM. The command line option passes the command line to the GAS2Kernel. The CPU option and the memory option determines the GAS2VC views and memory size. Now we run this command. We can see that our GAS2VM has been started. After the GAS2VM has been started, we can log into this GAS2VM and verify the number of the CPUs and the size of memory. We can see that the CPU number is two, which is correct. Also, we can verify the memory size, which is also correct. In order to take a snapshot of this VM, we need to open a new terminal. Here, we will use the CHRemoteBinary as a tool to connect to the socket file we have just created. It's here. And we can also send command through the socket to the Cloud Hypervisor API server. To take a snapshot, firstly, we need to pause the VM. Now we can see that the VM has been paused. Now we make a directory to store the snapshot data. Next, we take a snapshot of this VM. After all of this, we can resume this VM. Now we can see that the VM has been resumed. We can hear power off this VM. To restore the VM using the snapshot file we have just created, we firstly start the Cloud Hypervisor API server and store the VM from these files. Now we go to our second terminal and resume this VM. Now we can see that the VM has been resumed and this is the exact VM we have just powered off. Thanks for the watching.