 Hello, my name is Feng Ziming. I'm now working for Baddance and currently focused on virtualization lead project. Today, I will share my topic, KVM life upgrade with probably hiding of a password device. Next, I will introduce it from four aspects. With the development of technology, the VM had to be frequently updated and restart to add security patch and new features. There are two existing life update method to improve the cloud availability. KVM life patch and virtual machine life migration. However, they both have a serious drawbacks. Kernel life patch cannot handle complex changes. For example, change to persistence data structure. VM life migration cannot handle the password device and it may be incurred unacceptable load list because life migration need to copy memory from the old QMU to new QMU. In this topic, I will introduce our life upgrade method. They can probably update the KVM and QMU without interrupting customer VMs. There is a difficult for VM life upgrade. How to handle the password device during the life upgrade and minimize service downtime is a major concern of closed providers. In this talk, we will analyze the requirements for the password device handle and present how we follow those requirements to probably handle password device in our KVM life upgrade implementation. In this, we also optimize the startup and the suspend of VM to decrease downtime during the life upgrade. This is a framework for life upgrade. In order to life upgrade the KVM, we modify KVM module and allow it to be complied multiple modules named by KVM 1 module, KVM 2 module, and so on. The specific implementation is that we move most of the KVM module focusing into KVM internal module. To load multiple copies of KVM internal modules, we associate all the original global variables in the KVM module with the KVM internal module and make all the global function local. In Linux, a device password is enabled by VFL during the life upgrade. We inherit the VFL connector from the old QMU to the new QMU. And the VM's memory is shared by the new at old QMU process. The mapping from GPA to HPA is not changed during the life upgrade. At the IOMMU table, and the IOMMU translation table is reminding the validity. So, device DMA operation can continue execution without interruption, even when the VM is stopped. For password device, how to ensure interrupt is not lost during the life upgrade, this is difficult because password device is not suspended. For this reason, the password device interrupts can occur at any time. So, we cannot complete copy the password device interrupt from the old QMU to the new QMU during the life upgrade. There is existing solution. Inject an additional virtual ARQ is a core idea. First, new QMU inherits VFL event FDs from the old QMU. Second, the new QMU reads from event FD and receive the pending interrupts. In last, inject an additional virtual ARQ into the VM after handling over device. In this topic, we use post-interrupt technology to inject interrupt. Device will set a post-interrupt request speed where the device reading and interrupt. So, we need only to ensure the same PID scrapped data between the old QMU and the new QMU process. We allow PID scrapped data is shared between the new QMU and the old QMU during the life upgrade. This picture shows PID scrapped data initialization compared to original QMU design. In order to share PID scrapped data in the new QMU, we allocate memory for PID scrapped data structure in the QMU. There are three key points for PID scrapped structure initialization in the new QMU. First, PID scrapped data shouldn't not be initialized when the new QMU is initialized. Second, the new QMU don't need to seek a post-interrupt requesting data from the old QMU because PID scrapped data is shared between the new QMU and the old QMU. Last, the new QMU don't need to update interrupt memory table during the life upgrade. Next, I will introduce how to optimize the VM downtime during the life upgrade. This picture shows the life upgrade flow diagram. The first step, we focus on the children process and execute the new QMU binary and the new QMU is initialized. The second step, we stop the old QMU and save a way and a state. The last step, the new QMU load the state from the old QMU and start the new QMU. It is obvious that VM downtime contents through the following phases stop the old QMU, save the old QMU state, load the state from the old QMU and start the new QMU. When stopping the old QMU, we find clean up, we find clean up in the time of day is taking a lot of time. When the device have multiple queues, for example, what have net, what have block and so on. The old QMU process will be killed after the life upgrade under normal QMU initialization lock it. Event have D will be free by QMU process. However, the device event have D will be free by the system. If the QMU don't free it, so device event have D needn't to clean up by the old QMU process while the life upgrade is successful. Other normal VM startup logic, first event have D is in use slide with the device startup then VCPU is resumered. So the VM dot time contents the initialization of device event have D in this way. Inspired by the optimization of VM suspend, we can pre-create the device event have D during the QMU initialization because the VM dot time don't contest the new QMU initialization. We can decrease the startup time in this way. In last, we use the shared memory to save the old QMU state and loading state in the new QMU happens concurrently with saving state in the old QMU. We use different VCPU number and memory size to measure the VCPU dot time with different workload. We use different benchmark tools to simulate the command use case of the cloud service including computation, storage and memory. We use the following tools, streets, memory tester and FIO to simulate computation memory and storage. This picture shows the distribution of VCPU dot time under VMI idle. We can see the distribution of VM dot time is 11 millisecond to 34 millisecond. Next, we use the streets tools to simulate the computation workload while running the streets tools in VM. We update the VM in host and we can see the distribution of VM dot time is 12 millisecond to 34 millisecond. For the memory tester, we use memory tester tools. We use four GB memory in VM and we upwards the distribution of VM dot time is 12 millisecond to 34 millisecond. And last, we use the FIO tools to simulate the storage workload. We rewrite 14 GB to disk. We observe the distribution of VM dot time is 12 millisecond to 38 millisecond. Based on the above tester result, we find that the relationship between the VCPU dot time and the VM workload were not closed. Okay, that's all. Thank you for listening to my topic. If you have any question for this topic, please contact me by this email. Thank you.