 Hello everybody. This is Jason Wang from Red Hat. Welcome to the Q&A forum. Today I will give a talk about how VDP is supported in the Linux kernel. Here is the outline. We will first go through the Word.L architecture. And then we will try to introduce a new type of device called VDP device. And after that we will discuss the design and implementation of the VDP framework in the Linux kernel. And we will conclude the presentation and will be a Q&A session. So here is the Word.L architecture overview. So you can see that Word.L architecture could be split into three layers. In the upper layers, Word.L defines several different types of devices. For example, it could be a networking device, SCSI device, block device, etc. In the middle is the core device model. It could be split into the definitions of the Word.Qs, teacher beats, and config space. And on the lower layer is defined several transport, which ties to a device type to an actual bus. So the spec defines PCI, transport, MML, and CCW transport. So several different types of Word.L devices has been implemented in software. The first device implementation is done in QML, but its performance cannot satisfy our requirement. Then they move the data plane to the kernel by using the host protocol. It can do better than QML, but it's still not sufficient. So we offload the data plane from QML to another dedicated remote process through the VHOS user protocol. Then it can achieve the best performance that a software data plane can even achieve. So with the Word.L specifications and software implementation software device, we get good application usability. We get a unified device in a faithful guest, and we even get live migration support. But there are still several drawbacks. The first is that the software implementations requires extra CPU cycles to be spanned. And it requires extra management costs for settings such as CPU or memory affinity. And the last and probably most important one is that it can't be reached the bare speed due to the software overhead. Consider the high speed network become available. In order to address all the limitations born in the software Word.L implementations, some hardware vendors start to implement hardware implementation for Word.L, which means the device is fully compatible in both control paths and data paths with the Word.L specification. Then it could reach bare speed, and there will be no CPU overhead since all the data plane has been uploaded to the hardware. The unified data was preserved, and there will be no vendor log. But there are several issues of hardware Word.L implementation. The first issue is that the current Word.L is not designed to be virtualized. There is no API for save when restored device state in the current Word.L specification. This means if you just expose raw Word.L device to the guest, it can't be live migrated. The second issue is that Word.L is same and simple, which means it will be hard to be integrated with existing hardware stack. So more than real hardware is much more complicated. For example, as every cable is not current, it usually have embedded speech. This means vendor may have their own specific API to config, to program the speech. All of this API was missed in the virtual specification. And to be fully compatible in the control path also means a redesign of the control path, which tends to be a challenge. And it would be very hard to add vendor specific values, which really requires vendor specific extensions. And the manageability is also bad, since there's no management API defined in the Word.L specification. So in order to address all the limitations before the hardware Word.L implementations may introduce the concept of VDPA. So what is VDPA? It's short for we host data pass accelerations originally. But then we realize we host this trust cancels of virtual data pass of load. So we change the definitions to the virtual data pass acceleration. VDPA device is basically a kind of hardware that has a virtual compatible data pass, which is defined by a virtual spec. But it loads a vendor specific control pass. And it was required that such control pass should be functional equivalent or subset of the virtual control pass defined in the spec. In order to support live migrations, the VDPDIS is also required to help part of the work we host features, such as device recovery or dirty page tracking. Note that the dirty page tracking is not of must. So from hardware perspective, VDP device is a functional superset of the virtual device. So it contains the VQs, the virtual features, the host features, and the vendor specific config method, which is functional equivalent to the virtual config set. And beyond that, it was allowed to add vendor specific features on top. So why we need VDP? You can see that we gain almost every advantage of the hardware Word.L implementations. For example, it has a unified data pass with open standard and it can reach wire speed. Besides those advantages, we also gain more. So we gain live migration support by supporting the host features like device data recovery. It was also allowed vendor to add on features on top. But if we just expose raw VDP device to the software, there could be still several gaps for the end-to-end delivery. For example, do we really want to expose all the complexity and difference of the raw VDP device to the app layer? Or do we want to integrate the VDP device with the existing systems or revamp the view for a new dedicated subsystems? And also how about manageability? Is there a vendor specific manage API or it could be a unified one? And how about heavy weight drivers or lightweight drivers? So in order to answer all of those questions or these comments, we want to introduce the VDP kernel framework. The main goal is to bridge the usability and manageability gap with the raw VDP device. So it's basically a framework with the following features that is required. First, it should hide the complexity and difference of the layer device and present a simple same device API to the software. And it also tries to present a unified manageability to simplify the task of the app layer stack. And it will try its best to be integrated seamlessly with all the existing subsystems. Which means it will try its best to reuse the calls in both kernel and user space applications. And this framework should not be designed for user space drivers only. It should serve for both kernel and user space drivers. And the framework should be designed to be a bus or device agnostic. Which means it will allow any type of soft device such as non-PCI device, FPG device or even software emulator device. And it will try to keep the drivers as lightweight as possible. So here's the overview of the VDP framework. So you can see that on hardware level there could be several types of VDP device, which is all connected to the VDP framework. And to the app layers, you can choose to connect the VDP device to both VHOS systems and mortal drivers. So when connecting the VHOS subsystems, it will present a VHOS device and let applications to use VHOS UAPI to control the device as if it was a VHOS device. And when you connect to the mortal drivers, a kernel-visible virtual interface will present to the kernel IO subsystems. So the applications can choose any of the material UAPI supported by the kernel IO subsystems to control the VHOS device, sorry, the VDP device as if they were a virtual device. And the framework will also try to present a unified management API for the management applications. So as discussed, the framework tries to allow several different types of device and different types of drivers. So it's nature to consider to introduce the bus. So the VDP bus is the core concept for the VDP framework for abstracting the hardware, which allows different VDP device and drivers to be attached. And the VDP bus also defines the communication protocol between the bus driver and the device. So those communication protocols is a set of callbacks, which is called VDP config operations. And VDP device is the device abstractions provided by the VDP pirate device driver, which have several common attributes of the VDP device and also the implementation of the VDP config operations. So on top of the VDP bus, several different types of VDP bus drivers were allowed to be attached. Those tasks is to connect the VDP device to the existing kernel subsystems and use the VDP config operations to talk with the VDP device. So you can see from the view of the VDP bus driver, you can only see the VDP device and VDP config operations. All the complexity and difference was highlighted via the VDP device abstraction and bus. So VDP device is for the common abstractions and VDP parent is the module that provides such abstractions. So it needs to provide the common attributes and to implement the config operations of the VDP bus. So the config operations usually contain several different types. For example, it will contain the word help specific operations such as the word queue, attribute setting, the device state, feature negotiation, and something, etc. And it will also contain the interrupt and the doorbell acceleration method for fast access to the interrupt and doorbell. And in order to be more generic, it will also contain the DMMA map and NMAP method, which could be very convenient for the device that has on-chip MMU or have a sophisticated DMMA mapping logic. And it will also contain the V-host operations for device state recovery and dirty-page tracking. So the parent can be any type. For example, it could be a real parent device driver that talks to the VDP device directly. Or it could be an intermediate layer on top of another device driver framework. Or it could be even a software emulated VDP device or a proxy or relay of the VDP protocol to some other modules or even user space. So they allow several different types of VDP bus drivers to be attached. So we will first talk about the V-host VDP bus driver. So this bus driver is used to present a V-host device to the V-host subsystems. So it serves mainly for the user space virtual drivers. For example, it could serve for QML V-host backend. For QML to present a virtual data pass to VM. Or it could serve for the DPDK virtual PMV, which serves for NFV use case. So the idea is trying to reuse as much V-host generic UPS as possible for the data pass setting. But it also requires some dedicated UAPI extensions for full device abstractions, which is missed in the generic V-host UAPI. So those UAPI usually contains something like configures, space access, device data get and set, can be interrupt, etc. So the traditional V-host user applications only need very minimal changes. Then it can use the V-host VDP bus drivers to control the VDP device as if they were a virtual device. The second bus driver we provide is the VDP bus driver. So its goal is to present a V-host device to the virtual bus. Then this PISODO or proxy virtual device could be prog by the virtual drivers. So the virtual device will be visible to the kernel IO subsystems. This is done by introducing a new VDP transport for the virtual bus. Then the kernel IO subsystems can use VDP devices as if they were a virtual device. This means the applications can use, for example, TCP, IP stack, storage stack, IOU, XCT, or any kernel IO subsystems to transfer data between cells and the VDP device. The main use case for the virtual VDP device is for bare metal applications or containerized applications. And for the management API, we will try to introduce a dedicated VDP-specific netlink protocol for the VDP device management, which mainly contains the first is life cycle management to create, destroy, enable, disable, and also to set in attributes or provision the VDP device. The idea is to introduce a new VDP programs that will be integrated into IP route too. And then the management API will use either these programs or the netlink protocol directly for unified configuration interface. All VDP device, Pyro device is required to implement the VDP netlink protocol. So current Linux kernel have supported three VDP parents. The first is the Intel FCVF. So from the hardware perspective, it draws a virtual device plus Intel specific extensions. And the VDP is implemented through a dedicated VF. So it's parent driver is simply a PCI-VF device driver. The second VDP parent is the Manilox file VDP parent. So this device is also implemented in a dedicated VF, but it has a total vendor-specific control pass. This parent is an intermediate layer on top of the existing Manilox file core model. And the third parent is the VDP simulator, which is basically used for device testing, feature prototyping, et cetera. So this parent is implemented purely through a software emulation. And then we are working with vendors for more types of VDP parents, such as the ADI, which complies the Intel Scalable LB specification, or device or vendor-specific device like such as Soft Function, or even a PCIe endpoint device, which means the VDP is implemented via a remote SOC. So here's basically the status of the VDP support in the currently internal. So basic functions such as VDP core bus drivers and 3 VDP parents has been merged. And the basic QML function has been merged by QML. And then we are working on, for example, the net link-based main beneficiary, which would be posted soon. And the live migration support is also under development. For live migration, we will probably start from a software-assisted live-making first. This means it doesn't require any device support for dirty-page tracking. QML will try to assist the device for dirty-page tracking. And after this, we will try to invent new API for supporting dirty-page tracking from hardware. And you can see also the control queue work is being developed on the upstream. And we are working with vendors to make sure that the framework and the drivers can work for the device rather than networking. We will probably start from the block device. And for the future, there are several things in our mind. The first is we will need to finalize the documentation in the kernel source, which contains both the VDP device API definitions and the host VDP UAPI. And we also plan to co-op with the platform vendors to support shared virtual data address or even virtual shadow search address. And then we also plan to extend virtual specifications for some VDP-specific extension. Okay, so let's conclude these presentations. First, we introduce a new type of device called VDP device. The VDP device has the virtual data pass with vendor-specific control pass and the host features. And we introduce the host framework in the Linux kernel, which tries to hide the difference and complexity of different types of VDP device and present a unified device and many of the API to the upper layer. The VDP framework contains the VDP bus and the device for abstracting the device. And we also allow different types of VDP bus drivers for connecting the VDP device to various kernel subsystems. So we support both V-host and virtual drivers to light the VDP device to be used by both the kernel virtual drivers and user space virtual drivers. With the help of both VDP device and VDP framework, we could achieve well-speed virtual with the best usability and measurement ability. So there will be no vendor log and there will be live-migrant support or even cross-vendor live migration. A unified management interface was provided for ease the task of the management stack. And we will get material softwares stack in both host and guest, since we present VOTEL or V-host device to the upper layer. So here are some good references. First is the preditated band by Steve in the previous KVM forum. And the second is two series of blog wrote by us, which contains almost every aspect of both VOTEL and VDP. And it also contains several deep dive for the VDP kernel framework and is the typical use case. So you are welcome to go through those blog series and give us feedback. And it's also be useful for review and have a look at the VOTEL specification. And if you want to ask or hear from what is recently being developed for VDP aspect, you are welcome to subscribe the VDP development mailing list. So VDP is come to real life. It's not a concept in the paper. You are welcome to consider to deployment the VDP device in your cloud or you want VDP based hardware. And you are also welcome to test and contribute to the VDP framework. Please contact us if you have any questions. For example, the hardware design, driver implementation, deploys, increase creation and management issues or even feature request. You can drop a mail to either the VOTEL networking mailing list or a private mail to me. That's it. Thanks.