 Hello, my name is Jan Wugenfiller and I'm a CEO at Danix and a contractor at Redhead. Here with me is Annie Lee, Principal Software Engineer at Oracle. Together we are going to talk about implementing SRIV failover for Windows guests during the migration. So, during our presentation we are going to discuss virtual wind drivers in general and give you some overview on the virtual wind drivers. We will talk about some Windows guest terminology for the people that are less aware of it, discuss the problem with the SRIV migration, go about a little bit overview about different solutions and then we will discuss our solution. So, let's talk about virtual wind drivers first. So, here we have a link to the GitHub where there is an Aeroxin repository and this repository can find all the major drivers, all the major virtual drivers. We have a retail net, retail block, retail SCSI and other drivers. In this directory as well there are some other guest drivers that are not related to retail like PINIC driver and PYROMICMC and also some INF files that help us to define the system. For example, SM BIOS INF for Q35 chipset or PCI serial. So, what are those drivers and how they are built? So the network and the storage driver that are built in the architecture, Microsoft architecture called MiniPort drivers, so each and their respected driver technology for networks and this and for storage and store port of SCSI port and other drivers, WDF drivers. So it's a Microsoft framework that allows you to easily write kernel drivers. What are the supported OSs? So we support all the OSs starting from Windows XP up to Windows 10 and same with Windows servers 2003 to 2019. From the Windows for Windows 10 we also support ARM64 platform. How can you contribute? So please send pull request. The code changes through class Microsoft certifications, but don't worry, we are running CI on the upstream, so you are covered for that. And who are the contributors for the project during the years? So main contributors are coming from Red Hat, but we had also contributions from Dutiloso, Oracle, Google, Microsoft, AWS and others. So now let's talk about Windows terms and how the network driver architecture looks like in Windows. So you'll hear a lot during this presentation, Andis. So what is Andis? It's a network driver interface specification. It's also ABI for network drivers. It's also the architecture of the network drivers in kernel and there is Andis.cs, which is a Microsoft driver that you can see in Windows kernel and it implements part of the Andis functionality. So if you will go from bottom to top, in the bottom you will see hardware devices and on top of them there is a mini-port driver that usually supplied by the vendor and this driver drives the specific device. In the simple case that you can see from the right side, there are binded protocol drivers just on top of the mini-port driver. The more complicated case is when we have also intermediate driver and then the protocol drivers are binding to the intermediate driver. So intermediate drivers towards the mini-port driver have ABI of the protocol drivers. So mini-port thinks that it's up to protocol driver and towards the protocol driver has a mini-port ABI. So protocol drivers think that they talk to the mini-port driver. Why do we need such a thing? So one of the examples is the MOOCs driver that can sit on top of several mini-port drivers and present one virtual need to the top layers. Also in the user space we have notify object. Notify object can get callbacks from the network configuration subsystem and act on those callbacks by changing network configuration, removing drivers, installing drivers, etc. So how the NetKVM driver for Windows looks like. So it's in this mini-port driver as we mentioned before and the basic driver package looks like you have the INF file which is an installation descriptor, you have a SIS file which is a driver binary, you have a PDF file, those are symbols for debugging and CUT file which is the package digital signature. So now let's discuss the problem, why we even needed to do something. So when we are talking about para-vutualized devices and drivers and fully emulated devices, all the code resides inside of QMU and QMU also controls the data path. So when we want to migrate such device, it's just migrated with virtual machine, all the device state is migrated, QMU fully controls the data path, etc. But when you have external hardware device, QMU does not control the data path, the QMU may continue to run, we need somehow to migrate the hardware state and therefore there are several solutions that were proposed over the years. So we will have a small overview about them, some are more vendor-specific, we'll talk about what Microsoft did in IPerView and there's a solution in Linux with 3Nab model. So regarding previous effort, I think almost every year on the KVM forum we have at least one presentation about the server immigration, so they are ranging between very vendor-specific or device-specific solutions to just now there's a parallel session about more generic solution and now I am passing the presentation to Annie and she's going to give the overview about the solutions that Microsoft did and about our solution. Hi everyone, this is Annie from Oracle, today I'm going to talk about the software solutions or SRLV live migration in Windows. So these solutions focus on switching data paths seamlessly between VF network and the virtual network. Before initiating the live migration, the VF network adapter will be hard to remove and all network traffic will be redirected to the virtual network data path. After the migration is done, the VF network adapter will be hard-edited on their target, so all the network data will go through the VF network path. Today I will talk about the existing solutions. First, there are the Windows Snake teaming, Windows Max intermediate driver and Hyper-V solution. After that, I will talk about the two-net model in Windows virtual driver. So the Windows Snake teaming is built in Windows since Windows Server 2012. It is similar to the Bound in Linux and the Windows Snake teaming provides the lower capability. The user can put the virtual network and the VF network into one team and configure the virtual network as done by. So when the VF adapter is hard to remove, the Windows Snake teaming will set the virtual adapter as active and switch the data path to the virtual. After the VF adapter is hard-edited, the Windows Snake teaming will set the virtual adapter as standby and then the data path will go through the VF adapter. So the Windows Snake teaming can be configured through GUI or the Porsche in the user space. However, we prefer the solution in kernel space to switch the data path automatically and the user doesn't need to spend time or effort in configuring user space. The Windows Max intermediate driver is a kernel solution. It is supposed to one or more virtual adapters. Based on the relationship between the virtual adapter and underlying network adapter, it has the various models. Today I only introduced the one-to-two driver model for the SRLV line migration. This model is similar to the Snake teaming in fillover mode. Its architecture is also similar to the three-net model in Linux. So this slide shows the architecture of the one-to-two Max driver model. As you can see, the bottom are the virtual network and the VF function network. They have their own mini-pod drivers serve them. That's the net QVM driver and the VF mini-pod driver. On top of them, that's the one-to-two Max intermediate driver. This Max driver is supposed to the protocol driver as a low-ride to band to the underlying mini-pod driver. Also, it's supposed to the mini-pod driver as up-side to band to the TCP IP and other protocols. Inside the Max driver, the mini-pod virtual adapter band to its own protocol driver internally. So the business isn't aware of this funding. The Max driver has full control on the network data here. So it can switch the data path between the net QVM and the VF mini-pod driver. The thing is when the underlying mini-pod driver is post-Andis file banding interface, to avoid the confusion between the banding interface and its file with the Andis version, I will only use the term Andis here. As you can see, the protocol driver in Max driver is post-Andis banding interface, as well as the TCP IP and other protocols. This means this underlying mini-pod driver banded to the Max driver protocol, also banded to the up-layer protocol. However, the Max driver only want to expose the virtual adapter here, not the underlying mini-pod driver. So a notified object is involved here to unband the up-layer protocol driver from the underlying net QVM and VF mini-pod driver. Here I will skip the details about the notified object and I will go into more depth on it later. So this snapshot shows the banding details of Nick teaming on Max driver. So they show the similar banding, so I only paste one for both. So the Ethernet 14 and Ethernet 5 are the VF and the virtual network adapter connection. They only band to the network adapter multiplexer protocol. And SRLV network connection is generated by the Nick teaming of the Max driver model. It banded to all necessary up-layer protocols, but it doesn't band to its own protocol driver. So as we know that Hyper-V supported the SRLV migration, so let's see how Hyper-V works. The important part of the VM network is the network virtual service client. That's the NetVIC. NetVIC also communicate with the network virtual service provider through the VM bus in parent partition. So that's the synthetic data path. The NetVIC driver also communicate with the VF mini pod driver for the SRLV migration. And the NetVIC also provide two installation files. One is for installing the NetVIC mini pod driver, another for installing the NetVIC protocol driver. Both two drivers share the same driver binary. Normally, their name are tagged with the NDS version, for example, NetVIC63. So let's see the architecture of Hyper-V SRLV VF developer. So as you can see, the VF mini pod driver in Hyper-V is supposed to banding interface upper-range as the NDS VF. And the NetVIC protocol driver is the only protocol driver that is post the NDS VF banding interface. So this means that these two drivers can band together exclusively. And there's no notify object involved. Neither no new virtual adapter is generated. Also, there's no bound or teaming involved. So the NetVIC protocol driver is sitting in the same binary as the NetVIC mini pod driver. So as you can see, because of this, it is possible for the protocol driver to access the network data from NetVIC mini pod driver and forward them to the VF mini pod driver, finally reach the virtual function device. And the vice versa. So here is the network banding or Hyper-V. The Ethernet file is the VF network connection. It only band into the NetVIC field over VF protocol driver. The Ethernet file is the Hyper-V virtual adapter connection. It band into the TCP IP and other protocols. But the Hyper-V field over VF protocol driver is hidden to it. So as we can see, the MaxDriver model is complicated. A new virtual adapter is generated. This requires deployment of the new virtual mini pod driver and the offload how to be restored in the MaxDriver. Also, the notify object in the MaxDriver model is complicated. The installation, you know, installing the virtual adapter mini pod driver also offload the protocol driver. So this means more efforts are required for the deployment of the MaxDriver model. The Hyper-V model is simplified, but it is only appropriate for Hyper-V. In Linux, the mailbox mechanism is implemented for the VF and PF communication. However, different mechanism is implemented for the Hyper-V in Windows. As a result, same device, VF device, are advertised by different device ID. And the Windows VF mini pod driver end up with the different implementations too. As well as the banding interface is posted differently. This means we cannot use the Hyper-V implementation directly in KVM. So here the question comes, what should we do for the SRI or VLab Migration in Windows guest in KVM? The idea here is combine the MaxDriver model and the Hyper-V model solution together and that's the Tonette Dev model in Windows. So at first, let's take a look at the regular network and a virtual network and a virtual function network. They have their own mini pod driver to serve them. That's the KVM and the VR mini pod driver. This driver is then to the up-layer protocol directly. Let's see what's new in the Tonette Dev model for the SRI or VLab Migration. So a new virtual IO protocol driver is implemented here. It shares the same driver binary as the KVM mini pod driver. So that's very convenient for them to share the data between them. Also the VF mini pod driver is post the banding interface and is here. As you can see the virtual IO protocol driver and the up-layer TCP IP other protocol also expose the end-is banding interface. That means the VF mini pod driver banded to the virtual L protocol driver also banded to the TCP IP other protocols. So a notify object is implemented here to guarantee the banding between virtual L protocol and the VF mini pod driver in one-to-one mode. So the notify object is a common object that sit in the dynamic link library. When the virtual L protocol driver is being installed the network transport class installer will register a notify object for this virtual L protocol driver. So when the VF device is hot added any new banding generated to the virtual L protocol driver or the VF mini pod driver will be detected. So if the banding is between the virtual L protocol and the VF mini pod it will be allowed. Any other banding banded to them will be disabled. So this guarantee the banding between the virtual L protocol driver and the VF mini pod driver are exclusive. So the protocol driver is the important part in the two-net-down model. So I will talk about it in more details here. At first the protocol driver behave like a bridge between the mini pod driver, virtual mini pod driver and the VF mini pod driver. And also in two-net-down model the VF adapter is coupled to the virtual adapter with the same MAC address. When the VF adapter is hot added the protocol driver will search for the matched MAC address among all existing virtual network devices. If there's matched one, the protocol driver will band it to the VF mini pod driver and switch the date pass to it. So when the VF network adapter is hot removed the protocol driver will shut down its banding and switch the date pass back to the virtual L. For the TS network data the protocol driver will set the source handle or the network list as the VF mini pod banding handle and forward the network data to the VF mini pod through this handle. For the RX network data the protocol driver will indicate all the network data from VF mini pod to the upper layer protocol through the VF mini pod handle. The object identifiers are wrapped and forwarded. Offload are propagated in this same way. And the work of the propagation in this case is much less than the restoring offload work in match driver. So here is the banding details of the virtual L SRLV. The ethernet file is the VF network. It only band to the red hybrid Lnet QM protocol driver. It doesn't band to any other drivers. Ethernet 11 is the virtual connection. It band to the TCP IP other protocol but it doesn't band to the its own protocol driver. So you can see that's not that you know how the two net model works in the Windows virtual driver. So I will hand it over to Yen to talk about the current status of the two net model. Yen, please take it over. Thank you. Thank you very much, Yenny. So let's talk about the status and the known issues. So the code for the solution is already upstream and you can use it. And the known issues that we have are the following. So first of all, the support is only for the newer operating systems. Second, the statistics for the VF is missing because they are not propagated to the net KVM driver. We have some issues that are related on the order of starting of the devices. So one of them is a DHTP issue that we might have. And another thing that you should know is that if you want the solution to work on the specific VF, you need to add the plug and play idea of this VF either to the notification object code or to the registry. Regarding the current solution with the virtual net spec and the net FS standby. So we are not using this capability right now because we are relying on the notification object to notify us about the appearance and disappearance of the devices in the system. In the future, we might use it. Some changes in the installation. So first of all, what we had before, we had one ANF file for the mini port driver. And after those changes, we have one ANF file for the mini port driver and we have another ANF for the protocol driver definition and for the notify object. So it's kind of a dual installation here. Regarding double scale certification. Before we were certifying the mini port driver and the Microsoft automatically review the test package. And currently it's a two step certification. So first we need to verify the mini port driver and there is a automatic review for that. And then we should certify verify the whole solution and submit the test results. And it's a manual review for the second time. Let's take a look at the performance numbers. And here's the performance number on 100 gigabit per second card between the hosts and you can see it a VM to remote host traffic. So that's several things. So first of all, what we wanted to see is almost no degradation between the usage of the VF and the usage of the two-native model in our solution. And you can see it's here. And in order to achieve it, what we had to do is of course to propagate the OIDs correctly in order to ensure that the offload is propagated correctly from offload settings are correctly propagated from the VF to the protocol driver and also ensure the correct settings of the jumbo frames. You can see here I'm to use 9,000. This is the data for remote hosts to VM. As well, you can see that the performance here is almost not diminished when we are using our solution. And another results are VM to VM that are running on the remote hosts. So thank you very much. If you have more questions, please ask us in the chat or send us emails with questions and comments and we'll be happy to answer them. Thank you very much.