 that the guest sees. Guest OS has no idea that it is running in a virtualized environment. So for example, there's a database which also works on the guest. It's a thing that you have to space. When you allocate memory for the guest and the guest, it will create a thread on the post machine for the service in the guest address. It uses other Linux functionality or resources like this kind of network. Guest runs as part of the Q&A process. If there is an exploit and an exit problem, it's exposed hard to a virtualized agent. It collects some devices, posts to the destination, and then on the second migration, you have to only transfer the pages which are not changed or pages to be changed. We keep track of some special needs, what kind of request it is. The most trouble you can satisfy that request is that you secure the related features. So now, with the system calls that the user space application can make, so after Q&A starts, it can give up its rights to, let's say, open a fire. Because if there is a guest exploit, if the guest exists, you might be able to open access. You cannot do a rent and payoffs in your case. It's not a cross-free process, but it's part of this new experience which is very far-fetched, which is machine. So this kind of network I work on, many other workbooks. Inside the guest, there are fewer P&P requests. It doesn't say that you're merging, but it says that you're saving everything. In the guest, it's the same type of guest which you're giving also to your parents. For example, the bias of a filter finds a set of SDDNACs that gives each bunch of machine and its SDDNACs, so that software thinks that they actually have a real SDDNAC that they just need to find. Or if it's a real SDDNACs or software that we have something running from the SDDNACs, it means addition of a new one. That's because new architectures are being introduced. The work on these are being checked if the number of work pages are spoiled or not. But to be transferred with that number, I use my PCI Express support. So we only had support for PCI devices to be exposed to the guest, but now we can expose PCI Express devices to the guest. It reached feature parity with PCI over there. Migration works, so we can declare it to be completely supported. And this is actually a requirement for IOMMU. If you want to pass through an IOMMU or emulate an IOMMU and expose it to the guest for nested virtualization, this is required as well as user space drivers, which we discussed earlier, and EER, Advanced Error Reporting. So if a device has been exposed to the guest and that device encounters an error, with the PCI bus, we had no idea on the host that the device was in an error state and that we had to reset it. That is possible with the AER capability in PCI Express devices, and we now have access to that functionality. Video inside the guest, there are three different ways of doing this. One which was recently merged was Vertio GPU, which can do 2D and 3D support. So Vertio GPU is a paravirtualized video driver which uses OpenGL on the host for all the GPU rendering. VGPU is something that's in progress, but it needs a one-to-one mapping between the host GPU and the guest GPU. Intel is working on this. They call it KVM GT. So this basically means the host GPU has some functionality by which you can share the host GPU between multiple guests and the host GPU does something. So this is still in progress though, but it's one of the ways that we do video inside guests. The last one, of course, is device assignment where you just assign a device from the host into the guest. The host gives up all control of the device and the guest exclusively owns that device. Now this, of course, depends on what kind of hardware it is. If the hardware has support for multiple functions, you can assign one function to each guest and so multiple guests can use the same device. And so this is mostly used for compute and not for video because these are very heavy-duty GPUs and really expensive hardware. But there have been cases where it has been used for video as well. As we'll see on this slide. Device assignment, there have been several improvements over here as well. Lots of things, I won't go through it all, but one interesting thing is IRQ bypass support, which means the device can directly inject and interrupt into the guest. The host need not be involved, so it's just that much more faster. And there was one interesting video posted by someone called SevenGamersOneCPU where they used KVM to, they built a machine which had seven actual physical GPUs inside the machine and they used KVM to assign each GPU to a different guest and they ran seven games simultaneously. So they had mouse, keyboard, video, whatever, hard disk for each virtual machine and they were running games at bare metal speeds, getting very good FPS results for all of them. So the choice of video card was not right because if there was an error, they would have to reboot the system and so on. NVIDIA kind of gets this right with some of their cards, AMD not yet and there are some quirks that need to be added, but what's interesting is this setup is possible. They had some hundreds of GBs of RAM per guest, terabytes of SSD storage per guest with seven guests doing extremely compute intensive work and KVM handling it all really well. So this was really interesting that someone did very recently. Just shows the capability of QMU and KVM over here. The block layer in QMU gained a block depth backup which is a, you can just backup running guests, a point in time snapshot of a disk and this snapshot can be taken over the network. So another feature was IOT throttling groups where all disks that are used by a guest can be made part of a group and quota restrictions can be applied to the entire group. Earlier it used to be that the quota restrictions applied only to each disk. So now they can be applied to the entire group. So it's just something for infrastructure vendors. Extended IOSATs which helps with understanding guest behavior and tuning guests. Some lip word specific. So most of the changes also involve lip word but some very lip word specific things are adding a new word admin API which can tune lip word D itself to make things faster so or understand guest behavior and do things. So something like gather resource usage and produce stats for it or if there are thread pools like how many IOT threads you have running. If they're utilized to the maximum you can add more IOT threads to the pool. You can get such stats and make decisions based on that. IOT thread pinning was added. So like vCPU pinning which we saw in the real time case you can pin vCPUs to physical CPUs. IOT thread pinning can also be done to make IOT faster. This can be for block devices. PPC 64 the architecture became a first-class citizen. This resulted in a lot of refactoring of the code. Earlier the code was very x86 centric and now lip word can handle multiple architectures. And with the addition of PPC you can have big Indian guests on little Indian hosts like little Indian host PPC and you can run a big Indian power PPC guest inside and not everything was tuned for it. Not everything was ready for such a scenario. For example, word IOT of course, word IOT has to deal with a para virtualized and IOT between the host and the guest. So what IOT one addresses this thing and so does lip word. Lip word needs to also deal with this. Some of the other things, word IOT input is something that I had mentioned. So now we have para virtualized keyboard, mouse, tablet devices basically gets rid of the USB dependency. So you no longer need to have USB keyboard, for example. USB does a lot of interrupts. So it's better to have word IOT. Word IOT balloon, the balloon device which is used for overcome it. So a balloon device can allocate RAM in the guest and give it off to the host. And the guest doesn't can't use that RAM, right? The host starts using it, maybe giving it to other guests and so on. But one of the side effects of this was if the guest enters an OOM condition out of memory condition, the guest used to just blow up for no fault of its own because it has the RAM which it has given to the host as a courtesy. But now there is support for deflating the balloon on OOM condition. So we can ask back for the RAM from the host and continue operations on the guest. Memory hot plug and unplug support. So you can unplug memory from guests or plug in new memory. New security feature is to insert guard pages after the guest RAM. So if there is a buffer overflow, exploit someone tries from inside the guest. We will be guarded against it. Just this is like adding canary values for stack overflow protection, similar to that. There are some architecture specific improvements. So S390 got PCI bus support. That's who would have expected that. But yeah, S390 can do PCI now. For ARM hosts and guests can use multiple CPUs, eight CPUs. There's a virtual interrupt controller support which makes servicing interrupts faster and dirty page tracking which is useful for live migration. So all of this was added for ARM. For X86 VTD emulation or IOMMU emulation is in progress. This is used for nested virtualization. Of course, so if you have a guest that can act as a hypervisor itself and it can emulate an IOMMU so that it can pass through devices of its own to the second level guest. And some more nested word improvements. Split IRQ chip, so this is a security feature we saw. The APIC was emulated by the KVM kernel module and to reduce the attack surface because that's part of the host kernel and if the guest can exploit something, it gets access to the host kernel. So to reduce the attack surface, a lot of functionality which is not necessarily performance intensive has been moved into KMU and only a small part now remains in the host kernel. PPC got CPU and memory hot plug and also support for the H random hypercall which is similar to the RNG device. PPC always had the guest side of PPC, the Linux kernel for PPC always had support for a H random hypercall because a previous hypervisor which PPC used to work on had support for that. So KMU gained support for this hypercall as well and it can pass on host entropy into the guest using this. Some of the features in progress are Vertio, GPU, 3D, Spice integration. Spice is the remoting protocol and 3D only works with the GTK backend right now, so integrating it with Spice is something that's coming up. Native hyperv, Paravirtualization. So hyperv exposes a lot of Paravirtualized devices and KVM can now, as in very soon, start exposing those devices as well. So guests which are tuned to run under the hyperv hypervisor can actually run under KVM and get the same performance benefits as well. Blockdev backup, something that we saw earlier in the block will gain incremental backup functionality and it will preserve state across restart and live migration and several other features. There's a lot more. So I've reached the end of my slides. I have my email address up here and the address to my blog where I will put up the slides in a short while. And if there are any questions, I'll take them now. We talked about benchmarking and on the slides. Did you know how KVM compares to the Zen hypervisor? Yeah, so the spec word benchmark is an industry neutral and an industry standard benchmark for measuring virtualization hypervisor performance. And companies can run spec word and choose to publish or not publish the results. KVM is the hypervisor that consistently gets published. Zen is not published at all, so draw your conclusions. Well, the thing is Zen cannot spec one of the things it really does well or measures is the scalability as in how many guests can be run at the same time and what is the amount of work that each guest gets done in that much amount of time. Zen doesn't scale really well and Zen, I don't even think Zen supports as many VCPUs as we do or the amount of RAM that we can give to the guests. Not close, not anywhere close. Can you discuss some of the differences between the Type 1 and Type 2 hypervisors and the use cases where Zen would still be used versus where KVM would be superior? Yeah, so I don't know. I think the Type 1 versus Type 2 debate is purely academic and not... Well, what does Zen do? I mean, Zen has to do its own scheduling, has to do its own power management, has to do its own memory management. We use Linux for that. And so just call Linux the hypervisor and I think both are at the same parity. I don't think it matters much and frankly, from the number of features, the kind of security we provide, the scalability, performance, et cetera, we beat Zen in every possible way. So any more questions? Yeah, there's one there. Snapshots, are you talking about live migration? No. Okay. Okay. Okay. Yeah, I guess it's mainly because you have to get the hardware in whatever state it was and you cannot because you need to reset the hardware, which was the point I mentioned. His hardware cannot handle those resets. So it's hardware dependent as well as snapshots. I don't know how useful it would be with assigned devices because yeah, you need to get it into whatever state it was in. Kind of out of your control. Thanks for staying for the last session of the last day. And thanks for bearing my voice. Not too great. Thanks. Thank you.