 Hi everybody, I'm Will Ald and I work for Intel. This work here was done mostly by Shantel, who's a member of our team, and I'm presenting for him. The idea here is that we would add the capability of doing an HVM dom zero, so a hardware-based dom zero, hardware-based dom zero. So what I want to cover in this is a little bit of history, why we are where we are, why we are interested in doing a hardware-based virtual machine for dom zero, some of the technologies and what we would like to help with, that kind of thing. This is kind of the standard, well, a standard representation of the Linux Zen hypervisor, where you've got the hardware at the bottom and the hypervisor and the domains up above, where the dom zero here, the special domain, is at the beginning it's a para-virtualized domain, and it has kind of the back-end drivers here that connect with the other domains and services them. Then up in the yellow part, that's sort of the, this isn't entirely the right thing, but that's sort of a ring three kind of area where the management control stop is. The other types, the rest of these are all kind of dom U, the user domains, and they come in a number of varieties, you've got the para-virtualized as well as the HVM, the hardware-based, and they can be multi-threaded, I mean multi-core, the SMP, or a regular single core. Can't look up there, it's too high. Anyway, so a little history here. We've had, trying to have Zen and the Linux OS work together, we've had a lot of problems there. A lot of pushback every time we try to get things into the kernel to support Zen, and largely the kernel community feels like this is just kind of an added tax on them. Because of that, they're not that interested, and so that's what generates the pushback. They see it as extra maintenance on their side, and they don't really see the need for adding extra features to things like the ZenRaz and that sort of thing, which would be important for Zen, but really a no op for the Linux community. So why is DOM zero a PV? Well, originally all the DOMs were PV, so that's clearly the reason. But a big part of why that's the case is that the CPU architecture just didn't have what it took to do a natural, virtualized environment, something that's straightforward. And because of that, the Zen project looked at doing something quite different than what's done in other places like the IBM work, where it was all para-virtualized. Then even after the VTX came out in about 2006, it was still not performant. So a lot of the memory, well, you had the shadow page tables and stuff, that all took up a lot of your performance. And because of that, there wasn't a lot of impetus to move forward and convert over. So in response to those kinds of situations, the DOM zero was para-virtualized and it remained para-virtualized. But this carries with it some limitations. This is a recurring theme, the dependence on the kernel and getting things into the kernel, having issues with being able to get special features in and an inability to handle unmodified OSs. So if you don't have the source code to an OS, you really didn't have an ability to make it part of the DOM zero. And you weren't able to leverage the performance improvements that you get from hardware. So some of the performance situations are like the system call in 64-bit OSs on X86 where you had to kind of bounce through the Zen kernel to get to your kernel space in the domain. That sort of thing just makes it pretty slow and you couldn't take advantage of superpages. And there's a number of other kind of components here that are just not available. So as the processor architects came on with improved hardware, this is both ourselves and as well as AMD, we really improved the hardware capabilities. This is both in the area of the CPU that we had originally, as well as in the memory and IO and also the interrupt space. So, you know, filled the holes originally with VTX and then later, you know, we came in and had faster paths for like the control registers, the APIC connection. We added the memory virtualization with EPT and things around that. And then the IO with the pass-through and SROV and then also the interrupt handling improvements. So all of those things really helped out quite a bit if you were using them. If you're using PV, then you didn't really see a lot of that. So for the goal here for the DOM-Zero in HVM, we really want to remove some of those limitations and be able to leverage these VT performance and flexibility things. And in terms of flexibility here, I'm really talking about using other OSs for the DOM-Zero things that we don't have to modify or can't modify. Now, there's a couple options. One option I think was talked about at this conference last year, the PVH and the idea there is you put the DOM-Zero into a hardware container like you would in HVM, but you leave everything else, the main core of the interface as per virtualized. So you get some enhancements from things like EPT, but you continue to have the issue around having an OS that you can modify. Then there's the full HVM type of thing, and it's the one that we opt for. You have to change some of the interfaces a bit, but you get all the performance that you get with a regular HVM that's used for DOM-U. So with this modified DOM-U, here we can run all the management control software, KEMU and the VIRT-IO up in ring three. And then we can install the drivers, the backend driver and of course the native drivers into the OS as you would without modifying it. And then just work through this end layer to communicate between the various domains so that DOM-Zero can still support the other domains in their IO needs. Okay, so some of the benefits, you can use any OS. You don't have to have a specific OS with the source available. You don't have the issue around the 64-bit system calls, move the issues around dealing with the Linux community and getting things into their kernel. And then you are able to add in some use cases, and actually this bottom one is I think pretty interesting. It's actually not a change in any technology. It's more of a user experience kind of thing where you can install Zen almost like it's a type two hypervisor, but it remains a type one and stuff, it just has more of a user feel as if it were type two in that it's more, you just log into your system, you install it, maybe reboot and you're done and it's quite different. So how do we make this HVM work? We use pretty much the same infrastructure that we're using for DOM-U today with the HVM. It's all the same for the CPU virtualization. The memory virtualization uses the EPT and gets the advantage of superpages. In terms of the IO, you've got VTD to do the pass through and by doing that you eliminate a lot of the VM exits that you would get trying to talk to the hardware. And then the interrupt virtualization, this is an area that's a little different. The IO APIC is controlled by the DOM-0 and then DOM-0 gets a local virtualized APIC and the hypervisor owns the local APIC, the physical local APIC. Now to support things like Windows, we need to use boot sequence for EFI and I'm not going to go through the details here, I don't fully understand them all myself, but they're there if you want to glance at it a little bit. In the case of Linux and the HVM, it's really the same sequence that we use today. Also to support multiple domains, or actually what I'm talking about here is multiple OSes as DOM-0. We need to have Kimu there and ready to go, we need to have the PV drivers supported. In the case of Linux, we've got these all in place and working. In the case of Windows, it's not all there yet. We still need the user land tools, libraries, ported over and working and also Kimu needs to be completed. This is the Zen Kimu portion and for the back end driver we're using this bird IO. We would like to help or to work with people on the user land tools to provide those things for other OSes, Windows here being the primary one, but something like Mac OS or something would also be interesting. We need to ensure that we've got the communication between the front end drivers and the back end drivers that would live in the OSes that are being added and then porting the Zen Kimu to Windows. Takeaways here, this enables us to use unmodified OSes. It opens up the space there quite a bit, particularly for Windows, but Mac OS as well. This in addition to the Zen projects, it doesn't remove any of the current capabilities or change that so that remains as it is and it resolves some of the limitations that exist currently. These are performance and not being able to deal with unmodified OSes. Then add some usage models that we talked about, things like using Windows as DOM zero, creating a trusted execution environment or kind of the type 2 user experience that you might get while remaining with the type 1. That's it. So how do you handle things like ACPI power management where Zen relies on a PV DOM zero to interpret the ACPI information and pass it down to Zen? So actually I don't have an answer for that, I don't know the answer. We could do something similar or you might be able to just have the DOM zero OS control that. Today it's kind of passed through, but with DOM zero we're assigning a lot of the devices to DOM zero. Most of those are IO, but you could do this with other devices like the power control and that sort of thing. But I don't think, well, in fact we don't have a Windows version working and so we don't have a solution there. Has there been any discussion of this approach on Zendivel or any batches posted or anything like that? Actually I don't know. Do you know, Hytel, has this been discussed on the mailing list? So what was the question? The question is, has it been discussed on the mailing list that HVM DOM zero? I don't know. Has anyone seen anything? I don't think so, right? So actually probably the right thing is to maybe start discussing that on the mailing list. Can you go back to the graphical overview thing, I just wanted to take a look at that for a bit. I may have questions later. In the window row before you listed Virtayoa's backend, but at the same time I seem to recall that you had ported the Xambas grunt table, these kind of things to Windows. So at that point why not have normal PV backends? So actually I'm not sure I understand the enough details for this, but we're using Virtayoa as a communication channel, and then the backend drivers just sit down in their plug-in drivers, right? Does that address your question? Not really. So the question was about why specifically Virtayoa instead of the normal PV backends? The backends are there. So they're in addition. So Virtayoa is going to be free and not confirmed. Is that what I'm saying? Yeah, it's kind of a split thing here. Have you done any performance testing on this model? Actually for this talk we were trying to gather the performance data and we just didn't have time. Any leading indicators? I haven't seen them. Okay. Anybody else? Doesn't that mean creating guest operating system at another kind of dome zero? Could you repeat that? Can you make multiple dome zero? Oh, maybe. I think there's a control issue, but that's really the problem. It has nothing to do with the HVM versus PV. How we can know that? How we can know that if there is multiple operating system? Multiple dome zeros or not? You would have to, you know, the dome zeros would have to be able to talk to each other. And, you know, that's a different project. Yeah. But for this application, that doesn't have to be a disaggregated demand. It doesn't have to be PV already, right? So, okay. So that's already there. That's totally independent. Okay. Anybody else? Questions? So when are patches going to be available? So I don't know the answer to that either. Are you planning on publishing? Yes. So we are planning on that and we will put on the mailing list something about it that says something about, you know, availability or when we would put some patches out. Because I think that is a big hole. Yeah. I guess it might actually be a good start to, you know, post the slides in the presentation and just get this going before the patches. And I just used that to maybe bootstrap. It seems that a lot of interest, you know, so let's just do a raise of hand, you know, this is a cool and good idea. Quite a few, yeah. I'm going to ask the opposite question. Who would be interested to see what it actually looks like in practice? Okay. Okay. All right. Obviously, I'm not a developer anymore. So I need to phrase my questions more carefully in future. More questions. We have a little bit more time, I think. So you just started to ask the opposite question. Is there anybody that thinks this is a bad idea? Not enough. Well, there was a hand nap in the back. But, you know, yeah. I guess it depends on your viewpoint. For clients and those kind of use cases, it's actually probably very interesting. Any more questions? No? How long have you been working on this? I think it's been about six months. But I'm not actually the person working on it. So we probably have three, four people on this, you know, but it's not full time, certainly. This is part of the Shanghai team that we have. Actually, Haitao and the Vax here are part of that team. Actually, I see Jack as well. Cool. Also, one more question on that. In that case, just one more questions. Okay. Thank you. Thank you for the talk. Thank you.