 OK, good song, oldie with goodie. So today we're going to continue to talk about virtualization. And we're going to sort of pick up where we left off last time. So last time, we talked about what's known as full hardware virtualization, the goal there being to completely be able to run an unmodified guest operating system to the point where the guest operating system really has no conception of the fact that it's running on virtual hardware. So today we're going to talk about a different approach. And again, this is another case where an entire sort of large industry and a lot of sort of modern computing was at least spurred on, if not spawned, by a very seminal research paper. So this is called Zen in the Art of Virtualization. I don't remember exactly what year this was, but a bunch of these authors are sort of still figures in the computer systems community and are quite famous, if not partly or entirely because of this particular paper. All right, so brief announcements. Please finish assignment three. Point two is due on Friday at 5 PM and then swapping two weeks later. So I can see from the leaderboards that people are making some progress on this, but there's still a large spike at 100. So hopefully that's going to start migrating to the right. OK, so the time has come in the semester to discuss the course evaluation policy for this class, which some of you may be already familiar with. So the evaluation system is open, I think. I don't know. I can't view it as a student. It was supposed to be open. Can we have one conversation in the room, please? Thanks. Supposed to be open today, but if it's not, it'll open up soon. I really want you to fill out the course evaluations. This is really important to me. It's a really important source of information about the class, feedback that we use to make things better in the future or worse, depending on your perspective. This is particularly important this year. I mean, we've made some pretty significant changes to the class, the Test 161 system and the elimination of the code reading portions and things like that. So I really want to know, get feedback, sort of anonymous feedback from you guys about what worked, what didn't, how you found the class, and things like that. And that's something that we read these very carefully. We take the feedback and we try to use it to improve things on an iterative basis. So here's what I do. So because I want you to do something for me, I provide some incentives, because I realize right now that the process is somewhat opaque. Now this is about to change. For those of you that will be around next year, I believe that our department is actually leading the way toward a public course evaluation system. So starting next year, we will actually be able to see the results of the course evaluation process. I know that doesn't seem weird, but it isn't possible right now, but that's something that we're going to change, so that's awesome. But right now, I know there's not a lot in it for you, so let me create some incentives here. So once you guys get to 70% of people have filled out the evaluation, I will start releasing exam questions. This is how this works. So the exam is going to look a lot like the midterm. It's going to have a mixture of short answer and long answer questions, but there's also going to be a medium answer section, so you can think of the short answer questions as being identical to the five point short answers on the midterm. The medium answer question on the final is like the long answer on the midterm, and then there's several long answer questions on the final that are even longer than the long answer on the midterm. The final is not something to be stressed out about. It's about twice as long as the midterm, but you have three times the amount of time to do it, and so usually people find that to be plenty of time. OK, if you get to 80%, I will release one of the medium answer questions. And if you get to 95%, I will release a long answer question. I would love if you would do this as soon as possible so that I don't have to write extra questions for some of these sections. If I'm releasing a question, then there may not be as many options in that section. Any questions about this? The questions will be released to the entire class, duh, over a discourse or email or something like that. So you will all have access to them at the same time. You are free to talk about these questions with each other over the forums. That doesn't always lead to good answers. I'll just warn you about that, depending on who you're talking to. But this is the system. So I think last year we got to about 97% eventually. So 95% is not impossible. I'll post it on discourse or something. But look, those are your incentives. So maybe you could post something on discourse being like, hey, fellow classmates, we could potentially release one of the long answer questions on the exam. Why don't you do it? Find your five friends in the class. And lead them to the course evaluation system. When this has worked, well, I think every year we've done this, I think there was one year maybe that they didn't get quite to the last, across the last hurdle. The classes have gotten very close. But it's usually not because of me. It's usually because people realize that it's in all your best interest to encourage you to do it. Oh, yeah. Like, I can't unrelease the question. It's like, oh, give me that question back. Oh, no, this is, yeah. So in the best case, you will arrive at the exam knowing one of the short answer questions that will be on the exam out of six, one of the medium answer questions, maybe the only medium answer question if you guys get that far, and one of maybe two long answer questions that I would ask you to complete. That's the goal. The point of the exam is not to be hard. The point of the exam is to get you guys to study and do the work. So probably what I'll do on the exam, which I think worked reasonably well in the midterm, is reuse some old questions from previous exams as well. And it's getting to that point where it's getting hard to write good exam questions. So I just wanted you guys to prepare. I think having everybody go through the old midterm was pretty useful for you. Any other questions about this? OK, good going. All right, so let's talk about Zen. So what kind of paper is this? Remember, data taxonomy of paper has fell into different categories, but data analysis papers and other types of papers. What kind of paper is this? Fall into maybe a couple of categories, Steve. Yeah, so again, if the paper spawns like a multi-billion dollar industry, then it probably gets this moniker of being a big idea of paper, and that is the case. It's also kind of a wrong way paper. It's trying to point out that a particular approach to designing a particular type of system is not ideal. What are they critical of? What's the approach that they're critical of? One answer at a time. I mean, it's Wednesday. So they're proposing a new approach to virtualization to replace what? I mean, if I forced you to make a guess, you might say, what? Well, remember, the goal of virtualization is to be able to run a operating system inside a virtual container to be able to create multiple virtual machines out of a single physical machine. On Monday, we talked about doing this in a way that required no changes to the guest operating system. And that's hard, partly because of vagaries of the x86 architecture. So they propose doing this a slightly different way. What is the difference? Yeah, so they're critical here of a full virtualization of the kind that we talked about on Monday, where the goal there was to not make any changes to the guest operating system. And unfortunately, because x86 contains instructions that are not classically virtualizable, where I can't do trap and emulate because they don't generate exceptions when they're executed without kernel privilege, that became complex and hard to get right. So here's the Zen approach. So what they point out is support for full virtualization was never part of the x86 spec. When x86 was designed, this wasn't really something that was considered or built into the architecture. Certain supervisor instructions, these are privileged instructions, must be handled by the virtual machine monitor for correct virtualization. But executing these with insufficient privilege fails silently rather than causing a convenient trap. This is the instructions that are not classically virtualizable. Officially virtualizing the x86 MMU is also difficult. These problems can be solved, but only at the cost of increased complexity and reduced performance. So this is one of those times where, again, it sounded like a great idea. We're going to run these unmodified operating systems. And this was going to work out well. But by the time we got done accomplishing this, you sort of look at it and you think, oh man, that turned out to be a real mess. Maybe we should revisit some of our design assumptions. They also make some other arguments, again, it's full virtualization. And these really have to do with sort of awareness of the guest operating system that it's being run inside a virtualized environment. There are certain notions of time, for example. So you might wonder if I'm running a bunch of different guest operating systems that are not aware of the fact they're being virtualized, from time to time, time is stopping. And then I'm running them again later. And without a notion of that, it can be hard to get timing right for certain things that require precise timing, like, for example, TCP calculations. And this can cause problems. And there's other cases where I might want to give operating systems, guest operating systems, a little bit more information about the virtualized environment that is hidden from them by design in a full virtualized system. Fully virtualized system. Remember, the goal there is that I made no changes to the guest operating system and it has no idea what's going on. But maybe giving it a little bit of an idea would actually be useful. So this is another criticism of full virtualization. Not only is it hard, but it's also maybe not exactly the right approach. So what do they do instead? If I'm not going to try to run an unmodified guest operating system, what could I do instead? Yeah. Modify it. Make some modifications to the guest operating system. They refer to this as porting it. And what this allows me to do is the following. I trade off small changes. So yes, unfortunately, I'm giving up the ghost on this idea that I'm going to be able to run an unmodified operating system. But the argument here is that this is the right design trade off by making a small number of changes to the guest operating system. I can achieve big gains in desirable qualities like simplicity, performance, and some other features. So rather than doing a large amount of work to achieve the somewhat keyhole goal of full virtualization, I'm going to instead do a smaller amount of work. But I'm going to work on the guest operating system and make some small changes to it rather than having to build these really complex virtual machine monitors. So by shifting things around, rather than having all the complexity be borne by the virtual machine monitor, I'm shifting some of the complexity into the guest operating system. And that allows me to actually reduce the overall amount of work I have to do, because it allows me to build a simpler virtual machine monitor. And in many ways, a more powerful one. So they say we avoid the drawbacks of full virtualization by presenting a virtual machine abstraction that is similar, but not identical to the underlying hardware, an approach which has been dubbed a pair of virtualizations. So with full virtualization, I remember essentially what I needed to virtualize was the hardware interface. In this case, what we're doing is we're creating, they call it a virtual machine abstraction. So it's almost this idealized virtual machine that is easier for the virtual machine monitor to support. And then I make some small changes to the guest operating system to port it to meet that. This promises improved performance, although it does require modifications to the guest operating system. Now the final thing that's important to understand, which is that because they don't modify the ABI or application binary interface, these are the instructions that are run, these are instructions that was presented by the processor, to unprivileged applications, there are no changes to application binaries that are required to run them inside Z. Does it make sense? I have to change the guest operating system, but I do not have to change guest applications. Why is that important? Yeah, Steve? Yeah, so that's the goal, right? But why not have a different approach here that required some small changes to the guest applications? Why is that harder to pull off? Yeah? Yeah, there's just a lot more apps to change. Think about all the different binaries that are out there in the world that now have to be tweaked. Happily, there are fewer operating systems out there. So to accomplish their goal, they only had to, in this paper, I think they only modified two or three different actual operating systems, guest operating systems, to work with ZEN. But doing that is sufficient if you don't have to modify the application binaries to now support thousands of, tens of thousands of applications. The operating system sort of provides what is sometimes referred to as a narrow waste here. So it gives you the chance to run a bunch of a lot of things while making a small number of changes. So here are the design principles that they lay out in the introduction. This is a fantastic introduction to a paper, by the way, it's one of those sort of classics. So here's what ZEN needs to do. And this is kind of something that you see in papers like this. And it's important to sort of process these. These are the goals of the system stated in English. The first one is, I need to be able to run unmodified binaries. We just said that. If I don't do that, then users are not going to migrate to ZEN. Somebody's not going to come in who has a lot of binary software. Maybe they don't have the source code for it. Maybe they don't have enough manpower or woman power to make all the changes to it that they would need. So unmodified binaries. Unmodified application binaries, that's critical. They want to support full multi-application operating systems. This goes back and echoes some of the arguments that we made about why virtualization is powerful and important by, this is kind of the arguments that we made on Monday. Now I can move an entire, not just one application, but an entire environment that could involve multiple applications, specific configurations, maybe specific operating system configurations that are designed to support those applications. And I can move that together. I can package it in the virtual machine, and I can distribute it and copy it and move it from place to place. So this is the goal. Paravirtualization is necessary to obtain high performance and strong resource isolation. I love how they refer to the X80SYS as an uncooperative machine architecture. It's uncooperative. What's uncooperative about X86 again? Yeah, it's not classically virtualizable. It has instructions that what I want to happen is when they're executed in unprivileged node, they should trap and they don't. Instead, they either fail silently, or they do something different, or something like that. So that's the degree to which the X86 architecture is uncooperative. I like that term. And then they make another argument here. Now X86 is quite dominant. Although ARM has emerged in a bunch of places. But they're arguing even if the architecture is more cooperative, this idea of hiding virtualization from the guest operating system is not necessarily a good idea. And there are some correctness and performance problems associated with that. And we may talk about those in a second. OK, so again, with full virtualization, I had an unmodified operating system. And so below that unmodified operating system was the bare metal hardware of the system. Zen introduces this idea that I'm going to port the operating system, change it in some way, and now it's going to cooperate with something else. So the something else here is what's called a hypervisor. So the hypervisor is in charge of implementing this sort of idealized virtual machine environment. And what's interesting about this is the hypervisor tries to do as little as possible. And we'll talk about some of the functions of the hypervisor over the next couple of slides. Some of the traditional functionality for operating a virtual machine environment, for example, creating new virtual machines, destroying virtual machines, cloning them, checkpointing, whatever. That is actually not part of the hypervisor itself. What they do here is they actually move that functionality into one of the guests. And the hypervisor, you can think of, is sort of running below all the other systems. So this is a figure from the paper. This is Zen. So here's the hypervisor. It provides this virtualized, makes some small changes to provide a different view of hardware that's more amenable to virtualization. But again, as few changes as possible to make the job of writing the guest OS is importing them easy. And then here are all my guests. So here's a Linux guest. Here's a BSD guest. Here's an XP guest. In, as part of those, we have Zen-aware software. I've had to make changes to port them to Zen. Over here, what I have is the Establish Control Plane software. So this is the machine, quote unquote, that you would log into to do things like create new virtual machines. Does this make sense? This is kind of a nice idea. They've essentially moved some of the control plane stuff out of the hypervisor itself, makes the hypervisor smaller and simpler, while still providing you with an interface to modify and to control virtual machine creation and things like this. Any questions at this point? Yeah, Steve? Well, it is a virtual machine. It's running inside one of the Zen sort of domains, what they're called. You could. Yeah, I don't think people usually do. I think usually it's like a pretty stripped down thing. The goal there is just to kind of stay out of the way. You also want something that you're not going to mess up, because if you somehow affect the operation of that system, you're losing some of the functionality that you need. It's a good question. Like you can see here, this is implemented in Linux. So Linux is what provides the support for the tool chain that allows them to configure the rest of the hardware machine. So this is the summary of the changes. This is just straight from the paper. Summary of the changes that Zen made to the traditional underlying x86 interface. If you look to the paper, a lot of the time is spent up here. And we'll talk about this in a little bit more detail. So a lot of the complexity and challenge here has to do with allowing the guest operating system to manage memory on the x86 architecture. And that has to do with a couple of features of the x86 memory management architecture that we haven't talked about yet, but we'll talk about in a sec. Over here, so this is quite interesting. Like the relationship between the application and the operating system, Zen has to run at a higher privilege level than the guest operating system. Now, this is interesting. So the hypervisor or Zen has to run at a higher privilege level than the guest operating system. And the guest operating system has to run at a higher privilege level than the applications. Seems a little confusing, but we'll come back to this. This was one of the sort of brilliant things about Zen, was that to some degree, the x86 was uncooperative. In other ways, the x86 had these latent features that turned out to come in handy. System calls, so this is kind of interesting. Remember when we talked about the virtual machine monitor? Every time the guest application made a system call, I had to go out into the virtual machine. Sorry, I had to go into the host operating system. In this case, this would be the hypervisor. Then the host operating system had to take that and hand it over to the virtual machine monitor, which then passed it into the guest operating system. That was kind of a mess. There's a bunch of different boundary crossings that have to take place. Instead, what Zen does to improve performance is the guest operating system can actually install its own system call handlers when it runs. So when I switch the machine over to a particular virtual machine to run, that guest can install its own system call handlers and those can allow system calls made by guest applications to vector directly into the guest operating system. Does that make sense? So that avoids the trip through the hypervisor or the virtual machine monitor, as we were calling it on Monday. Again, a lot of this stuff seems like, of course that's the right way to do it, but it's made possible by the fact that I'm changing the guest operating system. That's critical to keep in mind. These are fantastic features in a lot of ways they're the right ways to do these sorts of things, but they're not possible with an unmodified operating system in most cases. I have to change the interface slightly. Here's another example of a place where I'm providing more information that would be available to a operating system typically running in a fully virtualized environment. I can provide two notions of time. So one notion of time is real wall clock time. The other notion of time is what's called virtual time. What's the difference? Things they can explain the difference between these two time bases that my hypervisor is going to provide. Steve, want to try again? They're quite, right, so okay, so I'll give you a hint. I mean real time is always ticking. It's real wall clock time. What is virtual time? Yeah, you're close. So virtual time is the amount of time that that guest has been running. So remember, I've got one CPU on this machine and how many, what is their goal here? They state this very early on in the paper that their goal is to run how many guests? 100. They want to take one 1990 size machine and run 100 guest operating systems inside of it, in Zen, so that's pretty cool. What that means is the operating systems here, and hopefully this was apparent on Monday too, it's like another application. There's a section of the paper where they talk about how Zen does CPU schedule and how the Zen hypervisor. It's switching between different operating systems. So let me just go back to this. If I had this environment, what would be happening, let's say I have one CPU, is that I let the Linux guests run for a bit of time and I let the Windows guests run for a bit of time, the VSD, and I'm doing this constantly. I'm duty cycling. I'm context switching, quote unquote, between the operating systems themselves. So when the operating system is running, it's virtual clock is ticking, but when it's not running, that virtual clock stops. So the virtual clock measures the amount of time that the operating system is running. The reason that it's important to have both of these is because in certain cases, again, if I'm communicating with another client and we're using TCP or something like that, that other client doesn't care that I'm on a virtualized system. Like when it expects a reply, it's measuring things in wall clock time. And so I have two, too. And then one of the things that was sort of very interesting about Zen and where Zen went was that it turned out that pair of virtualization was actually really nice at improving the device interface as well. So this is another place where they can make a bunch of improvements. Virtual devices are elegant and simple to access. And in some cases, they're hiding some of the complexity of the underlying physical devices here. Yeah, so that's a better question. I'm not exactly sure, right? I think there are certain cases where, well, okay, well, no, I am sure, right? So virtual time, what would be one use for virtual time? What's, there's something that we talked about earlier in the semester where I probably am going to measure time and I want to measure it in virtual time. What's that? Make. Okay, no, that's fair, right? So if I, yeah, if I want to do benchmarking, I want to see how long something ran, it's not makes fault that the whole machine got descheduled and some other machine ran, but what's another process that the operating system performs where it'd be really important to use virtual time? Yeah. Scheduling, of course, right? If I use real time and scheduling, then when my operating system gets descheduled, I'd come back in and I'm like, oh my gosh, your task has run way longer than it was supposed to. So in that, that's the case where I have to use virtual time, right? Because remember, there's two levels of scheduling going on here. One level of scheduling is then scheduling the different virtual machines to run. The second level of scheduling is inside those virtual machines that gets operating is scheduling threats, right? So that's the case where it's really important to use virtual time to get things right. That make sense? Yeah, that's a good question. Okay, so let's talk about virtualizing memory. This is probably the most interesting sort of fun part of the paper. So they pointed out in the paper that virtualizing memory is not particularly challenging if the memory architecture has a couple of features. One of the feature is a software managed TLB. This makes it simpler to improve interaction with the underlying hardware because the hardware is not sort of doing things behind my back. A software managed TLB is the type of system that you guys are programming for this assignment. You guys are controlling the TLB. And the other feature is something called address space identifiers. And this is a little more complicated to understand. But you guys are also dealing with this maybe. So if you guys are doing assignment three, what are you, so if you're not using address space identifiers, and I suspect most of you are not, what do you have to do every time you change between different processes? Flush the TLB, okay? Now, so this is interesting. In order to operate on a Zen system, the guest operating system is going to frequently have to make, or I shouldn't say frequently, from time to time, it's going to have to make calls to the hypervisor. These are actually called, so if normally to get into the operating system, I make a system call to get into the hypervisor to get the hypervisor to do something for me, I would make a hypercall. Actually, that's what they refer to them as. So it's like hypercalls are kind of the equivalent. Applications make system calls to get help from the operating system. In Zen, the guest operating system makes hypercalls to get help from the hypervisor. Now, the hypervisor needs to protect itself from the guest operating system, so the hypervisor has its own memory mappings. If every time I enter the hypervisor, I have to flush the TLB to make sure that there aren't any translations that are going to be used, that I wouldn't want the guest operating system to see, that has a performance impact. And if the X86 had these things called address-based identifiers, I could use them to make sure that the hypervisor can maintain its own memory mappings without flushing the TLB every time. It's a little bit arcane, but that's kind of why it is. And of course, they point out that X86 lacks both of these features. So address-based identifiers, I think I just kind of explained. Does anyone know what the alternative to a software-managed TLB is? I don't think we talked about this this year in class. I think I just skipped it. If it's not software-managed, it's hardware-managed. So the TLB faults that you guys are handling, you handle them by looking up information in the page tables, and then reloading the TLB as needed. On the X86, they decided to get clever. So when the X86 MMU hits a TLB fault, it looks up the information in the page tables itself. The MMU can walk the page tables looking for the page table entry. If the page is in memory, the MMU will load the entry automatically without interrupting the operating system. Does that make sense? So why would I do this, first of all? This sounds terrible. And there are some implications. The first implication is that the operating system has to set up the page tables in a way that hardware understands, which is kind of interesting. But what's the reason to do this? Yeah, performance. So on a software-managed TLB system, every time there's a TLB mess, I have to enter the operating system. The operating system has to figure it out. On a system with a hardware-managed TLB, if the page is in memory, I don't have to enter the operating system, and the MMU can do this quite quickly. Now, when do I have to enter the operating system? I still may have to trap into the operating system on a TLB fault on a hardware-managed system. When would I have to do that, Zach? If the page is not resident, right? So if a TLB fault becomes a page fault, either because the page hasn't been created yet or because it's on disk, then I trap into the operating system. So the X86 has a hardware-managed TLB, and that turns out to make things quite complicated. Again, if you want to see this in action, go Google shadow page tables that are set up by things like VMware, and that'll really bend your mind a little bit. Okay, so here's what Zen decided to do given these limitations. The first one is, guest operating systems are responsible for managing their hardware page tables. Remember, there's no management of the TLB that's done in software here. The way that you manage memory mappings in ON-X86 is you manage the page tables. And what happens is the processor has an address for the currently running process that it uses to find the page tables when a TLB fault happens to see if it can resolve it without interrupting the operating system. So, and there's minimal involvement from Zen. Now there has to be some involvement here, right? Because why, you know, what's the danger with letting the guest operating system just manage the page tables without going through the hypervisor? What could it do? What's the property that I'm trying to create for the virtual machine would that allow the guest operating system to vitally? I just let it make any change to the page table it wants. What's that? No, it is the kernel. Okay, so that's important, right? This is a kernel. I'm letting a guest operating system modify the hardware page tables without the hypervisor. This is unsafe because why? Yeah. Yeah, I can modify other virtual machines. So at this point, you can almost think of these, there's some degree of sort of metaphorical similarity between a virtual machine in this environment and an application. I don't let the application modify the page tables without the operating system's help because it could see other applications with other things on the system it's not supposed to see or modify. Same thing here. I don't let the operating system make those changes without Zen's help without the hypervisor's help for the same reason. The second thing here, I find this to be an incredibly beautiful hack. So they wanna make sure that every time I make a hypercall to enter the hypervisor, I don't have to flush the TLB. So here's what they do to get that to work. They have Zen live, so the memory that's used by Zen, it's code and it's data and all that stuff, at the top of 64 megabytes of every address space. So every address space on a system that uses Zen, and this is something that has to, this requires a change to the guest operating systems, has to set aside this 64 megabyte section for Zen. And so what it means is that part of the address space, and you can read the paper if you wanna see the details of how this is actually protected because it's kind of interesting, but that part of every address space is where Zen lives, and it means that entering the hypervisor through a hypercall does not require flushing the TLB because those addresses are loaded into every address space that lives. Does that make sense? It's clever. So how do we make the system safe? The way we make the system safe is that we make sure that Zen sees all changes to the page tables. So where the underlying operating system would normally use a hardware interface to change the page tables, now we need to replace that with a hypercall. So every time it initializes page it registers these page with Zen. Oh, sorry, I forgot this, yeah. So when it wants to create memory that's part of a page table, it has to register that page with Zen so that Zen knows that only Zen now can update it. And then in the future, if it wants to make changes to the page table, it has to essentially make them using a new hypercall API. So this is one of the things that I can do because I'm modifying the guest operating system. And they talk about, actually they have an interesting section where they talk about how they actually had to make changes to both Windows and Linux to get this to work. It turns out that Linux has pre-processor macros for modifying page table entries that made it very easy for them to make these changes. Windows on the other hand did not and apparently they suggest at some point that they wrote a script to modify the Windows source code to fix this. I suspect that that was disgusting. Anyway, because there were a bunch of different places where they were modifying the page table entries and rather than trying to fix all of them by hand, they probably wrote some nasty Perl script to do it. Okay, when you're writing the scripts to modify your operating system code, you're in an interesting place, especially. Okay, so now every time I have to make a page table change, potentially I have to enter Zen and make a hypercall. And so they considered the overhead of doing this and they have some ways to allow the operating system to batch updates. So they have some interfaces that allow them to say, I'm gonna make a bunch of updates and sort of put them in this particular place and I hand them all to the hypervisor at once. Zen validates that they're safe and then applies them to the page tables, if so. Any questions about this? Where are we now? Yeah, okay. So that's the memory management stuff. Does that kind of make sense? So what I've done in summary is I've replaced direct access by the guest operating system to the hardware page tables with a new interface supported by the hypervisor that allows the hypervisor to make sure those updates are safe. That's one of the benefits of the pair virtualization approach. I can make changes to the guest operating system. Okay, so the story about the CPU is also kind of interesting. So let's try to get through that today. With the CPU, the problem here is that I need to make sure that the hypervisor is the most privileged entity on the system. The guest operating system should run with lower privilege and applications should run with even lower privilege. Now, in the world that I've sort of raised you in where there are two privilege levels, clearly we have a problem. There was high and low privilege and so I've got three things now, the hypervisor, the guest operating system and applications. So clearly there's a problem. I can't establish the order and I want given only two privilege levels. So it turns out the next 86 actually had a few helpful features here. And one of the helpful features that had was this thing called privilege ratings. Did we talk about this earlier in the semester? I don't remember. Yeah, so on the X86 it turns out that there are not just two privilege levels. There are four and they refer to them as rings for some reason, you can think of them as levels. It doesn't matter. So normally for years and years and years, operating systems had really only been using two of them. They ran the operating system kernel inside ring zero which is the most privileged and they ran application code inside ring three which is the least privileged. And then, I think they point out that like OS2, has anyone even heard of OS2? There we go, yeah, we have a real kernel, real kernel dork in the crowd here. He's heard of OS2. I don't even know what OS2 was. I'm like, I know it existed, right? I'm not exactly sure. I think it was a competitor to DOS, right? Oh, even better. How many people have heard of Amiga OS? Okay, yeah, this is like, okay, there we go. Couple more hints. So yeah, OS2 I think is long, dead, and buried. But yeah, so OS2 used these rings, but like no one else was. However, it turns out to be kind of helpful. So what I do, did I get to this? What I do is that I modify the, remember, I'm modifying the guest operating system. So I modify the guest operating system so that it can run inside ring one. And this is one of the ways that the hypervisor in Zen protects itself from the guest operating system. And this is possible again, because I'm modifying the guest OS. So I take the guest OS, I don't have to worry about it doing things that require a ring zero privilege because I've modified it to make sure that it doesn't need to do that. In cases where it needs to do something that requires ring zero privilege, what do I replace that with? So I'm going through my OS and I'm finding some part of the code that requires something that I can only do in ring zero. I replace that with what? System call? Hyper call, yeah. So I replace this with a call into the hypervisor. So the hypervisor has to provide those functionalities. The way the guest's operating system to access them is explicitly by asking the hypervisor to do it because the hypervisor runs in ring zero, yeah. Yeah, absolutely, yeah. So you would have, I'm not sure this is true anymore because I think some of these changes have become more broad. So you would have to have a special version of your system that was modified to run in Zen. Because on a normal system, what happens? I try to make one of these hypercalls and it's like I have no idea what you're trying to do. It just won't work. So you can almost think about this. This is almost similar to modifying an operating system to support a different machine architecture. So like when a new instruction set comes out or a new system comes out, I make some small modifications in the machine to support it and I have a new version of the operating system and I have some architecture dependent code that's part of the OS, right? Does that make sense? Yeah, it's almost like, I think they actually did just add Zen as a new potential architectural target when you build things like that. Okay, so these are the ones that they, so and this was kind of interesting here too, right? So how do you handle, how do you improve performance? You go after the performance problem. This is something we'll talk about next week when we talk a little bit about benchmarking and optimization. So they looked at what happens a lot, what are the types of things where I would have to have a could potentially cause performance problems, page faults, as we know, and system calls. And so for the system calls, they allow the guest operating systems to install this fast handle, right? Which essentially allows system calls to land immediately inside the guest operating system while it's running. So one of the things that would happen is when I switch from one virtual machine to another, the new virtual machine guest operating system has the opportunity to install these handlers and then off I go. And any system calls that happened during that time can be handed straight into that guest operating system. Any questions about this? We're almost done. Yeah, so I just want to sort of, just to finish up, so to bring out some of the differences here, right? Because I think it's really sort of an interesting design argument. Because remember, our goal with full virtualization was that we weren't going to change the guest operating system. That was the goal. But it turned out, and if you look at essentially what VMware and VirtualBox and other approaches like that do, they have to. At least they had to 20 years ago, maybe this is a little bit different now. Because particularly on the x86, it does stuff that's unsafe and it tries to run these instructions that it's not supposed to and they don't work. And so it was this great vision that I was gonna not have to change the guest operating system. But it turned out, I had to do it at runtime using these binary rewriting techniques that were terrible, just really nasty and gross. And so to some degree, I had already lost, right? I didn't achieve my goal. If you look at what these other full virtualization approaches do, they cannot run an unmodified operating system because of problems with x86. So, why not make, I think the beautiful argument behind pair of virtualization is if we're gonna make those changes, don't make them in some totally ugly hacky way at runtime where you have to rewrite the binary code, just make them in the source code itself, right? Fix the operating system in some small ways to allow it to be more compatible with the underlying system and that's the right approach. Yeah, Steve. Yeah. I will be happy to look into this and get back to you. I think so, right? There's one that comes to mind immediately but I don't wanna say it in case it's wrong because I don't want to give you guys the wrong impression. But approaches like this, right? So for example, VMware now has their own hyper virtualized solutions, right? So this approach is essentially how server virtualization is done today, right? Even I'm not sure it's exactly Zen or whether it's other products and other systems that have been built later but the Zen approach of hyper virtualization is really the right way and that's how people do this, yeah. Yeah, I can look into that but I mean, I think sort of like flavors and variants of Zen sort of split off and went their separate ways, right? But I'll look into this and get back to you. All right, any other questions about hyper virtualization? I won't bore you with the details of the results. I mean, you can look at it, you know, the praise C is that it works. Overhead is pretty low. The porting effort to make these changes is a one-time cost and it's fairly minimal. And so that's a pair of virtualization. All right, so I will see you guys on Friday at which point we will talk about something which I am not sure, yeah. We'll figure it out. See you then.