 I'm not really depressed yet. I'll just put you in a class mood, you know? Be sad about other things. We'll be excited about virtualization. So today we're going to start a week-long unit about virtualization, which is cool. It's a topic that seems appropriate to cover in an operating systems class. A lot of the way that we run operating systems today is in virtualized environments. And virtualization is another technology, another place where there's a neat story about sort of a co-evolution of hardware and software. So hardware requirements, software capabilities, and sort of the mixture of the two. Outside this class, I mean, how many people feel like they use some sort of virtualization technology on a regular basis? OK, what do you guys do? Virtual box for what? There is no why, right? There's just virtual box. That's fine. Like just playing around. I mean, what's that? For legal torrenting? Right, exactly. You always want to do your legal torrenting inside a virtual box, right? It's not like a security feature? For your legal torrenting? Right, exactly. You always want to be careful when you're doing your legal torrenting, yeah? That's true. It's pretty boring. We tried that. It just sits there for a while. It is very hard to do RM-RF. Like at some point, I don't know. I wonder how long it took them to add that option. It was like a joke for so long. Like, oh, it actually would be RM-DF-RF slash. And then they were like, let's add 10 different options that you have to add to RM to get it to actually do that, because that would be useful. Yeah, so I mean, virtual city's technology is out there. Is anyone use container virtualization, like Docker or stuff like that? Anyone use EC2 for anything? Should. It's free accounts. So those are kind of like the three major virtualization technologies we're going to talk about. Asignment 3.3 is out. Let's do on 5.5. And I'm just going to really have a hard time giving up on this 5.5 at 5 deadline. I don't know. Maybe there is another week in the semester, but who cares? It's just too nice. It's too poetic. There's also, like I pointed out, late credit that we're giving for assignment 3.2. So if you're still working, keep submitting. OK. Finally, there's a paper to read for Wednesday. So today, I'm going to present one approach to virtualization. On Wednesday, we're going to look at the second approach to virtualization. And this one is one that came out of, largely came out of academic work. So please read this paper. I think it's from sort of mid-90s, Zen, and the art of virtualization. It's a great title. It's a very cool technology. It's something that's out there in the world. So this is another paper that gave birth to a multimillion dollar industry. If you've used EC2 and a variety of other types of server-side virtualization technologies, you've used things that started with Zen. OK. So what is virtualization? So up until this point, we've been talking about how operating systems interact with an actual physical machine. So a physical machine is a collection of real hardware resources that the operating system has exclusive access to through hardware interfaces. This is the way that the operating system interacts with hardware. Now, what's cool is that we've actually created an ability to, when you boot something inside virtual box, it's pretty clear that I can also boot operating systems inside this virtual machine. So we've been talking about how operating systems run on real physical machines. Now we're going to start talking about ways that we can get them to run inside virtual machines. And this is probably something you guys have noticed as you configure virtual box. So when you configure a virtual box machine, you're actually configuring hardware features. How many CPUs should the virtual machine pretend to have? How much memory should the virtual machine pretend to have? How many disks and things like this? And so the story this week is about changes to the operating system and clever software tricks that are required to get this to work. So just some terminology to start out, because now things get a little bit more confusing, particularly today when we're talking about full virtualization. Because now we have two operating systems. We have a guest operating system that runs inside the virtual machine and the host operating system as well. Did I see a question? OK. So now virtual machines are different than physical machines. Virtual machines, again, if you've configured a virtual box virtual machine, you've noticed that the virtual machine is sort of a subset of the available hardware. I can make sure that the host, sorry, that the guest operating system doesn't have full access to the actual underlying hardware. If I configure my virtual box machine to have only 1 gigabyte of memory and a 4 gigabyte system, that's all that the guest operating system should be able to use. And the other thing that's really important is I don't want the guest operating system to be able to break containment. So the guest operating system should not be able to use its features in order to see memory that I haven't allocated to it. That would be a big security problem. So we're going to talk about how this is done. But the big trick here is that in order to run inside this virtual machine, I have to take away some of the privileges that the operating system is used to having. Think about it. The operating system that runs inside that virtual box virtual machine is used to doing things like, for example, controlling the page tables and controlling the hardware virtual to physical address translations. If I allow the guest operating system to do that, what can it do? Let's say that I allow the guest operating system full kernel privileges, and I allow it to rewrite the TLB and the equivalent on x86, and essentially allow it to control the translations between virtual and physical addresses. What can it do at that point? Yeah. Well, it's not that I want the. So in order to get virtualization to perform well, I want this user program, whether it's the virtual machine monitor or the guest operating system to communicate with hardware as much as possible. What's the problem with allowing it to do things like control the virtual to physical address translations? Yeah. Or it can just see every page of memory. So if I allow it full control over these mappings, that it's used to having, keep in mind, the operating systems that we started to run inside virtualized environments were used to having full control of a machine. So they were used to being able to control all of the virtual to physical mappings. If I allow them to do that, they can see any page of memory on the entire system, including memory that's outside of the bounds that I've given them. So this is a huge security problem. And this is just one example of the ways that we have to take away some of the privileges that the operating system is used to having. So the piece of software that provides, so the virtual machine is an abstraction, the piece of software that provides the virtual machine abstraction is something called a virtual machine monitor. So VMware, VirtualBox are examples of virtual machine monitors. This is a piece of software that creates the illusion of a virtual machine to the degree of fidelity that I can actually boot an operating system inside it. And it will run happily and think that it has access to this entire machine that's a subset of the entire machine. So that's our second piece of terminology. So we distinguish between the host operating system. The host operating system is running the virtual machine monitor as an application. The virtual machine monitor creates a virtual machine abstraction that allows a guest operating system to boot inside of that virtual machine. And then the guest operating system can itself run other applications. This makes sense. And this is sort of what's happening. So when you boot VirtualBox and you boot up something in Windows, what's happening is that VirtualBox is running as an application and that application needs some special privileges. We'll come back to that later. That application is creating an environment in which Windows can run and then Windows, which is running, thinking, oh, I'm running on a piece of hardware, mostly, can then boot applications. Yeah. Is it using some kind of virtuality? We'll come back to that. The tricks you have to do to get this to work on x86, because remember, x86 does not have a software-managed TLB, x86 has a hardware-managed TLB, and so the tricks you have to do to get this to work on x86 are worth an entire separate lecture that I'm not going to give. But if you Google shadow page tables, you will find lots of documentation on how this is done. It's just really, it's technical, it's sort of interesting, but it's not really worth talking about. And so this should really drive home the fact that the operating system is just another program. If I provide it with the right environment, it's used to certain things. Remember, it's used to having these privileges that allow it to control the hardware, but it turns out that I can use another piece of software to provide it some sort of simulation of those privileges that allows it to actually run and to the point where I can boot other applications. So yeah, so this is like, so has anyone ever tried to boot to do, to run virtual box inside virtual box? Does it work? It's very slow, okay. But it works, so yeah, we could boot, we could run Windows inside Linux inside macOS, right? Why not? Like life is short, we should do things like that to see if they work. Not what I would do, but yeah, so this is fascinating, like I can potentially create multiple levels of this. Okay, so we're gonna go back and talk about how, but the first thing we should do is talk about why. Why do this? So we've been talking about operating systems all semester and hopefully I've given you some sense that these are useful abstractions that the operating system is providing and this is how things went on for several decades of computing history, but at some point people started working on this virtualization problem and they had good reasons for doing that. So what are some of the problems with operating system environments that we've discussed? So if I can't virtualize things, if I just have an operating system running on top of bare metal, the operating system is providing, because on some level, keep in mind, the environment that an operating system is providing to applications has its own set of abstractions. So why am I booting a whole nother operating system inside an application? That doesn't seem to make sense, right? It should run an application. So why do this? What are some reasons for it? Yeah, yeah, but just buy another machine and run the other operating system. Like, this is a reasonable argument. Yeah, what else? Again, get another machine, boot it up there, and the worst thing I can do is take over that machine and maybe make smoke come out of it like some of you guys are doing with CIS 161. Yeah. What's that? I don't know, it's not that expensive. It's certainly cheaper than the amount of work that we put into virtualization, right? Let's put it that way. If you add up all the man hours people spend on this problem, yeah. Yeah, but why? I can have multiple applications that run on the system at the same time, and they have access to hardware, right? I mean applications that run inside a traditional operating system have access to memory and the disk and they're not working all these other things, and the operating system is already doing a very effective, we hope, job of multiplexing those resources. So this is already being done. So let's go through some of the reasons. So yes, it can be hard if for some reason you want to run another operating system on the same machine. Maybe there's some piece of software that's trapped on Windows or some piece of software that's trapped on Ubuntu, which is probably less common, although probably happens. But this isn't like a normal use case. This is something for developers and people that are weird enough that they want to like, I wanna try out another operating system. This is just not a normal thing that people do, right? Even people that have to manage computer environments, right? So this is a better reason. If I have a machine that's been configured, so I spent a lot of time and energy configuring settings on a particular machine so that a particular application or set of applications runs well, it's very hard to transfer those setups to another machine. Has anyone ever tried to do this by hand? It's like you have a web server setup and you have another machine where you wanna run the web server and you start sort of like setting up software, but you have new versions and the configuration files are in different format. It just becomes a pain. So virtualization allows me to take an entire machine image. And you may have done this before. Have you guys booted up machine images that you downloaded online before? But just come pre-package. It's like we used to do this for this class, but it's not the best ideal way to do it anymore. But yeah, I can take an entire machine and I can package it up to the point where it's identical and you can boot it up in your own virtual machine monitoring. You have the same machine that I have. Yeah. Vagrant creates its own machine and configures it later, but that's true. The box that we gave you was sort of a starting point. That was created using this technology. So the way that we created the base starting point for your OS 161 environment, if you're using Vagrant was that, I booted up a Vagrant virtual machine, I installed some software and then I ran some commands in Vagrant that caused it to create a box and the box is essentially this. So it's a good point. Other reasons. So it's messy to adjust hardware resources. So this is one of the most probably interesting reasons related to cloud computing. So let's say I have a website and I put it on a piece of hardware. So I made a decision at that point to run a particular set of software on this physical machine. That physical machine might be too big, which is typically the case because I would typically over provision things. I'd say, let's pretend that the ops class website is suddenly gonna be going a million hits a second. That's something I dream about at night. So we'll put it on machine with six terabytes of RAM just in case. And then it sits there getting like one hit per day. There's a huge waste of resources. So having the difficulty moving things back and forth made it difficult to do this. With virtual machines, you guys may have done this. You can just stop your virtual box instance and more memory to it and restart it and the whole process takes, I don't know, a minute or two. And potentially I could actually do that while the operating system is running if it has the right tools inside to realize what's going on. If I'm completely fooling the operating system that thinks it's running on bare metal, that's not usually something that it will support because it's weird for suddenly there's more memory in the system. It's like, no, there's not. Unless you stuck your hand in there while it was running an added RAM, which I don't recommend doing that. I don't think it works either, so I might get shocked. And so static machine provisioning is tough because the requirements for different services that I'm gonna run inside of machine change. And this is a core driver of things like EC2 is this flexibility that's not only expressed in terms of duplicating and replicating services across multiple machines, but also in the knobs I have in each single machine to change how many resources they use. So another problem that operating systems have is they don't necessarily completely isolate applications from each other. Operating systems try to provide a certain type of isolation. So I try to make sure, for example, one application can't crash another application. But that doesn't mean that one application can't make the machine really, really slow. And that affects a bunch of other applications. Operating systems leak a certain amount of information about what's going on on the system through different side channels that things like PROC and stuff like that. So if I'm really, really paranoid, it's possible. And people have done research on this for years in the security community. There's all these different ways that if two processes wanna communicate without letting anyone else know, they can do these weird things and they can communicate with each other in ways that are completely invisible to the operating system. So they can collude. Multiple software packages might require different versions of libraries. And hopefully some of these things have been solved by saner development environments now, but this is still a problem. I try to, this used to happen all the time before you had things like apt-get. When I was a little kid setting up software in my Linux box, everything came in a tarball. I know, you guys are looking at me like, what's a tarball? So this is how you set things up. You got a tarball, you unpack that tarball, there was a configure script. If you were lucky, you ran that configure script. That configure script generated 60,000 errors. You spent the next three weeks fixing all those problems. And then maybe if you were lucky, you could actually get it to compile. You actually have to compile things on your own machine. This is true. This has not happened anymore. Has anyone ever set up a piece of software this way? Oh gosh, okay, well anyway, we haven't, clearly we haven't fully reached the future yet. But when you run things like apt-get, it just works, right? Cause we figured out how to handle all these dependencies. But what used to happen is, again, once you got the configure script to run, then you had to run a make file. Maybe if you were lucky, make completed. But most of the time it didn't through some sort of completely bizarre error that you had no chance of fixing. And that's what happened, right? I mean, I remember getting so deep into trying to set things things up. I was like, what was I trying to do again? What was the point? I'm trying to install some to the library. It's like, oh, I just wanted to listen to an MP3. Okay, there's probably an easier way to do that. Certain applications can have very specific requirements. So this is something else that if you look at the Zen paper, they use as a driver in this space. So there were actually cases where software vendors at the time would refuse to support certain software packages if they were run on a machine that ran anything else. So if you want support for Microsoft SQL Server, you know, the first thing that I would ask you is, do you have any other applications running on the same machine? If the answer is yes, then we don't make any guarantees about performance. And so, you know, it was like, okay, well, I have to buy a special machine to run Windows in order to run this one application. And at that point, a lot of the goals of the operating system are lost. I'm not running multiple applications. I have a whole machine that's just tailor-made for doing one thing. And of course, that goes back to the previous slide where now I have to make static upfront choices about the resources allocated to this machine and hope that they're appropriate over time. Okay. So these are some of the nice features of virtualization. These come basically from the slides that we just looked at. So I can ship entire environments around. If I want to, let's say I want to take down a particular machine, I can take all the virtual machines off that, ship them over to another machine, boot them again, you know, exactly where they were when they left off in exactly the identical configuration and off they go. And there's some downtime associated with this potentially, but maybe not that much. This is, you know, if you look at how EC2 works, this is how, this is very exciting. So I could take one big machine, I can buy a big beefy machine and I can carve it up into little pieces. And those pieces can be reconfigured over time. So this is how, you know, all the servers that I maintain are virtualized servers. So, you know, old versions of the class website, you know, we kept them around for a couple years and at some point you hit delete and they're gone. It's like you deleted a machine from the earth and now I have some spare resources on that server. I can create a new virtual machine, bring up a new configuration, use it to do something else. It's actually really nice. It sort of turns the process of hardware configuration, which before required buying stuff, moving rams. Imagine moving RAM from one machine to another in the past. I have to shut them down, take them apart, pull a stick of RAM out, hope it's the right version to go in the other machine. And, you know, that's, now I can do it, you know, using a software interface and a couple of clicks. So that's pretty cool. You know, so this goes with the first one, you know, replicating an entire machine image. So when I have a machine image, I get the software that's running, I get the OS version, I get all the libraries, I get all the configuration, I get everything. And if you've used things like Docker, you kind of see where this goes because that's part of what we're trying, you know, what Docker is trying to do with container virtualization. I don't necessarily need to duplicate an entire virtual machine, an entire machine to get this. We'll come back and talk about this on Friday. So, you know, you guys may feel like virtualization is a new technology, but it's actually, as the earliest reference I could find to this was 1974. So two systems researchers came up with requirements for the piece of software that runs virtual machines, the virtual machine monitor. So there's three things that are required. The first requirement is that the software that runs inside the virtual machine monitor, which is typically an operating system, should run the same way that it would run on real hardware. So it should not change in a big fundamental way how that software works. The second requirement is performance. So this is what distinguishes virtual machines from simulators. Sys 161, which you guys use to run your assignments, is a simulator. Sys 161 is not a virtual machine. It's not actually executing those instructions on your processor. It's, you know, using the processor to create a simulator of this old, ancient MIPS R3000 processor. Simulators are much slower. That's sort of the trade-off. Sys 161, we don't care because the machine is so simple. So usually it can run pretty close to line rate. Finally, safety. So the VMM should manage all hardware resources and not allow the guest operating system to pierce. That's sometimes the word that you see here to pierce the virtual machine. So if I allocate a certain amount of memory to the virtual machine, any software that runs inside of it should see that memory and nothing else. Same thing with files and other things. Okay. So there's three approaches to virtualization that we're gonna talk about. So today I'm gonna sort of talk about full virtualization. So full virtualization was probably one of the earliest virtualization technologies. And the goal was to run a complete unmodified operating system inside the virtual machine model. And this whole virtualization was a technology that was largely pushed by VMware and virtual box caught up at some point. So if you've used those pieces of software on a desktop machine, so we have to make sort of a distinction here. VMware also sells a pair of virtualization software as well, which is the next category. So but if you've used VMware's client to run a operating system inside your desktop computer, you've used full virtualization. Pair of virtualization. So on Wednesday we'll talk about Zen. And the Zen argument was kind of cool. And Zen said, look, it turns out to be really hard to run complete unmodified operating systems inside the virtual machine model. There's all these pretty nasty tricks that we have to play. What if we did the following? What if we said let's just change the guest operating system slightly? Let's make some small changes to it. And it turns out those small changes produce big wins in terms of how the rest of the virtualization approach works. So this we'll talk about on Wednesday. This is like Amazon EC2, if you've used Amazon EC2, this is what they're doing. And again, VMware also has a solution in that space. And then finally on Friday or maybe on Monday if we spill over, we'll talk about container virtualization. So this is a technique that's made popular by Docker. And here the idea is actually not to run the virtual machine monitor here is really a misnomer. What I'm doing is I'm creating a namespace inside the operating system that is completely isolated from the rest of the operating system. So the virtualized environment in things like Docker uses the same operating system as the rest of the system. But if I'm clever about how I name things, I can actually make it look like it's running in a completely separate machine. OK, so today we're going to talk about full virtualization. So the goal here is run a complete unmodified operating system inside the virtual machine monitor and VMware. OK, so what's hard about this? Going back to what we talked about before, what makes this difficult? I want to run an operating system next to another operating system. So I have a host OS that's already installed on the machine. It's going to run an application called the virtual machine monitor, be that VMware or virtual box. And I want to run an operating system inside that. An unmodified operating system. Why is this difficult? Yeah. So here's the core problem. And this is a nice way to review stuff that we talked about earlier in the semester. Remember we talked about privilege. The privilege that the host operating system has is that it gets to handle traps, interrupts, exceptions. System calls, hardware interrupts, other types of exceptional conditions are routed to the host operating system to be handled. And then the host operating system at that point uses the elevated privilege level that comes along with those handlers and its own software to handle whatever is going on. So when I have a system call, for example, the system call gets a vector to the host operating system. The host operating system uses its special powers to do whatever the application wants and needs done at that point. You guys remember this? So if I start running a guest operating system, where are the traps going to go? Traps that are generated by applications running inside the guest operating system, where are they going to end up? They're going to end up in the host of less trap handling code. So that's one challenge. So I need to be able to make sure that I can figure out a way to get these traps that are going to try to jump out of the guest operating system in the host operating system. The host operating system just needs to know how to vector those back into the guest operating system to be handled. Does that make sense? So if I'm running Windows inside of Linux, Windows applications will cause a trap. They want help. They're basically doing a Windows equivalent of a system call, but that lands in the Linux system call handling path, which is not going to help you handle a Windows system call. So at some point, I need to get that trap back inside the virtual machine and into the Windows code that's going to handle. Even more problematic is that the guest operating system is going to try to execute privilege instructions. Remember, when I boot up my guest operating system inside the virtual machine monitor, what does it assume? What's typically true about the state of the world during boot? What's that? Well, we talked about the fact that when I have trap handling code, typically, that raises the privilege level. But how does the guest operating system, how do operating systems get started when they boot? They typically boot in what mode? Read-write mode. I like that. Privilege mode. I mean, it's a version of rewrite mode. Typically, the boot path, when I start executing the operating system bootstrapping code, it's running with kernel privilege, which it then drops before it starts running applications. But that privilege is required for it to do things like touch memory that applications can't touch, touch hardware that applications aren't allowed to use, blah, blah, blah. When I start booting the guest operating system inside the virtual machine monitor, what's different? Or what should be different? Can I run it with privilege? What happens if I try to run it with kernel privilege? Well, no, but let's just say that for some, in some way, I just decide to let the guest operating system execute its boot code with full privileges. What is going to happen to the machine? Yeah. Yeah, it's going to be like, oh, I'm in charge here. You decided to boot me with privilege, so I'm going to start setting up the page tables and set up my own data structures. I don't know what would happen. It would be not good. The host operating system needs to keep control over the machine when this is happening. Remember, the guest operating system is running inside a virtual machine, which is an application. It's not supposed to be able to jump out of there and take over the machine. So I need to run it without privilege. But then what happens? It starts executing all these privileged instructions because it thinks that it's supposed to be, so these boot paths that were set up and on the unmodified operating system that I'm trying to run, oh, yes, okay, sorry. I thought my thing fell over. Okay, here we go, thanks. The boot paths expect that they're going to have privileged access. And so as soon as they start executing privileged instructions, if I didn't have special support inside the virtual machine monitor, what would happen? Let's say that the guest operating system starts running inside the virtual machine monitor and it executes an instruction that normally requires kernel privilege to execute. What would the host operating system normally do? Let's say an application tries to write to the TLB, or the equivalent of XA6. What do I normally do? Right? But in this case, I can't do that. I have to do something different in order for the guest operating system to work because the guest operating system needs to be able to manipulate the virtual hardware without having an effect on the hardware that it doesn't have access to. Okay, so we just talked about this. So if I run it with kernel privileges, then I provide the guest access to the entire machine. So this violates my safety constraint. If I run it with user privileges, I need to figure out what to do about these privileged instructions. So ideally, privileged instructions will do the following. So when I try to execute a privileged instruction, the CPU will trap the instruction due to a privilege violation, right? I'm trying to do something that I don't have permission to do. I'm trying to run an instruction that I need kernel privilege to run and I don't have it. And so this generates a CPU exception. So now I'm inside the host operating system. The host operating system needs to vector this back into the virtual machine monitor. So this is the first thing that has to happen. It's the host operating system needs to recognize that this is kind of a special case here of an application that actually needs to be able to generate certain types of privilege exceptions. And those privilege exceptions need to be sent to this piece of software called a virtual machine monitor that's providing a virtual machine to handle. So the virtual machine monitor will look at the exception and try to figure out what the guest is doing. And if what it is doing is legitimate, it'll update certain hardware state that's maintained by the virtual machine monitor. And then the guest OS can continue. So this is what I want to happen. This is also known as trap because that's the first thing that happened and emulate. The emulation happens inside the virtual machine monitor. So technically it's trap vector into the virtual machine monitor and then emulate. Does this make sense? This is the goal. This is what I want to happen. If an instruction or an entire instruction set has this property, we refer to it as being classically virtualizable. And yeah, I just said that. Okay. So now what happens with traps that incur inside the virtual machine? So this gets complicated because there are two types of traps that can occur inside the virtual machine. One is generated by an application. The second is generated by the operating system. The guest operating system. Does that make sense? There is like an extra level of terminology that I've just introduced the first time. I remember sitting through the virtualization lectures, 10 years ago when I took OS, I had no idea what was going on. So hopefully you have some idea because it is sort of confusing. If the trap is caused by an application, the trap needs to be handled by the guest operating system. Does that make sense? So it's like a system call. I have a Windows program running that's making a Windows system call and somehow that trap needs to get back to the Windows code that needs to run and handle it. Make sense? Okay. The trap is caused by the guest operating system and then the virtual machine monitor needs to handle that trap and adjust the state of the virtual machine. So this is gonna be something like, it's right into a virtual disk, for example, or it wants to change the virtual page tables, sorry, the virtual MMU data structures. I wanna create new memory mappings or change a virtual to physical map inside the virtual machine. So this is one of the reasons when you install, this is the reason that when you install VirtualBox, you have to, like on Windows, everything gets gray and scary. It's like, oh, it's gonna harm your computer. You need to give it some special privileges. This application needs usually some sort of kernel driver support to allow it to behave differently than other applications because again, a normal application, if it did what's going on inside the virtual machine monitor, it would just be killed. Virtual machine monitors require some special operating system support in the host operating system. Remember, these are not modifications to the guest. These are modifications to the host. So that's okay, that's valid. It's holding, Tim's holding his head here. He's like, what is going on? Yeah, so, and this is to make sure that traps that happen inside the virtual machine have to be handed back to the virtual machine monitor. So this is a special case of an application that is treated a little bit differently than the normal applications would be. Oh, sorry. Okay. Now, here's what makes this virtualization as opposed to simulation. Most of the time that the virtual machine is running, it's executing non-privileged instructions. So things like, for example, ads and some tracks and memory access to areas of memory that I've already had mapped, those proceed normally. So this is what makes virtual JSON perform so well. Has anyone ever run a system simulator and tried to boot an operating system in it? You can, there are these full system simulators. So for example, there's a program called Gem5. You can download Gem5, you can run it on your computer and you can boot up an Android, you can boot up Android that's running on ARM inside x86. It is slow as molasses. It takes like 24 hours to just boot Android. And the reason is that it's not running on native hardware. Every time it executes a quote, unquote instruction, there's all this code that has to run to emulate the instruction. Here, the performance of the VM is good because most of the time it's using the actual hardware of the system. What does that imply? What has to be true about the guest operating system and the host operating system? Yeah, exactly. They have to be designed to use the same instruction set. I can't take an operating system that's compiled for ARM and run it inside an x86 virtual machine. It does not work. The CPU is shared between the virtual machine and the real machine. And so if I need a different CPU architecture, I'm in trouble, I can't do it. This is also the reason that the virtualization performs so well. I mean, a lot of times when you're inside your virtual box interacting with the operating system, you don't necessarily even perceive a slowdown. And that's because most of the instructions are using the CPU directly and don't require any interaction with the host operating system or the virtual machine model. This is the best case. Okay, so now let's think about what happens when a guest application makes, so a guest application, an application running inside the guest operating system makes a system call. So what happens? I think we've sort of gone through this, but let's just review. Guest application makes a system call. So what's the first thing that happens? Where does the trap go first? Yes, it goes to the host operating system. This has to go to the host operating system. This is critical so that the host operating system maintains control of the machine. Goes to the host operating system. The host operating system will hand it to the virtual machine monitor because the virtual machine monitor has special permission to handle things like this. Virtual machine monitor looks at it. It says, okay, this is a system call. It knows where it needs to send control to handle the system call inside the guest operating systems code. So it's gonna jump to that point. Handles it there. When the guest operating system is done, handling the system call, it's gonna run this instruction called return from exception. This is on MIPS, right? You can imagine something similar happening in non-X86. Return from exception is also privileged. So it's going to trap back to the host operating system. That's gonna get passed back to the VMM. The VMM's gonna see what's happening. It has to keep track of where the trap came from and hands control back to the Guest application. Oops, sorry. And hand it back to the original process that originated this. So this is extra level of indirection that's required every time I do something that requires the Guest operating system. Does this make sense? Okay. Yeah, so this is even, TLB faults can be even worse, right? Okay, so process inside the virtual machine creates a TLB fault. Trap to the host operating system. Hand the trap to the VMM. VMM has to inspect the trap, handle it to the Guest operating system to figure out what to do. The Guest operating system is gonna try to load the TLB. What happens at that point? Trap back to the host operating system. This is a privileged instruction and I'm not running in privileged mode. So the host operating system has to handle this back to the VMM. Now, the nice thing here is every time this happens, the host operating system gets a chance, for example, to examine what the VMM is trying to do. So the host operating system say, wait, wait, hold on. You're trying to load an entry into the TLB that's outside of the memory region that I've given you. No, no, no. So at that point, the VMM would crash, right? I would kill off the VMM. Yeah. Yeah. Yes, exactly. Yeah, the Guest operating system. The Guest operating system maintains its own page tables for memory inside of the Guest operating system. Or whatever data structure it uses. No, remember, I need to give the process the ability to use the real TLB. So when the virtual machine monitor is running and applications are running inside of it, it is allowed to use the TLB, as long as its use is checked by the host operating system to make sure it's safe. This is another, again, I cannot run things in any way that's even close to actual real line speed if I don't give you access to the TLB. So you have to have access to the MMU, but it has to be safe. Yeah, yeah. Well, I mean, it's like any other application, right? When it stops running, the Guest operating system is going to clear it and use a different set of mappings for another process. Which is a little bit different in that it actually controls the TLB in ways that other processes don't. Yeah, good question. I don't know. Yeah, I mean, handling these hardware-specific features like that is tricky, right? There's usually some clever way to get it to work, right? But I don't know. I'd have to look at exactly what the semantics of that, that capability are. Yeah, yeah. I mean, when the, yeah, absolutely. Well, no, I mean, sorry. Yeah, let me be careful here. The physical addresses that are maintained by the page table entries, or whatever address-based data structures that the Guest operating system is using are real physical addresses. And they have to point into the area of physical memory that I've given the Guest operating system to use, right? And as Guest applications are running, those get translated and that works out normally. Yeah, no. No, that would be quite slow. Yeah. Yes, yeah, like I said before, the VMM requires special drivers in the host operating system to work, right? I can make changes to the host operating system. That's within the rules. I cannot make changes to the Guest operating system, but absolutely, I have to make changes to the host operating system. There's no other way to do it, right? Okay, yeah, and then I do this. Okay, so, I think I'm so run out of time, so I'm gonna skip this. Yeah, the interesting thing about virtualization is on some level, what we're trying to do, we should compare this to virtual memory, is virtual memory, the idea was I'm virtualizing the load and store interface of memory. The virtual machine has to virtualize the entire hardware interface of the machine, and that's what's tricky, particularly instructions that are not safe. Things like just adding two things and putting the result in the register, that's not unsafe. So I don't need to virtualize that, I just let those instructions go. There are certain instructions on each version. Yeah, so, okay, right. Yes, and this is, so I'm gonna come back to this in a second. So trap and emulation is a beautiful strategy. Unfortunately, X86, along with all of its other problems, has this problem, which is that it is not a classically virtualizable instruction set. And I don't know why. Maybe they didn't know about those things back in 1980s when they were designing X86, despite the fact that the paper was from 74, I don't know. So the problem is, you might say, well, I mean, shouldn't privileged instructions always generate a trap if I tried to run them? No, they don't. Some of them will just fail silently. That's awesome, you know, why not, you know? It's another option. Just don't do what the instruction said and then keep running other instructions. That's awesome, right? That's a great strategy. The other thing that some of them did was that they worked differently. So if I ran them in kernel mode, they did one thing, if I ran them in user mode, they did something else. So again, like, sad, okay. And so what made this hard on X86 was dealing with this problem. So the instructions that trap properly are not that complicated to handle. I'm not saying that a VMM is a simple piece of software. It is not. There's a great deal of complexity that it has to handle in terms of figuring out how to re-vector traps around. But, and again, the X86 memory management architecture was also tricky to deal with. But one of the innovative features that VMR pioneered was this idea of dynamic binary translation. So at this point, I can't rely on these non-virtualizable instructions to trap and emulate properly because they don't. I know that. So what do I do instead? As the program is running, I basically pull the next set of instructions that's going to run and rewrite them dynamically to fix this problem. So if I see instructions that don't trap properly, I rewrite them with safer versions of those instructions. And I have to do this sort of constantly. Now once I rewrite a particular section, I can cache those translations. I don't have to do it again. But this is sort of the special sauce that VMware brought to early versions of their virtualization solution that allowed them to virtualize X86. Because X86 was tricky to get this right because of these non-virtualizable instructions. Okay, we're pretty much done. Does anyone have any questions? Yeah, type two virtualization. I don't know that type system, but maybe what is type two virtualization? Yeah, no, no, Wednesday. So on Wednesday, we're gonna come back and talk about, so again, remember the design goal here, don't modify the guest operating system at all. Once I start allowing myself to do that, there's a whole another world of solutions that open up. And Zen was the first example of somebody saying, you know what, some small modifications that the guest operating system are worth it in terms of the performance boost and some other features I get inside the VMM or hypervisor in this case. Bunch of things, like this is a subject that you could teach an entire class on, so currently, certainly I haven't talked about everything today. X86 turned out to be both a blessing and a curse, so X86 had these irritating features like these non-virtualizable instructions, but it also had these bizarre features that nobody had used for 30 years that turned out to be useful. So X86 has this notion of a privilege ring. Remember when we talk about, we talk about kernel mode and user mode, it turns out that X86 actually has four modes. It's like really, really user, little bit of kernel, little bit more kernel, full kernel. And what it turned out you could do is you could actually figure out how to run, this is what we're gonna talk about on Wednesday, you figure out how to run the kernel inside the guest OS inside the second to last ring. So the way that traps work is that they trap into the other ring. So if I'm running in pure user mode, I first try to trap into user kernel mode and then user kernel kernel mode and then kernel kernel mode. And so this allows me to, if I'm careful about it and I can modify the operating system properly, which we'll do on Wednesday, I can run the guest operating system inside a ring that's less privileged than the host operating system. Wait, let me just get through the list. Shadow page tables, again, look this up online, it'll blow your mind, it's very complicated. If you understand that, you understand x86 page tables. Memory traces, there's been a, so this is actually worth pointing out, there's been a lot of work since early versions of VMware to improve virtualization on x86 system. Some of you guys had to turn these things on in your BIOS, remember, to get virtual box to work properly. So there's a lot of new hardware features that are designed to make virtualization easier. And there's lots of great ways to learn more about this. Okay, so Wednesday we're gonna talk about type two virtualization or para-virtualization, as I've always called it, and read the Zen paper, because that will help things make a lot more sense. I will see you guys on Wednesday.