 Good all right, so this is new Discovered today. Don't need my laptop. That's nice pretty soon. I'll just be able to walk in here Sands everything. What's up? Magic it's called the internet see look at that The internet at work you too can view these slides Stop it. No, no one would do that. I don't know what's going on down here though Don't blame me. I can't figure out why that won't go away. That's some bug with with firefox. Why is there firefox installed anyway? I feel like I'm in the past all right, so Today we're gonna talk about virtualization. I lied on Friday. I don't do that often but I thought we talked about virtualization first as I've pointed out on Discourse the rest of the class is sort of a direct a grab bag So we'll do we definitely want to talk about virtualization because virtualization is really interesting There's some fun systems aspects. It's everywhere. It's certainly a part of the modern computing environment You guys are using it if you're using our vagrant virtual machine So Simon threes do second part is do Friday Scott should have the targets up today so you can start submitting Certainly today or tomorrow and Yeah, so we're gonna do virtualization first and then on the discourse forum Please feel free if there's if there's things you want to talk about if you want to talk about Hyperthreading if you want to talk about whatever You know feel free to add those and we'll try to slate those I realized not the last week because the last Wednesday of class we will not have class you guys can go see This this visitor for Microsoft research on Monday So Monday and Friday Friday before probably the classes that are up in the air at this point as far as topical material So if you would like Me to cover something whatever it is data center computing. I mean, it has to be systems related I don't want to do too much networking stuff. We have a class about that So I don't want to steal it to meet you. It's thunder But if you want to talk about something related to systems propose it and we'll think about All right, and then finally, please look at on Wednesday. We're gonna do a So today we're gonna talk about one approach to virtualization on Wednesday We're gonna talk about a different approach to virtualization. This is another Kind of very widely used technology that actually Spraying from the head of a research paper So we'll read that paper on and discuss it on Wednesday So please read Zen and the art of virtualization again This is probably one of those just a top tier in terms of papers They were influential and created sort of an entire industry certainly created an entire approach to approaching virtualization All right, any questions about material we covered last week any questions about raid lingering doubts So the material all the material that we present and we cover it in class Including the papers is fair game for the final exam Now I'm not gonna ask weird arcane details about the papers of the systems that we didn't discuss in class So don't worry about that But you know when you're reviewing for the exam you can look at the slides, but please, you know refer to the papers Okay, so who thinks they understand virtualization who uses virtualization on a regular base Okay, okay. Let me just let's continue to do this participatory exercise How many people have an EC2 instance that they use? Okay, like all your hands should be up It's free. You can get like a free micro instance for a year I know EC2 is kind of scary like the interface It's like flying to 747 and you're I always feel like if I push the wrong button I'm gonna get a bill next month for like a thousand dollars is something terrible, but It's not that bad. You can certainly sign up get a free instance play around with it It's not super exciting at the end of all these Button pushings and other things what you have is like a Linux server that doesn't have anything installed on it But whatever, you know, it's a useful experience to get familiar with some of these tools So go out and play with that stuff And has anyone used Docker? Okay, yeah, anyone heard of Docker? Okay, so there's a lot of excitement in this space. There's a lot of new ideas in this space We make I might try to add a lecture on On OS virtualization a lot Docker if people are interested in that because that's something that's not covered By the current two lectures we do now. So feel free to vote for that So up until this point in the class right at this point We're gonna kind of try to change your perspective a little bit and we'll talk exactly about how powerful interfaces can be So we've been talking about machines running on real physical hardware. We talked about the software hardware boundary We're talking about our actual devices that have run instructions So these are real hardware resources the operating system has exclusive access to them So throughout the first couple decades of OS research This was the assumption the operating system was the piece of software that would multiplex these resources Build abstractions on top of them and in general manage all the resources on a specific physical machine and It does this by using these lower-level hardware interfaces directly Now What's interesting about this and hopefully what you guys have figured out as you've used tools like I mean How many people have used virtual box for something? Okay, good. So this is clearly something interesting going on here because I can set up virtual box And I can create this thing called a virtual machine and then inside the virtual machine I can run an entire operating system like for example windows And I can run it and I can you know I'm this is all being done in software clearly the performance is not terrible, which is interesting So how does this work? How is this possible and this is really the basis? This is the building block for a lot of what you guys are familiar with this sort of the modern experience of computing So like most of the websites that you visit most of the online tools that you use are now You know, there is no machine That there is sitting there running that website on bare metal instead what there is is a data center in Which the machine that that website is running on is itself an abstraction. It's it's it's virtual This is a new illusion that we're going to create and that machine is now mixed in maybe on one large machine With like dozens of other machines that are doing other things in fact The the website that is serving these slides is a virtual machine that runs on top of another machine with a bunch of other Stuff up not a bunch, but a few other things right that my lab maintains So this is useful. This is kind of cool. And this is really how computing works at this point. So at this point, we've really Fundamentally thrown out this relationship between a computer and a physical machine The idea of a computer is very very has become much much more strongly associated particularly on servers with the idea of a virtual machine So let's get some terminology straight when we start talking about this because now We have a couple of different operating systems that we need to talk about the operating system that runs so so the virtual machine is The interface that the thing that runs inside the virtual machine Experiences the piece of software that creates this virtual machine is sometimes known as a VMM or a virtual machine monitor okay Inside the virtual machine We have a guest operating system And the virtual machine the operating system that runs the virtual machine monitor We sometimes refer to as a host operating system So to use the example from your vagrant VM that's used for the class in that case the virtual machine monitor is what? What's a piece of software that provides a virtual machine abstraction in this case? Turns out it's virtual box right don't get confused vagrant is kind of just a fancy set of scripts that manipulate virtual box and some other VMs VMMs right it could be VMware. It could be something else in this case. We use virtual box What's the guest operating system a bunch of Linux? I think it's 1404 maybe and the host operating system is whatever you run on your machine for me It's Mac maybe for you it's Windows or something else It could be a bunch of you could be running at a bunch of virtual machine on a bunch of this is not only Not weird, but it's actually useful in a lot of cases for isolation and other things Okay Now in order to get this right in order to build a virtual machine We clearly have to meet some design requirements. So the first thing is And and let me make one more note Which is that the view that we're talking about today is very we're very much today talking about sort of commercial user facing virtualization when we talk Wednesday or maybe Friday about Zen That system works quite differently So what we're talking about today is this idea of virtualization where there is a host operating system There's some operating system running on the on the machine And then there's this app like virtual box or VMware that runs another operating system inside of it Systems like Zen which we're going to talk about on Wednesday are really more built for server virtualization And in that case, there's really no notion of a host operating system Instead what I can do is I can take one physical machine Split it up into a bunch of virtual machines There's a new name for the piece of small piece of software that manages the Physical machine and creates the virtual machine abstraction. That's called a hypervisor. You may have heard that term And but that's sort of a fundamentally different story So today we're talking about this idea that I can run an operating system like an app in a window inside some other operating system So obviously one thing we need to do one thing that's really critical here is we can't let the guest operating system get out of The virtual machine the guest operating system We need to make sure that the guest operating systems the resources that it is managing are the resources That I've created for it that I provided to the virtual machine monitor if the guest operating system could Suddenly take over the entire machine then this wouldn't be very useful, right or very safe And the way that I have to do this fundamentally is I have to fiddle and Manipulate this idea of kernel privilege So remember when we talked before we talked about the the simple view of the world where the kernel has ultimate privilege and ultimate control of the resources on the computer and Everything else runs in user space and has a much lower privilege level. We have to somehow figure out How to get the kernel which is used to having all of this privilege. That's the problem Kernels are built when you load Microsoft Windows inside a virtual machine It assumes it's running the actual machine like that's what it's going to try to do So it's going to do things like execute privilege instructions and do other stuff that it shouldn't be able to because if we give it Full privilege now it can take over the entire machine. So we need to figure out how to accomplish this. This is the this is the design problem So just to be clear here the virtual machine monitor is a piece of software that runs on the host operating system That can allow another operating system the guest OS To be run as an application alongside other applications This is you guys you guys know this right you guys have experiences before how many people have used like virtual box Or something else other than this class Okay, you should it's super useful right like oh you want to figure out how to set up a piece of software Like a server or something like set it up inside virtual box that way if you make mistake you can just destroy a virtual box start over Frequently, it's a lot easier to iterate inside a virtual machine Where you can do things like checkpoints and stuff like that that it is on on bare metal where every time you you know Every time you you mess up on bare metal you have to start over reinstall the whole thing whatever. That's a huge pain all right and Hopefully this also brings across the point that remember at the beginning of class We said the OS is just a program like other programs like this should make this really clear because I can literally run it alongside other programs inside the host operating system Hello, that's interesting. Oh, it doesn't look apparently firefox doesn't like my jokes. So sorry. They're gone There was only one of them today all right, so Why was so let's take a step back here. We're gonna talk about how we do this which is super cool But why like why why would we bother with this? So, you know, we've been talking about operating systems all semester and hopefully you guys have you know Been convinced that they're kind of cool and we're studying and they do some interesting things And there's some nice design principles at work here and there's things you can learn from them But what are some of the problems with OS? environments That are sort of now again. You guys live in a virtualized era So if you want to think about problems with traditional operating systems You could also think about things that you would use a virtual machine for So, what are some of the why did we all why did the whole industry move in this direction? Yeah It's an interesting use case. So so yeah, so I mean the There are some software packages where to experiment with them like for example sys 161 right to experiment with them It's actually a lot more effective to So imagine we wanted to give everybody the opportunity to use this great piece of software called system 161 Right, which is awesome and clearly a lot of fun and enjoyable We could do two things one is we could figure out how to get it to run on every type of system known to man Which would take a while or I can just pack it up in a VM and ship it off to you And then I can run in the fact that virtual box and VMware and other companies that make virtualization software have solved this problem, right, so that's actually a Much nicer solution in this case what we're talking about is a coupling between The operating system and software So in theory the OS is supposed to provide an interface that is general enough that any piece of software can use it In reality what happens is in many cases you have tight coupling between the operating system and a piece of software And it's easier to distribute both of them together Yeah, just said that that's cool So this is kind of a door this is kind of a dumb reason, right? But I mean clearly if you want to run for a different operating system This is what I would tell you to do every time someone says the word dual boot I have these like flashbacks and just say stop don't do that like don't ever set up But there's no reason to do what your computer just don't do it still Make a decision and then install virtual box right away And then you can install six different operating systems and if that's what you want to do for some reason Same thing here, right? You guys have experienced this this year I mean when you guys installed our virtual machine you got all of this environment that I set for you Now some of you guys didn't like parts of it. Sorry and maybe you changed it yourself or whatever But this means that I can control everything about the system right from the Settings for shells and SSH and stuff like that. I can distribute this complete environment that you can use Docker is probably the best example of people using that today when you install something using Docker you you not only get Whatever piece of software it is that you're using for example the the forum that we're using discourse comes on Docker Not only do you get discourse, but you get everything that discourse needs to run you get the passenger Ruby on rails whatever the heck it is. I mean Ruby on rails scares me You get like there's a web server that runs inside of it Whatever so you bundle up all of the software together and all the configurations that those tools require And all I have to do is you know type one command and the whole thing just installs. It's pretty cool This is another this is probably one of the biggest reasons so when Resource needs change. It's very difficult using bare metal to adjust things on the fly You know a particular website traffic is really high in the morning plummets in mid-afternoon and then picks up again in the evening I Can if I put that on its own machine There's large portions of the day where that machine is underutilized if I put it on a machine with a bunch of other things That have different usage patterns. I get some statistical multiplexing and now I can run the machine pretty well utilized Without you know by sort of you know Exploiting the fact that different websites have different activity patterns, right? And and you know these two things sort of go together With the with bare metal when you install this thing you better hope that that's the machine you need because if it's not You know now you've got to reinstall the whole thing again try to move things around. It's a real pain I mean virtualized servers can be moved from machine to machine they can have their You can I you know essentially if the websites running too slowly on ops class or I can shut it down quickly Boost the memory usage by a couple of gigabytes reboot it and I'm done right I can do that by clicking things I don't have to open up a machine buy memory stick it in there Hope I don't electrocute myself or cause the machine give the machine a static that's going to cause it to fail Whatever it's a lot nicer So here's another and and to some degree Unfortunately some of the problems that we're pointing out here are kind of like failures of OS design So for example isolation between applications So operating systems to some degree are supposed to prevent applications from interfering with each other from a fault perspective So if application a crashes it shouldn't take the whole machine down with it Nor should anything that application a does Allow application B to crash, but if I'm really talking about stricter isolation What else do I worry about when I have application a an application B running on the same machine? It's not so we're talking I mean let's say I can build an operating system And we accomplished this several decades ago that makes sure that there's nothing application They can do that will cause application B to crash but How is application B still not isolated from application? What can application do a do that might affect application B? Yeah, Steve Well, I mean yeah, I mean they're they're accessing they're they're competing for shared resources memory You know the IO controller, so they compete for performance And so if you look at Zen and some of the work on para virtualization and the whole virtualization movement that took off part of it was Because what would happen is you would buy like let's say you would you buy like a database application, right from some vendor And you think okay great. I have a high performance database I'm going to install it on my big server with my web server and with my email server and with all these other things And then it turns out you discover after you read the fine print of the contract that The company guarantees the performance of their database server only if it is the only thing running on the machine This is not made up. This was at this actually happens So now it's like oops now I have to go get a whole nother machine to run this stupid thing on and I have all the problems with utilization that we talked about just a minute ago Where you know if that thing doesn't doesn't keep the machine busy enough that I'm wasting a lot of a lot of a lot of resources so Operate systems also tend to leak a lot of information between processes that can be important for security reasons isn't that interesting You know this is also kind of terrible Is anyone ever had to try to set up a piece of software that just refused to install on a machine with another piece of software? Thankfully that seems to be a solved problem in the past though it wasn't right like he started installing something from source And you realized oh no it requires this library at this other thing That's this other library and then a whole week of your life goes by right and that's not a week that you're ever going to Remember we'll look back on finally so Here's a year's another problem in certain cases the performance of an application depends a lot on the ability to tune the underlying operating system to make that application happen Databases are famous Our famous culprit here So database will say I will run really really fast if you set all of these kernel parameters exactly this way And it turns out that those settings are great for the database and maybe not so great for everybody else And so this is another case where the coupling between an application and the underlying operating system can cause a Lack of isolation between the operating system between that application and other applications because now I've done things to help that application That have hurt other applications. We'll come back and talk about this when we talk about performance next week Yeah, and I just I just pointed out okay So I think you guys sort of understand these sorts of things, right? I mean this clearly we did right you know that this is possible And it's super useful The this is also very possible. I can take you know Google can take their data center They can build it out with a large number of really Homogenous machines and then they can divide those machines up in a variety of different ways as needed by various applications A certain application doesn't need a lot of memory five Don't give it a lot of memory another application doesn't need a disk great Don't give it a disk I can build so if you imagine I have start with a big machine with like 64 cores and 64 gigabytes of RAM and A bunch of disks that I can configure in various ways I can use that machine to create a lot of Configurations that are appropriate for the virtual machines that are running inside. It gives me a huge amount of flexibility And of course I can do things like migration So when things fail or if I need to take a few machines down to do something or as load changes I find out that two VMs that used to work well on one machine now are conflicting with each other because of usage patterns Move it just it's not that hard I mean yes I have to transfer a fairly big chunk of stuff around but it's nothing compared to the work required to move my great configurations that run on bare metal Is that a question or just a place to put your hand? Okay, cool All right, so What do I actually what it was what does the VM actually have to accomplish? This is interesting, right? What what do I are there were these are sort of the design requirements for the virtual machine monitor? And I was very surprised to find out that these were formalized very very early, so you know It's not clear that in 1974 These guys ever thought that this stuff would be useful, but it kind of is so here are three essential requirements the first thing is that On the virtual machine software should execute roughly the same way. This is called fidelity Now obviously when I'm actually sharing resources between multiple virtual machines There might be some timing differences and other things, but it shouldn't affect correctness if I run the program It should get the same result or it should be able to do the same things The second one which is sort of it odds with the first one is performance If so does anyone I mean what we're talking about virtual machines What is the simulator? How is the simulator different than a virtual machine does anyone know? We'll come back to this in a minute So one of the requirements for virtual machines is that they be able to run Very fast pretty much at the same speed as the underlined hardware Simulators do not do this and then I'll come back and we'll talk about that remind me if there's on a slide about it I think there is The final thing is that the virtual machine monitor should manage all hardware resources so that the guest operating system cannot get out If I give virtue if I do not provide the application the virtual machine monitor with a certain piece of memory There should be no way for anything that runs inside that virtual machine to access that memory. Okay, this is pretty important Particularly if you think now about all these multi-tenant environments like Amazon and Google and things like that. I mean Amazon's running Virtual machines provided by probably tens if not hundreds of thousands of different companies So if I couldn't keep resources from one of them safe from another I might have a problem I might be leaking sort of secrets from one from one server to another virtual server Okay, so there there are now there are two You can define approaches to virtualization into two categories The one we're going to talk about today is what's called full virtualization and in full virtualization I want to be able to model run the goal here is to be able to run an unmodified guest operating system inside the virtual machine That's the design goal on Wednesday, we'll talk about Sorry, I'm gonna read this paper Wednesday on Wednesday. We'll talk about para virtualization So para virtualization is is a little different pair of releases. It says, you know what? I'm willing to trade a small number of changes to the operating system and Of course, one of the things that's critical about para virtualization is that is small in order for much better performance and Much simpler system. That's a much simpler virtual machine monitor Because it turns out that one is hard or the first bullet is tough to do So people did that and there were companies that were built around doing it and then later on a little bit later on Somebody said wait hold on We could do either do all this work to run unmodified Linux Or we can make a small number of changes to Linux that allows it to cooperate with the virtual machine monitor And then the performance would be a lot better and it'd be a lot simpler and so these are two different approaches But today let's talk about for real full virtualization So the goal here is to be able to run this unmodified operating system inside the virtual machine So Vmware is probably I mean, I don't know if this is true anymore Maybe virtual boxes surpassed them right but Vmware was really the leader in this area for a long time They developed a lot of these solutions I don't know if they were first or not But I mean they were certainly among the first to be able to do this effectively and Now there are Free solutions. So what's hard about this? So I'm trying to run an unmodified operating system in a virtual machine. Why is this difficult? Steve you have an idea Yeah, I mean it wants or expects, you know, the operating system is like, you know, like It just expects things to be a certain way, you know It's like some rock star that shows up and says, you know, I said no blue M&Ms, right? I just don't want any blue M&Ms and my bowls of M&Ms. That's not okay You know, they just expect it to be that way. So You know how how and the thing is because they expect it to be that way They're gonna they're gonna run those instructions a true story So and and what's gonna happen when so what happens when I just give the answer away, but I'll take it off What happens when the op what's gonna start happening when the operating system the guest operating system starts to run? So remember I'm not I can't run it with full kernel privilege. Why not? Why not just run the guests with with kernel privilege not a problem? What could it do? Yeah Anything it wants right including get out of the virtual machine So remember the virtual machine is an application that application has access to some memory on the machine If I allow the guest operating system kernel privilege It can just change the to be entries itself and see any page on the system bad cannot cannot do that Okay So I'm running it without kernel privilege. What's gonna happen almost a medium? Gusta the the guest operating system starts to run and before long it's gonna do what it's gonna trap why Yeah, it's going to run and And a it's going to execute an instruction that requires kernel privilege Let's like so let's use the example of modifying the TLB, which is not an x86 instruction But whatever so let's say it does something like that. It's not running with kernel privilege So what would I do to a normal application that did this? Let's say like Skype tried to modify the TLB. What would I do? Boom, you know that you cannot like that's an exception the real kernel who's actually in charge of The system is going to run. It's gonna say that was bad You weren't supposed to do that and it's going to terminate the process. So this is what would normally happen Yes, and and the gas OX is is going to try to execute these privilege instructions, right? What so so here's another question? What about a process running inside the guest operating system? What's complicated about that? So let's say the process let's say a process What sort of complications arise now now the process that runs it runs inside the guest operating system This is important that process is used to running in an unprivileged way So if it you know, it's not going to have this problem where it's going to try to execute privilege instructions What is it going to do? Invariably, yeah, it's going to like do a system call. Okay Now who's going to handle like who would normally handle that system call the host operating system? So who should handle that system call? The guest operating system, right You know the calling conventions are totally different if a Linux program until I guess very recently if a Linux program tries to run on a Windows system And it just won't work, right? I mean the call the way that you trap into the kernel and the where you put the arguments are different enough I think assume between these systems that is just the window if you try to make a Linux like a system called a Windows Systems to be like I have no idea what that is or those arguments are garbage or whatever, right? So I need to make sure so there's really two couple of challenges here What to do about the guest operating system is going to try to execute privilege instructions and how to make sure that traps and system calls generated by the processes running in the guest operating system get to the right place Because normally they would go to the host OS and they actually need to go to the guest of us All right, so we just pointed this out we can't do that You can't run it with with kernel privilege it would see the entire machine if I run it with user privilege It's gonna have to figure out how to do or run privileged instructions, okay So so here's what would happen ideally Ideally when the deprivileged operating systems to remember it's just a program a lot of what the OS does is Not privileged So a lot of the math the operating system is doing a lot of the instructions that an operating system will execute are not privileged So that's good. That means that I don't have very many special cases to handle When it executes a privileged instruction, here's what should happen Here's what we want app the CPU supposed to trap this it's supposed to say you cannot execute this instruction You don't have the required privilege this requires kernel privilege and you're a user program. I know you don't like that too bad Okay, so CPU traps Now here's the thing here's where this gets interesting this trap has to make its way to the virtual machine monitor When you guys install virtual box, do you remember something interesting about it? Why is it a little different than other applications to install? Same thing with vmware. You just double-click it. It just runs. No problem, right? What is this require? Yeah It install some kernel drivers Yeah, that is required for this to happen because normally what happens here is the guest operating system runs Sorry, the host operating system runs and kills the VMM So I need cooperation from the host operating system to get this to work the host operating system has to realize Okay, there's this one special app where when it has one of these privilege violations before I kill it I actually hand it to the VMM and I say it's possible that this is due to something that the guest operating system inside You is trying to run see if you can handle it for me If not, it's it's possible that like the VMM is bucking, right? So it's possible that this is this will be still a problem But you need cooperation from the host operating system to get this to work. That's that's important understand So now this ends up in the VMM Now most of the what are most of the privileged? Instructions that the VMM might try to run like again take the example of a TLB An operation that's trying to modify the TLB. What is that doing? Like these can be a lot of these involve doing one Yeah, modifying some special piece of hardware that the virtual machine that the operating system controls In this case that hardware is to some degree abstracted by the VMM So the VMM has to look at this and say okay. Well, yes, this particular modification to the DLB is okay and Allow it right so the VMM has to see these exceptions If this can happen we refer to this instruction. This is an instruction. It's called classically virtualizable the approach here is called trap and emulate So I trapped the privileged instruction and the VMM emulates what would happen if that instruction was actually executed on real hardware There are a gazillion details that I'm glossing over here, but this is the overall approach This is particularly if you really want to try to blow your own mind try to understand x86 shadow page tables Those are those are wild anyway So x86 makes this even more complicated because it has a hardware managed TLB So the TLB faults are not generated the hardware handles them automatically. So anyway Google Shadow page tables and then I'll come back a week later. We'll talk about but it's very cool If you can understand shadow page tables like a plus All right, this is trapping emulator So If the Well, so what does the VMM do with traps that occur within the virtual machine? If the trap is caused by an application So remember both categories of traps both system calls and exceptions Virtual memory related exceptions. They're all going to jump out of the VM VMM have to be handled by the host operating system and pass back in once they get to the virtual machine monitor I have a couple of choices about what to do if it if I if the virtual machine was running in kernel mode Then this is something that I may need to handle directly if it's a user system call Then actually what I'm going to do is hand it back into the kernel. So the so what would happen here? Do I have a I guess I'm not sure I have a slide about this so a system call in the underlying operating by an application running in The guest operating system will first cause the host operating system to run then the VMM will run Then the kernel will run the guest operating system will run and then maybe again later the application will run So you can see some of the performance overhead Caused by this approach. It's sort of inherent. There's a lot more people involved There's a lot more actors involved when I do something like make a system. All right so Yeah, so I just pointed this out, you know You need Support from the host operating system here. Otherwise. It would just think that virtual box was a buggy application and kill it immediately Yeah, so so the the piece of software you run as a virtual machine monitor has to install some sort of driver on the host operating system Yeah, absolutely Yep, those drivers would be specific to your host operating system, right? So windows you install a particular set of drivers that cause things to get handed back to virtual box Lennox It would be a totally different set of good good question Same thing with Mac, right? So that that's kind of what binds the two together. There's also something I mean you guys may have noticed this there's something you wouldn't install inside the guest operating system to improve performance I'm not exactly sure how that works, but I would be happy to look into it if you guys are curious All right Now here's the now. Here's the nice thing about this, right? We there is an overhead associated with exceptions and traps whether they're generated by the guest operating system or Applications that are running inside the virtual machine however, most of the time the guest operating system and Guest applications use the processor and the rest of the system normally that's critical How so so this creates a require this creates a compatibility requirement is I'll come back to your question, right? So what is this require if a guest operating system and the guest applications are going to use the processor normally most of the Time there's one additional requirement here that we haven't talked about that I need to satisfy to make this work Could I run an operating system compiled for arm? Inside an x86 on an x86 host operating system on an x86 hardware machine No, so the hardware interface has to be compatible. That's pretty important So if you tried to take like an a native Android app and run it on an x86 inside an x86 We're for machine not going to work because most of the time is just executing native instructions If the native instructions and sets are different. I'm in trouble They're the same. I'm good Okay, I think we just talked about this so when a guest operates something running inside the The when a guest application makes a system call The host OS vectors the trap to the VMM VMM inspects it Then it traps back into the VMM when the host when the guest operating system is finished running And the VMM passes the arguments back to the process initially So there are several additional steps that make this higher overhead So What what about a what happens? Let's say I generate like a TLB or a memory related fault the guest operating system So let's walk through this what happens. It's the first thing that happens Trapping to what so many want to make a different answer So I'm not running the I'm not running with privilege so this will generate an exception who handles the exception first The guest OS Would that be safe? Not the VMM either Where does the exception go? Host operating system right so trapping to the host OS Then I hand that trap over to the VM host OS sees okay This is a trap that that's by an application that's allowed to handle these itself. So I had handed to the VMM Now the VMM is going to inspect the trap see was generated by the application and then pass control to the guest operating system Right so now the guest operating system has to handle this TLB fault Sorry, I think I confused myself at the beginning of this This is a TLB exception caused by an application running inside the gas. It's a guest application So now the guest operating system has to run the guest operating system is going to try to handle the TLB fault What is it going to do? Inevitably, I mean the result of this is as you guys are finding out for assignment 3. What do I have to do? I mean a bunch of this is just manipulating my own data structures, whatever But at the end of the day in order to allow this translation to succeed What does the guest operating system have to do? Right to the TLB, which is going to do what? Generate another exception back to the host operating system Because I've entered now. I've executed a privileged instruction when running in user mode Then I hand that track back to the VMM See that it was handled by the guest OS and then the guest and then the VMM will adjust the state of the virtual machine appropriate Does this make sense? Understanding how all these chains of things happen is pretty good. All right So So what I want to point out here, you know before we closer today is what I'm virtualizing here is actually the the hardware ABI or the hardware interface Which is really interesting I'm you know if you compare this to virtual memory, right? So for virtual memory the interface was load and store and all I had to do was make sure that those instructions Executed the way that I thought they should right when I load I get the value that I last stored When I store I can set a new value for For virtual memory I insured safety by translating every access and I got good performance by caching translations the approach with Virtualized hardware is different in this case the interface is the full hardware instruction set From the perspective of things that run inside the virtual machine I just need to make sure that that instruction set behaves the same way with this with regard to the state of the virtual machine That's represented by the virtual machine monitor So for example, if my virtual machine is only configured to have 512 megabytes of memory It should only be able to access 512 megabytes of memory even if it's running on a machine that has eight, you know gigabytes of them, right? and In this case what we do into insured safety is when I see these privileged instructions I have to intercept them and Make sure that they don't pierce the virtual machine So these privileged instructions have the potential to alter the underlying hardware state in a way that would allow the guest operating system to get Out and I need to make sure that this doesn't happen How do I get proof what what? What the performance the sputtering the performance of virtual machine depends on what to some degree? How do I get good performance out of this? Remember, this is one of my requirements of of a piece of software to be a virtual machine How do I get good performance? Yeah Fetal that comes along with this right, but you know what the performance of this is dependent on what? Yeah, so okay, so that's always going to be slow right any Thing any privileged instructions any traps have this much longer path that they're going to have to follow However, performance is not that bad because of what? Getting closer. Yeah Most instructions execute natively. Yeah, so safe instructions like if you compute pi inside of virtual machine It's going to like run pretty much close to full speed I mean it competes with the anything else running on Inside on your sort of host system for resources But as long as you're just doing math Like there's nothing unsafe about adding two numbers and storing the result in a register who cares that's not going to pierce the VM So the more instructions I can of the hardware instructions that I can run on the underlying hardware Safely the better and it turns out in most cases, you know, that's that's quite a few This will also explain why different pieces of software perform differently inside the virtual machine Things that mainly are dependent on memory and CPU tend to run pretty well because once they get a hold of the memory They need they're executing a lot of instructions that don't require any participation by the host operating system or by the virtual machine monitor Applications to do a lot of IO You know that generate a lot of Faults tend to run more slowly because IO is another place where I have to do a lot of checking to make sure that things are safe, right? To some degree, I'm sort of letting you communicate with this virtual disk, but I need to make sure that communication is is All right Now here's you may wonder this this seems like straightforward Why was there this huge company in like decades worth of work and again in this to work? So it turns out that whenever and who knows if they were cognizant of this or not When people design the x86 It is not Classically virtualizable Okay, so remember what did that mean? What was required for? Architecture to be classically virtualizable What when I try to execute a privileged instruction and I'm not in privileged mode what needs to happen? Yeah, the processor has to generate an exception That's the only way that I get control that I can then vector back into the virtual machine monitor So unfortunately x86 has a variety of problems Some instructions just don't trap correctly They just fail silently So there's some instruction that if I run it with privileged mode It does the thing it's supposed to do and if I run it without privilege mode It doesn't do that thing. So this is a problem. It means I never get control so that I can make it look like that thing happened This is even nastier, right? So some of the instructions would do one thing when you ran them in privilege mode and another thing when you ran them in non-privilege mode and Neither one of those things was trapped so this is sort of what gave birth to some of the Interesting work that BMWARE did early on where so now I can't apply this beautiful approach that we just discussed of Trap and emulate because not every that will not work on the x86 it breaks in a bunch of different places so what BMWARE does was very clever is They take portions of the binary of the kernel and the applications as they are running and You can think of them as taking them and translating them so that the right Instructions execute so in cases where I run into these sorts of problems They rewrite the binary in order to eliminate any instructions that have these sorts of issues Right, and that is just as complicated as it sounds So yeah, and and then I have to worry about performance and so they do fun things where they cash like once I've translated a particular part of The program I cash that translation what? All right, so I think I'm done for today Just let me point out a few things that we didn't talk about privilege rings which we'll come back to on Wednesday Which kind of cool shadow page tables, which I would encourage you guys to explore memory traces Now we will talk a little bit about this on Wednesday But what's one of the interesting pieces of software hardware co evolution that's been happening is that? architectures including arm in x86 The hardware guys have sort of woken up and been like oh by the way There's this virtualization thing going on that's people are spending billions of dollars on maybe we should help And so there's a lot of extensions now to newer processors that that provide this better support If you're interested in this, you know go start googling. There's a lot of cool stuff on Wednesday We will talk about Zen and a very different approach to virtualization. Please look at the paper for me. Thank you See you guys then