 Someone apologize in there. It is now on. All right so if you have virtual machines so instead of having like a lightly used machine just make them all virtual machines and then Run them on a single machine if you want so to clarify this so for instance if you were making a fun web application and You don't want to give yourself a headache later on Well, you could make a virtual machine for like your database make a virtual machine for your front end make a virtual machine for your Back end and then once when you don't have much traffic at all You can save a bit of money run all those virtual machines on the same physical machine And then once you see that that's overloaded well You don't have that much work to do you go out you buy another machine and then you move on the virtual machines over to it and you can do that no problem and If you run out of more you just go buy another server Maybe you have more instances of the database virtual machine. Maybe you need more of the other it makes it real easy to scale and recover and Run stop. Yep. Okay. All right that better No ish Hello one two three four. All right that better. All right. I will try and be louder. All right So Where were we? Yeah, so if you do something like that a lot easier to move things around and manage them So like we had a process control block when we were scheduling processes virtual machines have something called a virtual CPU and that keeps track of everything about a virtual everything about a CPU including all of the kernel mode only registers and All of the entire state of the CPU instead of just whatever a user can actually have access to So for processes Process control block for virtual machine. It's a virtual CPU so Whenever that guest isn't running the hypervisor is going to save the state of the CPU and that's what's involved in doing a natural context switch And if you want you can resume that virtual machine Just like the process control block just loads the data from it and resumes as you might imagine Well, it would have to do a lot more work than just a process control block to actually resume So context switches are more expensive But it is possible to do and you can also context switch across machines if you really want So the guest still uses user mode and kernel mode There's no changes at all in the guest operating systems So the Linux kernel still uses kernel mode instructions. It doesn't change So remember on x86 if we remember all the way back from the beginning user mode was in something called ring three kernel mode was in something called ring zero and well x86 CPUs were like Design in the 70s or the 60s or something like that. So virtual machines are a new concept So they were like 20 years ago or something like that So they needed a new CPU privilege mode that was more privileged than kernel mode So they called it ring negative one because they ran out of numbers. So On x86 hypervisor mode that you need for type one hypervisors is ring negative one So it's even more privileged than Then kernel mode and it lets you control all the guests For type two hypervisors Well, the host has to essentially fake kernel mode because it's only running in user mode So if we were trying to simulate kernel mode in a user mode process One strategy we could do is something called trap and emulate So if we try to execute a privilege instruction So like if you somehow figured out or you looked at the spec and figured out the assembly For I don't know like switching the root page table or something like that and you try and execute in your process Well, you're going to get a signal and it's going to say oh a legal instruction You're not allowed to execute that and that's part of the hardware protection So if you wanted to by default you would get a signal your process would exit and it would die but if you were to implement a hypervisor you could Write a signal handler and you could simulate whatever that instruction was supposed to do So you are going to have to simulate Everything that happens in kernel mode on that particular CPU, but you can technically do it So in your signal handler you would figure out. Oh, what instruction did they try and do? Okay, I'm simulating all the page tables and all the MMU and everything like that So I'll just simulate whatever this instruction was supposed to do and as you can imagine Well, it will be slow So instead of executing the instruction directly Well a signal gets sent and then I have to see what the instruction was and then I have to simulate whatever it was doing so technically possible, but it's going to be much much much slower and Also, this would work if you had a sanely design a sanely designed instruction set architecture Which x86 is not an example of so? Oh Before that here's that visually so Your guest operating system is going to be running in user mode And it thinks it is in kernel mode if it tries to execute an instruction that is only present in kernel mode Generates a trap or a signal and then you would handle it Emulate what it's supposed to happen update that virtual CPU that keeps track of literally the entire state of the CPU Including all the kernel mode stuff and then you return and just keep on executing until this happens again so Trap and emulate doesn't always work So on some CPUs. Well, there's not a clear boundary between privileged instructions and non-privileged instructions So on x86 virtual machines didn't exist in the 70s so what they did is they tried to be clever and Had some instructions that just Had the same machine code But did different things depending on what mode you were actually currently in so for instance there's one instruction called the pop f instruction and it loads of flags register from the stack and That flags register is different depending on if the CPU is currently running in kernel mode or running in user mode So if it's running in user mode that flags register as a user flags register if it's in kernel mode It's a kernel only flags register So the same instruction behaves differently. So if my type 2 hypervisor, which is only in Running in user mode While tries to execute an instruction that thinks it is in kernel mode, but just behaves differently. Well, it would just Execute in this scenario and just do the wrong thing. So I have to be able to fix something like that So since it doesn't generate a trap and it just kind of executes and does the wrong thing I call these special instructions and we have to have another approach for them So for special instructions, we need to do something called binary translation So that guest virtual CPU is in user mode It can keep track of whether what state is currently in so if the virtual CPU is currently in user mode and That guest thinks it is in user mode. It doesn't need to do anything It just runs the instructions because they both agree with each other but in the case that the virtual machine thinks it is running in kernel mode, then I actually need to Essentially translate these instructions so all it does is inspect all the instructions before they execute and check if it's one of these special instructions if It's one of these special instructions then instead it would execute the simulation instead of the normal instruction So you can imagine this would also be slow and how it can do this is well It knows whether or not That guest virtual machine knows it's in kernel mode or user mode because well There's a CPU instruction that switches from user mode to kernel mode Hypervisor can handle that using trap and emulate so it knows what mode that the virtual machine actually thinks it's in So because of this it has to like inspect all the instructions Performance is going to suffer really bad But in general it kind of works if you have an overpowered machine and it's more or less adequate That's what I'm going to say about the compile your program or anything like that it will essentially Read the instructions your program is about to execute and if it's about to execute a malloc or free or something like that It will go ahead and instrument it and then keep track of your mallocs and freeze and that is how Valgrind works And also why if you check its performance It's like 10 times slower like a hundred times slower than just executing your program because it's doing this So how that would look visually is well My virtual machine it's only running in user mode, but my hypervisor also running in user mode Would know that hey this virtual machine thinks it's in kernel mode It would start inspecting all the instructions before they actually get to execute if it's a special instruction Well, it just simulates or emulates what it's supposed to do updates the virtual CPU and then just keeps on executing more instructions So this is slow and we don't want to do that And that's where we came up with that hypervisor mode and has have more actual hardware So 2005, which I guess is slightly newer than some of you That's when they first standardized virtual virtualization hardware virtualization So it was standardized as vt-x and and For Intel and then AMD standardized it later as AMD dash V and Throughout all your biases if you need to enable virtualization for some other reason It might be called different things because they also came up with two different names for it So Intel also named it codename Vanderpool when they were developing it and then they ratified it as Vmx AMD their codename was Pacifica then they published it as secure virtual machine, but these all all Six of those terms mean the exact same thing So if you need to enable it in your bios for whatever reason if your virtual machine was slow I don't think any of you had to do that and today you have to do this So if you had to do this or if you want to turn it off for some reason It'll be called one of those things. They all mean the same thing so those all add the concept of ring-native one or hypervisor mode and How windows actually works like for you with your virtual machine? Well windows will see that this CPU has hardware support and when it boots up It will claim the hypervisor so it will act as a hypervisor and then normal Windows kernel things So it will run your applications your normal applications as if it was a kernel and also context switch that out with Virtual machines because it's also a hypervisor So this also, you know, lets windows set all of the isolation for all the for all the guests and what hardware can virtualize All right, so To tie this with other stuff we learned. Yes a hypervisor would have to schedule as well So if there's only one CPU on your physical machine Well, the guest doesn't know about that. So you could just say hey virtual machine you have eight processors and you only physically get one and then your hypervisor gets to schedule that and do all that context switching and all of that fun stuff, but Well, the trade-offs are a little bit different here because context switching is way more expensive than just context switching between processes So what you might want to do is just map virtual CPUs on the virtual machines to physical CPUs Or you could schedule them like processes if you want in general It's easier just to map them and just keep them on the same CPU So you don't have to context switch them out But you might you might choose to just context switch them out like normal processes So in the hypervisor, it will have hypervisor threads that will deal with all the scheduling and all of that fun stuff So one approach to scheduling these are like I said CPU assignment So if there's more physical cores on your machine then all of the virtual CPUs across all the virtual machines just map them one-to-one hosts can continue using spare physical cores to do hypervisor things if it wants and If you have to share that's where things get complicated That even has a special turn for virtual machines, and it's called over committing Overcommitting just means I my virtual machines are using more hardware than I actually physically have on my machine So if I don't over commit, it's a lot easier. I just kind of map things and I don't have to worry about context switching But if I over commit then I have to worry about context switching and things get more complicated So then you would have scheduling algorithms. They would look exactly like what we use for processes Sometimes it over committing also causes additional problems. So for instance If I was just running directly on hardware, and I had a soft real-time task that had some type of deadline and in practice I met it all the time and then Well, suddenly I move it to a virtual machine, and then I miss my deadlines and bad things happen Well, that's because ultimately you lost control over the hardware if you're running a virtual machine. So In the middle of doing a soft real-time task Well, the hypervisor could context switch out the virtual machine Stop it from executing and then it would miss its deadline and then it could context switch back in and then it says oh I missed my deadline that shouldn't have happened and The virtual machine has no control over that so for some things once you get into like super super predictable and Things you want to be super high performance Then sometimes running a virtual machine is not an appropriate thing unless you do the mapping and have it have direct access to the Hardware without having the context switch it out So this is only in certain situations that virtualization causes differences. That is not very good All right, so like we talked about scheduling with virtual machines. Well, they also have memory management except it's even slightly more confusing so Your kernel that has access to physical memory and then each process has virtual memory that thinks it can access everything Well, guess what once we add a hypervisor into the mix the hypervisor has control over Physical memory and then each virtual machine has virtual memory and then each virtual machine. Well it thinks that virtual memory is physical memory and then each process has virtual memory, but in reality That's like virtual virtual memory because it's through two layers, right? So it's two layers and each layer would have three levels of page tables That's fun, right? So that's also part of the problem that the hypervisor solves it lets you have something called ways it on the slide yeah, it lets you have something called nested page tables and That will essentially take that into account with the TLB to make sure everything is nice and fast So you don't have two translations going on that aren't aware of each other So this makes both translations aware of each other make sure the TLB doesn't get screwed up and Does all the page table management? For the virtual machines kind of for you so the hypervisor maintains that nested page table and then the guest would maintain their page table, but it's nested by the hypervisor and Then it would use that to translate for the guests and there would be hardware support for that As you might imagine as soon as you get into swapping this becomes even worse so the hypervisor could swap out if it runs out of memory swap out to disk, but the Virtual machine might also swap out to a virtualized disk So if you run out of memory, you might not actually know which one is doing the swapping It could be the hypervisor. It could be the kernel that you're actually running and the hypervisor would have its own page replacement algorithm, but Typically the guests or the kernel running knows the memory access patterns better and that's probably what should actually be doing the swapping All right similar to copy on right. There are optimizations in virtual machines to share memory so Just like with processes I can share memory if the contents of the page are exactly the same and I am only reading it So unlike virtual machines, there is no fork and kind of sets you up and lets you know what Pages are going to be shared So what the hypervisor will do is essentially scan all of the pages take a hash of the page Because that is much quicker than actually comparing every single page byte for byte So I'll take a hash of the page and then check if two pages have the same hash If two pages have the same hash It means their contents could be the same or they could not be the same, but they're definitely not completely They might be different, but they don't have to be So if the two hashes are the same Well, then it will do the expensive checks see if they're the same byte for byte If they're the same byte for byte then it will start sharing that between the two Virtual machines and then well if one of the virtual machines needs to write the page Then our same copy-on-write things are going to happen where we have to make a copy of that page and then Modify it in the new one make sure they're completely isolated But the same concept of copy-on-write does apply for virtual machines Just how you share a page is a bit different All right, other fun things so the hypervisor also provides virtualize IO devices So you can multiplex like each virtual machines say network card to one physical network card and The hypervisor could also just emulate devices that don't physically exist. So some weird network card or some I Don't know what else you could emulate some weird sound card or something like that So you could emulate that and it doesn't actually exist Other things the hypervisor could do is just map one physical device to one virtual device in one VM Give the VM exclusive access to that device But the hypervisor is still in the picture. It's still doing some translation. It's just not doing that much So there is a hardware solution to speed that up that removes the hypervisor from the equation and It is something called IOMMU and that's a new hardware feature now And what that will do is directly map a hardware to directly map hardware to a virtual machine so that virtual machine directly has access to the hardware and the hypervisor does not get in the way at all and This is generally used for things like GPUs that Don't virtualize very well and don't like being shared. So with IOMMU You just give a virtual machine direct access to that hardware and then it can run it at native speed So there's nothing in between it. There's nothing getting in the way Directly uses that GPU and then you know if you were running Linux, you could create a Windows VM map a Graphics card directly to it and then suddenly you have the same performance as Windows as you would have if you just ran it directly on the hardware because the most important part Isn't being virtualized. Nothing's happening to it. It just goes directly to the virtual machine So this is what you know GPU instances will have They'll just directly map a GPU to a virtual machine all right So virtual machines also boot from a virtualized disk So you create a disk image and that has all the contents of a physical disk. It just looks like one big file So it would contain partitions and then each partition has a file system And that file system would look exactly like what you're doing right now in lab 6 So in lab 6 if you knew the contents of the kernel files and where they should be and all that fun stuff Well, you could set it up so you could actually boot a virtual machine off your disk image and actually start Running a virtual machine because usually it's like what you're doing in lab 6 It's just one big file Some formats allow you to split it up into little pieces But the guest kernel just sees it as a normal disk that it has full control over it can read and write and access it And it just looks like a disk to it So this disk image is all you actually need for a virtual machine and makes it easy to move So for instance if your computer died and you saved an image of your virtual machine Well, all you have to do is copy your virtual machine image over to the new computer And you can just boot it up from that and it will look exactly the same as where you left off so makes it really easy to move and the OV if you use some virtual machine software like virtual box Sometimes they'll fool you into thinking a virtual machine. This is really complicated thing that needs a lot of things And they even have their own file extension for it called OVA But really all that is is like a configuration file that says like how much memory and how many CPUs and What network card this virtual machine should have and 99% of it aside from that configuration is all that disk image and that's it So they just package it up so it will import the settings from you But really the disk image is the only thing that's actually important just like for your physical machine The disk is the only thing that's actually really important. You could tear at your disk moving into a computer and it would work So another nice thing you could use virtual machines for is to isolate an application So remember with dynamic libraries if we had an ABI change bad things happen Like we could suddenly change sometimes you could even like change libraries behavior break applications So generally when you deploy applications in the real world you kind of freeze all your dependencies together and deploy them together, but instead of Managing like for this application. I need version 1.2 of this library and for this application I need version 1.3 and then installing them both on the same system making sure they use the right one is a pain Generally what you do that's a lot easier is just create a virtual machine for it That just has library you need and then you give some of the virtual machine and say hey run this It has everything you need don't worry about it but the drawback To that is well my application might be a few hundred like a few megabytes And I also have to supply a kernel with it if it's a virtual machine that might be a gigabyte So that's wasting a lot of space So there's a solution to that as anyone ever heard of Docker That's the solution So containers like Docker it aims to have all the same benefits of virtual machines except just faster and you don't have to give it a kernel so Remember the hypervisor like lets you set limits on cpu time memory network bandwidth all that fun stuff So some kernels well not some kernels only one kernel the links kernel lets you have all those features Just without virtualization So on links there's something called control groups or c groups that support hyper or that support like these hypervisor like features Like isolating processes to something called a namespace And then you can set what that namespace has access to like mount points or files IPC things like that and it runs as a normal user process But it is it is isolated from the other processes that are running so containers Their goal is to give you all the same benefits as a virtual machine Except that while they just all reuse the same kernel So if you're using Docker on Windows or Mac guess what you're using a Linux virtual machine and then Docker is running in that Linux virtual machine and It's all all your Docker Containers are using that Linux kernel that's running in that virtual machine and they're all sharing it But if you're running on Linux, it would just be using your kernel directly Which is why links is the good operating system So everyone uses Docker guess what you're using Linux All right any questions about that already so Virtual machines they virtualize a physical machine So they allow multiple operating systems to share the same hardware They are isolated from each other So the hypervisor is the one that actually controls all the resources a type 2 hypervisor is going to be slower Because it's running all in user mode. It's like trap and emulate and binary translation in order to simulate kernel mode Type 1 hypervisors are supported by hardware So the kernel is actually still running in kernel mode just there is another privileged mode that is even higher privileged in kernel mode called hypervisor mode and There's also other techniques to speed things up. So like IOMMU to give a virtual machine direct access to hardware and exclusive access Then we saw hypervisors could over commit on resources But they're nice because we can physically move virtual machines around and Containers basically aim to have all the benefits of virtual machines Without the overhead of providing a kernel all the time. They all share the same kernel So with that just remember pulling for you. We're all in this together