 Thank you, and then we're gonna we're gonna talk about OS virtualization So this is a topic that I haven't election on before so and I'm not an expert on I'm not really an expert on any of this stuff, but this is new to me But this is important because it's something that's going on in the world And I think it's it's a fun way to sort of complete our discussion of virtualization So we've talked about Harbor virtualization on Monday We talked about ways to make Harbor virtualization a little bit easier on Wednesday by making some modifications To the operating system to make it more cooperative and and created a new piece of software below it And today we're gonna talk about something that's actually quite different. So OS virtualization is quite distinct from these other two techniques Okay, so my usual reminder about assignment three do week from today So we have an increasing number of people that have actually finished assignment three So this is pretty cool because like eight or nine groups that have perfect scores at this point They're not all up here, but Congratulations to those people if you're not on this list if you want to be on this list So if you are on this list and you know, you have some love in your heart for your fellow classmates You know come to also keep coming to office hours and help people out Oh, yeah, they're gonna be trapped on this somehow Let me go the next one. Okay, so Before I talk about OS virtualization, let's thank you guys to go away Let's review hardware virtualization. So What do we have to do to create a virtual machine? So what were the what were the things that were required for us to? virtualize a piece of hardware Remember we talked Monday about VMs. What were some of the things that we had to do? To create a virtual machine. Nobody remembers Just Monday Anyone want to venture any guesses? Well, that's so that's so that's a little bit so I start with a physical machine. I Need hardware And then I'm gonna write Sorry, this is driving me a little berserk. I don't know why I can't get you to go wait wait wait So I start with the first physical machine. That's required And then I'm gonna write some software. I'm gonna write this thing called a hypervisor hypervisor is responsible for taking some of the resources from the actual underlying physical machine and Using them to create this virtual machine. So I'm using the underlying resources To create one or potentially more virtual machines and then I need to isolate the software that runs Inside the virtual machine and the most difficult piece of software to isolate inside the virtual machine was The guest operating system because the guest operating system is used to managing these hardware resources in ways that could potentially allow it to sort of Get out. This was the challenge And then the VM resources that I'm using are provided by the physical machine But the visibility outside of that physical machine of those virtual resources is limited This is by design This makes sense. This is sort of what we talked about on Monday and Wednesday. Okay Now what were some of the implications of these design choices on the virtual machine itself? So someone was just saying we want to virtualize the instruction set. Is that actually true? Can we virtualize the instruction set? What's true about the virtual machine? You know, what's what's the what I'm getting at here is what's the relationship between the virtual machine and the physical machine? That that you know are that that's part of It's a consequence of how we've chosen to virtualize machine. Who remembers a couple of well In order to get good performance I want to give the guest operating system as much access to hardware memories possible But what like what else is true? You know, what can I what can I not change about the virtual machine? Yeah Yeah, I can't I can't modify the architecture. I'm stuck with whatever our underlying architecture I started with because remember I built the virtual machine by taking resources from the underlying physical machine So, you know, I'm stuck with the architecture and then any any Features of that physical machine the discs that are attached the amount of memory I have the performance of various parts of the system those all are part of the virtual machines that I'm creating, right? So the VM and the physical machine have to share the same instruction set So the hosting guest operating system must also be configured to share the same instructions Now the guest OS that runs inside The virtual machine can provide a different Remember this term from that we first saw in Zen a different application binary interface So I can take my virtual machine I can boot windows inside of it if it's running on a bun to host OS And then I can run Windows programs in there that were compiled to expect to communicate with the Windows kernel And of course the challenges in getting this to work were largely related to the fact We're largely related to running the guest operating system Because the guest operating system is used to having privileged access to the machine and things like that once I can get the guest operating system to run Running applications inside the virtual machine is not so hard But you know what we've talked about primarily is the process of getting the guest operating system to work. Okay, so Let's try to apply this type of thinking to virtualizing an operating system So the terminology that seems to have caught on for a virtual operating system is something called a container What So how do we create a virtual operating system so again in the virtual machine I Started with a physical machine to create a virtual operating system. What do I need to start a real operating system an actual operating system kernel So I'm starting with the real operating system So this is something this is different So before we had this piece of software that was required to provide this interface and a lot of the features were actually provided by the hardware in This case the functionality that I want to get access to that I want to virtualize is actually provided by the under some operating system. That's already running on the machine Okay So now I have some software that's responsible for isolating what? So before we were isolating the guest operating system in all the software that ran inside a virtual machine What do I isolate in this case and again? There's no name for this is kind of weird I mean hypervisor has become this term that we use to refer to the piece of software that provides this virtual machine abstraction But there doesn't seem to be a name for this particular piece of software yet. Although it is required What am I isolating inside the container if the container and the rest of some Share a common operating system What gets isolated inside the container? Yeah Okay, well Yeah, so there is so I want to be able to allocate a certain amount of resources to it But what runs in the container in the virtual machine? I could run an entire operating system and all the software That run not ran on top of it here. I already have a kernel running that I'm going to use. Yeah Yeah programs some sort of guest software these are software applications that now run inside the container Are using the same operating system as the rest of the machine So before the virtual machine was using the same hardware as the rest of the machine here The virtual operating system or container is actually in the same operating system as the rest of the machine So container resources and and these are and this is where this could seem so these are the operating system resources Before we were talking about machine resources physical cores the memory on the machine Access to devices and part-dispartitions. Now we're talking about operating system resources. So we wanna think about the type of resources that operate systems exposed to applications. Some of these have some abstractions as part of them, some of them are illusions. But these are the types of things that we now need to virtualize. So processes, files, network sockets, et cetera. And again, these are gonna be provided by the real operating system, but what I wanna be able to do is somehow isolate them inside the container so that when I run a VM on top of real hardware, I'm isolating the VM from any other operating systems that are running below it and potential operating systems that are running next to it in other virtual machines. Here, when I run things inside containers, I'm potentially isolating the container software from the rest of the machine but certainly from other containers that are running on this machine. Does this make sense? Is this analogy sort of holding up so far? Okay. So what are the implications of this on the container itself? So before we said the virtual, the guest operating system, the virtual machine and the physical machine shared an instruction set and that had consequences on the software I could run inside of it. So for example, I have to run a guest operating system that's been configured or compiled to use the same instruction set as the host operating system or the hypervisor. Here, what are the implications on the container and the software that runs inside it? So I sort of gone up a level. And virtualization always happens across some interface. So before I was virtualizing the hardware interface, now I'm virtualizing the operating system interface but that has implications on what's below it. So what has to be true of all the software I'm gonna run inside my container? It doesn't have to be the same binaries. Actually, this is one of the things that's interesting about it. I can run different versions of software inside the container and outside of the container and different containers. But what has to be true about the software that runs inside the container? All the containers that run on the machine and the machine itself outside the container. What has to be true about those? Yeah. Well, okay, so that's a good point. The physical machine is still the same. So they don't have to be compiled for the same hardware interface but there's an additional requirement now. I've moved the virtualization interface up one level. Yeah, yeah, potentially. Yeah, and containers usually need to include everything that needs to run inside them. But again, this is another term that goes back to Zen. So software that runs in multiple containers. It's using the same kernel, right? So what does that mean? No, no, that's not about isolation. So before the software had to use the same binary interface because the virtual machine was provided by the physical machine. Here the virtual kernel that I get to use inside the container is provided by the same kernel that runs the rest of the machine. So what has to be true of the software that runs inside the container compared to the software that runs inside other containers and outside the... Yeah, what do we call this? Zen had a term for this. I used a few slides. The application binary interface. So compiled applications have to get attention from the kernel in the same way. This is the only difference, by the way, between Linux and Windows and other systems. I mean, Windows, most of what we talked about all semester applies to Windows just the same way as applies to Linux and other kernels. What's different is the language that we speak across the kernel user boundary. So the way, you know, running on Linux implies a set of agreements about how processes are gonna make requests to the kernel. If I go to Windows and try to, you know, communicate with Windows in the same way, until very recently, Windows would be like, I have no idea what you're talking about, you know. And so you can think of it as like different languages. And the support that they've added recently in Windows 10 for running on modified Linux binaries is really just figuring out a way to provide the same, you know, speak that dialect of kernel, you know. So respond to Linux system calls in the same way that a Linux machine would without having to rebuild the entire kernel. Okay, so the container re-opens attempt to share the same kernel. So they have to use the same application binary interface. What's hard about getting this to work? What's not hard about it? What's one of the big challenges that we get to avoid? That we talked about on Monday and about on Wednesday, sort of the primary challenge of providing a virtual machine was what, yeah, there was this challenge of how do I figure out a way to run this piece of software inside the virtualized environment, in that case, the virtual machine, that's used to having control over the entire machine and used to gaining control over the machine using these privileged instructions and things like this. So do I have to do that anymore? No, because the same kernel is providing the containers and the rest of the system environment. So that problem goes away, so that's kind of nice. What do I have to be able to do? Yeah, yeah, so I need to figure out a way to isolate the resources and the names that are being used inside the container from the rest of the system. And we'll come back and talk about this, but for example, one of the big challenges in building containers is I wanna use, let's say I wanna deploy 10 copies of the same container and they all wanna run a service inside that container that runs on like port 8080, for example. Those containers all need to run next to each other on top of the same machine. And yet inside, they're all using the same quote unquote port. Normally that would be a problem. If I tried to do that on top of the operating system without a container, that would create a problem and one of those services wouldn't be able to start. But I need to be able to support this. So obviously there's some naming issues here. Okay, so let's try to get at the differences here or containers versus virtual machines. So true or false, you can run a Windows software inside a container provided by Linux. False, right? It's using the same kernel as the host. What about running like suce linux utilities inside and a container that's running on top of Ubuntu? This works, this kernel is the same. Linux distributions vary for reasons that don't make any sense to me in terms of like where they put things and one of them uses one package management tool and another uses another because what the world really needs is more package management tools, right? So yeah, but I mean the differences between these variants of Linux are really confined to these, you know, again differences in file system layout, differences in tooling, you know, blah, blah, blah. The kernel that you're using is the same. Linux, these are all built on top of a Linux kernel. So as long as the kernel is the same, I'm good. What about if I run a tool like PS inside the container? Should I see all the processes that are running in the entire system? No, and this is important. So I wanna isolate things inside the container so they don't have a view of the entire world. Now outside the container, it turns out that frequently PS will show you everything. So this is kind of a difference between VMs and containers. Why, what makes this possible? Like it's kind of nice. So for example, my students have recently decided to take advantage of the fact that I have this huge beefy machine under my desk that I barely do anything with other than, you know, edit the slides for this class and send email. So they've decided to turn this into part of their processing cluster. So they put a Docker image on that that's running. And when I run PS, I can actually see what's going on inside that Docker container. I can see the things that are running in there. So that seems kind of like a nice thing. It sort of breaks isolation in a certain way because I can see into the container. But why is that even possible? There's no way to do this if you're running a VM. So for example, if they had used Vagrant to create a virtual box, virtual machine, and then solve the software that way, I might see that the virtual machine monitor was consuming a lot of, or the hypervisor was consuming a lot of cycles, but I would have no idea what's going on inside of it. So why is this possible? Yeah. Well, I mean, put another way, the tools like PS use information provided by the kernel. The problem was peering into the kernel running on another virtual machine is that that kernel is isolated. There are now two operating system kernels running on my machine. When you do container virtualization, there are not. There is only one kernel running. And so if I ask the kernel outside the container, show me all the processes that run on the machine, they can do that. Because even the processes that run inside the container are communicating with the kernel to get process IDs and gain access to system resources. Okay, so here's sort of a comparison of the difference here. Between hypervisor virtualization, and this is a case where the hypervisor is running on top of a host operating system. So this is type two virtualization, I think. Is that right? Good, okay. My expert on type one and type two. I don't know, he introduced me to these terms. They're real things. Yeah, she uses visibility provided by the kernel. Yeah, okay, yeah. So the way that namespace, we'll come back to this, but the way that namespaces work on Linux, as far as I understand it, is they're hierarchical. So when I'm outside of the container, I'm in a root namespace. And that namespace has visibility into all the other namespaces on the machine. When I'm inside one of the namespaces, I don't, by default, have visibility up the tree. And this creates some interesting features. So for example, processes that run inside a container can have a different process ID inside the container than they do outside the container, right? And this is one of the challenges to providing these namespaces. Yeah. Okay. So on, you know, type two hypervisor virtualization, I've got a hypervisor running the top of the hostOS that's controlling the hardware. And inside each one of my virtual machines, I have a guest operating system, all the binaries and libraries that are required to run things and applications that use the rest of this. In contrast, in container virtualization, I have one operating system. So here I have four operating systems, potentially if I have three VMs. I have one copy of the operating system. And what, and this, to be honest, looks a lot like running multiple processes next to each other. And the real difference here is that I can group processes into a single container so that they can only communicate inside the container and so I can distribute them all together. And so this is, I went back for this slide to some of the reasons why we did hardware virtualization. So one of the reasons why container virtualization has become so popular is that we preserve a lot of the things that we liked about hardware virtualization. We give up a few things, we'll point those out along the way, but we get a lot of the same features that we like at a lower overhead, because there's only one operating system. We'll come back to talk about the overhead too. So I can't run multiple operating systems in different containers. Boohoo, sorry about that. But who cares? Why are you trying to run multiple operating systems anyway? Just run Linux, problem solved. So yeah, I can't do this. I can, if I wanna run Windows containers, I think Docker has support for Windows so I can use Docker to bring up Windows containers inside Windows, but I lose out on this. On the other hand, I can still transfer software setups around as long as those software setups use the same kernel or something close enough to the same kernel, that the kernel interfaces the same. So before when I packaged up a virtual machine, I got the machine, I got the operating system, I got all that stuff. In Docker, what I get is I get a bunch of applications that are all running together inside the same container that are configured to work together that require the same kernel. But I can move those around, I can duplicate them, et cetera. This is one of the reasons it's so much, has anyone brought up a Docker container before? Okay, has anyone brought up a virtual machine before? Okay, if you raised your hand both times, you probably know Docker containers are way faster to start. Because they don't have to create, they don't have to grab lots of memory and create all these other resources. They're very lightweight, yeah. Good question, I don't think there has to be. There certainly isn't this, we'll come back to this in a minute, there isn't this multiple level of traps that I have to re-vector around. Because the operating system is handling all the requests. It's just using these new namespace features to isolate them so they can't see outside the container. And there's also some features to control the resources of the container. Yeah, we'll come back to that, but that's a great question. I don't see a reason why. Unless the container is throttled in some way. So if I take the container and very similar to what I can do with virtual machines, I can say, okay, I only want you to use this many cores or this much CPU time or this much memory or whatever. And in that case, it would run slower. But it's running on a more limited machine, yeah. I can potentially adjust container resources over time to, as things change, so I can give a container more memory, I can give it less memory. I can potentially do this even online at runtime, rather than having to reboot the system. Because again, the operating system is not used to memory suddenly appearing on the machine. There are operating systems out that do have support for hot swap memory, but for a long time that no one really built that in because they were like, this isn't gonna happen. Applications, on the other hand, are used to memory coming and going. It's like the kernel took a bunch of my memory away and now I'm running slow, and then it gave me some more and now I'm running fast. But that's just something that they're used to. So isolation, containers can be configured to make sure they don't leak information outside. So I can isolate certain things inside the container, and particularly not to other containers. Maybe the operating system outside the container has visibility into everything that's going on, but if I had two software setups provided by different companies that were accessing sensitive data, I can put those in different containers to make sure that there's no crosstalk between them, so they can't see each other's resources. And like I said before, I can package up everything I need to run an entire application or system and put it inside a container and allow someone to deploy it easily. People that have used Docker, I mean, what have you done with it? I mean, you're using it for AutoLab, right? Yeah. So for example, one use case of containers is the auto grading software that they're trying to deploy here. Well, you guys are deployed, right? Have deployed, it works flawlessly. It uses containers to isolate tests from each other and isolate the tests from the rest of the system. So if I run potentially malicious buggy student code inside a container, I can make sure it doesn't destroy the entire root file system or something like that or do something bad. We get away without doing this in test 161 because we have sys161 there. So that's what allows us to not have to worry as much about this, but if I'm running an arbitrary piece of code provided by someone I don't trust, like don't run it on your main machine, who knows what it's gonna try to do? Yeah, rerun me as Pseudo right away. That's what it's gonna say. You know, like the, so for example, the discourse forum that we use for this class is in a Docker container. That's how you deploy it and it's very easy. The reason for that is that it requires a bunch of different things. It requires certain versions of Ruby. Sorry, I didn't write it. You know, it requires things like Redis and Postgres databases that run inside the container and all that stuff is packaged up together, isolated from the rest of the system and can be deployed very easily. Okay, so back to the overhead question. So what had to happen in a virtualized environment for an application running inside the virtual machine to make a system call? What's the first thing that happened? This is a good review. Forget Zen. Zen obviously optimizes a little bit, but on a normal system, what happens? A guest application makes a system call. Where does that go first? Yeah, Steven. Yeah, so I have to try it on the host operating system. The host operating system has to see every, you know, interrupt and exception on the system to maintain control. It then has to re-vector it back to the hypervisor, which then has to hand it back to the guest operating system. So there's a bunch of, you know, transitions here. What happens when a process running inside a container and operates as a virtualization makes a system call? What happens? Yeah, yeah. It just goes to the operating system. You know, there are no other operating systems running on this machine. There's only one operating system that can handle this. That's it. Yeah, so the overhead here is dramatically reduced, which is nice. And again, going back to this, the hypervisor has to do all this work. And potentially some of this is slow, like handling x86 page tables. One of the reasons why Zen worked on that is because that is actually pretty slow. Now, some of the, you know, as I mentioned in my discourse post, some of the new hardware features that Intel is introducing, I think, are designed to help address some of these things. So that's good. But there is still some sort of unavoidable overhead to being able to provide full hardware virtualization. And I don't have to do that anymore. I guess there's one operating system. What I do need to do is sort out naming issues. So this is kind of the core of container virtualization, names, getting names to work properly. So what sort of names does the container have to virtualize? If I want things that run inside the container to not be able to peer outside or to look at other things on the system, what are some of the names that the operating system has to start to virtualize or attach identifiers to? Give me one example. Yeah. File system. File system, that's a huge one. If I can look at other files on the system, I'm already outside of the container. So the container has to have its own view of the file system, not necessarily its own file system. We'll come back and talk about this in a minute. The way that Docker does this, for example, is very clever. I can let the container use parts of the file system provided by the outside machine or none of it and force it to establish its own paths. What else? What are the examples of names? Some of these are numbers because it's a computer program. Yeah, so memory is interesting. Yeah, memory isn't so much a name I have to worry about. I do worry about controlling memory, remember? Because the operating system sort of naturally already provides memory isolation between different processes. Is it already subject to the Docker pool? So as far as I understand, processes that run inside the container have their own address spaces. Now in order to share memory with each other, processes have to communicate with each other to set up those shared memory maps. What's required to communicate with another process? Is that it? Yeah, so there's an example of another name that I have to make sure that I isolate inside the container. If I can't name a process, I can't communicate with it to set up a shared memory mapping and so there's no problem. Does that make sense? If I want to do shared memory inside the container, I can do that, but I can only do it with other processes that run inside the container. That's what I want to happen. So yeah, it shouldn't. Yeah, so I'm, if I can prevent the container from seeing upwards, so that can prevent it potentially from seeing down as well. Yeah, I mean, I don't know if it's better in every way, right? That's a great question. I didn't even think about that. I mean, there's clearly an enormous amount of interest in container virtualization. I mean, there are, so for, okay. So let me, well, great question. What can I not do? There were a few things that were on that side that I got rid of or didn't strike through. What are some of the things I can't do with container virtualization? Because I'm sharing the same kernel. Yeah, we're right. So I can't support different operating systems. So that's one thing. But I also can't do any operating system customization that's heavily designed for one machine, right? So for example, if I have one piece of software and I want to modify the operating system to tailor its performance to that one piece of software, and this is something people do. When you set up databases, for example, like Postgres makes all this noise because it wants these huge memory, the ability to map huge amounts of memory and that's a thing you have to actually have to reconfigure the kernel to get it to do. So the degree that I want to configure kernels to tightly match the software that's running inside them, I can't do that here. The kernel configuration is shared across all containers. So any changes I make that are gonna improve performance for one container or the software set if in one container may not for another. But I mean, you make a great point. I mean, one of the reasons why people are so excited about this technology is because it does solve a lot of the problems that we wanted to solve in a much more lightweight way. Yeah. Oh yeah, I mean, there's no magic here, right? We haven't created more physical resources. One way to think about it is every container is like running more software on your computer. It just all came together very tightly packaged and it has limited visibility outside the container. So it's really just like running a bunch more applications. So yeah, absolutely. I mean, container software will slow down. If you take, I don't know, whatever, whatever piece of software you wanna run and you try to run it in a thousand containers on an underpowered machine, they're gonna go slow. Yeah, yeah, Steven. I mean, like we just said, the overhead of getting in and out of the operating system is lower potentially. And so how fast things go ends up, remember those Zen graphs we looked at, it had to do with how much am I using the operating system? So anything that has to do heavy OS utilization on a system that is providing full hardware virtualization slows down because of all those traps I have to do, right? But remember, I mean, a hypervisor is scheduling. Yeah, that's one thing that hypervisors have to do. They have to schedule between multiple virtual machines. That's a good question, I'm not sure I know the answer. I mean, does OS virtualization benefit because Linux knows more about what's going on inside those containers? Because again, the container just holds processes that Linux has full visibility over. So does that improve scheduling and resource allocation at might? I'm not sure. I could see cases where it would help and I can't necessarily see cases where it would hurt. Yeah, Dave? You also don't have... Yeah, well, right, I mean, you know, that's, and that's something else that another reason people use hardware virtualization. Yeah. No, not really. I mean, look, again, you have the same number of physical resources. I've still only got four cores, you know? Whether I have a hypervisor scheduling operating systems that are then scheduling processes or having a Linux kernel that is scheduling containers, processes inside containers doesn't really matter, right? I mean, there is differences in overhead, but yeah, I mean, if I have a really CPU bound virtual machine, it's stealing CPU from every other virtual machine that's running alongside, right? And this is something you notice. Like you can, you know, even if you, once you get, so let's say I have a machine with eight CPUs and I'm running 32 virtual machines on top of it, when one of them starts to really hog resources, you notice or the other ones slow down, right? Because they have less CPU time. Yeah. All right, let me keep going and try to get through this. A file name somebody already brought up, you know, processes inside the container, I wanna be able to control their visibility into the rest of the file system. So their names might resolve differently and some of the outside file system might be hidden from them entirely. And they might have files inside the file system that are not shared outside the container. User names. So this is, again, like this is, so this stuff is actually surprisingly important. So, you know, let's say I have a container that has an entire web server set up in it. That set up includes, you know, the web server runs at a particular user and it has permissions on these files. If I wanna duplicate that a hundred times on one machine, that user had better be different outside the container, right? If I try to use the same username, then that's not gonna work. So the user names inside the container have to be unique. Outside the container, those user names may be different, but they certainly have to have different IDs and things like that. And then there's all sorts of networking stuff that you have to get right as well. And we didn't really talk a lot about networking in this class, I'm not gonna get into this in detail, but you know, obviously the IP address for a container potentially if you want that to be different so that you can run an actual full fledged server environment inside it. And then you can do all this sort of fun stuff where you can set up virtual networks between containers. So multiple containers running on the same machine, could talk to each other as if they're separate machines, despite the fact they're not. They're just all running on top of the same hardware. Okay, so naming. So naming is sort of how we keep things separate. And that's a new problem that we have when we do OS virtualization that we did not have before in the same way. Control is an old problem, right? So control is something that the virtual machine need to be able to do as well. And things I might wanna control are the container's usage of resources. So I might wanna place limits on the amount of CPU time it has, the amount of memory it can use, the amount of bandwidth it gets access to. Again, this is stuff that the hypervisor also thought about and thinks about. And these are things that you can configure when you set up virtual machine environments. They're very similar. Okay, so the thing that's cool about OS virtualization, at least I think it's cool, is that it is built on all of these old tools and techniques. So for example, there's a tool that dates from the 1980s that's called charoot. Has anyone ever used this? Am I saying it right by the way? Okay, I've never used it. I don't know why you would use charoot. It seems very scary, right? So essentially charoot allows me to change the root file system of a process. So a name that would normally resolve starting from the root, now resolves starting from somewhere else. Does that make sense? So slash temp is no longer slash temp. It's like slash home slash bar slash temp, whatever. I can essentially re-root the file system for a program at an arbitrary point in the file system tree. And of course, you know how this is done. We talked about path resolution. What did I do when I resolved paths normally? I started with the root inode and I worked forward. All I have to do to get charoot to work is started at different point. And I just do the same thing. So rather than starting with inode two, which is actually root, I just start with a different item number and I try to perform the same process. Yeah. Yeah, I think so. Ooh. Yeah, I think both you need sudo because you don't want to let the charooted thing true out, right? Because sometimes you're charooting because you want to limit the programs. So I think the design pattern you're supposed to use is get root privilege, charoot, and then drop it, right? Yeah. But you can charoot back. Okay. And then there's a bunch of these other sort of lower level tools and services that things like Docker are used. So here's some of them. So the past 15 years, Linux has been working on this idea of kernel namespaces. And this applies to a lot of the things that we just talked about. Mountpoints. So namespaces allow me to provide certain processes with a different view of the file system. I can mount different things, different places, process IDs, networking tables. This is all the stuff we just talked about, devices. I can create a namespace so that things that run inside that namespace can't see certain devices or see devices that other things don't see. Okay, C groups. So this is the control portion. Linux namespaces is all the naming stuff. That's where all that came from. C groups is this control aspect. So the idea with C groups is to limit and isolate different processes from each other. So you can think of it as I can put a bunch of processes into this group. And then once they're in the group, I can do things like limit the amount of memory that the group uses. Limit the amount of CPU time that the group uses. You know, assign the group to a namespace so it only has certain visibility into the rest of the system. So you can kind of see where things like Docker are coming from, right? I mean, they're using these capabilities that are part of the underlying offerings. Processes children also remain in the same C group, which is important from a resource manager perspective. I didn't realize this. I never understood the double forking design pattern until yesterday. And I wouldn't encourage you to understand it either, but it is sort of weird. But in certain cases, with old process management tools, services could sort of get away by forking a few times because then when they would lose connection. So remember, if I fork and then I fork again and then the parent, the middle process dies, that new process is now parented in it. It's an orphan. And so it's detached itself from the process tree. And in certain cases that would allow things to get away from processes that we're trying to control. In C groups, if you fork and fork and fork and fork again, you're still inside the same C group. And so the resource limits that I'm trying to apply to you still apply. This can also be useful for things like web servers. So we talked in the past about can a process get more access to the system by just forking a bunch of copies of itself, assuming they can work together, C groups allows me to control an entire group of processes. So if engine X decides it wants to start 82 workers for some reason, all of those can still be controlled inside the same sort of group. So these are things that are useful outside of container virtualization. This is pretty cool. So how do I provide file system access? There's a number of different technologies in this space, but I think one that Docker uses or used to use is something called UnionFS. So this is a stackable union file system. A union file system is a file system that's trying to provide a union of the files available in different files. So for example, I configure things into layers and the way I do path name resolution is I try to resolve a particular path in the top layer of the file system. And if that doesn't work, if that works, that's the file I get. If it doesn't work, what do I do? I go down to the next level of the stack. I just keep looking for this file until I either find it or there are no other layers to look at. I can also do cool things like hide parts of the lower file systems. So I can have occlusions in my union file system where I say if foo.bar doesn't exist in a particular layer, then even if it exists in some lower layer, I have a configuration in the middle that says stop path name resolution here for an entire part of the tree. So for example, I have a configuration that says I can't access anything that's rooted in foo in any lower level. And so even if that file exists, like for example in another container or somewhere else, I can't get at it. One of the things that Docker did to improve performance apparently was apply this copy-on-write technique that did we talk about copy-on-write when we did memory management stuff? Was that one of the lectures I missed? No, yes. So the previous Linux containers implementations did this thing where when you created a new container, it copied a large part of the file system that the container could have access to. Because even files that are shared between the rest of the system in the container, I may not wanna see the container's modifications. I may wanna hide the modifications inside the container. So what Docker does is it does copy-on-write. So it tracks which file system which files the container is allowed to modify and when those modifications happen, it makes copies of them. So it doesn't copy things first. Again, this is one of the reasons why it's so fast to bring up containers is because there's very little upfront resource provisioning that has to be done. Okay, almost done. So, yeah, it can, yeah, but it doesn't have to, right? I mean, put it this way, any resources that really have to be portable from machine to machine, I need to put inside the container, right? And this is something that Docker does, right? So Docker containers come with their own file systems, right? And therefore, yeah, if you want things to work perfectly everywhere, you know? Everything that's used inside the container has to be moved around with it, right? So just to bring this together and sort of talk about one of the more popular, sort of hot containerization technologies out there. So Docker is built on a bunch of these old Linux containerization technologies and isolation technologies. This is an active area. So Docker has gone through several iterations where they've used different versions of things that were available. Early versions were based on Linux containers and then I think now they own their own containerization library that they develop that's part of Linux. So, you know, again, I think lib container is what they're using now, but you can also configure it to use these other containerization libraries and approaches below it. And then within Linux, you know, we talked about C groups and namespaces. There's also a variety of other Linux kernel primitives and support that things like Docker need to work. That makes sense. So this is kind of a neat story of sort of evolution of the Linux kernel itself. And then also the addition of these nice libraries and tools for doing the things that Docker does. Docker also has these nice ways of layering file systems. So if you've updated Docker container, sometimes you'll see what happens is essentially what it's doing is it's bringing in a new layer to the file system that runs inside the container. That's a nice way to update things. If I have changes I wanna make to a container that's already deployed. I deploy essentially a diff that contains those changes. The container is stopped, that diff is applied and the container gets restarted. And then the files inside the container are changed. I mean, this is an example of a container. This is a Docker file. I don't understand this syntax entirely because I've never written one of these before. My students have though. So what this says is, let's see here. When the container is run, so app get, for example, is a tool that it is using from outside the container, as far as I understand. Maybe, is that true? Okay. Okay, yeah. And then here's a volume that's shared outside the container, I think. And here's a command that's run when the container is started. So this is a container that runs MongoDB. This is designed to containerize Mongo. This installs the latest Mongo version, or sorry, this installs the Mongo version that's configured right here, which is 266, when the container is booted, this runs Mongo. Docket container file systems are not permanent unless you configure them to be. So in a case like this, this volume right here specifies that this data should be located outside of the container. If it's not, it's not persistent. So if you didn't do this, every time you stopped and restarted this container, you'd have a fresh Mongo database with no data in it, which is probably not what you want. It's interesting to notice that the Docker file does not configure, so this is gonna be located outside of the container, but the Docker file should doesn't configure where it's located. That's configured when you run the Docker container, and the reason for that is if I wanted to run 10 copies of MongoDB on the same machine, I need to find 10 different directories to put their data in outside the container. Inside the container, the data looks the same. That's the nice thing about it. Every copy of MongoDB running inside this container thinks that it's using data that's located in DataDB, despite the fact that you, as a system administrator, have to find different places to put that when you run it. Okay, any questions? So we're done with the week on virtualization. On Monday, Carl will be here, and he's gonna talk about performance benchmark. So I will see you guys on Wednesday.