 All right, good morning everybody. It's like exponential decay in terms of class attendance. These are like the few that are proud. OK, let's all stand up. It's Wednesday. All right, so we're done with most of the, you guys can sit down, we're done with most of the big units in the class. So we've talked about how we abstract three different types of hardware resources, processors, memory, and stable storage. So for the rest of class, we're going to cover a couple of sort of smaller but equally important topics. So today we're going to talk about how operating systems themselves are structured. So sometimes you can do this material sort of before you dig down and talk about individual components. We've actually talked about those individual components. So now we're going to sort of pop up the stack a little bit and look at sort of how all these pieces fit together into a complete program, the program that you run that is your operating system kernel and how that's structured. And then throughout the rest of class we'll talk about performance and then do some virtualization next week. So any questions about sort of the plan going forward? And as far as announcements goes, there really aren't any. There are gurus out of town today. So I think he's coming back at some point. So I think his office hours today are canceled. So just check the calendar, but he'll be back tomorrow. Other than that, any questions about, of course, logistics, any of the last little bits? From office hours attendance, it's apparent to me that people have realized that you guys have some work to do. So that's good. All right, so again, so at this point we've talked about scheduling and process management in terms of abstracting the processor by memory management. We've talked about stable storage. What we haven't discussed is some of the other things that operating systems do that are probably equally, if not more important. So what other sort of features of operating systems or standard parts of operating systems have we not discussed? Who can name one of them? Corbina. Let's say you had an operating system that did these three things. Let's say your laptop did these three things, but nothing else. What would you be sad about, Robert? You have no network. Yeah, that would kind of stink. So networking stacks and networking protocols are, networking stacks are implemented in the kernel. Networking protocols are really sort of a little bit off topic for a discussion of core operating systems. But this is a big part of what operating systems do, as they support network devices. So that's pretty important. What else? Tim? Yeah, OK, so graphics and display management, that's true as well. What else? Security, right? So if anybody on the computer, if you used it, I mean, this isn't interesting, right? So maybe less important for consumer devices to some degree, but certainly on shared login machines. And on any other public machine, you would be sad if everybody could read your files or capture your keystrokes or whatever. What else? Yeah, online. Power management, right? So this is becoming more and more important on energy and battery constrained devices, right? How do we make sure that the device is using energy appropriately? That's a good one. Sean, anything else? We've gotten a bunch of this, right? So users permission, so we've talked about how the kernel can multiplex these resources between processes, right? But we haven't really talked about different types of enforcement of resource allocation between users on a system, right? Just what's one piece of that that an operating system might care about, right? I have a big shared user machine. What resource might I want to divide up fairly, right? Yeah, so I can do permissions, right? And you guys are probably pretty familiar with standard UNIX permissions, but what other types of per user control might I want? Mukta. You guys may have run up against some of these limitations in the past. Andrew, I'm not even thinking that low level. I'm thinking something fairly basic. Greg. Yeah, so OK. So I might want some types of permissions to determine who can do what to the machine. So that's a good answer, right? Yeah. This is still permissions, right? Amit. Sumit. Yeah, you could do per user memory management. That's sometimes done. I'm also thinking of something even simpler, which is just per user file system quotas, right? I mean, how many people have ever exceeded their quota on a shared user machine and you start getting emails that say that somebody's angry with you and you have to reduce this much space? I mean, thankfully, this stuff's going away now because storage has gotten cheap and stuff like that. But this used to be a big deal. I used to be like, I remember when I was your age, getting my courage up to email the administrators at my local computing cluster to ask for a quota increase to 200 megabytes. I was like, ooh, I'm a power user now, right? Now Google has several gigabytes of free email. So anyway, some of this, again, is going away, which is nice. And then also another thing that we've totally avoided this semester is talking about device drivers and the low level details of how devices are supported. And this is actually a lot of the code that gets written for operating systems falls into this category. And a lot of it is also pretty problematic as we'll come back to today. So if you start to think about this operating system that we've discussed, there are things, there are components inside the operating system that you guys would probably be pretty familiar with by this point. But there's also all this other stuff. And as you start to think about the relationships between these pieces, this is just kind of a made up diagram. I like this. Some of the lines change. This starts to get pretty complicated. These are big, complex pieces of software, probably some of the larger and more complex pieces of software that are sort of an active development and are in wide use. And OK, so I just like this clip. How many people have seen this social network? Yeah, so you guys may remember this part. Oh wait, sorry, hold on, I need to. So the professor here is modeled after my former advisor, Matt Welch. So look at this slide. Did you guys see that slide? Let me go back and pause it. There it is. So it's funny. When I saw this movie, I saw this slide and I was like, they had to have made that up. There's no way. I sat in these lectures for years. I never remember seeing this slide that complicated. Then when I started preparing lectures for this class and I was going through Matt's slide, I found that slide. That's a real slide. See, there's your MMU, there's these pages, these are different address spaces. I don't know how many hours Matt spent making that slide, but you guys probably can almost understand the slide at this point. So anyway, this was just to give you an idea of how complicated these things can get. Also a good part, though. Of course not. All right, anyway, that's just my little trip down memory lane. Let's see. All right. So essentially, the design of the kernel that we've talked about up to this point, we talked about how we're presenting this, and you guys just started to implement this for assignment two. We're presenting this interface to user programs. And at the lowest level, what the operating system is really dealing with are devices. And what's called a monolithic kernel design, which is the first one and the most common one we're going to discuss today, is essentially this. I mean, it's a little bit more structured than this. But what monolithic kernels are is all the operating system code is loaded into a single shared kernel address space. So within the kernel itself, the kernel isn't using any of the hardware features like memory protection to protect itself against itself. So for example, if you're a kernel file system driver, well, let's make this more concrete, if one of your synchronization primitives is buggy, it could potentially overwrite operating system structures that are used by anything. They're used by your system calls. So any of the parts of the operating system can corrupt each other and clash with each other. And all the operating system code runs at this high privilege level. So we talked about user or kernel privilege. Essentially, all the OS code runs with identical privilege, with the highest privilege level. So all the operating system code is in a position to corrupt the state of the machine or affect the state of the machine with those same privileges. And essentially, the operating system is just set up as this one big, potentially very ugly and complex program. If you've ever sat down and tried to hack Linux, or if you go on and you take Ken Smith's class in the summer and you start hacking on BSD, you'll see that these are really big, complicated pieces of code. And if they're not designed very carefully, they can get pretty ugly. But the structure of the operating system itself is essentially not helping here. All OS is all running in one address space. It's all privilege. And so you're not using any of the hardware features to help you. And if you think about it, maybe this seems like a bad idea, but the reason it's done is because it's the easiest way to get started. It's the most natural approach. I just sit down and start hacking stuff. I don't worry too much about how to design things and how things have to fit together and what are the interfaces between my components. And the other real reason here is that it's fast. So when we talk about other operating system designs, you have to come back to the fact that transitions between modules in a monolithic kernel is just function calls. When you guys start working on assignment three and you start designing these various pieces of the system that you need to finish the assignment, the interaction between those pieces is just done through C function calls. And that's very, very fast. It's much, much faster than transitioning in and out of the kernel. And it's frequently much, much faster than other ways of communicating like message passing, which we'll talk about in a few minutes. But I just want to point out that there are three problems with this approach that certain types of operating system redesign have tried to address. And there's also operating system features that are built specifically to address some of these issues. So the first one is rigidity. Does anybody know what the Boy Scout motto is? Alyssa. Oh, all right. The woman knows the Boy Scout motto. I knew it too. But I just didn't say anything. The Boy Scout motto is be prepared. But on the other hand, if you think about, and again, you have to kind of forget about some of the OS features you know about. But single, if you took all of your operating system, everything you needed, you compiled it, you had one big binary, you load it onto your machine, and it works great, until the user tries to plug in a new device or mount a new kind of file system or whatever. So at this point, the monolithic kernel, if it doesn't have some sort of feature to enable this kind of flexibility, it's just like, I don't know what to do. You have a new device? Well, OK. I hope you have a spare machine to recompile your kernel so that you can now include the driver for that device. How many people have compiled Linux before on a machine? So when you compile Linux, if you're brave, there's this configuration step where you can launch this nice, gooey interface that will essentially ask you, I don't know, it's probably up to several thousand yes or no questions. Actually, it's worse than that. It's yes, no, or M for module. So it asks you these thousand questions. You don't know the answer to any of them. Most of them are about things you've never heard of in your entire life. I remember doing this, and it was like, do you need support for blank, blank, blank, blank, blank? And I was like, I don't even know what that is. So I would say no, and then I would try to boot the kernel, and it wouldn't boot, because I had one of those in my machine somewhere. I just didn't know what it was. So this is kind of how these kernels were. This was back when it was really important to have a kernel that had the device drivers that you needed all loaded at boot time. This isn't important anymore. But in strictest sense, if there's no support for any kind of flexibility at runtime, you can have this problem. You buy a new device for Windows, and if Windows was a pure monolithic system with no flexibility, you'd be stuck. You'd have to buy a new version of Windows that supported this new device. All right, so programs have bugs. So this cake has a bug. I don't know if anybody can see what the bug is. So anyway, in the interest of my online reputation, I won't repeat what the cake says. But anyway, so people make mistakes. And what happens if the kernel that you built on your system just has a little problem? Monolithic kernels have this issue. And this goes back to what we talked about before. All the code is running in one big shared address space, and all the code is running with kernel privilege. So if some unimportant piece of your, or you might think relatively unimportant piece of your kernel, crashes, or for example, a part that's associated with the device that wasn't written by the people who wrote the kernel, crashes, the whole kernel just dies in the system crashes. So this is kind of unfortunate, especially given the amount of code that's out there. And then finally, so I don't know if there was a great article in The New York Times a few years ago about the use of PowerPoint by the military. So apparently, if you join the military, one of the jobs that you might find yourself doing is full-time PowerPoint slide writer. There are people in the military whose job is literally just to spend their entire days preparing PowerPoints for military presentations. And this is the type of example of a slide that someone probably spent like a good, I don't know how many man hours went into this slide. I think this is supposed to describe the relationship between different tribal or ethnic groups in either Afghanistan or Iraq. And I mean, what you can take away from the slide is that this is very complicated. I'm not sure if you can really take much more than that away from this slide, but it is pretty cool. And so to some degree, its operating systems get more complex. And as we saw before, the relationships between these components can get very messy. It makes it very, very difficult for anybody to make fixes or changes. So you want to change something. Suddenly you're talking about this incredibly delicate sort of Rube Goldberg type machine and making a single change somewhere might affect everything else in ways that you don't fully understand. So this is another thing that gets very difficult, which is that even to make one small change, so you're hired by this company, your company does some Linux development, and suddenly you need to fix this bug in the some piece of code that you don't understand. And suddenly you find yourself having to understand the entire Linux universe just to make this one or two line change, because the implications of that change might ripple across the entire system. So today we're going to just focus on these three problems, so rigidity, safety, and complexity. And these are not the only problems with monolithic designs. There's other problems. And there's been a lot of research into how to improve monolithic designs. And to avoid ruining the story, a lot of the alternatives to monolithic designs haven't really caught on in their own right, but what they've done is that they've provided perspectives on better ways to design monolithic kernels. So you can design very well structured, very safe, very flexible, and relatively, operate systems are complex. But as uncomplex as possible monolithic kernels, that can be done. And some of the design principles from some of these other approaches have sort of bled into monolithic design. All right, so when we talk about rigidity, we're essentially talking about just trying to modify operating systems on the fly. So something happens to the system. And if ice gets plugged in, the operating system, you're trying to adjust operating system, piece parts of the operating system to improve performance. So what sort of techniques have we already talked about for doing this? When you guys have started to poke around your own systems, you've probably realized that Linux, for example, has some ways of first of all, facilitating communication between user space and the kernel, and then also allowing certain things to be tuned at runtime. So what's an example of how to do this? Or a way to do this? Yeah, so PROC, right? So PROC is a way that the operating system exposes information that can also actually receive information from user space. There's also something called SIS. And SIS provides a way to sort of frequently pass parameters in or adjust kernel's behavior at runtime. So an example of this I just happened to know about is that Linux provides different ways of controlling energy usage on your system. And the approach that you use can be changed at runtime from user space. If you have the right permissions, you can tell the kernel, please use this different approach. And assuming the kernel knows what that approach is and how to do it, the kernel will say, OK, I went from aggressively trying to save energy to trying to maximize performance. You can imagine on mobile systems that kind of flexibility can be nice. But I would argue, really to some degree, this is probably the most important thing here. Because users do this. They buy new stuff. They go out, computer shopping to come home with some new do-hickey. And they want to hook it up to their computer. They want it to work. And the one primary way that kernels have developed for dealing with this is what's called loadable kernel modules. Loadable kernel modules allow the kernel to adapt its behavior at runtime. And this is a very neat approach. So to some degree, I could get around this problem. And the way I could get around this problem, to some degree, would be to take. So when I was compiling my kernel, I could essentially tell the compiler, tell the kernel, include support for every device known to man. I want my kernel binary to include a device driver for every single device on Earth. Every device you know how to support, I want you to include it. And on Linux, there's ways to actually, again, have the kernel include these as part of the binary, the kernel binary. What's the problem with this? As I start to add more and more devices. What happens to the kernel binary itself? Yeah, I don't know. The kernel starts to get huge. Because essentially, the thing about it, I mean, most of the kernel binary is code. It's compiled code. And so what I'm telling the kernel is, add more and more and more and more and more and more code into this binary just in case I happen to plug in a ham radio to my computer or something like that. I don't know, whatever. Something that I'm probably not going to use. I don't have anything that's ham radio, but I don't think you, do you plug in a ham radio to your computer? Maybe not. Anyway, I'm sure there's one that you can. But the point is that, and there's probably a Linux driver for it somewhere. But the point is that if you tell the kernel, be prepared for everything, the amount of stuff that it needs to include in the binary starts to get really large. So instead of doing this, what I do is I create an architecture for loading and unloading what are called kernel modules. A module is a compiled piece of code. It's compiled code. They're usually built alongside the kernel. They rely on kernel libraries and kernel support. But there's two things that are nice about them. First of all, they're not part of the kernel binary. The kernel binary has a loadable module system. But these can be loaded and unloaded at runtime. How many people have ever done this on a Linux system? Like loaded or unloaded? OK, it's less common than it used to be, I guess. So this is something else. They can be independently recompiled. So if there's a bug fix for a particular driver or a loadable kernel module, that bug fix can be independently distributed. You don't have to recompile the whole kernel. You just recompile the kernel module. Then you unload an old version and you reload a new version and then your system just keeps going. There also, they don't take up space in memory until you actually tell the kernel to load. At that point, they're incorporated into the kernel's address space. But what this means is that they can be loaded and unloaded as devices are plugged in and detached. So essentially, one of the things that happens, if you plug in a device to your computer, the device will essentially, at a very high level, what's happening is the kernel looks around and tries to figure out, hey, do I have any code line around that helps me interact with this device? If it finds them, it'll load that and then the device will start to work. When you unplug the device, the kernel will say, I've got this part of my address space that's sitting around that I don't need anymore. And it'll just unload. And this is something that happens essentially fairly transparently now on most systems when you plug in a device and remove it. And most modern operating systems support this, I don't know to say most, probably all, support this in some form or another. They call it different things, dynamically loaded libraries, or I guess those are something else. So loadable kernel modules is on Linux. I don't remember what Windows calls these. But all operating systems have support for this. It's necessary. The other thing is, so where do these frequently come from? Let's say you get a, what do you usually get alongside your brand new? I mean, it used to be before the internet run. Now the internet's made this a little different, but what did you use to get when you bought any sort of new doohickey? Sarah? Yeah, disk. And the disk essentially had this on it. The disk had some information on it that said, hey, this is how to operate this device. And where did that code come from? Did it come from Microsoft? No, it came from the vendor of the device. But it was designed to be compatible with the existing system that Linux had set up for supporting devices through this kind of architecture. All right, so let's talk about the second problem. So our second problem was the safety issue. And actually, this is a nice segue from loadable kernel modules and drivers, because this has a lot to do with drivers. So users don't like crashes. And kernel designers are really focused on this issue, too. I mean, you'd be surprised that Microsoft is very concerned about this. And most operators and vendors are, right? But this is the problem, right? This is the real issue that frustrates kernel designers, right? Because these guys are like hardcore, very, very practiced programmers, years of experience, incredibly defensive programmers. They're extremely smart. They know what they're doing. And as a result, their code doesn't crash very often. What crashes? So I just said that, you know, kernels, I mean, obviously kernels crash, right? You guys have seen blue screens, like this. I don't know what the equivalent on Windows is now. Now, does this still have a blue screen? Does that still what it does? Yeah, that's good. OK. What's that? Light blue. Light blue, OK. So they've moved, they've changed colors now, OK. OK, so if core kernel components don't crash, now that's not that surprising, right? Because core kernel components are tested all the time, right? Like memory management subsystems, process support, all this stuff. Let's go for this bang done every day. If you write, if you end up being one of the three people that writes some of this code for Microsoft, right, for Windows, you know, like you have a testing community of billions of people, right? So bugs just don't live that long in these core components. But what components cause problems, right? What components are less widely tested, less potentially adeptly developed? Yeah, I'm in. Yeah, so Microsoft estimates that 89% of Windows actually crashes are caused by device drivers, right? 90%, right? So, and this is kind of, you know, I don't know, I've worked at Microsoft before, and they kind of have this sort of sad sack mentality about this, right? Because they know that 90% of the time, this is not their fault, right? But who do you call when this happens to your computer? Who does your dad call, right? Your dad calls Microsoft, you know, my computer crashed again. Microsoft, Windows, piece of crap, you know? I can't believe I spent no money on this because it came free with my computer, whatever, you know? Anyway, so you know what I mean, like that's best, and Microsoft gets all of these service calls that they're like, sorry, it's not us. Now, okay, fair enough, right? I mean, it's still, you know, I think they've finally got over this one, like, well, it's still our problem, right? Because those people do call us, right? They're not like, ooh, like that looks like my Logitech device driver for this brand new mouth with eight buttons that I just bought, right? You know, it must have a memory leak or whatever, no, they don't figure that out, right? They just call Microsoft. But again, so device driver code is written and rewritten. Every time you ship a new device, you have to rewrite this stuff. It's written by programmers that usually aren't as well-trained and potentially not as experienced. And there's all these device drivers, right? I mean, my guess is probably at least 90%, if not more than 90% of the code that runs on your machine, if you run a Windows machine, wasn't even written by Microsoft, right? It's written by all these external vendors, right? Because your machine has lots of components. And one of the things that Microsoft does to their credit is they support like every device ever created, right? That's one of their challenges. If you buy like an Apple computer, I mean, they're getting better at this, right? There's like this tiny little list of devices that are compatible with it, right? Or hard drives, for example, because like Mac makes the software, makes the hardware, right? Whereas with Windows, they're like, we need to support every hard drive that anybody could find in any sort of dodgy computer store anywhere on Earth, right? Because they're gonna take it home and plug it in and if it doesn't work, they're gonna be mad, right? So it's a bigger challenge, right? So I wanted to just point this out. I thought that this was kind of a funnier story, funny story, right? So device drivers are frequently the way that you... So again, I mean, let's say like you start, it's your first day of work. I should be careful about actually, just thought I don't wanna get sued. So let's make up a fake device company, right? You start your work at fake device, right? And fake device is pushing their new fake device mouth, right? Like we're going to nine buttons, you know, it's time. People have, you know, anyway. It's like a two-handed mouse. So, and you know, they've had this, the eight button mouse, the eight button mouse was a big hit, right? So your job is to develop the device driver for the nine button mouse. What do you think you do? How do you start with that, right? Like they want it next week, right? They're about to ship the product, you know, the guy before just couldn't deal with nine, right? His brain circuitry exploded, right? But he wrote the eight device driver mouse. Sorry, the eight button mouse device driver, right? So what do you do? Where do you start? What's that? Yeah, you start with the code for the eight button mouse, right? And then you just hack on it until it kind of works for the nine button mouse, right? And then that's what you ship, right? So again, like lazy programs, not just lazy programs, these are efficient, right? Efficient programmers, right? Efficient programmers start with earlier versions of the code and they just make changes to it until they get it to work, right? Just, you know, cut some corners off, you know, stretch it a little bit here and now you need what you want, right? So here's the question, right? So what happens when the company discovers a bug in the device driver for the six button mouse? How many versions of the device driver do you think that bug is in? Yeah, so essentially you start to see these, like they've actually done studies on this. So you see these generational propagation of bugs in device drivers, right? Because the six button mouse was the basis for the code for the seven button mouse, which was the basis for the code for the eight button mouse and this goes on. So you find a bug in a year old device driver, there's like eight devices now that have that same problem, right? And there have been, again, there have been sort of academic studies and various studies in industry about this type of problem, right? So it's kind of funny. All right, and you guys may be discovering this as you work on assignments two and assignment three. Don't cut and paste code, you don't understand. You know, just like, I don't know, like take the extra week. People can wait another week for the nine button mouse, you know? They're still wrapping their fingers around the eight button mouse, right? They can wait an extra week for a version of the nine button mouse that doesn't have bugs in it. So this is, you know, obviously again, I mean, we showed some statistics about this. This has probably gotten worse. Like this is a big focus of research and development work and I'm probably not even up to date on the latest greatest stuff that Windows has done in Windows eight and even Windows seven to try to improve the device driver framework, right? And a lot of that because, you know, again, I mean, it's, you can look at it both ways. On one hand, Windows will say, well, this isn't our problem, right? But people do blame you when there's a problem, so maybe you should help fix it. And again, the framework that they provide for developing device drivers and testing device drivers is something that can either isolate these problems or make them worse, right? So I know that they've done a lot of work on. And there's all sorts of, and again, there's all sorts of, you know, academic work on automatically testing device drivers and verifying device drivers, trying to prove the correctness of device drivers before they shift and stuff like that. So this problem has sort of created a lot of interesting research, right? Yeah, and there are other approaches that try to isolate faults that are caused by device drivers, you know, don't let it crash the whole system, just have it crash the device and things like that. So kind of, I mean, this probably deserves its own lecture entirely, right? But I just wanted to give you guys a sense of what the problem is, right? All right, so when we go back to safety, right, there's other approaches to, so we talked about a problem with operating systems, right, which is they have to rely on a lot of code that they didn't write, right? And that's an issue. And one of the approaches to doing that is to try to isolate the parts of the code that you think are gonna be buggy and stuff like that. Another approach is to do actual automatic type checking of kernel code, right? One of the, you know, as you guys have been working on OS 161 this semester, you guys have been writing code in C, right? C is a fantastic language for, you know, extremely skilled, you know, C is on some level kind of a, it's a very, very low level language at this point in time, right? When C was developed, believe it or not, it was considered to be a high level language, right? But as time has gone on, you know, levels of attraction by languages have risen. And, you know, C is not necessarily a great choice for writing operating systems anymore because there's a lot of things that you'd like to be able to check when you build programs and see that you can't do, right? And so there are research projects now that look at trying to implement kernels in managed languages, right? So you could think about, you know, C sharp is kind of equivalent to Java, right? So can I write a kernel in Java, right? Because then I can do type checking and I can do, there's all sorts, the power of these types of languages because you all sorts of new capabilities that you can use to try to make sure that things are right when you compile them, right? Rather than waiting for things to break later, right? What's one of the other nice things that you get in a managed language like C sharp or Java or Python or I don't think you'd ever write a kernel in Python? One can dream, I mean, what's one thing that you guys have probably been struggling with when you guys have been writing C, especially if you came from a Java background? Memory management, right? Dangling pointers, you know, dead beef, forgot to allocate the object, allocated and freed it, freed it and kept using it, whatever, right? Forgot to free it, leaked it, you know, so all these sorts of problems and at some level like Java and these higher level managed languages are great for this, right? But you can imagine that it's almost like we're going back to log structured file systems, right? I mean, this sounds awesome, right? Like who's forever having to, you know, call K Malik again and worry about what happened to that pointer where it went, right? That sounds awesome. On the other hand, garbage collection overhead could potentially be a problem, right? And this is to some degree what caused kernel, I think, designers to resist doing this type of stuff for a long time, right? They said, well, sounds great, you know, and yeah, it'll make things a little bit more correct, but when the garbage collector runs, it's gonna be pretty terrible. It's got this big effect on performance and things like that and so I don't know how true that is, right? But I think that, you know, someday, right? I don't know, we could take over under bets about how long this will be. We're gonna stop writing kernels in these low level languages, right? I mean, language development has moved on a huge amount since the 1960s and 70s when these kernels started to be developed, but on some level kernels are still written in these low level languages, right? So on some level, you know, I think it'll start to see kernels that are written entirely in high level languages and at some point the hardware's just gonna get fast enough that this is worth doing, right? I mean, right now you still have, again, these sort of expert programmers and these systems that have been built up in these low level languages that have been tested and verified and worked quite well, right? And the speed is kind of worth it, but I'm willing to bet that at some point within our lifetime we'll see kernels that are entirely written in higher level languages, right? We won't keep writing things and see for it. All right, so let's go to the third issue, right? So let's say that we're, let's say you're writing, you know, you're doing an assignment for the class, you're writing a normal, you're doing some normal software development. How would you structure a big complex software project, right? What is the, what's sort of the classic way to doing this in a way so that you can complete it so that, you know, you can isolate development, you can effectively use a large group of people. What do you do, Peng? Yeah, so I break it into modules, right? I say, okay, you know, these are the parts of the system, there's some logical separation between different parts of this project and I'm gonna have, you know, you write one of them and I'm gonna write another and then this third guy's gonna write a third one. Before we do that, we're gonna agree on what the interfaces are, right? So that we can use each other's code. We're gonna say, okay, here's the interface to your piece of code, you know, here are the functions, here are the return values, here's how I access it, you know, here's the guarantees those interfaces are gonna make, awesome, you go do that, I'm gonna use your code, I'm gonna understand how to use it because the interface is, but I don't have to understand how it works, that's your problem, right? So this is the typical approach to doing this, right? When we talk about operating systems, I mean, structuring any large software project regardless of whether it gets compiled into one big blob of code or has to be split up and run in multiple places or is run as multiple subsystems, this is always a good idea, right? Always, but in the operating system, there's another advantage to doing this, right? Which is that if we can split up the system appropriately, what we might be able to do is we might be able to minimize the amount of code that absolutely has to work, right? Minimize the portion of the code where if there's a failure, if there's some sort of failure in that code, the system is actually going to crash, right? In security, they talk about this in a different way, they talk about minimizing the trusted code base, right? So what's the portion of the code that you actually have to believe is secure, right? Here we could say, what's the portion of code that we actually have to believe works, right? And it is so important that this piece works that actually again, if it doesn't work, it'll crash the system, right? With the monolithic kernel, essentially the whole kernel has the potential to crash the system, but if we break up things carefully, maybe we can reduce the amount of code that this works, right? So micro kernels are something that you'll hear about and you'll hear about sort of micro kernel-based designs, right? And micro kernels were an idea and approach that attempted to address some of the problems that we've seen with monolithic kernels and what they attempted to do is they attempted to say, okay, right now, again, what we have here here's the interface to the kernel, the application makes a system call and there's all of these pieces, right, that are now, and it'd be great if we could actually layer them like this. They're probably not even layered like that. They're like blobbed up into little corners in this box. Instead, what we're gonna do is we're going to say, we're gonna, you know, the idea behind micro kernels is to move as much code as possible out of having to run in privileged mode, right? So you could think about it as taking a kernel and trying to throw out as much of the code that doesn't actually need to run in privileged mode as possible and reduced the micro kernel, right, to be as small as possible, right? So what you can see here is, what actually they've done here, right, is they've said, okay, I have a micro kernel and here are some of the things the micro kernel is gonna do, right? I need to especially support some very, very fast inter-process communication. We'll talk about why that's important in a second, right? I probably need to manage virtual memory. That's a low enough level thing that I'm really the only person who can do it and I do some scheduling. That's it. Everything else, like for example, device drivers, right? Device drivers are now run in unprivileged mode, right? They kind of run as an application level service, right? File servers, file systems, run as application level services, right? They do not run in privileged mode, right? I think the Unix server here is supposed to be a piece of code that actually supports, you know, some of the POSIX interface and different Unix-type functionality, right? So again, the idea is move as much code as I can out of the trusted code base out of the part that actually has to work and implement everything else in these user space as these user space servers, right? So they call these sort of, you think about file server, right, that's a file system that's implemented in user space as a user space program, right? So let me ask you a question, right? So why does IPC become so important, right? Why is it so important to have really, really fast communication between multiple processes in a system like this? Yeah, so essentially, yeah, that's exactly right. So imagine I have a file system, right? The file system is using a disk device driver, right? It has to, it has to actually at the end of the day, write and read disk blocks. In our monolithic kernel, all of this happens inside the same big shared address space and all that communication is just done essentially through function calls, right? Here now what I've done is I've made both of them run in user space as application, as these servers, right? And so all the communication they do has to go through my microcurve, right? If the file server wants to talk to the disk, it actually has to send a message through the microkernel to the device driver that supports that disk, right? And that messaging overhead starts to become really problematic, right? And so that, you know, essentially all that communication I was talking about before is fast because it's just done through function calls. Now what all has to be done in this very structured way, right? I can't just make a function call to another component. I have to, you know, create a message and send that message to another component and doing that involves trapping into the kernel, giving the kernel the message and having the microkernel send it to another component, right? So again, the stuff that goes on the kernel is just the lowest level stuff, right? Low level VM protection, a bit of hardware control and then again, a goal is really, really fast and simple IPC, right? The user, my user level services communicate with each other and with applications over well-defined protocols using messages, right? So, you know, if I want, if my file system server wants to send a message, it wants to say write a disk block. It just doesn't make a function call. It puts the data for that disk block into a message and sends it to the device driver that's supporting that disk, right? In theory, right? So this was one of the hopes of the microkernel movement, right? In theory, these user level services can now fail, right? They can fail without causing the system to stop, right? So if there's a bug in my file system driver, right? If that file system driver is loaded on Linux, it's probable that that bug is going to crash your system, right? Now, what will happen, right? Well, the file system, you know, the file system server is implemented in user space and so it'll crash, right? It'll crash the same way a user program will crash, right? And what will happen after it crashes? Well, the operating system probably need to do, okay? Restarted, right? Because it's probably important, right? Like, I'm trying to use that file system, right? So I need that file system server to run, right? But the idea here again is that I'm using the capabilities of the system, the fault isolation, that the same stuff that I use to isolate processes from each other and from the system, right? Remember, when we started, we talked about one of the goals of operating systems is to protect the stability of the system from buggy user code, right? So your process cannot crash the entire system, right? This is a variant of this. It means now, because I've implemented my kernel kind of as a process, I can use the same mechanism to protect my kernel from the failure of core kernel components, right? Hopefully, again, I could just restart that core code from when I keep going, right? That was the idea. So my core kernels were kind of a hot topic in the research community in the 80s and 90s. What do you think happened? Can anybody guess about? So this sounds like a nice idea, right? Again, what I'm doing is I'm reusing these protection mechanisms to limit the amount of code on the system that can actually cause problems, right? Sounds very nice, right? But what do you think the bug is in this soup? Okay, so, yeah, micro kernels got big. That's kind of what happened, actually, but if I was just gonna go with this strict micro kernel design, what do you think starts to become the limiting finish? Why? Yeah, so the overhead of message passing and these boundary crossings that were required for communication between parts of the kernel that didn't have to do this before, right? So again, in my big monolithic kernel, anything goes, right? I can just make function calls, and it's very, very easy to transfer information from component to component. The overhead of doing this in a more structured way ended up making these systems, the performance of these systems sort of uncompetitive with more traditional operating system designs, right? And there were two ways, right? There were two sort of common ways to improve micro kernel performance, right? Both of them were kind of terrible. Varun touched on one of them, right? So what, oh, sorry, sugar, yeah. So what was your suggestion for how micro kernels fail? Yeah, but what's one way to improve the performance of a micro kernel? Yeah, make it more monolithic, right? Like start moving things back into the kernel, right? It's like, well, I put all this stuff out and it was really nice. I had these really nice, really defined interfaces and it was, you know, life was good and I could restart things, but now I'm slow, right? So I'm gonna go back the other direction and start moving things back to the micro kernel, right? So yeah, I can migrate pieces back into the kernel, which just kind of defeats the purpose, right? What's the other, you guys probably don't know this, but what's the other way to do this, right? So let's say I'm gonna say, no, I'm a purist. I like my interfaces. This is good. The system's more stable. What's the other approach? Lovely. What's that? So yeah, so actually one of the contributions of the micro kernel movement was big improvements in IPC, right? So they came up with these really, really fantastic ways for making IPC really fast, really low overhead. You know, no copy IPC, right? I don't, like the data just stays in the process that sent it, you know, very, very efficient stuff. So, but this is a variant of what, right? Like what am I gonna try to do the micro kernel? The system's too slow so I can make the micro kernel really what? Fast, right? And one of the ways people did that was writing the micro kernels in assembly language. Right? So I'm just gonna like optimize the heck out of my micro kernel. You guys are laughing, right? It seems, it does seem kind of funny now, right? Because it's like, well great, right? But you know, but the point, like we didn't want the development of the micro kernel to take like ages and ages, right? So, and this was another case where there were these battles in the research community. Now micro kernels are terrible, they're really slow. And then somebody would be like, I have this hand optimized micro kernel that I spent two years writing in assembly language and it's really fast, right? Like of course, right? You know, anything would get faster if you spent that much time on it and wrote it in assembly language yourself, right? But again, I mean, to some degree, like the principle, so this is what I was talking about at the beginning of the lecture, right? The principles behind the micro kernel movement are kind of still alive and well, right? So the emphasis on message, so messaging and well-defined interfaces between components, right? And you guys start working on assignment three, I would encourage you to heed this advice, right? So just because you're building a monolithic kernel doesn't mean that it has to be disgusting, right? You know, coming up with well-defined interfaces between components and also, again, all this really, really efficient message passing, right? That ends up being useful in other contexts, right? Micro kernels, you know, some of the ideas of micro kernels are still influencing the research community as well, right? So recently there was a really neat project out of Australia where they were actually able to formally verify the correctness of a micro kernel operating system, right? So what this means with like 80 caveats, right? But largely what this means is they were able to prove the correctness and the safety of a kernel, right? So this was actually like a pretty big news when it happened, right? And it took a team of like 20 or 30 people like several years to do this, right? It was a really huge project. But what did they formally verify? They didn't formally verify Windows, right? They didn't formally verify Linux or Mac OS. They formally verified a micro kernel operating system called SEL4, right? Why? Because it's small, right? And it's, you know, it's probably the least you can get, sort of, you know, get your hands around for. It still took them years, right? So it took them probably, you know, 100 man years of time to do this project, but the only way they were actually able to get it done was to use a micro kernel, right? And again, this we talked about, there was a lot of this came out of efficient messaging at IPC, right? All right, I'll stop here. We want to have to talk about hybrid kernels, but again, there were a lot of contributions from the micro kernel movement, even if micro kernels didn't live on. And on Friday, we will start talking about performance.