 All right, good morning Everybody enjoy the long weekend Happy Easter belatedly or Passover or whatever particular Holiday your solidarity this time of year maybe just happy being done with assignment to So today we're we're gonna take We're done with file systems, and we're gonna take a little bit of a Grab bag week we're gonna do a lecture today on OS structure, and then on Friday We're gonna do a lecture on operating system performance, and then my plan next week is to start talking about virtualization, which will be kind of the last major topic that we'll cover in the class So We're still working on the great it's incredible that I am still saying that and it's kind of embarrassing coming in Every day and still not having to sign in zero graded, but we're so close And I know I've said that like 50 times, so I feel like the trust level in this room is low, and that's okay Clearly we will have grades done before May 9th or whatever it is right at some point They'll start coming after me with pitchforks But but I'm trying to prepare for that day despite the fact that I'd be what is going on if they start to dismantle the building while we're having class then I guess We'll stop early All right, and then a Simon 3 should come out today, and I'm gonna look into I you know Simon 3 to me is the the kind of the culmination of the programming work for this class Traditionally, we've given students about three weeks to do it I want to give you guys as much time as possible So I'm gonna try to figure out kind of like when when the latest possible date is that we can let you guys Continue to work on a Simon 3 clearly that's not gonna be like July or something, but you know it's As far as far and maybe you you might want to keep working until July But but we'll figure that out and I'm gonna try to you know give you guys as much time at this point I'm really more interested in you guys kind of doing the assignment having fun with it We also have an assignment to solution which I'm guessing that a bunch of you guys will want to use that We're banging out a little bit so so stuff's coming along with the course assignments. All right No review today. Does anyone have questions about file systems that they would like answered or? Questions about operating systems in general Meaning of life sort of issues anything anything pertinent to discussion on a Wednesday morning And we're here to talk about cool computer system all right, so a Lot of times Operating system classes start off talking about structure And I think I'm more clever than other people who have taught this class because I waited till now to talk about operating system Construction right because you guys have seen some of the pieces of the operating system now you've seen you know the process and threat support You've seen What else have we talked about? Let's see here. So we've done a couple of major units This is kind of like mental brain warm-up this morning, right? So I'll start in the back What else we talked about so we did schedule in a process management. What other big pieces of the OS have we discussed? Virtual memory. All right. Yeah, we talked about memory management. That's this like big hairy thing John What else? What are we talking about last week? Yes Yes, I was like I could see you paging it back in. You know, it's like page folds Hatteras translation what we're talking about last week last week file systems, right? And then there's and there's actually a lot of other things that operate systems are responsible for that We haven't really talked about Partly, you know most of time I think these things get excluded from these sorts of classes because maybe they're a little bit less interesting But there's still important functions. What other what other things does your operating system do that? We haven't really discussed Alex, what's what's one thing then? What's one thing you would probably be very sad if your computer didn't do Boot okay, no, yeah, okay. I like it. He just took it like straight from the top Yeah, that would yeah power on right we have talked about batteries Boo, yeah, actually, and that's really that's actually that's a great point right and then boot up There's this whole boot process. There's boot loaders It and and that's just kind of gross and disgusting and it's a little so we didn't talk about it But yes, it would be sad if your computer didn't boot What else would it be sad if your computer didn't do that we haven't talked about? Devices right? Yeah, I'd be sad if you had a computer that you know, you couldn't like you got your brand new You know wireless pointer thing or whatever it was your new special controller for playing angry birds even better And the computer you plugged it in it didn't work. So that would be sad What what what else though? You guys missing something big? I think I think I don't know what you guys do with your computers But I think it was a big big what else What else what other major part of the computer have we just not even touched on it? What's that? We've talked about multitasking Ben Networking right like if your computer wouldn't connect to the internet or didn't know about the internet or Didn't think it was cool to talk to other computers then you would probably be be sad, right? I would be sad. I don't know how to use the computer. It's not hooked up to the internet anymore So yeah, so we haven't talked about network protocol stacks sometimes that's covered in other classes But typically these are implemented inside the operating system. So remember we talked about 4.3 BSD, you know that the greatest software ever released one of the big things and that was the TCP IP protocol stack Which is kind of spread its way into all sorts of other things. Yeah network protocol stacks We don't talk about network protocol stacks are really hard to design as you would imagine right actually a lot of the same Design considerations we're going to talk about today Influence some of the design of protocol stacks, right? What about this sort of stuff, right? I mean, maybe this is a little bit less pertinent to your day-to-day life, but Big multi-tasking multi-user operating systems spend a lot of time sort of tracking And trying to control what users are doing, right? So if you log into Timberlake You hopefully won't be able to make the machine slow for everybody, right? You might be able to make it slow for you, but there's a lot of these things that go on in terms of Shutting you know I had this happen to me and EC2 the other day. Actually. I was running Oh, yeah, I was actually I was running some stuff that Trying to like change some of this schema for the databases that we're using to store the course information And it ran so long got killed, right? And I felt like that was kind of sad because it was on a machine where I was the only user so I was scratching my head I was like who who was killed on whose behalf, right? This is my machine. Just just let me use it But on I mean how many people have ever run something on a big multi-user system that's been killed because it because it ran out of resources or ran too long or whatever I had a I Had a friend you guys remember SETI at home. How many people remember SETI at home? How many people ever ran SETI at home? So SETI at home was a system where you could They would distribute tasks and you could process information from outer space looking for aliens and for a while This was this really kind of fun example of a pretty interesting sort of peer-to-peer I guess maybe they were called that crowdsourcing now, right? So I had some friends at college that were competing over how to How many packets they could process they used to have a bulletin board where they would post like here's the person who processed the Most packets last week it's kind of stupid right because it was like, you know Yeah, all my machines are slow because I'm competing to process the most SETI at home packets But people got into this for whatever reason because people will compete over anything if there's a competition being held and Some of them started to run their SETI at home jobs on the campus's central mail server, right? you can imagine these are really Intent CPU intensive jobs and they kept getting killed off, right? So a friend of mine came up with a clever way of getting around this how many people used pine the email client There's a popular email client text-based email client called pine and this is you know again This is how old I am I guess is that we used to use pine when I was in college for checking email, right? The interweb was in it's kind of like early stages and there weren't a lot of web-based email clients So you'd log in to use pine So he thought that he would be able to get away with running his SETI at home job and the way He did it was that he renamed the executable to pine two thousand, right? This was you know, 1999 right and then he just sat there and let it run I think it ran for a few days before the system and real administrators realized that pine two thousand was in fact Not some sort of like really CPU intensive email client and in fact just some sort of CPU hog that he was running to try So anyway, I think I think maybe the funniness of that story is lost on you guys, right? Because you guys don't know what SETI at home or pine hour, right? But look I tried All right, and then you know we talked about device drivers, right? So yeah, we would like our system to actually be able to interface with devices and these are all really interesting topics I'm not trying to poo them, but we just haven't ever really talked about it So clearly there's a lot going on in the operating system, right? So, you know if I started to come up with some sort of diagram about how the operating system components relate together And this is this is I found this diagram like five places on the internet I came up with a slightly more interactive version of it But you know and then I've got all these different pieces I've got this protection module memory management and some sort of a count aid subsystem Maybe a command interpreter that I'm running and I start to try to like figure out the dependencies that these components have on each other Right, and then I'm going to add in some other things that I might need like information services like proc and you know I've got file systems that I'm running and the IO subsystem communicate with IO devices and I start to like you know Try to understand how this is going to work and it just gets grosser and grosser, right? So this is kind of you know the canonical mess that you get into with with operate system design is that it's It's really difficult to to impose any sort of structure on this, right? One of the earliest multi-programming systems that was called THE Which was actually designed by Edward Dykstra, right? Who is famous for a bunch of other reasons that was back when if you were brilliant like Edward Dykstra, you could Contribute in systems and algorithms analysis of other ways, right now now you have to focus a little more But anyway, so he and he had this model of like the layered operating system Which I don't have on here, right and really nobody adopted that model partly because it doesn't work, right? Like it'd be great if your operating system have layer had layers, right, but it doesn't So so yeah, I want to I know I showed this video at the very beginning of class But I wanted to show this video again just because I love the I don't know if you guys were here on the first day Or if you missed the slide The slide is really the most Impressive thing, right? So did you guys catch that? So this wait wise. Oh, I've got a plate over here Here we go All right, so there we go. So this is kind of like what this is kind of like what we're getting into, right? and I Think I told you this when I when I saw the movie I was like they had to make that up There's no way that there was ever a slide this complicated that Matt put up an OS one six in CS 161 And then I went back to look to the slide decks and sure enough that very slide is in there Right, so I think they I think they must have actually cut and pasted the slide itself So so anyway, this is actually a virtual memory management slide But but you get the you get the idea right like this this stuff can get really can get really ugly, right? All right Okay, and essentially what we're talking about and what we have been talking about throughout the semester is a kernel Structural model that we call the monolithic kernel, right? But like the monolith what was a model it makes this They can be thinking of a think of 2001 space odyssey for some reason right right this big Thing it's just one big thing right so essentially what's our model the world right world user programs We have this system interface right here. We have devices We have these low-level interfaces for communicating with devices Maybe we use you know we read and write to their ports directly through memory right and then in the middle We have Everything else right like this is the kernel right and everything is in here everything Can potentially you know cross this interface and everything runs with essentially kernel privileges, right? So this is the model that we've used throughout this semester, right? And this is really how a lot of operating systems are designed probably most of the major operating systems You guys are familiar with use this design, right? So again all OS code is loaded into the same shared address space, right? So the kernel has an address space the kernel address space usually has, you know Special permissions and can access all of memory directly right or in some ways Which is required so that I can multiplex memory right so I load all the code that runs in the kernel Runs with the same set of protections in the same access to all of the memory on this system, right? All the OS code is potentially privileged, right? So it has access to these privileged CPU instructions that we talked about before in contrast No user code is privileged right so this is the privilege boundary here It's black and white like OS code totally privileged user code totally unprivileged right when it comes to these privileged instructions And and what happens here is that all the OS code is structured as one program and as you guys have started to do some development yourself How would you characterize that big program? anybody Yeah, I mean it's kind of gross right like it's all kind of sitting in there It's all trying to you know it in unless you impose some structure on it This could has the potential to get really really gnarly and disgusting right because you know I mean, it's just C code right you can do whatever you want right and there's no There's no need for you to actually obey interface conventions You can do all sorts of things you guys been kind of fine in this out a little bit right So this seems I mean does this seem like a good idea to anybody? I mean this kind of seems like a terrible idea right this seems like a slide about how you know we talked Throughout the class about one of the reasons to study operating systems Which is that these are these really mature really sort of Elegant and well-designed systems because people have been working on them for decades and decades right so why have people stuck with this monolithic model of How to structure the kernel components themselves? What do people think? What are some of the things that I mean given the fact that this seems kind of I mean it's it would make you nervous Right like let's say you wouldn't work to work at a big software company And they were like we are going to design You know let's see what am I going to call it you know new OS right or? Lindows or whatever I mean pick a pick your name for it a new catchy operating system right and you were like okay great So what's our plan and they were like we don't have a plan? Let's just start writing code So why why would people do so so again me? What are the what are some of the things that seem appealing about? Give me one You just know so so what's hard? I mean okay just again like you're just saying I'm just gonna get right down to work I mean that sounds good, right? I mean you can just get started Right, so I could just sit down and start hacking right? I don't need to plan You know I mean I just sort of like sit down It's like the way you guys approach your assignments for this class right or maybe the way that you wanted to approach them until Until you've started to learn otherwise right just sit down with the assignment webpage and start Trying to write code right and model that kernels make it easy to do this right okay? What's that? You don't agree with this anymore, okay good So What what else right? It's it's actually really fast right this is probably one of the the most Important reasons for the enduring success of this potentially terrible model of OS structure right because if you think about it You know the interaction you know Let's say you write your files and code and then you write your virtual man memory management code And as we talked about there are some dependencies there like this is one of the things that makes it very difficult to structure Operates as a stack right so for example the virtual memory management subsystem might want to use Files for swapping right but the file system wants to use a memory management as a buffer cache Okay, so I challenge you to find the right way to stack those two modules right they don't stack right? They both have dependencies between them if you write them as one big blob of code You know like your file system just makes fun can cause through the virtual memory Interfaces and it can use whatever interface it wants it can use internal interfaces It can you know create its own functions to to deal with memory more efficiently whatever right so as opposed to some of the Things we're about to talk about the overhead between transitions between modules. I mean they're not even really transitions It's just making another call up or down the call stack right and see it's it's Trivial okay, so this is these these are some of the nice reasons right, but there are all sorts of problems with this approach Right and a lot of there's been a lot of really interesting operating system research has been looking or I since a research I mean research and development that have been trying to address some of these problems. So let's talk about them Individually first and then I'll present sort of some of the I can stay on my feet I'll present some of the approaches to dealing with all right Okay, so first question right How how is the monolithic kernel not like the Boy Scout model right so who members of Boy Scout motto? Does anyone know what the Boy Scout motto is other than me? Yeah Be prepared the Boy Scouts had a motto and a slogan and I think this this is actually the motto I used to get them confused that the slogan is do a good turn daily But the Boy Scout motto is both both nice things right be prepared Okay, so how is a this monolithic rigid kernel rigid? Not like the Boy Scout model right so going back to some of the things that you expect your colonel to be able to do What's a way to challenge this monolithic kernel and in and kind of expose one of the flaws in this particular design? Okay, no okay. No we're gonna get to that be prepared for all sorts of weird stuff to happen Right so with that that's coming next right but what other what other events might a colonel Be forced to deal with that could that could strain kind of this put everything into one blob And that's it sort of mall can't add to it. When would I want to add to it? Right, so let's say I plug in a new device Let's say you go out to Best Buy you get your brand new This is I'm trying to remember trying to think about okay So bet this is probably a terrible example because Best Buy probably didn't exist but it's like the mid 1980s and You know you've come home with a brand new computer mouse right and you don't have a gooey yet But you just figured the mouse be fun to have right So but you've decided that you've got this mouse and you're gonna plug it into your computer and the model the colonel is like nah, not so much, right I Don't know what that is I don't know how to talk to it and if you want to use that thing You got to sit down with the source code and like you got to make some changes to the kernel Configuration and then recompile me and reinstall your machine and then you can use your new device right? So this is kind of terrible right so one one challenge. Oh wait, sorry What one challenge we have here is again just how do we maintain some degree of flexibility now? What's what's the alternative approach right so let's say I want to be the like Because because even in the Boy Scouts There's there's some a balance between preparedness and over preparedness, right? So how would what would the over-prepared model if the kernel do? Say I'm sitting down a compile my kernel and I want to be ready for anything. So what what would I do? What's that? No You're too modern right no, I'm a model of the kernel. I have no idea of pluggable modules We'll get to that I but I want to be ready for any right. So what do I do? Right, I've got to include every possible device driver for every possible device ever That ever exists at the time of compilation, right? So when I sit down to compile my code, I'm just going to compile everything into the kernel now that sounds like a great idea What's wrong? What's the problem with that? What happens to your kernel is you start to include more and more and more code It may get slow, but it probably is going to get what? Huge Remember the bootloader loads the kernel into memory right and kernels have you know Mono kernels have some ability to page out pieces of their code that they're not using But still you end up with this massive kernel executable So it would be kind of fun to see if you went and you configured Linux to to Build every possible device driver into the Linux kernel itself. I wonder what the size of the resulting images probably mass And most kernels don't do this for this very reason. So so this this is the trade-off here, right? I can include everything but then I'm like the proverbial Boy Scout that can barely move right because he's so like weighed down with like extra Backpacks and 16 extra shoes in case the 15 don't work it then you know what I mean like you can imagine trying to hike with that person All right, so I Don't know if you guys can read this cake It might be better if you can't read the cake So so so but if you can read the cake, how is the model of the kernel like this cake? Can everybody read the cake? Okay So how how is the model of the kernel kernel like this this particular cake? What ruined this cake? This is a nice cake, right? I mean it's got really nice designs up here It's got this nice red around it. You know it has a nice message At least it wanted to have a nice message. So what what ruined this cake? What's that? Okay, poor handwriting right, but what else right? I mean let's say that this isn't just a handwriting problem Let's say it's a deeper issue. So so again, I mean, what's that? Oh No, no, no, so that's that's not the problem with the cake If you can't read the cake carefully, there's the cake is supposed to say best of best best of luck We will miss you and there's been a typographical error here That is rendered the cake obscene right so but again, what's the point here, right? I mean there are like 20 letters in this sentence What happened? You get one you make one little mistake, right and it just can ruin the whole cake Okay, it's probably cake probably tasted really good And it would probably be funny to eat it just because of what it says So what if you what if your colonel just has one little problem, right? One little problem. How many of you guys found a small problem in your assignment to code that was having a large impact on the Overall functionality. I think everybody's hand should be Yeah, yeah small just one one little thing. It's like you that that's stupid Asterisks, you know like that there should have been an asterisk there or you know There should have been like one additional ampersand that just that needed to be there once so again Just and now those are those are a little little mistakes, right? But I mean one little problem can really bring bring kernels to their knees Especially monolithic kernels because like some like you pointed out everything's running in this privileged address space, right? So if anything does something bad the whole system is just going to collapse, right? Okay Yep, I mean you guys have started to see this message maybe on a regular basis, so I think that's what it says Is that what it says? I didn't actually initiate a panic to see this, but I I should know what it says, right? I've seen this. I Guess I'll just die not It's a nice message. Okay, so we've covered this a little bit Clearly you guys cannot see what this is. So there was a there was this really interesting article Couple years ago about how much time the military spends writing PowerPoint slides So apparently there are there are like upper-level officers in the military who almost the entire their entire job is to write PowerPoint slides right for these presentations that they give right and this is one something that was included in one of these slides and this is I guess this is actually Let me see if I can even read this so I get close enough this this is something that was supposed to be describing the relationships between different like tribes or warring parties in Afghanistan And you know like someone spent a long time and I'm sure there's actually a lot of information content in that But in a PowerPoint talk, it's probably difficult to see clearly enough to see this so again But what you know, what is the you know? You start off writing your kernel and you start off on day one And it's so simple and you're just doing a few things and then what's happened by day, you know 150 what's okay, so let's say you start off writing this time at 2 and Everything seems really clear and what's happened by day 10? Right, I guess I'll die now right you see that a lot But but in the meantime like stuffs get stuff gets ugly right stuff gets nasty And it's just you know you do something and you're like well This is okay to do it this one way and then pretty soon You've got this ugly Disgusting mess right and and this can happen just way too easily right and then once you've got this ugly disgusting mess Then you know what happens if you know you or God forbid somebody else right someone who isn't familiar with your own private universe Needs to come in and make a change right. I mean to some degree This is I shouldn't say to some degree to the to a major degree. This is why we don't use Linux for this class Right, it's just too complicated and and the Linux designers I'm not saying that they they punted on this right like they've done a lot to think about the structure of the code and to try to To do it in a way that makes sense, but at the same time You know eventually this kind of naturally happens even if you do your best to resist it and the best you can do is to try to make things as cleanly documented and as you know as Uncomplex as possible. We're talking about really complex systems right especially when we start to talk about modern operating systems You know, I remember when I was working at Microsoft I spent this I spent one really frustrating morning pouring through this piece of code Trying to figure out what it did. It was it was this piece that was in the memory management subsystem And it was really complicated And then I decided to let the pre-processor do some work for me so I compiled the kernel for my target platform and I looked at the pre-processor output and it turned out that None of the code I looked at was being included right it was all being destroyed by pre-processor macros It was like if you're running on a machine with three gigabytes of memory And it's divided into pieces that are a prime number size and etc So like it was just this really really specialized corner case, but it was this huge blob of code It was really hard to understand it was dealing with that right so operate systems have had is sort of had this happen, right? All right So so today I just want to talk a bit a little bit about some solutions to these problems right and again This is not to say that there are not other problems with the way that we design kernels And and a lot of other interesting aspects of the way as we design kernels, but it's chosen to focus on these three all right, so and and again now the systems community over time has Don't has tried to deal with some of these structural problems right because these these structural problems have real consequences, right? I mean rigidity makes it difficult to use the system safety makes the system fail, right? And complexity I would argue sometimes gives rise to those other problems like the reason why you know complexity and safety in particular Are very intertwined right? I mean the reason why kernels in many cases are not safe or have bugs is because they're so hard to understand And so when people sit down to try to improve them or add features they make mistakes All right, so so rigidity is really a catch-all phrase that that it could potentially Encompass a wide variety of different problems. I don't want to talk about all of these right But but there's been work that has been done on all sorts of different aspects of rigidity, right? So today we're really going to focus on the first one which is Rigidity in terms of what devices the system could support right so when I boot up There's a certain set of devices that are attached to the computer and over time as the computer runs that set might change Right there might be new devices that could plug in that the kernels never seen before there might be old devices that I've seen but But that are not always attached right But there's been other aspects of rigidity that have also been explored right so one one thing we talked about a little bit especially when we talked about scheduling was that the operating system Has this very very thin narrow interface that it tries to preserve and use for interacting with user processes and The narrowness and thinness of that interface Means that there's a lot of things that processes might want to tell the kernel that they can't right things about how they're using memory things about how they would like to be scheduled and Kernels do their best to guess those things right given the visibility they have into application behavior But they don't always do a great job and you might think hey You know like there's some applications out there that might know things that would be useful for the kernel to know I'm never going to use this page again, right? I I cannot run and do useful work unless these conditions are met right So there's no point in scheduling me and doing the whole context which overhead paging in my memory Just so that I can check these conditions and go back to sleep, right? So maybe you should try to help with this right? So those are kind of like the last two And and again I mean that the real reason we have this narrow thin interface between the kernel and user programs is safety right it's because the kernel doesn't trust things in user space and the smaller I can make that interface the fewer ways there are to enter into the kernel the fewer Functions the kernel has to support that processes are allowed to call to request access the the less You know the fewer sort of holes in the fence I need to defend right So is if I made my interface huge and big with all these sort of new features and new ways for the application Interactive the kernel now I've got to support all that extra code right and that's that's difficult and that creates more of An attack surface quote-unquote. It's not always an attack It might just be you know idiot-proofing things right processes do dumb things and and programmers make mistakes Colonel programmers never make mistakes, but user programmers make mistakes all the time all right, so as Ben pointed out before One of the ways that we work on the flexibility with respect to devices is simply by Modularizing the current right and particularly modularizing support for devices right there, you know you you don't That there are some components of the computer that come and go and then there are other components that don't so for example For this has started to improve but for a long time Linux and other Operate systems really had very little support for hot swapping disk drives right because their assumption was kind of like there better Be a drive there and if it's attached when you boot you better not detach it right because I'm just not sure What's I'm going to do is the kernel if you take the drive that I've got my swap file on and hey like hey I'm going to disconnect that drive right so you know the same thing with memory right I mean I don't know I wonder if this actually true like hot swapping memory I mean how many people reach into their machine and like pull out a stick of RAM while it's running because they're like you know I need this for another machine man. My other machine needs more RAM So I'm gonna pull this out and put it over there Not not not a common use case right same with the CPU right you know get in there Script the script the grease off pop it up. You know it's a little warm cool it off Move it over to another machine needs an extra board right so so these things are usually just baked right into the system Right, but to the degree that things come and go you know we We want some support for that right and the solution is is something that a lot of you guys may or may not be familiar with How many people have have ever sort of done anything that's touched the loadable module system on Linux like Inserting modules were moving modules rebooting things. Okay a couple of sort of hardcore Linux hackers in the in the audience so so what happens on Linux is simply that Most actually a lot of parts of the kernel are modularized right so what they're they're built in a way so that they can be Separated from the curl right so when you configure Linux, there's a big deal That's not a word bajillion different options That's not already there bits of word that people know is being a big number That you can enable or disable support for different types of hardware different chipsets different features different file systems And normally your choices are you can include it in the kernel? Which means that it's baked into the kernel that the kernel image and in some cases that's good, right? Let's say like ext4. I'm going to install a bunch of systems with the ext4 on it I might as well just have that kernel be part of the that that could be part of the kernel image Right sometimes the kernel needs certain modules to boot, right? but These things can also be compiled as modules so when you configure Linux and when you use things like Ubuntu The Ubuntu kernel ship with most things compiled as modules And what happens is that if the kernel detects that it needs that code it loads the code into the operating system right so these modules are stored on disk and then if If the kernel needs them if you insert a device that requires a certain driver to function The kernel will look for it and then try to load it into the into the running address space and the code just executes as no Right, so this is a very very clever idea, right? You know again, they could be independently recompiled so I can actually ship new modules that that That might correct bugs like there is obviously there's some coupling between the core operating system image and the modules right so usually there are Symbols that that the kernel will look at in the module to determine if if it is safe to load that module into the currently running kernel But you know again I mean you can certainly as long as you're compiling with the same kernel source You can fix bugs in modules and load them reload them into the kernel to fix problems without stopping the machine Right, so this is kind of a really nice feature and these are supported You know in one way or another on really every modern operating system, right Linux has loadable kernel modules Mac has something called kext I've built a couple maybe I shouldn't say it's on camera But I've built a couple of hackintoshes, and I thought I've got a little familiar with kext If you want if you want to find out what it was like to play around with Linux circa in 1998 Try and saw in a hackintosh, right? It's kind of it's really the same user experience as it used to be You know where you spend like three or four days trying to get it to work And then you make one change and you have to start all over So I've gotten better at it over time All right, so let's let's go to the second problem right so safety, right? So, you know People hate when their operating system crashes, right? I mean again remember I can't remember who said this Maybe it was Margot, but you know the worst Bugs are the ones that give your users not only the time but the inclination To call you up and complain right or hunt you down via email or Twitter or Facebook or whatever right and You know stop the crashes that stop the system causes to reboot, right? Maybe you need to repair the file system at that point those are terrible, right? And those give people time to call Microsoft on the phone, okay? You know kernels don't like to crash either, right? I mean this is a major goal of operate system Development what's what's the what's the interesting factoid here that's missing from this equation? What what what about? What's what's the problem here from a from the kernels perspective if you're from Microsoft or Apple? Right again all these calls people are like my computer just crashed etc. etc. What what's what's the? What's the frustrating thing? Well, okay You might have some clue right the person may have carefully copied down all of those hex codes on the blue screen of death Right like this is what actually when Linux crashes like this is what the mint Dell developers tell you to do like get out of pen and paper Right and carefully like copy down people do this right because it's really useful information Maybe now you take a screenshot of that and I hope that's what people do But if your Linux machine was your only piece of technology and it died and it generated this air dump You know you would carefully write it down because that's that's like the machines last will and testament, right? The fact is that that kernel code doesn't cause a lot of operating system crashes Core kernel code does not cause very many crashes. Most crashes aren't caused by faults in the virtual memory management system They're not caused by null pointer exceptions within the kernel line, you know system call paths for example, right? So what causes people see this all the time, right? The people cost them yes, I pushed the blue screen of button hood stop pushing the blue screen of death button on your computer We told you not to do that Yeah, well Yeah, so you had the answer Device drivers, right? So Mike I read somewhere Microsoft estimated something like 90% of crashes in XP are caused by device drivers, right and you know And and look I mean I'm not you know, I'm not here to defend Microsoft I think it's a company full of a lot of really nice people But you can imagine that they find this really frustrating because they work really hard on the core operating system components Right got a big group of people. They've got some really talented developers. They test their stuff really rigorously You know they write in and they rewrite it They did this, you know that one of the reasons that vista was delayed was they did this massive security audit They went on for several years where almost every line of code as far as I was told was read and re-read by several people who didn't write it You know to look for buffer overflow problems and things like this and You know and then this yahoo that works for like, I don't know logitech or something You know writes this buggy crappy device driver, but who do you think the users call when it fails, right? Windows crashed, right? They know they don't know it's the device driver, right? So Microsoft it fields all of these calls from angry users and all of these, you know all this gets all this vitriol Right because they provided a platform that allows other people who you know, maybe Applied for a job of Microsoft and didn't get one because they couldn't write really good code to to essentially create problems for Microsoft Right, so a lot of these issues are caused by other people But but people don't know that right and it's not you know It's not users fault like they bought windows and windows crash, right? And they don't they don't associate the fact that windows crashed which with oh, yeah I bought that really dodgy like thirty dollar extension the other day and plugged it in and I did download a device driver from this Russian website and you know I clicked okay to eight different dialogues when I installed it that said are you really sure you Want to do this right including the one that said this may crash your computer every time use it But I clicked okay now Microsoft, you know anyway So What are the one of the things I think it's interesting? I heard this somewhere and I couldn't I couldn't find a reliable source for it But it seems very possible, right, which is that a lot of device drivers are developed in a kind of Unfortunately standard way for the for the programming community namely There are you know, so so again the slide kind of goes it away, right? So, you know Logitech comes out with magic wheel mouse version 4.0, okay, and You're the programmer that's hired to write the device driver for a magic mouse version 4.0. So what do you do? You take the 3.0 code and you hack it until it works, right? Okay, so and and but what's the interesting thing that what's the interesting thing that this causes, right? So you can think of yourself Well, I wouldn't say you're giving birth to new code. No, no, so so you are you but you're creating this ancestral relationship, right? Like you took the version 3.0 code and if you were using something like get you know You created a branch or you just you just created your own new repository and you edit some stuff and things like that But but what what is this? What is this hat? What means? So what happens when they're I'll just read this slide That's usually the simplest thing what happens when there are bugs upstream, right? So the 3.0 guy had some sort of you know Really interesting little race condition or he was using an interface and properly and making some assumptions about side effects What happens to your code? It probably has the same problem So I read somewhere that somebody had done this thing where they had pinpointed Problems in device drivers and by looking at you could see like where they originated and then you could see every piece Of code that copied from them, right because they all had the same problem, right? So and you know again, so now it's like you know magic mouse 2.0 is having some stability issues Well, so is 3.0 4.0 5.0, you know magic mouse x or whatever I don't know where we start to use letters at some point just because we report with numbers right so and but I mean kind of it I've I've strongly suggested to you guys when writing your own code for the class Well, what's one potential motto here, right? We can never own CSE 421 model to compute with the Boy Scouts So what's one model based on this experience as a programmer if you really want your stuff to work and you want to know That it works. What should you what should you do? Test it. That's a good idea, right? But the problem here is that a lot of times these bugs are bugs that that wouldn't be uncovered by testing because they aren't Necessarily bugs yet, right? They may be a lot of this comes from I think usually bad interface design You know relying on side effects that are part of interface specifications, right? My motto here is don't use code. You don't understand, right? Certainly don't kind of paste large blobs of code into your system without at least like going through and being like what is this doing? Right like what is this doing? Okay I'm gonna speed up a little bit because we're kind of running out of time So there's there's been a lot of work in this area all motivated This is not a super sexy area unfortunately, but these are real problems, right? And there's a lot of interesting stuff that's going on So Windows in particular has been really working with the driver community to try to improve driver quality And there's been a bunch of different fronts. They have user mode driver tools now You can write drivers that run in user space meaning that unlike other drivers They don't get loaded into that big blob of monolithic kernel code that can cause the system to fail if it crashes, right? And then there's also been a lot of really interesting work on these sort of like static Programming language-based tools for checking and verifying device drivers. I mean what would be the really? You know ignore the virtualization bit for a bit. Well, I mean what would be the real magic? Pill here that would solve this whole problem It would put thousands of people out of work, but what would be the magic pill that would solve this open? Well, maybe a standard interface, but what do I really want? I come up with a new device and the heart let's say the hardware guys and this is not always true Let's say the hardware guys do a fantastic job of documenting the hardware features. What do I really want? What plug-and-play, but what does that mean for a device driver? Yeah, why not just generate the driver automatically from the hardware like I feed in the hardware spec And I have some sort of piece of code that just generates the device driver that does the right thing, right? It understands whatever driver interface It's trying to touch whether it's windows or Linux or whatever it understands the hardware specs And it just fills in the rest right and then if the hardware spec changes I generate a new device right and and then at least You know then then where are the bugs going to be? In the code that generates the device driver, right? But those those might be easier to fix right? That's that's a broader surface if more people use the tool It's like compiler bugs right compiler bugs get discovered because many people use compilers and see the output Right, so if you could push a tool like this and you got a lot of people to use it You can potentially get a small bit of code generation stuff that would be really really highly tested in an effect, right? And most approaches again use the same the same key technique Which is they assume the drivers are going to make mistakes and they try to make sure those mistakes don't kill the entire Kerm right and again I mean the the real tension here is that drivers need direct access to hardware, right? So that's that's The struggle alright, let me skip this slide, okay So so now now let's step back for a minute. We'll look at the complexity thing So let's say you were a normal software developer a normal good software Somebody who has taken this class and all the other great classes We have here in the department and has learned about the right way to do things How do you structure a large complex software project? Design it well, right? So yeah, we do a lot of design up front But what do I usually do? What's the usual thing that you start to do when you when you write a big complex piece of code? Break it down into smaller pieces, right? Smaller pieces are easier to test. They're easier to think about right you create modules and Each module supports a well-defined interface you document the interfaces carefully callers know what behavior to expect They know what behavior not to expect, right? So so yeah, this is the standard way of designing kernels and and also in operating system and when we do operating systems in the case of operating systems kernel So we also might want to minimize the code that can cause a fault Right, so if you could one way of thinking about it is monolithic kernels anything in that big blob of code can potentially crash the machine What I might want to do is refactor that code base so that I have a small Amount of code that can crash the system if it's wrong and then a lot of other code that can crash safely So why so why do I want to minimize the amount? Maybe this is obvious But why do I want to minimize the amount of code that can cause a fault? What does that allow me to do? Well, it's been a fine fault But also allows me to test the crap out of that code, right? Like you know just really really make sure that that small trusted code base is secure, right? so In the 80s and 90s there was a great deal of interest in what we're called micro kernel architectures Right and micro kernel architectures are kind of like let's apply Standard good software development practices to kernels, right? So here's my model of the kernel, right again I have you know, this is the everything box that I had up before I've got hardware down here and and maybe I don't like this layering here, but I didn't create this picture I once in a while I find a picture. I like on the internet doesn't happen often But here's the application, you know, I make system calls and then there's all this other stuff that goes on Okay, and all this is in user mode This is all in kernel mode meaning that anything that fails in here can trigger fail the entire system, right? My micro kernel based operating system is different, right? So again, what have I done? I have minimized the footprint of the code that can cause a problem the code that needs to run in privileged mode and Microkernels had had standard ideas about how to do this And one of them was simply that the principle parsimony, right? If it doesn't have to interact with hardware, it shouldn't be in the kernel, right? So this blue piece down here is now called the kernel it includes some very very basic IPC Which is necessary to allow all these other things to communicate with each other Maybe little pieces of the virtual memory subsystem that have to interact with hardware a little bit of scheduling support But again if I can find a way to do it outside of this trusted code base I would do it that way right and what sits on top of here So now what I'm doing is I'm taking these other things like the Unix So so one of the things that microkernels used to want to be able to do is offer these different personalities, right? So let's say I have this really really low level thing. Well, this is no longer a Unix kernel This is a microkernel has a different interface and so what I have on top is a Unix server, right? That uses the low-level microkernel interface that provides the POSIX interface on top, right? What might this allow me to do? Right, so now I've kind of decoupled some of my kernel stuff from the way the machine looks to programs So it turned out the Windows NT actually had was was based in some ways on some of these ideas And one of so one of the things that happened with microkernels I know I'm running out of time is that they they influence model of the kernel design more than they succeeded in their own right Right, so Windows NT described itself as a hybrid operating system Linus Torvalds Considered that term to be marketing right, but but one of the things Windows NT did was try to borrow a lot of the ideas So Windows NT had this thing called personalities Which meant that there was some small core of the system which it referred to as the executive and then the Windows interface was Implemented in a trusted library that ran on top So it turned out that the NT actually also for a while had a POSIX Personality right so it had some limited and I don't think it worked out Well, and I'm not sure it supported it anymore, but it had a limited support for running You know applications that use the POSIX interface so you can compile your Linux You know things to use the standard Linux or POSIX interfaces and you could run them in a Windows NT kernel Right, so that's kind of funky. All right Let's see what time it is maybe I'll finish okay, so one of this I'm gonna finish microkernels on right and we'll also start a little bit about performance, so Yeah Simon 3 is coming out today. I wish it was Monday. I wish I'd lecture them, but I enjoyed the day off so I will see you guys on Friday and we will finish our discussion of kernel structure