 It's weird starting without music. Sorry, I just sort of ran out of time today. All right, so today we're going to talk about files. On Monday, Carl introduced you to the wonderful, wild, messy, weird world of disks, these actual physical objects. And so what we're going to talk about for the next couple of weeks is sort of this challenge that the systems community spent a fair amount of time on, which was how to build a reliable system, a file system that has some desirable properties that we'll start to talk about today, on top of this really unreliable substrate, and how to make it perform well. So on some level, I think, as hopefully Carl acknowledged, the world of spitting disks is something that's sort of retreating into the past. But we can still learn a lot from sort of studying how this was done. And there's a lot of sort of principles about how to build these types of systems that apply to not only flash storage, but certainly to cloud services and other types of things. I mean, failure is a reality in a variety of different types of computer systems. And some of our ideas about how to cope with failure came out of early work on file systems, right? I mean, file systems predate the cloud. They predate large distributed systems. And so this was sort of one of our first chances in the systems community to really cope with failure. OK, so Simon 3.1, how many people are done? I know that there are some, wow. What is wrong with you guys? There's something different about this year. I don't know what it is. Maybe it's you. You can certainly think that if you want to feel good about yourselves. Clearly, you guys are doing better, so there's something. If you have ideas about what's going on, I would be happy to hear them. Are office hours working better? Office hours working well? How many people go to office hours a lot? They feel like office hours work well. OK, I think having them downstairs works well. There's all sorts of issues with the one small office approach to office hours, even if you have your own office. The ninjas are awesome, right? Ninjas are good? Yeah, I don't know. I'm really very pleased. It's too bad this is my last time teaching the class. Maybe next year will be awful. OK, so hopefully, whoever I've cursed to ever takes this over, they're going to have terrible students. OK, so Simon 3.1 is due on Friday. If you're done, move on. Recitations have already moved down to 3.2. This is another one of those cases where there'll be no extensions for this particular part of the assignment. So get it done or work on it later and don't get any points. Any questions about Simon stuff? Logistics? I know I've been away for a little while. Our goal is to have the midterm back by, like, we may have it available to distribute at some point next week. The TAs, we're trying to finish grading it by this week. I think that's aggressive. When they do finish it, we take it over to the printing place on campus and have them scan it for us. Once we're done with that, you guys can have the exams back. So at some point, we'll make them available in office hours. You guys can pick them up. They'll also be incorporated as part of your midterm grade score. OK, any questions about disks? Everybody remembers perfectly how disks work? If we have some time at the end of today's lecture, I'll also go over some of the midterm questions. So I'll sort of run through this stuff quickly. Parts of the disk, what's a platter? What is the platter? Yeah, the platter is the part where I write stuff on. It contains the magnetic material that encodes the information that's being written down to the disk. What about the spindle? Yeah, it's the thing where I mount the platters on. It's the part in the middle that spins everything around. I really wish I was here on Monday. I love that lecture with all the videos. It's a really cool video. Did you guys watch the Saucid videos? Is that still in there? The Saw video? Did I get rid of that one? Oh, it's too bad. Anyway, maybe we'll pop with that one later. The head, drive head. Yeah, so this is probably one of the most interesting parts of a magnetic disk. This is where all the action happens. This is the part of the disk that's responsible for both reading magnetic information off of the disk platter and altering that magnetic information during writes. Well, actually, I'm curious. How many people still have a machine with a spinning disk? OK, how many people have a laptop with a spinning disk? Oh, man, get rid of those right away. These are dangerous. I used to have one of those. I'm really glad I don't anymore. Drop it. Just a minute to say goodbye. Yeah, a lot of times, I think Carl probably talked a little bit about the capacity curves. I mean, those are still favoring large magnetic disks. So if you want something big and cheap and you don't need a huge amount of performance, has anyone ever priced out a terabyte flash drive or a two terabyte flash drive? Yeah, those numbers get scary. Big. All right, this location, so what's a track? Once we start talking about file systems, particularly old file system designs, think a lot about this stuff. Yeah. Is the track like a rain around it? Yeah, think about a lane on a track. It's like a lane on a track. It's one. It's the place where the heads, if I just let the head stay in one place, it's the place all the locations around the disk it'll see as the platter spins beneath it. Sector. Yeah. Yeah, so it's like a slice of the disk, right? And these are not necessarily, well, there's interesting features of these. Cylinder group or cylinder. That's put together. Yeah, all the tracks. Remember, disk typically consists of multiple platters arranged vertically around a single spindle. Imagine, so a track is on one platter. The cylinder is on all the platters going down. Why is the cylinder interesting from the perspective of file system design? Yeah. Yeah, so typically on, it's hard to say anything categorical about disk drive design. I suspect there were disk drives where you can move the heads independently. But a lot of disk drives have all the heads for all the platters mounted on the same device, on the same, well, there's one head, and the actuators for all of the platters are all on that. So essentially, all the heads are in the same location at any one given point in time. And so the cylinder represents all of the information that I can read from the entire disk without moving the heads very far. And why is that important? Yeah. Moving the heads is the really slow part. Moving the heads on a disk requires not only moving a physical object, which is always slow, but then stabilizing that head once it reaches the location over a very, very, very thin point on the disk. I mean, disks are one of those things. It's sort of amazing that they work at all. And this is really a very high precision instrument. If you get the location slightly wrong, you're reading a totally different set of data. So rather than the MP3 file that you wanted, you're playing some other song or something. And disks, and hopefully Carl emphasizes some money, I mean, disks are really different than everything else we talked about. Discs are physical devices, so they move. They have this physical component to them that creates a great deal of latency. On some level, all of the delays in other parts of the system are really the function of the distance between data and where the data is being processed and the speed of light. Or the speed of light through copper. How fast I can get things across the chip. So if I've got registers on my CPU and I wanna do some computation on memory, the latency involved, which is actually sizable from the perspective of the CPU, is really the time it takes to communicate with the memory, get the data from the memory, and move it to the CPU and into the registers in which it's going to be processed. That's just electrons moving around. This is actual physical things moving. So this is slow on a whole different order of magnitude. And remember, from the perspective of the CPU, memory reads are slow. CPUs play a lot of games to try to minimize the latency of even a read to memory. So a read to disk is like a whole, it's like Pluto. The last thing is that disks have, so one thing that you may notice as we start talking about file systems is there's a lot of different file system designs. How many people could name at least three different file systems that they've heard of that they might use on one of their computers? Yeah, if you run Windows, is Windows still using NTFS or have they moved on to something else? Okay, so Windows says NTFS, Macs have their own thing, HDFS, I think. I don't remember what it's called. Linux has EXT4. You might have heard of things like Ryzer. I'm not sure Ryzer works anymore. But there's actually still new file systems out there with new features that people are developing. And that's a little bit different than the other things we've talked about. So the file system is on some level a distinct piece of system software that is tightly integrated with the operating system, but also separate. So the same operating system can run multiple file systems at the same time. You don't really do this with schedulers or with page replacement algorithms or other things on the system. Those are a little bit different. So file systems have always sort of been integrated into the operating system in a slightly different way. Now obviously if you run Windows, you have sort of a tight coupling between Windows and NTFS, but Windows can mount and read data from other types of drives, right? FAT32, other CD-ROM file systems, things like that. All right. So here's how we actually move things around at a disk. I actually have to issue a command so that the operating system has to send some sort of electronic signal to the disk, using some sort of protocol to tell the disk what to do. The drive actually has to position the heads appropriately. So this is important. I have to wait for the heads to actually stop vibrating, moving around so that they're properly stabilized over that very narrow track. Wait until the platter rotates to the correct location and then finally pull data off the platter and transfer it back to the system, usually to some sort of memory buffer. So and we've already talked about the slow part here, which is the seek time. That really dominates everything else. Seek and settle. So, and this is sort of become a little more interesting, right? So for a long time, disks were becoming bigger, but not faster. And so people are being encouraged, you know, like you may have read about people that wear these, does anyone do this? Wear these video recording devices around all the time so that they can collect a complete trace of their entire life? Sounds, I don't know, I don't know about you guys, maybe your life's more interesting than mine, but there's a lot of boring stuff that that would be capturing, right, if you were me. But yeah, so, you know, people are generating more data or we're expecting access to more data. Discs have gotten quite a bit bigger. That curve has slowed down a little bit, right? I mean, to give you some context, when I was in college, I had a 20 gigabyte hard drive and I thought that was pretty huge. That was my big hard drive. My small, I'm serious, my small hard drive was four gigabytes. That's the one that came with the computer. And that's like, I mean, how many people have a two terabyte drive or something like that, right? Yeah, so that's several orders of magnitude. You know, so you have more space than I do. You should feel lucky about that. All right, questions about disks? Let me talk about files. Yeah. So Carl mentioned third space. Yeah, maybe. As a possibility. We'll talk a little bit about that when we come back to hierarchical files. That's a good point. I mean, well, let me ask an easier question. How many people have a device that has a hierarchical file system on it? I mean, everybody's hands should be in the air. You all do, you know. How many people use that? How many people organize things? How many people create folders with subfolders? Yeah, you guys are weird though. Computer scientists do this because we've encouraged you. Because you guys use the command line and stuff like that. So you're doing things a little differently. How many people's parents do this, right? Yeah, all in downloads? Or like all in my documents or whatever. Like who knows where my documents is on Windows? Like it might not even be rooted with the rest of the file system. It's like, well, but that's all people know, right? Yeah, all in downloads, I like that. I go into my downloads recently, just to lead everything. But yeah, on some people's machines, that would be a problem. Yeah, so, you know, well, how many, let me ask a different question. How many people use one of these search things on a regular basis? To find files, to locate things. I like the, I ask this every year. And I never know, does Windows still have the dog? No? It's too bad. The dog was so helpful. It was like, oh, it's really cute actually. That was one of Windows' bests. You think of all the things that went, they've created the clippy stuff and all that stuff. I mean, the dog was really cute. Maybe I just like dogs, right? Like, great that they had a cat, too. They was like, no, no thanks. I'm not helping you. Just like slunk off, hit in the corner. Yeah, so people are using, I don't use this stuff, but I'm, you know, I'm from a previous generation. So I turn spotlight off. I just don't really care. It's just not how I look for things, but I know people do that, right? And how many people are nervous about storing anything on their computer today? Like, how many people just don't feel good about having stuff stored on their computer, just on your computer? I mean, you might be there locally, but how many people want stuff in Google Drive or Box or whatever they're forcing you to use now, or Dropbox, or, you know, yeah, exactly. I mean, the biggest shift for me, just to blunder into a little bit of personal history, is has been music. So how many people still have like a tree somewhere on their system with all their MP3s in it, right? That's organized very carefully by album and artists and stuff like that. What a waste of time. Just stop, right? I will give you guys $10 and you can try Google Play Music for a month. You'll never go back. I still have that somewhere, but I don't know why. I never listen to that stuff anymore. Yeah, and I spent hours and hours, probably days of my life organizing that stuff, dealing with special characters back in the day. Linux didn't like special characters. You know, these weird things when you have an accent. So don't listen to music by artists with accents in their title, right? That was the solution. Anyway, a total waste of time. But I think that's very different now. A lot of people use Spotify, Google Play Music, that's all search-based. Who knows where those files are or where they come from, but as long as I hit play and it starts playing in my browser, I don't care, right? All right, good question, though. All right, so let's talk about files. So part of the goal of files and file systems is to remove all the problems. And this is the classic, our classic attempt at the classic thing that abstractions do. So I've got this low-level disk interface. So here's what the disk understands. Read and write in entire sectors, blocks. Sometimes this is bigger now, maybe 10, 24, maybe 248 bytes. But that's what the disk lets me do. The disk doesn't know about files. The disk doesn't know about names. Well, the disk does know about names. It has this notion of names as numbers. So you can kind of think of the disk as a huge array of blocks. And that's it. So all the stuff that you guys associate with using files, human-readable names, the notion that files can grow and shrink and be deleted and created and have permissions and location, it's all implemented in software. None of that exists on the disk. The data required to implement that has to persist on disk, but the structure is entirely up to the file system. So this is kind of what makes file systems fun. It took me a long time to appreciate file systems. Maybe you guys don't appreciate file systems, but essentially you're given this very simple data structure which is a big linear array. And your job is to use that to implement all of these features that you guys are used to. Start talking about some of those today. And again, and this is partly due to speed, but it's also partly just for historical reasons, more of the interesting parts of the file system are actually implemented in software. So for example, the tricks that we play to try to make file systems fast, to try to make disks fast, the file system tries to hide all of these ugly delays that the disk has to do and do things intelligently. That's all done in software. On the CPU, those memory delays that I've been hinting at, those are hidden from the operating system in these really cool ways, but they're totally invisible to the operating system. Does anyone, do you guys talk about out of order execution and pipelining and stuff like that when you do your hardware stuff? No one's ever heard about that? So cool, out of order execution will blow your mind. It's totally awesome, right? Anyway, but that's what the CPU is doing internally. From the perspective of the operating system, the CPU is executing instructions in the order in which they were presented. What's really happening behind the scenes is that the CPU is playing all these cool games to try to make things go as fast as possible and to hide memory related latencies. You should go look this up, it's very cool. And this is kind of what explains the design space. So one of the reasons that there's so many different file systems out there is if you guys want to, you can actually write your own file system and you can run it on your own machine. In fact, there's now pretty well developed support in Linux for user space file systems. So there are tools where you can write your file system and you can run it in a user space that doesn't even have to run with kernel permissions and you can use that to do cool stuff. And to some degree that's why all these things exist. Okay, yeah, anyway. And that's that, okay. Yeah, so Flash. So Flash seems like, I just want you guys to keep this in the back of your mind as we're talking about some of these old disk-based file systems. Flash may seem like it solves a lot of these problems. It does have certain desirable characteristics compared to disks. So for example, there's no moving parts. So you might think Flash never fails. Is that true? Does Flash never fail? Who thinks that Flash never fails? False, right? Flash fails in other weird ways. And Flash has other strange properties. So certain types of Flash drives, if I want to rewrite one bit, I have to first erase this huge chunk that surrounds it and then rewrite the entire chunk with one bit changed. So that's weird. That causes me to have to do some interesting things in my file system design. And Flash actually wears out. I do not know why this is. I have no idea. My mental model of Flash wearing out is of like, I don't know, something that's been scratched too often and it starts to like, the coating starts to come off. So it's a totally ridiculous. Like clearly there's something more subtle going on with the underlying, because this is all electronics, right? I have no idea why Flash wears out, but it does wear out. And it wears out as a function of usage. So if you keep writing and reading to the same part of the Flash drive, that's the part of the Flash drive that's going to fail. And that makes things interesting. So Flash drives actually do a lot of interesting things internally to try to spread out the load across the entire Flash chip and make the whole thing last longer. So anyway, there are certainly complications associated with Flash and we might come back and talk about this a little bit later. Okay. So let me briefly, and I'm gonna try not to belabor this. Yeah, definitely, yeah. And it's sort of interesting. So a lot of the file system designs that came up around disk-based systems have been ported to Flash where they make sense and they don't, right? So there's certain features of them. They usually don't, well, I shouldn't say that. Sometimes they actually hurt. Sometimes some of the things that we do in disk don't work on Flash very well. The other interesting thing that happens with Flash is that Flash typically has software that runs on the Flash trip. This is something called the Flash translation layer. That Flash translation layer is doing things internally with the Flash that are somewhat opaque to the file system. So this makes things even more complicated. Sometimes I think some of what those Flash translation layers are doing is actually trying to make the Flash look a little bit more like a spinning disk in certain ways, right? But that can complicate file to system design further. But yeah, absolutely. I mean, there's definitely work on new file system designs that are responding to properties of Flash. But right now what we've done is we've sort of taken some of these older file system designs and we just moved them over to Flash and we're sort of hoping that they work well. They work okay, but they're not necessarily taking great advantage of the Flash chip. There's also these new, does anyone have a hybrid drive on their machine? One of these new, fangled? Yeah, so there's like this even newer thing that I think we had an exam question on a couple years ago where people are trying to combine Flash and a spinning disk in the same package to produce kind of the best of both worlds, right? So remember when we talked about using disk to back memory? We wanted the entire disk plus memory plus swap to look as fast as memory, but to be larger. So that's kind of what they're trying to do with Flash, right? So you can think of these hybrid drives. The goal is to make the whole drive look as fast as Flash, but increase capacity more cheaply by using a spinning disk for part of the backing store. That's kind of cool. Okay, any other questions before? This part's sort of boring, so I'm just gonna try to like blow right through it because I think you guys understand this, but I think we just wanna be specific. Okay, so what do I need to do? So the file is one of our primary abstractions here and the features of the file end up dictating a lot of what the file system design requirements are. So what do I need to do to be a file? What are the requirements for a file? Not a trick question. I think you guys know these, thank you. That's to be have a name. I'd be able to find it. So that's to be some name that the system understands refers to that, fine. Okay, what else? Yeah, okay. I mean it, yeah. Well let's go with the simple stuff first, okay? So Farron only has a name, right? It's not super interesting, yeah. Is it data for and follow up with the store? Yeah, it should like store data, right? That's useful. The data that a file stores is treated as if it's contiguous by the users of the file, right? So that you guys think of a file as a contiguous set of bytes. It has a start, it has an end, and there's some series of bits in between the start and end. How those bits are interpreted is one of these things that ends up being the function of what kind of file I think it is. So has anyone tried to open like an mb3 as a text file? Yeah, there's some weird characters in there, right? Like it's not a text file, so it doesn't look like text. You can open it with Vim and like play around and edit it and then see what it sounds like. I don't know what that would do. But yeah, so the, how we interpret those bytes is something that really happens at higher layers. But it's a set of bytes. What should I be able to do to those? Well, something else. They should persist. So this is, you know, again, this seems kind of like obvious, but if you had a file system where you, you know, let's say you're working on your OS 161 assignments, you create a new file, you save the content, you come back two days later, it's gone, right? That would not be a good file system. Like you would not appreciate that. So I need to persist that content over time. What else? What other things do I do with files? What's that? I read and write. I read and write? Yeah, so there's an interface to the file for accessing its contents. What about writes? What can I do to a file? What's that? A file. Right, so there's a name. The name can change, right? I can move the file around so I can change the name. What about the contents? Yeah. It can grow, it can shrink. I can delete it. I can truncate it back to zero. I can grow it up to multiple terabytes on most modern file systems. So again, think about it. You have to build this data structure on top of what's essentially a flat linear array. These are some of the design requirements. Okay, what else? So now we've talked about some of the basic properties. People had started to get into some of the other things. So what are some of the other features of a file that you might think of as more like metadata? Yeah, timestamps associated with use. What else? Ownership. Ownership, yeah. I mean, if file systems don't get ownership right, the world becomes a very interesting place. I can log into Timberlake and send mail as you and read your mail and finish your assignments for you and stuff like that. Maybe you'd like that part, but anyway, yeah. So ownership is big. This is something that is a core file system feature. What else? Yeah, metadata store about it. What about, well, type is interesting. Let's come back to that. Yeah, permissions in general. So UNIX has a specific set of permissions. There's other file systems that do permissions a little bit differently. What's another piece of information about a file that the file system maintains? It's kind of like permissions, but not quite. When I think of permissions, I mean ownership. But what other information do I store about the file? Yeah, that's basically the path, right? That's another feature of the path. Yeah. Yeah, that's internal file system data, right? Remember, computers like to name things with numbers. So there is something called an I know number. We'll come back to that, but there's something else. There's the permission bits, right? Can I read to this file? Can I write to this file? Can I execute the file? And that's something, maybe if you guys have downloaded something from somewhere online and wanted to run under your computer, one of the things you have to do is actually adjust the file system permission so that you can execute it. Otherwise, the system will say, you're trying to execute this file, I don't know how to execute. And then the last thing here is this idea of sort of how files relate to processes. And this really sort of boils down to things like what files a process has opened, where the file pointer is for those processes. I can see people already starting to doze off. I'm just gonna like try to speed up a little bit here. Okay, we already talked about this. Blah, blah, blah. So again, I mean these basic file expectations seem like they're pretty simple, but they're still bugs. I mean, file systems are complicated and there are still bugs in modern file systems. I'm sure that I could have found a more modern version of this bug. But, and this is, you know, Margot used to say, Margot worked on file systems for a long time. So she used to say, you know, data loss is one of those things. It's one of the worst bugs ever because if your file system loses data, not only is the person whose data you lost angry, but they now have, like they're probably, we're trying to work on a project and you destroyed all of their data. So now they've got nothing to do, which means they have time to call you on the phone, send you nasty emails, flame you on hacker news or whatever, like they've got a lot of time on their hands to come after you about this. So this is bad. And file systems still do this. There's also a really interesting line of work that's been going on for a couple of years about trying to prove certain properties about file systems. And that's really interesting, right? You know, can I prove, can I come up with ways of being 100% confident that my file system doesn't suffer from certain problems? Okay. Now, what's hard about this? Well, not only is it hard to build this complex data structure over this linear array, but file systems have to deal with all of these issues. So what happens if someone trips over the power cord in your data center and suddenly, they know all the power to a rack goes out? What are my expectations about what happens after that? It may be the case that some of the data is lost. That's okay. We'll come back to this when we talk a little bit about recovery, but I don't want the entire file system to be corrupted. Has anyone ever had this happen to them? Like a total file system corruption? Yeah, so I had this happen to me. Remember that 20 gigabyte drive I was really proud of? So I had all, remember that MP3 collection I was really proud of, right? So now these stories come together, right? So I had that all on that drive. And at some point I was starting to mess around with Linux and I accidentally overwrote the super block on this drive. And so it basically, so it's really frustrating, right? Because you haven't modified very much of the data, but you understand how data structures work, right? You modified the important parts of the data. Anyway, so I was pretty sad about this. This was back when you guys don't understand what it used to be like to get music, right? You had to do a lot of dodgy things to download music from the internet. This is pre-Napster, okay? I'm that old. So I spent like $50 on this CD of recovery software for this drive and I thought, this is awesome. So I ran it and it discovered like, I don't know, 95% of the files I had on the drive. So I'm thinking this is awesome, okay? The problem was that the other 5% of the files were intermixed in the other MP3 tracks, right? So you'd have like Britney Spears on and suddenly there'd be this like tiny little snippet of something else, right? And then you go back to the song. So essentially everything was ruined. It was terrible, right? I think I had someone made Napster at that point because everything got much easier after that. It's easier to do bad things online. All right, so we talked about this. You know what? I'm not gonna do this exercise, this is sort of boring. You guys can go through this. I used to do a little design exercise here about where to store the file method metadata, but it's not really that interesting. Okay, and there's also this idea of establishing a relationship between a file and the user. You guys may have wondered, for example, why do I have to call open? I just call read and write and give it a path and give it where I wanna put the data in the file and have that work. And there are some reasons for this. File systems like to observe these relationships so that they can optimize certain things, right? It's helpful if I know that a process is in the process of using a file, I can do some intelligent caching and other things. And I can also provide some guarantees based on open so I can allow a process to open a file exclusively to make sure that nobody else is using that file behind my back. That could be a useful thing to do. Okay. Yeah, I don't guys know about this. This is sort of all review at this point. Actually, you know what's to do this. Is there anything about this that doesn't make any sense to people? You guys implemented this stuff, so I think you get it. This is the interface for manipulating the shared information that's sort of between the process and the file. All right, any questions at this point? Basic file expectations. Okay, good. I've got 15 minutes to do midterm review. Okay, let's midterm questions. So obviously the secret about this exam is it was all recycled questions. Hopefully they didn't bother anybody. I don't know why it would have, but I've written so many. It's so hard to come up with good midterm questions after you've written like thousands of them. I have this thing that happens to me now where I think, oh, that would be a great midterm question. And I start writing it up and then I look through my huge catalog of old questions like, oh no, I used that one three years ago. I just totally forgot. So that's sort of depressing. Any specific questions about stuff that was on here? Munch? You wrote in an answer? It doesn't matter. Oh, I'm sorry, you know what? I'm not scrolling the right thing. Okay, let me go over here. Yeah, everybody gets a credit for this. So there used to be, there is a correct answer to this question, I think. On that was the one. Well, thinking of Bob is also, I guess. Did he say that? Does he say that in the lecture? Okay, awesome. He says, on that, right? Okay, anyway. So look, look, everybody gets a point for this stupid question because one year someone got really mad. They were like, I'm gonna lose a point because I didn't know what song you played in class. I don't care. Yeah. Well. I mean, there were only four options, right? One out of, I'm glad that I'm not the only person with verbal tics, right? But I do, I just want to point this out. This question is very much on here out of love and respect, like the fact that Carl has been willing to lecture has been super helpful, right? And he's good at it. Okay, any other questions on the multiple choice other than the one about Carl? Yeah. Go for Jay. Jay. Right, good question. So look, categorical statement, let me just give you a test taking tip. Categorical statements like this almost always falls, right? Like I can come up with a counter example. So why doesn't this ensure that variable foo is protected? Yeah. What if foo's also modified outside of it? Exactly, right? I can write code that doesn't acquire the lock and does stupid things to foo, right? At which point this piece of code doesn't matter, right? I mean locks only work if everybody who is modifying a variable uses them properly. And this is why in other languages they have better primitives for doing this. So in C, this is what we got. In other languages, you have ways to say certain, just make sure that certain methods or certain types of access to a variable are always synchronized regardless of who's doing it. That's a much better design pattern, right? Better to design a particular object so that all the accesses to it follow certain rules than hope that everybody that uses the object follows the rules properly. Makes sense? Yeah. Yeah. I have the exact same thing. As soon as I saw that it was false, I'm like, duh, he means that there's other code that would be accessed in or modified who then it's false, but I don't know. When I read that, I wasn't thinking that there could be other code. It just says the following code, which, yes. I'm not backing down on this one. I think this is an OK question. Yeah. OK, other questions so that I can get defensive and defend my solutions. This one makes sense. Yeah, so we talked about this a little bit in class. I mean, if I have a thread that figures out how to yield right before its quantum ends, essentially it can kind of keep itself in the top queue despite the fact that it's really doing stuff that's very computationally intensive. Maybe a yield or I do like a really short sleep. I do like a tiny little bogus read to a file just so that I can look like I'm an interactive task. And one solution here is to just give threads credit for how much of the quantum they use. The goal of MLQ is to reward threads that do a little bit of my work and then block again. Yeah. People know I don't think this stuff is black magic. I mean, Linux has like one scheduler and it's open source. So if you really wanted to, you could probably go. But I don't think it's subject to this sort of manipulation. Windows and stuff, that's a good question. I don't know if there's. I suspect a lot of the stuff doesn't change very often. So I suspect a lot is known about how these things are done. And to some degree, remember you also want it to be tunable frequently. So you need to allow information to come in from the outside. Somebody who's running certain types of workloads on a Windows NT server may have tweaks that they make to the scheduling algorithm to get those workloads to perform better. What's that? I think you can also set quantum lengths and fiddle with that a little bit, right? Yeah, I think there's a few knobs usually. Depends on the schedule. Yeah. Other questions about this? Oh yeah, there we go. So this is a nice question. This is modern, right? I mean, modern computers have more than. We passed this threshold a long time ago, right? I mean, how many people have more than 4 gigabytes of RAM on their machine? OK, yeah, great. So clearly a 32-bit address space is starting to struggle at that point. Yeah, and we haven't gone all the way to 64. That's a long way. In 64, I think it's enough to address every atom in the universe, so you're probably never going to have that much, given that it takes at least one atom to store a bit. You're probably not going to have that many megabytes of RAM anytime soon, so that's a little bit overkill. So we've gone somewhere in the middle. Some systems have eight-bit extensions that allow me to latch on to a 32-bit base address and therefore create a 40-bit address, 48 bits. I'm sure it's out there. Any questions about this sort of obvious stuff? How this is done is you could probably give a whole lecture about how we started to add large, wider address support to modern systems. It's ugly. A lot of reasons is because it's being bolted on to previous 32-bit architectures. So again, imagine a 32-bit chip where in addition to my 32-bit addressing, I now have one 8-bit register that sets which page table I use or something like that so that I can support wider address spaces. And a lot of this ends up being pretty hacky. OK. Yeah, this is an easy one. Illusions, any questions about this? This is sort of cool. I mean, there's probably other other. These right answers are not intended to be exhaustive. If you guys came up with other things, we'll give you credit for them. The solution set is just examples of ways that you could have answered the questions. Questions about this one? All right. Yeah, exceptions. The thing this question is trying to get at that I hope you guys have absorbed is the fact that memory-related exceptions are not fatal and are really calm. So of the exceptions that your computer experiences, the program experiences while running, probably 99.99999% of them are memory-related and totally benign. It's just address translation. That's how it works. So I mean, how many people have seen a program crash because it's divided by zero? I mean, that just doesn't happen very often. Well, OK. One that you didn't write. Sorry. Or you didn't. Well, anyway, that's a mean thing to say. I mean, hopefully we sort those out earlier. This is an exception. Yeah, Vicki, if you would wait, what's the example? Sorry. Only program. I mean, I guess that's an example of an exception that would terminate it. It's a different example because it's memory-related. But it's an example of using bad memory. So that's a great point. Using bad memory falls in the category of something that could terminate a process. But most of the time, again, unless you're a new C program and you're hitting all these segmentation faults, most memory-related exceptions are translated by the operating system because they're access to valid memory. Yeah. So you actually could use memory for both. You could say memory in a case where it's translated is an example of something that doesn't terminate the process, memory in a case of bad reference to something that I don't have permissions to use or isn't the right type of operation, again, trying to write to my code segment or something like that, that could come in. OK. Signalization primitives. Yeah, so this is, hopefully this is something people figured out. It's kind of an interesting design choice. Any questions about busy waiting versus non-busy waiting? OK. Yep, what privileges do I need to multiplex memory? The ability to modify the TOB, access to kernel data structures. And this is how I control the memory on the system that a process can use. I mean, that is the kernel's primary objective. Make memory available to processes, but control on a page per page granularity what a process can do to each page of memory. And again, with as little intervention as possible. OK. Well, I'm going to answer questions. How many people did the wait time prediction question? OK. How many people did the jumbo pages question? It's the other one. OK, it's a pretty good mix. Any questions about either one of these? Well, we'll start with the wait time prediction. Questions about this? How many people read the previous exams when they were reviewing? OK, good. I thought about telling you that this was all old stuff, but I didn't want to, like, bias how people studied. So maybe I should have done that. Now you're all mad. Yeah. Any questions about the jumbo pages? It's a nice thing about giving an exam that's all old questions. No one has any problems. Gas. I don't know. It's not clear that these questions are still better than they used to be just because I've reused them. We're good? OK. Any other general exam questions? So just looking ahead a little bit, the final exam is going to be identical to the previous final exams. I will probably gin myself up a little bit and actually write some new questions for it, because sometimes that's fun. So again, don't make any assumptions based on the format of the midterm. I know that the midterm is time pressure. So that's normally reflected in the scores. That's fine. The final is not. So the final is roughly about twice as long as the midterm, but you have three hours instead of 50 minutes. So people typically do not have a problem finishing the final exam on time. OK, I will see you guys on Friday. Good luck finishing assignment 3.1.