 All right, let's get started. Monday morning, everybody needs a jolt of good energy. So why doesn't everybody stand up? Actually stand up involves a little bit of flexing your knees, and then everybody can stretch a little bit or something, or try to get the blood flowing. You guys could also stand through the entire class. That would be cool. I know just kidding, sit down. I know everyone wants to relax. All right, so today we are moving on. We spent about, oh man, how long has it been? It's almost exactly a month. Kind of introducing, can we have one conversation now, guys? I'm talking. When I'm talking, you're not talking, OK? One of the fundamentals of a lecture style class. So today we spent about a month talking about the CPU and about standard operating system abstractions. So at this point in the class, you should have a little bit of an idea about what the operating system does, and more detailed idea of exactly how the operating system abstracts and multiplexes the CPU. And at this point, we're done with the CPU. So we're going to keep going. And today we're going to start our unit on memory management, which is going to last until spring break. So just looking forward this week, next week, and then a few lectures before spring break, we're going to talk about memory management. Our goal is to get through the memory management portion of the class in that time. And then the midterm will cover this material as well as the stuff that we've covered earlier. So that's kind of the blueprint for now until spring break, which just can't seem to come fast enough. But this is fun. This is a fun part of the class. This is one of my favorite sections of some of my favorite materials. I think this will be cool. Today, we're going to start out by spending a lecture sort of motivating the problem. We did this with the CPU. We're going to do something similar to memory. So we're just going to talk about what are our goals for multiplexing memory. And we're going to talk a little bit about why simple approaches might fail. And we'll talk, and I'll introduce you to the central abstraction, which we call address spaces, that we're going to use and present to processes. And essentially, the next few weeks, what we'll be doing is figuring out, I'm going to step on your tail, buddy, if you sit right there, he'll learn. So the next couple of weeks, we're going to be essentially working out how to implement this abstraction, this nice abstraction that we proposed today. OK, so let's do the announcement. So essentially, what I decided to do was move the assignment due date to Thursday, midnight. And the real reason for this was that we realized that the recitations, one of them was going to fall before the deadline, and two were going to fall. One was going to fall right after the deadline, and the other one was going to fall the next morning. So we just figured this week, Sonali's going to do assignment one in recitations. And we're going to sort of spend more time in recitations this week, maybe going through some of the last bits of assignment one, so you guys have some help. So this is a small change. This won't actually affect the due date for assignment two. Assignment two is going to end up kind of sprawling a little bit, because spring break's going to be in the middle of it and stuff like that. But we'll get that out later this week. So again, but at this point, I think despite the addition of 31 extra hours, if you haven't started the assignment one synchronization problem, then my advice is start saying assign some real blocks of time to work on this stuff, because it takes some time. So there's a couple remaining little assignment one bugs that I'm planning on fixing today. These aren't big things, and I don't think that it'll break anybody's code. It's just adding some cleanup and then adding a little bit of goo so that configuring for assignment zero doesn't break, which somebody discovered it did, because some of the extra menu items. So I'll push this out this morning. But again, I don't think that this should affect any of the code you guys have already written. These would be very, very small changes. But once that's done, I'll tell you to do a pull and you should pull down the new stuff. So the last thing is that in case this wasn't clear before, assignments two and three in this class are done in pairs. And I think we'll have to see. I think we have an odd number, but maybe we'll create one group of three and we'll give you guys a little bit more work to do on these assignments. So assignment two and assignment three are done in pairs. And at this point, it's time to start thinking about who you're going to pair up with. So here's my guidelines. First of all, I would prefer if 421 students would pair up with other 421 students. If you would like to work with someone in 521, that's OK. The only caveat is that a mixed 421, 521 group will be graded as a 521 group. And remember, as I said earlier in the class, the only real functional difference between 421 and 521 is that 521 students will be graded on a different scale. And with the guarantee that the same person taking the class in 421 would have, handing in the same material, would get a better or equal grade. So it would be not possible to do worse in 421 than you would have done in 521. Does that make sense? OK. So we're going to add a form to the assignment one, to assignment one, so you guys can submit your patch. I'm also going to ask you guys to enter the username of the person that you will partner up with for assignments two and assignments three. If you don't have a partner, or you don't know who that person would be, you can leave that blank. And what we'll try to do is assign you somebody who we think is suitable, probably based on your assignment zero performance. Well, you could ask the person. Yeah, yeah. But we can also add it to the form. So how about this? I'll try to set it up so that if you nominate somebody in 521, it is a mixed group. I'll at least warn you about that. We'll let you do that if you want to. But again, I think it's going to work out a little bit easier if we make sure that we don't get too many mixed groups together. On the other hand, if you want to work with someone in 521, and you think that's the person that you're going to complete the class in the strongest fashion with, I would say go for it. Any other questions about partnering or the last pieces of assignment one? What's that? Assignment zero is not finished yet. It's close. We're almost done doing the code reading questions. I mean, we had 2,000 code reading questions to look over between the two TAs. So it's taken a little while. You guys only submitted 20 each, but there's 100 of you. All right. Any other questions about this stuff? All right. So let's do our usual morning routine. So last week we talked about thread schedule. That was the last piece of the CPU management puzzle, was the policies that we use to determine how to apply the mechanisms of contact switching and preemption, et cetera, et cetera. So does anyone have any questions now that we talked about scheduling a bit, about the scheduling algorithms that we discussed in class? Keep in mind that scheduling algorithms like this are frequently featured on midterms and final exams and other sort of things because they lead to questions of how would the scheduling algorithms schedule these threads, et cetera, et cetera. So any questions on this stuff? No questions. All right. So let's do some review. What is schedule? Anybody? Everybody? What is scheduling? Fundamentally, what is schedule? Choosing the next thread to run. Yeah, this is kind of a limp room this morning. Maybe we need to stand up and stretch again. Do some? You're pumped? OK, good, good. We'll be pumped about answering the questions. All right. Why do I have to schedule threads? Why is the operating system in charge of this? Well, OK, performance. That's one of my goals in scheduling threads. But fundamentally, why do I have to schedule threads? What creates this problem in the first place? Dodging. There's no way to do that. Well, so that's a goal of scheduling. Yes, Ben? There are more threads than there are CPUs. There are more threads than there are CPUs. And what's the other reason? So there are more threads than there are CPUs, and that's the problem. Why does the operating system have to do this? Carl? We don't trust the threads to schedule themselves. And we're the privileged process that's running on this machine. This is our job. This is when the user installed Windows, what they told us, despite the fact they had no clue that they were actually doing this, was Windows, I would like you to control access to the CPU in the way that improves performance. And that's what scheduling is. So we schedule, because we don't have enough cores to give each process its own core, each thread its own core. And the operating system schedules, because the operating system is the privileged process on the system. And that privilege is there so that we can do multiplex. Ah, it's up on the slide, too. OK. When do we schedule threads? When does the operating system have to make scheduling decisions? There were four cases. Yield is one. Timer interrupt. We're going to work the system call. Yes, one more. Exit. All right. I've been told by several people that I don't give the front of the room enough attention. So I think today we're going to, especially the giggly corner over here. This is one of my favorite parts of the room. I'm going to get to calling you guys in. Hang out up here. All right. So when a thread voluntarily gives up the CPU by calling yield, when a thread makes a blocking system call, and I have to sleep it until the call completes, when a thread exits, and when we decide that the thread has run for long enough. And the mechanism that we use to give us those decision points is the timer interrupt. So this is preemption. So what are some of the goals of scheduling? Some of these came up earlier when we were talking about what scheduling was. So anybody list out some goals of scheduling? What are some things that schedulers try to accomplish? Fairness? OK, what else? Responsiveness, interactivity. We talked a lot on Friday about responsiveness and interactivity with the new Linux scheduling work. Anything else? Calvin. Don't be in the way. Don't be in the way. We want scheduling to perform well. The scheduling process itself. We don't want to consume so much CPU that the goodness of the decision that we made is impacted by how long it took. Anything else from over here? I know this. I know. I know you guys know this. You hear this sniggering all the time at me, I feel. Sometimes I feel a little hurt. OK, how well does it run? How well does it meet deadlines? So we talked about deadlines and why deadlines are important on the system. Deadlines are important to meet because they affect how interactive the system seems, also because they affect continuous processes, like playing video and audio. If you don't meet those regular deadlines, then those processes start to break down. And then the one that didn't come out here was just simply, how well does the scheduling, does how we schedule threads use all the resources on the machine? So could thread schedule in at some level be able to meet all of these goals? But one of the things that it will do is that when there's work to be done on the disk, when there are packets to be sent over the network, when there's applications that need to use memory, all of those resources will be busy a lot. And they'll be busy a lot because we're doing a good job of letting applications wake up and start processes and making sure that all these things are happening kind of together. Deadlines. So last question, I've asked you guys this a bunch of times. I'm trying to give you guys the one-upness over your machine. So why do deadlines win? You guys remember? Why? What's that? Human time is more important than computer time. Your time is more valuable than your computer. And then, yeah, let's come and point out performance. Make sure the scheduling algorithm doesn't run for long periods of time. Two examples of schedulers that don't use any additional information about threads. Ron Robin is one. What's the other one? Random. Random is on some level easier. Ron Robin at least maintains some sort of fixed order. OK, what might we like to know about threads if we could predict the future? If we could figure out what a thread was about to do, what features of what it is about to do might we like to know? Yeah, Dutchie. How long it's going to run? How long it's going to run before it blocks or yields? What else? Yeah? Maybe if it's going to make a system call. Will it block or yield? What is it going to want us to do next? And then finally, if it's going to make a system call and I'm going to have to do something on its behalf, how long is that system call going? How long will it wait? How long will it spend on the wait queue? Now, it's possible that we don't know how long it will spend on the wait queue, why? What's the case where a process could be waiting for something and we wouldn't know how long it's going to wait? It's waiting for a user. It's waiting for a user. No idea how long it's going to take you to get back from your coffee break. The system doesn't know this. So what is our typical approach when we can't predict the future? Systems can't predict the future, so we what? Use the past to predict the future. Everybody repeat after me. Use the past to predict the future. That's what we do. And we're going to see this again on memory management. Same approach. Use the past to predict the future. Give me an example of a schedule that does this. Multi-level feedback queues, MLF queue, all right? Final little review. Who is Ingo Molnar? Anybody remember who Ingo Molnar is? What's that? The kernel scheduling subsystem maintainer and developer. Who is Khan Khalibis? Australian anesthetist and little kernel developer who focused on improving interactivity while striving for simplicity and predictability, OK? So that was kind of a fun lecture. Now, I'm not going to promise that you won't see the rotating staircase on a midterm, because it's a cool scheduler, and it's not that difficult to understand. So if you're confused, go back and look at the slides again. All right. Oh, wait, here it is. So actually, this is brand new stuff. Great. OK, rotating staircase. Who remembers the rotating staircase? OK, I've got a 5 millisecond quantum and 10 levels in my staircase, OK? So a thread that starts at the highest priority level 0, let's say that the priority levels are ordered 0 to 9 and the highest is 0. What's the longest or sorry, what levels can this thread run in potentially? All of them, right? 0, 1, 2, 3, 4, 5, 6, 7, 8, and 9. All 10 levels, OK? What about a thread that starts at priority level 5? Where can it run? 5, 6, 7, 8, and 9, right? Remember that the threads fall down the staircase, but assuming that they exhaust their quantum at one level, they are granted a new quantum every time they dissent, right? And finally, so this is a good question. This comes into the predictability of the scheduler. This is a great question. Who came up with this fantastic question? OK, at the beginning of time, we have one runnable thread at priority 0, 3, 7, and 9. This is actually kind of hard. What is the longest amount of time before the thread at level 9 has a chance to run? Or is someone may want to do this? Just tell me, how do I answer this question, right? How do I answer this question? I won't ask you to do the math, because the math is a little difficult to do in your head. But how do I answer this question? Anybody want to guess? Yeah, John? I think you do it on your case. And run before that one. Exactly, right? So I'm doing a worst-case estimate, right? Longest amount of time. So the longest amount of time is, and remember the semantics of the schedule as we discuss is that when threads fall down, they get put on the end of the level, right? So the thread at level 9 will be the first thread to run at level 9, right? But before that happens, thread 0 can run at all other 9 levels, thread 3 can run at all other 6 levels, and thread 7 can run at the other 2 levels, 7 and 8, right? And so I can use this to bound the amount of time that it's going to take before that thread runs, right? And this is one of the nice features of the schedulers that I can very, very easily make guarantees like this, right? All right, any questions about RSDL? Locating staircase, good stuff. OK, any other questions about scheduling generally, right? So I know this is like kind of old stuff, and the weekend kind of totally erased your memory of everything to do with scheduling, kind of mine too. I tried. I tried my best. But any other questions about scheduling before we go on? Any other questions about CPU, threads, et cetera, et cetera? All right, so one thing I should point out that's going to happen, I should have talked about this one in the announcements, is that, so at this point in the class, right, I mean, we're done covering threads. You guys have a few more days on assignment 1. Once we get to assignment 2, the lectures and the rest of the, the lectures and the assignments are going to start to diverge. And that's kind of inevitable, right? I've never been a part of this class when that hasn't happened. What I will guarantee is that, at this point, the lectures will be ahead of the assignments, which I think is better than the alternative. But by the time you guys get to assignment 3, you will have seen memory management in its entirety. And actually, another week or two will have gone by. What that also means is the recitations are going to start to be kind of a little, there's going to be some tension in the recitations, because the material I'm talking about in class that Sonali's going to be trying to cover is going to be, you know, after spring break, for example, it's going to be stuff on storage, whereas the assignment's going to be on virtual memory, right? So we're going to manage that as best we can. My feeling is the recitations are really more important as far as doing the programming assignment. So that's kind of where we'll focus. But obviously, we will also use that time to address stuff that comes up in lunch, all right? OK, so when we talked about CPU scheduling, one of the things that we were doing, although we didn't really know it this way, is essentially doing what's known as time multiplexing, right? So on systems, you can really think of two ways of dividing up a resource, right? I can divide up access to it by cutting it into little pieces in time and handing out those pieces to different processes or different threads. If the resource is divisible fundamentally though, I can divide up the resource and grant multiple threads access to it quote unquote concurrently, right? Depending on whether or not this is illusion of concurrency or the reality of concurrency. So again, when I did time multiplexing, right? I mean, CPU scheduling on a single core system is the fundamental example of time multiplexing, right? I've got one CPU. I want to make it look like multiple CPUs. And I do that by quickly swapping back and forth between different threads, right? Now with memory, right? I've got something that I can actually break up with a pretty high granularity, right? And we'll talk more about exactly what granularity I can achieve. But you imagine, you know, I've got a two gigabyte system, you know, even if I cut it up into one K pieces, that's still a lot of different pieces, right? It's two million pieces I can hand out in different numbers, right? So this is something that I can space multiplex in a fairly, fairly flexible way, right? Now, go back to the CPU example. So what's another, is CPU always time multiplexing? Is there an example of CPU space multiplexing? Multicore systems, right? On that hand, on the other hand though, I mean, multicore systems have a fairly low granularity, right? I mean, I might have four cores that I'm dividing over the system, right? If I took memory and handed out four pieces, that would be weird, right? And that's not what systems actually do, right? A two gig system, I would be handing out, you know, half gig pieces to different processes, to up to four processes at a time, right? And as you'll see, that's not exactly how we want to do things with memory anyway, right? Because the cost to switch back and forth between different processes in memory is much higher if I did it this way, okay? So I just want to point this out. This is an important distinction because when we talk about memory, we're normally talking about, always really talking about space multiplexing. Now, the grants of memory that we give out to processes can change over time, right? That is definitely true, right? But in general, we don't do this thing with memory where we yank the entire chunk of memory away from a process every, you know, small piece of time and hand it to another process, right? And the reason for that is that preserving the contents of that memory is just too expensive, right? It would take so long to figure, to write all that memory to somewhere where I could get the contents later before I did the switch that this isn't how we do this, okay? So let me come up, let's start with the strong man, simplest possible way of dividing out memory, okay? So I've got a big chunk of memory in my system, I don't know, two gigs or whatever, maybe a megabyte in the old days, and I've got a bunch of processes that wanna use this memory, right? So what's the simplest possible way that I can divide this memory up? Anybody? So I could use fixed sizes, right? But let's say that I just start handing out stuff, Firefox shows up and it wants a certain amount of memory and I just give it a portion of the physical memory on the system, right? So this is physical memory up here, and Firefox comes up and says, I need this much memory, and I say, okay, and I just handed a chunk of memory, right? And the next process that comes up, maybe it's virtual box, virtual box needs a lot of memory, so I give it a big chunk of memory, my terminal starts running, the terminal maybe doesn't need as much memory. And then let's say Inkscape shows up and Inkscape wants some memory, right? Well, what do I do now? Inkscape wants more memory than the amount of memory I have on the system. So what's one possible thing I could do? Yeah? I could kill somebody, right? I could also, so that's kind of like unfair to the processes that are already running. What if I want to be a little bit more unfair to Inkscape? What could I do? Just how about fail? Just tell Inkscape, sorry, you can't run right now. Like you would launch a process on your system and it would be like, sorry, out of memory. Run me later or kill off other processes, right? So that's an option. And then the other problem that this sort of points out, right? Oh gosh, I'm so behind this now. Okay, is that this scheme, I mean one of the limitations of this, right, is that it's limited to the actual amount of memory that I have, right? So when I run out of memory, I essentially have to start failing allocations, okay? And what happens in this case, right? Well, what happens if the process requests memory that it doesn't actually use, right? So what was the mechanism that I had on the CPU for kind of trying to make sure that this didn't happen? I can preempt it, right? But what would happen actually on a CPU if a thread just decided to sit in the idle loop, right? Let's say it's just executing just no ops, right? Just empty instructions, right? What would happen to that thread in most systems, right? It's gonna start using up its quantum and it's gonna look like a thread that's non-interactive and so eventually I'm just gonna be able to sort of push it away, right? So, but it's interesting because on the CPU, I don't have as good ways of figuring out, I don't, the system doesn't necessarily look at the instructions that are being executed and trying to figure out if they're any good, right? I mean, maybe I'm just adding up random numbers or something like that. There's no way of telling what the application is doing on the CPU and how much value that's actually creating but with memory, I might be able to actually tell, right? I might be able to say, okay, this is large amount of memory that I've allowed you to use and you don't seem to be doing anything with it, right? You don't seem to be actually accessing it or changing the contents or anything, right? But in this model, if processes request memory that they don't use, what can I do, right? Can I actually revoke the memory that I've given them? So, let me come back to that question. Let's say, now let's say that this happens, right? And let's say, you know, Firefox exits, you know, and now I can satisfy this inkscape request, but what do I need to do? I have to split it in half, right? So now I have this case where in order to do efficient allocation, I might actually have to start doing discontinuous allocation, right? So you might say, hey, I need, you know, five megabytes of memory and I might give you one megabyte chunks that are sprinkled all over the entire machine, right? Why would that be a bad thing? I mean, who cares? Yeah, yeah, so there's all sorts of reasons why this could be a bad thing, right? One of the main ones is that you can imagine it really complicates how you set up your process, right? If when the process runs, the process doesn't know whether it's gonna get a contiguous five megabytes or little itty-bitty chunks of memory all over the place, right? How do I figure out where to put stuff, you know? How do I, you know, let's say you're a programmer, you wanna allocate a one megabyte array, it's possible you don't get a chunk enough that's big enough to put that array in, right? So that's kind of terrible, right? But I could do this, right? And this would work okay. Now let's say, so let's try to keep adding features here and think about how we could support things. So what about if processes wanna allocate memory dynamically, right? So the terminal process exited here and then virtual box wants some extra memory, right? So what are the kind of cases here that may or may not happen, right? I can always hand it some other discontiguous piece. But I can grow the allocation up to a certain point, right? I mean, you can see that I've grown the allocation into the area that was vacated by the terminal program. But if for whatever reason it requests more memory and it needs that memory to be contiguous then I might have to fail that request as well, right? So if this is starting to seem like just a laundry list of the problems with this approach, that's because it is, right? And this is a terrible thing to do, right? And there's more problems with this, right? Can anyone spot any more problems with this? It's probably like a dozen other things that would make this terrible, right? Anyone else wanna guess? Right, okay, so unlimited amount of physical memory on the machine, right? Yeah, it's possible, right? So normally on systems like the ones that we'll be focusing on in this class, we think about memory as being uniform, right? There was a lot of work and continues to be a lot of work on what's called non-uniform memory, right? So, and part of the reason for this is we're gonna talk about systems where the memory abstraction has been extended to incorporate pieces of memory that may have very, very different performance and latency implications, right? So that's what is usually referred to as NUMA or non-uniform memory, right? Meaning that, you know, this chunk of memory is maybe actually on my local machine, whereas this other chunk of memory, despite the fact that it looks like it's on my local machine is actually like on some distributed machine somewhere else, right? And so that creates a performance difference as you would expect between rights and access to those pieces of memory, right? So now I also have this discontinuous allocation problem, which is annoying, it's kinda messy and gross, right? It complicates the layout of my process in ways that we'll discuss in a second. What about this problem? How many people are familiar with fragmentation, right? Raise your hand if you've heard about fragmentation before seeing other contents, okay? So we'll talk about it a little bit, it doesn't look like everybody, right? But let me talk about layout again for a second, right? So how do I know, how do processes know where their code and data is gonna be located, right? So this is the code that you wanna write as a C programmer, right? And you just want this code to work, right? But the code you write in C is filled with these symbolic references, right? So for example, I declare this data array, right? And that data array is gonna be stored in memory and it has to be somewhere, right? It has to be an address for it so that I can tell the memory subsystem where to find it, right? And then here once I get down and I start to access data, right? So how does the program know? Where is this particular piece of this array, right? How do I know this, right? Especially if I have discontinuous or even varied allocations, right? I mean the other problem that we saw here is that if it's early in the morning, maybe your process gets loaded in at the very, very bottom of physical memory, right? If it's later in the day and a lot of things have been running, you might get a completely different set of addresses, right? So this starts to become kind of a pain, right? And then again, how do I know where my functions are? How do I know where anything is, right? Especially if I have to, you know, continue to react. Essentially what I'm doing is I'm forcing processes to do all this hard heavy lifting to figure out where things are at runtime, right? So when they run, when they're given a piece of memory, now they have to figure out where to put things and how to rewrite all these addresses. And this, as you can imagine, would be kind of disgusting, right? So the other problem we had here was fragmentation, right? And fragmentation, the best definition I've seen for fragmentation is simply when I fail a request for a certain amount of a resource despite the fact that there is enough of that resource unused to satisfy the request on the system. And the reason that I fail requests, especially for memory is typically these requests are for contiguous memory, right? And the memory that I have available is broken up into little chunks, the unused memory, right? And so there's two different types of fragmentation. Internal fragmentation is when I have unused memory that's inside existing allocations, right? So given our current model of the world, what would be an example of internal fragmentation, right? Let's say I hand out this big chunk of memory to a virtual box, right? And then I have to fail an allocation to Inkscape because I don't have enough memory, right? So where is the potential there for internal fragmentation? Yeah, and why, right? So virtual box may have allocated a lot of memory, it's not used, right? And if I could reclaim a memory inside of that allocation, I could give it to Inkscape and Inkscape could use it, right? So there's some internal fragmentation. Now, what about external fragmentation? So external fragmentation is when the unused memory is between existing allocations, right? So it's still the potential that I could allocate it, but it's stuck in between these other allocations. So going back to our model, when terminal exited and there was a little piece of memory left, what could I have done? And let's say that Inkscape came in and needed this big chunk of memory, where's the external fragmentation there, right? Anybody want to sort of walk me through an example of external fragmentation that could have occurred? Hmm, maybe this isn't a very well-posed question. Let me go back. Yeah, oh wait, sorry, back and forward or not. All right, let's see, I think it's in here, right? So here, you know, after this guy finished, and let's say that, you know, this, well actually, here we go, this is the better example. All right, so here, right? So here, let's say that Inkscape actually needs this allocation to be contiguous, right? It's got some massive array that it's storing that it actually needs to be contiguous, right? So here would be a case where I would suffer from external fragmentation, right? Because this chunk isn't big enough to be put here or here. And yet, I have enough memory if I could combine those two pieces, right? I feel like maybe I'm doing something terrible and just beating this dead horse too hard, but let's keep beating it for a little while longer, all right? Okay, fragmentation, external, external, right? And again, it's not always possible to split up these allocations as I might have wanted to, right? Because for example, this, you know, this 10 kilobyte array, actually it's more than that, it depends on how big it is, so it's probably 4k. This 4k array has to be contiguous, right? Because the way that this array is referenced is by offsets to the beginning of the array, right? And if it's not contiguous, I would have to do some really disgusting things at the instruction set level to get this to work, right? And I don't want to do those things, okay? Another problem with this that no one has pointed out so far, right? If I hand out addresses to physical memory, to processes, is there any way for the kernel to enforce those allocations? Remember, one of the things I have to do when I multiplex resources is I have to be able to enforce my allocations, right? I have to be able to say if I gave you a megabyte of memory and you a megabyte of memory and you a megabyte of memory that you guys aren't stopping on each other, right? That Venue isn't like, you know what, I'm the queen of town, I'm taking all three megabytes, right? She might do something like that. So I need to, it's average, I mean I need some way of enforcing that. So if I'm handing out physical addresses that map directly onto memory, how do I, is there any way to enforce those allocations? Is there any way to do it? What's that? Well, oh my, you're getting way ahead of us, right? So what do I have to do? I mean what actually happens here, right? The process is executing instructions and let's say that it executes an instruction that accesses memory that it's not supposed to be able to access, right? So what do I have to do at that point? So first of all, how would the system even know that it's memory that the process wasn't supposed to be able to access? What do I need here? Who do I need to help me? Hardware. Or, I mean, I could do the following. Every time a process accesses memory, I could trap into the kernel and I could check, right? I could say, oh, you access an instruction, every time any instruction that accesses memory could generate an exception. And I could trap all the way into the kernel, I could save all the context, and I could look at the instruction, and I could look at the address, and I could say that's okay, and I could restart the instruction, right? That would be the slowest computer ever, right? Because almost every instruction that does anything interesting that an application executes touches memory, right? And I don't want to trap every time. That would be totally, totally terrible, right? So either I need the hardware to help me, right? Essentially I need to check every memory access, right? So either the hardware's got to help, or I'm toast, right? There's no way that I could do this by trapping every time to the kernel. It would just be way, way, way too slow, right? What about reclaiming memory, right? What's I've given, it's pretty much the same mechanism, though, here, right? What's I've given a process permission to access certain pieces of memory, you know, taking it back requires that I basically tell the hardware that the process is allowed to use this anymore. And I don't really have any mechanism yet for doing this, okay? All right, so let's talk now that I've managed to convince you that this seemingly obviously bad idea is a really bad idea, right? Let's talk a little bit about what are our goals for doing memory allocation in the first place, okay? Multiplexing memory. So first of all, I need to be able to allocate memory, right? I'm the multiplexer, I need to be able to grant access to memory as a resource, right? And I need to do that essentially at two times, right? One time is when the process starts, right? When the process starts, remember we talked about the L file, et cetera, et cetera, and the process essentially gives the kernel a blueprint about how it wants memory to look when it starts, right? So when it does that, I need to be able to set that up, right? And then dynamically as processes evolve, I may need to be able to allocate more memory, right? Can anyone, does anyone know when this happens? It's two, primarily two different ways. How do you allocate memory and see? Malican-free, right? Malican-free, allocate memory from the kernel by asking the kernel to increase the size of what's called a process heap, right? And we will definitely come back to this. What's the other case in which I could allocate memory in a little bit more of an implicit way, dynamically? The stack, right? So imagine you write a recursive function. As your function recurses and the stack gets larger and larger, the kernel is essentially giving you more memory as you need it, right? When you run off the bottom of your stack, the kernel says here's some more stack pages, right? Second requirement is I need to be able to enforce my allocations, right? I need some way of being able to make sure that processes aren't using memory that's been allocated to another process, or even worse memory that's been used by the kernel or some other important task, right? I need to be able to reclaim memory. So this process, so this is very similar to CPU multiplexing except for the reclaim part, right? Because reclaim, you can really think of more as adjusting the amount of memory that I've given a process afterwards, right? And that's really a function of the fact that this is spatial multiplexing. With the CPU, I gave it the entire core until it was finished and then I took it all away. With memory, I want to be able to trim and grow dynamically, right? So I can give you a little bit more memory. I take a little bit of memory away, right? And finally, I need to be able to stop the process from using memory that was allocated after I've told it that it can't use any of that memory, okay? So again, in comparison to the CPU grant, it's actually scheduled, right? What about in force? How do I enforce that process used as the amount of CPU that I've allocated to it? By preemption, through timeline, right? What about reclaim? As I said, this is new. This is something that's a little bit different or I should say kind of adjust or trim or something like this, right? What about revoke? How do I revoke a thread off the CPU? De-schedule, right? De-schedule it via context switch, right? All right, so let's talk about, now we've kind of addressed the goals. Let me talk about the abstraction, right? Now with the CPU, we talked about the thread abstraction, right? And the thread abstraction was the thing that allowed us to separate mechanism of policy. It's the thing that allowed us to think about what it meant to grant and revoke access to the CPU, right? And in memory, we have a similar abstraction, a similarly powerful abstraction, right? And that abstraction is called an address space, all right? Now here's what we do, okay? And then starting on Wednesday, I'm gonna tell you how, okay? So today we're just gonna talk about what and why this is a good idea. Remember with the CPU, I mean part of our goal was to give every process or every thread the illusion that it had its own CPU, right? And the way we did this is we took away the CPU, we let somebody else run, but when we gave it back to CPU, we made sure that everything looked exactly as it left it, right? With memory, we do something similar. We give each process a unified, identical view of memory, okay? And this is the view of memory that we give it, right? And here are some properties of that view, right? Here are, this is the illusion, right? The first illusion is that every process has an identical view of memory, right? So memory starts at OX0 and it goes all the way to OX, you know, fffff, potentially, right? Potentially. Now this view makes it look like memory is plentiful, right? I mean look, I've got a four gigabyte address range, right? Potentially larger on newer systems that have wider address, right? But 32-bit systems, four gigabytes of memory, right? It's all, and it's all, you know, it's all mine, right? This is all for me, okay? It looks contiguous, right? Beautifully contiguous, you know? Addresses that are next to each other behave as if they're next to each other, right? So I can lay out my huge, massive array anywhere I want to inside this address space and it looks contiguous, okay? Uniform, right? Every time I'm loaded, my address space looks the same. It always starts at OX0 and it always runs to OX, ffff, right? So I can essentially say I'm always gonna put my code at a particular place, right? At a particular point in memory and that will work every time I run, okay? And then finally, it's mine. It's private. This is my memory. It's shared among any threads that I might create within my own address space but it is typically not shared with any other process. There are some mechanisms that I can use to enable sharing of memory between processes but those are elective, right? They're not like, oh, whoops, you know, me and another process happened to be sharing memory we didn't know about. You have to set up shared memory very explicitly, okay? Any questions about this illusion, right? So, and this is essentially what we are going to try to provide, right? This is the illusion, this is the abstraction that the operating system is committed to creating. All right? Now as you can imagine, the uniformity of the address space makes layout really easy, right? At least certain parts of layout, right? I'm saving a little bit of the mess for later, right? So, what I can do is I can say I always put my code and static variables at some address, right? You know, and this is just convention, right? I always, maybe my heap always starts at a particular point in my address space, right? This is where I dynamically allocate memory from. And when I ask the kernel for more heap, the heap grows this way, right? It grows toward the top of my address space, okay? And then the stack from my first thread always starts at the top of my address space and grows down, right? So I push stuff onto my stack, meaning that it grows towards lower addresses, right? This is kind of the standard model of how processes lay out their address space, right? And remember, so we talked a little bit about how this worked when we talked about processes and I showed you some PMAP output. Maybe we'll look at that again on Wednesday, right? This is all in the ELF file, so the ELF file determines how this stuff gets laid out, right? Now here's one interesting question, so, and maybe people do know this. So some of these conventions exist to help programmers, right? Why not load my code at address zero, right? So why not, in this model before here, why don't I put my code right there? It's the beginning of the address space, you know? There's nothing there, I might as well put my code right, you know, right flush against the bottom, right? Why not do that? What's that? Kernel space? No, it is not kernel space. This is all process address space, yeah. No, programmer knows where the code is. Yeah, John. Exactly, right? The most common programmer error, one of the most common, no pointer exceptions, okay? If I re-reference a null pointer, it's gonna land in here, right? Depending on what the null pointer is and how big my offset is, right? It's gonna land somewhere at the very, very beginning of the address space, and if I make sure that that memory is not mapped, segmentation fault, how many people have ever caused a segmentation fault on a piece of C code, right? That's what it is, right? I mean, maybe you guys have wondered about what that was, but the segmentation, we'll talk about segmentation on Wednesday, right? But segmentation means that you have access to segment that you are not permitted to access, right? And the segment is right down here, right? And if you put the code there, you still might get a segmentation fault because usually the code is marked read only, right? But it's possible in old systems that the code was marked read write, and what you would have done instead of getting the segmentation fault is that you would have just written bizarre garbage over a portion of your code, right? And then your program might have kept running for a period of time until something weird happened, right? And some part of your code tried to execute the garbage data that you wrote down there at the beginning of the address space. So that's exactly why that's done, right? What about the stack in the heap, right? So let's go back to this, right? Heap starts in the middle kind of and grows upwards. Stack starts at the top and grows down. Is this a problem here? Are these two things that we're gonna collide? What's that? They could, but when would they collide? What's that? But let me ask you another question. How much memory will I have to allocate to this process before those two things collide? A lot, right? So, you know, if the heap starts around here, I mean, I'm not gonna do the math in my head, but assuming an address space that's this wide, now on MIPS, the address spaces are smaller. We'll talk more a little bit specifically about how MIPS does this. But the point is there's actually a lot of room in here, I mean, this is not drawn to scale. This is drawn to scale that one heap page would be like infinitismally small, right? So this is over two gigabytes of memory, right? On newer machines, the address spaces can be a little bit even bigger, right? But the point is that I don't wanna have this problem and there's very, very few applications that ever need this. Sometimes like big database servers will actually allocate enough dynamic memory or they'll map enough files into their address space to cause problems here. But usually for most normal processes, this is very, very unlikely, right? It would mean that either, usually my heap would be huge, right? Because the process that tried to access that much stack would usually, the kernel would usually decide something was wrong, right? Like for example, you wrote a recursive function without a base case, yeah? Yeah, same thing. Yeah, yeah, exactly. So in that case, what would happen is normally at some point you're out of stack and the kernel will say, you know what, I said it could use 10 megs of stack and it's asking for 10.1 and then something's wrong, right? And normally that limit is much smaller, right? Certainly not large enough to fill that entire area, okay? So again, and this is also really nice, right? Because going back to our relocation problem, now I know exactly where these things are, right? And they're in the same space every time, right? Data is always loaded into the same, exactly the same address and I can just use those addresses in my code, right? Is this strictly true? Oh, gosh, I gave away the answer, right? Okay, so it's clearly, it's not strictly true. Does anyone know why it's not true? What pieces of code violate this assumption, right? When do I actually need to do relocation or moving things around dynamically depending on where the code was loaded? Anybody know when this happens? So dynamically loaded libraries have to be relocated because those are pieces of code that can actually be put in any part of the processes address space. Different parts for different processes and maybe different parts of the same process depending on how it runs, right? And in that case, there's this whole mess to figure out exactly how to remap all the symbol. This is not something we're gonna talk about, okay? So let's talk quickly before we finish about, so this sounds like a fantastic idea, right? What are the challenges here? Yeah, yeah. Anyone ever read those New Egg reviews where it's like, I think they make you write cons for the product and some people write like, it wasn't free, you know? That's like, they're common. They really like something, they're like, that was what was wrong with it, I had to pay money. The problem with address spaces is how do we implement this model, right? This is a great idea, right? This would be awesome if we could get it to work, right? Every process, same view of memory, you know? Huge address space, I can put things exactly where I want them, it's contiguous, it looks nice, right? But this is essentially what's gonna consume us. Now, what do we have to be able to do here? Can anyone point out one thing I need to be able to do to implement this, right? So I need, whatever addresses I'm using are not what we were talking about at the beginning of this class, right? Because process A, OX 10,000, is not the same as process B's OX 10,000, right? If they were, I would have problems, okay? So, yes, I've broken, I need to do some sort of address translation. I've broken this direct mapping down to memory, right? These addresses now have a lot more meaning, right? And we'll talk a lot about that on Wednesday, right? So I also need to make sure that I can protect processes from each other, right? These are designed to be private abstractions of how do I implement the protection? That's my second challenge. One thing that I can do, right? Now, remember, I've given processes this illusion that I have these massive address spaces, right? And what I'm doing is I'm exacerbating this problem where I'm gonna run out of memory, right? And what I need to do is I need to figure out, what do I do about this, right? I'm essentially, I'm saying, hey, process, you've got four gigs of memory, go to town, right? I'm not really encouraging them to be good consumers of memory, right? And so I need to do something to figure out how to handle this problem because I'm kind of egging them on, you know? Like, hey, I'll never fail an allocation, just keep asking for more. And essentially, this is what we're gonna talk about starting tomorrow and for the next couple of weeks, okay? So any questions about this stuff before we're done? All right, so we'll do review on Wednesday. Next time we're gonna talk about, what are these new addresses, these brave new addresses that we've created that are gonna do all this fun stuff and essentially we'll talk about how we translate them, et cetera, et cetera. So I'll see you on Wednesday.