 We are going to begin the second unit in the class, which is, I have to say, my favorite. And it's on memory management. So we've talked about how we abstract the CPU. Now we're going to talk about how we abstract memory. And in many ways, this is going to feel like we're going to reuse some of the same ideas that we used when we learned how to multiplex and abstract the CPU. But in ways, memory is quite different. And there are some really nice, very beautiful ideas that emerge when we start talking about how to handle memory as well. So this is fun. This will take us through. We get about right about where we were last year. So we're going to get about halfway through this before spring break when we come back. That'll be the midterm. Then we'll need maybe a week we're going to have to finish up before we go on to talk about disks and file systems. So all the grading is done for the first three assignments, really, because what's left is just auto-graded stuff for assignment two. So I don't think any of them are here. But I think let's give the TAs a hand. You guys are like, I'm applauding for the people who didn't give me enough points on this one code reading question. But yeah, I mean, it was a lot of grading to do. They did it. And so now you guys should have marks on everything up through Semitube. Yeah? Do they want to average score? No. But the website does. And I can figure that out if you're curious. Yeah, I'll look that up later. I'm sort of curious to find out myself. Of course, look, at this point, I don't know what it is percentage-wise. But I suspect that there are still at least 60% of the points for the assignments are still on the table, maybe closer to 50%, because you have the assignment two implementation to do and all of assignment three. So there's still a lot left to go in terms of points. So get going on assignment two implementation. I think you guys are doing way better than last year's class in terms of being started and working. And I think that'll make you happy later. Maybe not now, but especially think about it. It's going to start getting warmer outside. I promise. It actually does happen here someday. The days are going to be getting longer. The snow is going to melt. The last thing you want to be doing is inside hacking on an operating system kernel. So the more you get done now, the more time you can spend outside hanging out and joining the two weeks of summer we have after summer starts in mid-July. So I promised that I would. I sort of challenged people last time, and it turned out that not only one group, it was public about it, but two groups have completed the assignment two implementation. So it's here for Kartik, John Arthanaan, and Anand Sankar, or these guys. Are you guys here? There you go. Stand up. Congratulations. These guys are done. And also Scott Loffer and Libbyn, who? You guys here? Apparently they don't come to class. They just do the assignments. All right, so congratulations to these guys. They're done. Now you know who to ask for help for the next week and a half, but please finish up. And look, we'll be done with assignment two. You guys go on spring break. It's all going to work out very well. Very happy with the schedule this year. OK, so I just want to briefly do a little review, although we're not going to get into it. So remember, on Friday we talked about Linux scheduling. But overall, this is also nice material to keep in your head for the midterm. We talked about different ways to evaluate schedulers. One was how well they met deadlines, various types of deadlines, whether those deadlines were due to interactive performance or the regular demands of certain long-running tasks. Good schedulers put all of the resources of the system to work, and efficiently allocating resources is a proxy for getting the things you ask the computer to get done faster. So that's a good thing. And finally, the performance of the schedule and algorithm itself. So how long does it take to make a decision? Because that performance is total overhead. And this is something that we'll come back to when we get to this part of our discussion in memory. Because memory is very similar to the CPU in that at the end, towards the end of the unit, we'll talk about algorithms that are used to make decisions that have impacts on memory management. And they have a lot of the same, there's a lot of parallel ideas between how we evaluate CPU schedulers and how we evaluate page replacement algorithms. We talked about the fact that your time is more valuable than your computers. So deadlines win, particularly maintaining a system that's interactive and you can continue to work on. There were two scheduling algorithms that we discussed that used no information that didn't try at all to make good decisions. What were those? One of them was random. The other was run-rob. And these don't use any information about threads unless you count the fact that the run-rob and algorithm does, in fact, have a queue. We talked about some scheduling algorithms that would try to use information about the future. And we thought if they could, if they were an oracle algorithm, they'd like to know things like, how long is the next way I'm going to use the CPU? Will it block or will it yield and continue to be runnable? So I have to reschedule it. And how long will it block once it does block? These are inputs that I'd like to have. Now, obviously, I can't predict the future. Again, if this is not obvious to you, there are probably different courses for you to take at UB. But what do I do instead? Instead of predicting the future, I use the past to predict the future. And that's something we'll come back to with memory, because the memory management algorithms end up doing the very same thing. And one example of a schedule that we talked about, there's a couple now, multi-level feedback queues. It's kind of interesting. In a way, the rotating staircase does not do this, does it? Doesn't really use any historical information. MLFQ will move threads between queues in an attempt to reflect how interactive they are. The rotating staircase doesn't do any of that. So it's kind of interesting. It actually is even simpler than MLFQ. Doesn't use any information about the past. So just our refresher, these two guys were involved in our story about Linux scheduling. Who is Ingo? What does he do? Yeah, so he maintains the Linux scheduling subsystem. And of course, you guys probably hopefully remember that Conkelivas was the Australian anesthesis who hacks on Linux in a spare time and has made multiple contributions to the development of schedulers for interactive systems. So a couple questions about RSDL. Let's say I have an RSDL scheduler with 10 run levels and a 5 millisecond quantum. If I start in the highest priority level, zero, just pretend that's the highest. Clearly, if zero is the highest, you can guess which is the lowest. What levels will this thread potentially have the chance to run in? All of them. Don't you like how I very nicely wrote out zero through nine with commas in between them? I could have probably just used a dash, but yes, all of the levels. Now, again, it's not guaranteed to have a chance to run in all these levels. But if it continues to exhaust its quantum in a particular level, it will be moved down. And we'll have a chance to run at that level after the threads that started at that level have their chance to run. So same thing for five. I can run at my level and all the levels below me. Again, I have the chance to run at these levels. This is not a guarantee that I will. So let's imagine that at the beginning of the epic, I have four threads that are ready to run at levels zero, three, seven, and nine. What's the longest amount of time that the thread at level nine will have to wait before it is scheduled? This question is even easier to answer than it used to be because I've now got the answer correct. That's supposed to last year when it was wrong. Anyone want to venture a guess? Very simple. Does that require any complex math? Yeah. 20? OK, that's close. 15. Because remember, despite the fact that those threads can run at any level, I've initialized the level quantum to reflect where the thread started. And maybe this should be part of the question actually. So initially, thread level zero would have a quantum of five. Level three would have a quantum of five. Level seven would have a quantum of five. So regardless of the behavior of those threads, whether they block, whether they yield, whether they fall down the staircase and run in other levels, it doesn't matter. Because when the level quantum's expired, everything moves down. So the most amount of time, the thread in the bottom queue would have to wait would be $15. Questions about RSDL or about scheduling in general? Yeah. Yes? Yeah, yeah, yeah, yeah. You have to say epic, too, by the way. I used to say epoch. If not, I'm not allowed to say epoch. Nobody else could say epoch, either. Someone convinced me, finally, that word which I want to say that way is pronounced epic. Epic to me is like epic fail, not the other epoch. Yeah, so I probably should have gone through this as a question after class last time. At the end of the epic, threads are reset into their starting positions. All the quantum's are reset, and I start over again. So if you remember our example in RSDL, what I would do at the end is I would just go all the way back to begin the example and start over. Now, obviously, the next epic isn't going to be the same. Threads are going to do different things. But that's essentially the way I reset the epics. I put every thread at the priority level that it starts at, give them all a quantum in that level, set the per level quantum's appropriately, and then restart. That's a good question. Very simple thing to do, too. Any other questions about the schedule? All right, sweet. So as I mentioned, this is my favorite topic in operating systems. I remember studying this when I was in college. It just really, this is the moment where sort of things clicked for me, and I really thought, this is cool. And I really liked working on assignment three. I hope you guys will like working on assignment three. I know you're not there yet, but it's a lot of fun. And this is a neat, neat topic. There's just some really, really cool things that happen, and there's some really nice ideas embedded in here, too, that I think you can take off and use in other places. I don't expect to ever look like that. Maybe one day, though, I don't know. I can't grow up here. So the first thing to notice, there's a lot of similarities to CPU scheduling. But one of the most significant differences is how we are going to multiplex the system resource. So with the CPU, we pursued time multiplexing. And with memory, we're going to have to do spatial multiplexing. And the difference is, in temporal multiplexing or time multiplexing, the way that I share resource is by dividing it up into little pieces and then handing out those pieces temporarily. So dividing time up into quantums and reallocating the resource periodically. And with the CPU, the goal was to do it quickly enough that it looked like the resource was being simultaneously used by multiple processes. So CPU scheduling on a single core system is an example of something that is strictly temporal or time multiplexing. That's the only way I can share the resource. I have to stop and start threads. I'll give them all a chance to make a little bit of progress and then repeat that over and over again. Another example of this would be room scheduling. So if you guys have booked a room somewhere, rooms are typically temporarily scheduled. I allocate them for a block of time. You come in, you do what you need to do, you leave. Car share programs, same thing. Most programs like Zipcar, that's how they do things. You don't show up at the car and find that there's four of the people there for your hour slot. It's like, I want to go to the mall. I want to go shopping. I need to pick up someone at the airport. That wouldn't necessarily work out very well. So spatial multiplexing is this idea that I break up a resource into smaller pieces and now I can allocate those pieces simultaneously. So you might notice that CPU multiplexing on modern systems actually falls into this category as well because I actually have more than one core. So I had this unit of core that I can hand out and things can actually run at the same time. Now, this is an example of something that's done with low granularity. So maybe if you're lucky, you might have a 32 or 64 core machine. Now I have 64 different resources, CPU resources I can hand out simultaneously. In comparison, memory, if I have a 64, if I have a machine with a couple of gigabytes of memory and there are machines now that are shipping with quite a bit more than that, I can probably hand it out in much, much, much smaller pieces. And we will talk about this later. When we talk about memory, I'm not going to divide it up in the unit of bytes. That would be a little bit ridiculous and too small. But the unit of allocation for memory management is not much larger than that. It's typically four kilobytes. So I take my however many gigabytes of memory, I divide it up into these small pieces and I can hand them out to multiple processes at once. And that memory can essentially be allocated and multiplexed spatially rather than temporally. And of course, there's a splitting up of cakes. I don't let some person eat the cake for a little bit and then somebody else gets a chance to eat the cake. I mean, you could do that. That would be an interesting way of having your birthday party. Just make people line up and it's like, OK, you have five minutes with the cake. We'll see what's left. When you're done, you divide the pieces and everybody eats at the same time. So one thing I thought I would clarify this year that I haven't in the past about memory allocation is many of you guys are familiar because you've been taught, unfortunately, sometimes at a really early age, about memory allocation when you learned about how to program and you've learned a little bit about where objects live, maybe for some reason I don't understand. But the point is that memory allocation happens on most systems in a couple of different places. So processes are required to request memory dynamically from the kernel. So when you allocate an object in Java or when you call malloc in your C program or when you create an object in a language like Python, what's happening is that there is memory that the interpreter or the runtime or a library will ask for from the kernel. Now those allocations are what we're going to talk about in this class. However, what happens afterwards is that there's typically libraries that take those bigger chunks of memory that the kernel is willing to hand out. Remember, I just said the unit of allocation for memory is maybe going to be 4 kilobytes. And it might be bigger than that. On some systems, it's actually larger now as memory sizes have grown. I clearly don't need 4 kilobytes for my little Java object. So what I'm going to do is I'm going to have a second library that's going to take those chunks of memory from the kernel and reuse them to satisfy smaller allocations. And we typically refer to these as library allocators. Malloc is the canonical example. So if you've used malloc, malloc is not a function provided by the kernel. Malloc is a library provided as part of the C standard library. Now malloc does have to get memory from the kernel once in a while, but in general, most calls to malloc will not result in a system call requesting memory from the kernel. They'll be satisfied from within the memory that malloc has already requested. And again, I think you guys are more familiar with what you've learned about a little bit more as programmers is the second step, which is process-level memory allocation. And process-level memory allocation is really neat. It can be fun. It used to be a part of assignment 3 to write malloc in user space. It's a fun thing to do, but it's not something that we talk about in great detail. Does this make sense? OK. So before we talk about the primary abstraction that we're going to introduce to manage memory in the operating system, let's talk about some bad ways to do it. And some of you guys already know the punch line here, but just bear with me, because I want to show you why some things that you might not even have thought of don't work. Ways to allocate memory on the system that are not going to work out. One way to do this is to take the physical memory on the system and just divide it up between processes. And you may think, isn't this what we're doing? And it is what we're doing. But this method gives some process ability to use a physical address range directly. So what would happen here? Let's say I have some physical memory and I have a process that starts to run. Firebox, a little bit of a memory hog, grabs a big, requests a big chunk of memory from the kernel. The kernel gives it to it. Fine. We are doing good at this point. Now let's say virtual box comes along. Now virtual box clearly needs a lot of memory to run. So it now requests a piece of memory from the kernel and things are OK. Now my shell comes along and it gets a little bit of memory. But then I try to start up another program. And now what's happened is I potentially run out of memory. I can't start this other program because there's not enough memory left on the system. And what you'll see is that by giving the applications direct physical access to hardware memory, I've tied the kernel's hands behind its back. There's very little that I can do at this point to address the situation. There's a lot of things I can think about doing. It's possible, for example, that Firefox doesn't need all that memory anymore. And maybe what I can do is take it back. It's possible that Firefox isn't even running anymore. So I should just take Firefox and get it the heck out of there. Make some room for the editor, the Inkscape, my SVG editor, which I'm trying to use. It's a terrible program. Sorry, edit that out. The only one that you can use to edit these things. So anyway, there are options. But because of how I've given the processes access to this resource, I can't do these things. So one of the problems that this approach very quickly runs into is that I'm limited to the amount of physical memory that's on the machine. Once the amount of memory that processes have asked for exceeds that amount, I have no choice but to start failing allocations. So in that example, I just couldn't launch that fourth application. I would just have to say sorry. The call to exec or fork would just fail. And you tell the user, I'm sorry. You have to be happy with these applications you already have open. So it turns out, and maybe now that we all have 68 good jillion petabytes of memory in our computers, this doesn't happen as much anymore. But it turns out most of the time, the amount of memory that processes have asked for on your machine is larger than the amount of physical memory that's present. So if you added up all the memory that processes think they have access to, it's larger than the amount of memory that's there. That sounds like magic, and it is a little bit of magic. But that's what we want because of some aspects about how processes use memory, which we'll cover in a minute. So one of the problems is what happens if a process requests a big chunk of memory and then doesn't use it? In the example I just gave, I'm forced to honor that request because I don't know if the process is going to use the memory. But once I've handed the process the memory, there's no way to get it back. So even if it never uses that memory, there's no way to say, hey, by the way, Firefox, and you don't need that large of a buffer. He's not even going to use you for very long. He moved on, he uses Chrome now. He's just opening this one time because he has to go to a web page, it's broken. So there's no way to do that. There's no way to get memory back from processes. And this is a problem because remember, the kernel doesn't trust processes. You may say, hey, I want some memory. The kernel says, OK, sure, here's some memory. But I don't want to actually have to give you that memory. I don't want to have to give it to you forever. I want to be able to see what you're doing with it. I want to be able to take it away later if there's a better process that comes along that's more important to the user at that point in time. I don't want to end up in this little corner where I'm forced to be faithful to these allocations for all time. But now here's the question. So before what I was doing is just giving processes big contiguous chunks of memory, but maybe I could work around this problem by doing the following. So here's another problem. Let's say that I close Firefox. Let's say that the system told me when I tried to launch my next program, you can't launch that program using up too much memory. So I say, OK, I'm going to free up some memory. I'm going to close Firefox. Fantastic. Now in theory, I have enough memory. You don't have to believe me that these two gray rectangles are large enough to fit the purple rectangle. I have enough memory here and here to run the new program. But because it's not contiguous, this creates other problems. So what I'd like to be able to do is split the memory into two pieces. But if I'm giving low-level access to physical hardware memory to programs, this creates all sorts of problems. And we'll talk about one of them in a minute. The other problem it creates is, of course, what happens if a particular process tries to expand its allocation? So this is another problem with this approach. Like I said before, I can't take memory away from processes. So at this point, I have a running program. Now here's the thing. When a program is running along and it asks for memory, it's usually not going to handle no very well. You can say no. But you know what it's going to do? Crash. Crash. Do something buggy. Refuse to do anything. Freeze. Who knows? Like something bad is going to happen. And then someone is going to complain about your operating system. I'm sure that Microsoft feels this pain. I couldn't. The windows was crashing all the time. You're like, what do you mean by that? I clicked on the icon with the orange fox in it, and it kept crashing. So I called Microsoft for help with Firefox. Anyway, this is what happens. So I don't want to say no to processes in general. I want to say yes. I may play games behind their back later, but I know that if I refuse their allocations, that's not going to lead to a good thing. They're not allocating because they're just trying to find out if they'll give them some memory right at that moment. They actually think they need the memory. So if I give low-level access to the actual memory on the machine, the other thing that's going to start to happen is, if I want to do anything even somewhat smart, I'm going to have to start cutting that up into smaller pieces. And what a process is going to end up with are these chunks of discontinuous memory. So the process is going to think, OK, well, the kernel gave me a little chunk over there and a little chunk over there and a little chunk over there. And it's enough, but it's kind of weird. This makes things very problematic. So here's an example. I have a variable, and I just want to know, where is that? Where is this variable? You wrote this code. The compiler is in charge of turning it into machine code. And all you did, I mean, this is a reasonable thing to expect that you could do as a program. You allocate it in an array, and then you try to set one of the elements. But keep in mind, you wrote this code. What the compiler needs to know is, where is foo? And where is data? Where does that array live? You wrote this. The compiler needs to convert this into instructions that use actual addresses. So if one time, so imagine this was how these things worked, and every time I ran, I got a different chunk of memory on the machine. Every time I run now, data's in a different place. How is this even going to work? This is going to be very complicated. It would be better if every time I ran, data was in the same place. That would make my life a lot simpler. OK, so the other problem that this approach can have, and fragmentation is a problem that's going to be with us throughout this unit. But these sort of approaches to directly allocating physical memory suffer from fragmentation problems that we have better solutions for. So in general, what do we mean by fragmentation? So memory fragmentation occurs whenever I cannot satisfy a request for memory despite the fact that I have enough unused memory available on the machine. So this or in the system or in my library or whatever. So this is not good. It's hard enough to allocate resources on a system like this without having this problem. So imagine the kernel knows in theory it's got 78 bytes of memory available, but you ask for 45, and it says no. That's not what I want to have. So this is something that we're going to try to avoid. So we distinguish between two types of fragmentation because there are two places where this unused memory can be. So keep in mind, I have enough quote, unquote unused memory on the machine, and yet I failed your allocation. So where is the unused memory so that it's not helping me satisfy the allocation? There's two places. One is that it's inside existing allocations. So imagine every time you ask me for a byte of memory, I gave you four kilobytes of memory. Why not? Maybe you're going to do something done with that, and I'm going to create a little bit of extra space around. But then what happens over time is I'm only allocated eight bytes of memory, but it's taking up 32 kilobytes. And so there's lots of wasted space inside every one of those allocations that can potentially force me to fail an allocation down the road. The other place that this unused memory can be is in between existing allocations. So this is probably what you guys think about when you think of fragmentation. You think of I've got a big chunk of memory, I've got little holes in it, and if I added up all the holes together, I'd have a lot of memory, but I can't do that. The reason I can't do that is a lot of times there are certain things that need to be contiguous. There are certain data structures, for example, like arrays that really need to be contiguous. There is no way for C to understand how to split this array into little pieces so that it can squeeze them in between the other allocations that already exist. This block of data better be contiguous. And part of the reason is because this is how the C compiler generates instructions to access it. It knows where the beginning is, and if it wants to find a certain element in the array, it just adds to the beginning. So that implies that the whole array is located somewhere in memory and is contiguous. Looks contiguous. So I'm limited amount of physical memory of the machine. I have to hand out potentially discontiguous allocations, which can cause a lot of problems. There's this potential for fragmentation to ruin my day and make me make less good use of the amount of memory I already have. So here's another thing. Can I enforce my allocations? Can I ensure that a process is not using memory that's been allocated to another process? So if you go all the way back to what the kernel's responsibilities were, this is something we need to be able to do. In terms of isolating processes from each other, this is, other than file system level stuff, this is probably, if you go back 20, 25 years and look at why things were crashing on your favorite operating system, this is why. And unfortunately, for Unix-like operating systems, you have to go back maybe 40 or 50 years. But the point is, stuff crashed in the past because kernel's operating systems did not isolate one processes memory from another processes memory. Because if I start accidentally using your memory, or on purpose, then bad, you can almost guarantee that bad things will happen. And some of you guys will do this for assignment three. You'll write code that will just randomly overwrite little bits of your kernel's memory. And then just see how long that runs. Because what will happen is it'll run for a while, and then something random will happen. Because you've just randomly flipped a bit in the kernel, and you can expect that that bit was important to somebody. And when someone finds out what you did, actually they won't find out because they'll just try to use it in some weird way, and something terrible will happen, and you'll never understand what went wrong. You should give up right now. Don't do assignment three. Just kidding. Hopefully you will not have this problem. Now, one way to do this would be to actually check every single hardware memory address. So every time a process used memory, I'd have to check what memory was using and make sure that it was memory that I had allocated. Turns out, as you might guess, this is incredibly slow. Imagine that every time a process tried to use memory and had to trap into the kernel. You'd still be waiting for your laptop to boot from this morning. It's super, super slow. And finally, I've hinted this several times, I can't easily take memory away from processes. So having gone through one, one and a half failed designs, let's try to actually formulate some design requirements about what we want to be able to accomplish. So the first thing I want to be able to accomplish with memory management is I want to be able to give out memory. That's clearly important. I want to be able to grant memory to a process on request. And there's two times when processes request memory. One time is when they're launched. Startup isn't really the right word here. So that's a great question. When, in the course of a process, these life cycle, might it allocate memory? It's a couple of times. What's one system called that definitely has to allocate memory? Yeah. Open doesn't technically have to. Fork, made a copy of myself. That copy of me takes up space. What's another system called that might allocate memory? Another process creation relationship. Exec. So imagine I call fork. Now fork, I can have some estimate about how much memory it's going to allocate, because it's about the same that the parent has allocated, because I'm making a copy. With exec, I've asked the operating system to load a new executable from a file. That new executable can have a lot of contents in it. For example, if your shell forks and then loads, I don't know, a Python interpreter, for example, that is probably much, well, I shouldn't say much, much, but it might be bigger than the shell clone that it created. So imagine I have to allocate some memory for fork, but now if I run a larger executable, I have to allocate some memory again. The other time memory gets allocated is dynamically. So this is what you guys are more used to as programmers when you create objects called malloc, things like that. So there's some memory that's allocated implicitly when processes call fork, or when they load a new executable from a file using exec. And then there's also, essentially, the OS equivalent of malloc, which we'll get to in a couple of lectures. It allocates memory on demand. The second thing I want to be able to accomplish is enforce my allocations. So this is critical to safely multiply some resource. I need to make sure the process A is not using process B's memory, pretty straightforward. So this is something that we doesn't really have an analog with the CPU. I want to be able to reclaim memory. And I want to distinguish between these last two. So revoke, revoke is taking memory away permanently in a way that the process understands. I should be able to tell a process that it can't use that memory anymore, or make sure that if it tries to, something bad happens, like be killed or accident. Reclaiming memory is different. So reclaiming memory is borrowing the memory for a period of time. I'm going to borrow it from you, and then when you need it back, I'm going to give it to you. And I'm going to make sure that it looks the same way. But you aren't necessarily going to know exactly what happened to that memory when I was borrowing it. And I'm not even going to tell you that I borrowed it. Now imagine you have some nice computer doodad. Maybe your roommate has been borrowing it. Do you know? Do you keep your eyes on it all the time? Maybe they've been taking it, using it, and then putting it back in the exact same spot. So what you do is you put a little hair, a little feather on it, just to see if that's happening. So again, as long as I don't notice that things are gone, the kernel can do other things with them and make better use of the resource. And this is important. So let's compare this with the CPU, because these mechanisms or these goals map pretty well onto what we already understand. So Grant is scheduling a threat. This is how I give a threat access to the resource. The way that I enforce my allocations is via timer interrupts. Remember, I don't trust that the process is going to voluntarily yield at the exact moment I wanted to. Instead, I set up the system to ensure that I get control back and I can make a decision about whether or not the threat continues to run. So reclaim and revoke, I think depending on how you interpret these, one of them is new. One of them doesn't really have an analog. The other one is the equivalent to de-scheduling. Because you can imagine that de-scheduling is also a little bit like reclaiming the resource. I took the CPU away from you. You didn't even know. You were in the middle of doing something. Timer on fire, I grabbed the CPU back. I gave it to somebody else. When you get to run again, I make sure it looks exactly the same. So I've borrowed the CPU from you and you never knew that it happened. So the way that we're going to solve this problem and the core abstraction that operating systems provide to multiplex memory is something called an address space. The address space is what the abstraction. And this is kind of a fun thing to do. Because what we're going to do is we're going to design what we want. And then we're going to deal with all the consequences of trying to get it to work. So what do we actually want? We want to provide every process with an identical view of memory. And we want it to look plentiful, contiguous, uniform, and private. So by plentiful, we want it to make it look as if the process has access to way more physical, way more memory than is physically present on the machine. With the footnote that this was true until recently. Because recently you guys have just kept putting more and more and more memory into your machine. Nobody could stop you. And at some point this started to break down. But you can imagine five years ago. If I gave a process a two gigabyte address space, that was way bigger than the amount of memory it had on the machine because maybe it was only one gigabyte or a half gigabyte. Now again, now that you guys just keep loading up your machines with memory, we've had to do some new things to get this to continue to work. But for a long time, this was certainly true. On the MIPS, we're talking about a machine that has at most 16 megabytes of memory. And yet you're supporting a two gigabyte virtual address space, so it's many orders of magnitude larger than the amount of memory that was on the machine. In addition, that memory is going to look contiguous. So this is also part of it being uniform. Every time I start up the process, the memory address space that I give it looks identical. So this, as we'll show in a minute, simplifies layout quite a bit. Contiguous, again, so it's all laid out all in a row. There are no gaps. There's no part of the address space that I'm not allowed to use. And that's the same every time. Uniforms is essentially, I think the same thing is identical. It's the same every time and private. So this is mine. And without my permission or without using various IPC mechanisms, only the threads that are started in my process have access to the information that is stored in this address space. Let me see my little cartoons about this. I get four GB or two GB. It's all mine. It looks the same every time. There are no holes. And it's mine. It is private to my process. So I just want to point out how this solves some of the problems that we noticed before. First of all, the uniformity of address space, this makes process layout very simple. Because when you compile your process, the process can come up with a way that it wants to organize things. And it can use that every time. No matter what machine it's running on, if the machine has more memory, less memory, if it's four to 10 times or only running once, it doesn't matter. Every process, I'm giving it the same abstraction that allows it to view memory. And so it can do things like, I'm always going to load my code here. The heap where I allocate dynamic things always starts at a particular location. And the library allows it to grow up. My stack always starts at the very top of my address space, say, and grows downward. So this allows the process to just make these assumptions and for those assumptions to hold on a variety of different machines. So here's a little example of how this would work. Remember, in here, this is a four GB address space. On MIPS, we're going to give processes two GB address spaces, because there's some memory that's reserved for the kernel to use. But in general, this is how things look from the perspective of the program. And again, it can do this every single time. Every time it gets loaded, it can put things in the exact same place. And it was funny. I saw this slide, and I was like, did we talk about ELF? We did talk about ELF. It was on the video you guys watch, but I wasn't here. So I'm not sure I remember what ELF is, but you guys are supposed to. So exec took a blueprint of how the process wanted the system to look, how it wanted its address space to look. That's what's inside the ELF file. If you guys have looked a little bit your own source code, you can look at the load ELF function. You can pretty much figure out what it's doing. So you can think of binary files on any system as a data structure that maps the contents of the file into the address space. It's not a bad way to think about it as a blueprint for a house. It tells the kernel exactly how to construct an address space for the program before it begins to execute. Most of this involves content. So the ELF file has a bunch of stuff in it, like code and initialization constants for variables and things like that. But the ELF file also contains a lot of instructions about where to put stuff. Now, some of this is the result of convention. So sadly, I gave away the answer. So most of the time, when you guys link and compile your programs, it turns out that there is no code located at address 0. Nothing special about address 0. Remember, this is an abstraction. These addresses aren't even real. So there's nothing special about 0. 0 is just another address. But why don't I want to put code in 0? Because no pointer exceptions are one of the most common programming mistakes. So what happens is, if I ensure that 0 is not a valid address in my address space, it allows the system to help me catch my own mistakes. Because what will happen if I try to de-reference a null pointer, that address will not be in my address space, and the program will fail. So you guys have seen this before. This is the patented segmentation fault. And I know I haven't totally explained this to you yet, because you still don't understand why does it say segmentation fault? But this is what's happened, typically frequently with null pointer exceptions. Usually, actually, I load the code or whatever's at the bottom of the address space. I load it quite a bit higher than 0. And the reason for that is to catch null offsets into null structures. So if I have a struct boo, let's say I did something like this, this doesn't have to be pseudocode. I actually remember reading somebody's assignment two once and seeing something almost exactly like this. Unsurprisingly, this code was panicking. So literally, they were initializing it to null and then dereferencing it maybe a line later. There might have been a comment in between the two. So I create the structure. I don't initialize it to anything. And then I try to set a member of it. What address will this try to dereference? It's actually not 0. But it depends, I guess, on where bar is in the structure. If bar is the first member of the structure, then it'll be 0. But if I have some other variables up there, it might be 6, or 8, or 20. What the compiler is doing is the compiler will actually take whatever the value of the pointer is, and it'll add enough to get past the other members in the structure. So that's why I don't just reserve 0. I reserve this big chunk starting at 0. That way, even if I have a big structure that I happen to initialize properly, and it's null, and I try to dereference it, that access will still land in the area that I've reserved in my address space for catching null pointer exceptions. By the way, so I usually try not to talk about assignment stuff in class. But this is one of my favorite attempts at doing assignment to write that is also fatally flawed. Not fatally. I mean, actually, it will pass the test. Let's just put it this way. So we see this frequently. This is your sys open. You get a path name. It's a user pointer t. And the first thing people do is they say, I'm just going to check it for null. Why not? It's a pointer that came in from user space. It could be bad, right? So why not do this? There's several reasons to not do this. I won't ask how many people have code that looks like this in their assignment to source tree right now. Why not do this? It seems good. You know, it could be null. I want to check for null. I mean, I am not trying to encourage you guys not to check for null, by the way. So there's two problems. First of all, by the way, 0x0 can be a valid address. If I'm a program and I decide to set up my address space so that I ask the kernel to map 0x0, that could be a valid address. It's not usually a valid address, because by convention, I reserve that space at the bottom of the address space for catching null pointer exceptions, but there's no requirement that I do so. But this is my favorite reason. So there are billions of ways that this pointer could be messed up. No matter where it is in the processes address space, this could be a bad pointer. And you have now eliminated one of them, right? It's not worth it. Just pass it to copy in and be done. Don't check it. It's like, OK, I checked one off my list. I only have 2 to the 31 minus 1 different ways that this address can be wrong left to examine. So don't do it. OK. So by convention, the stack, which threads used to store private state, starts at the top of the address space and grows downward. If I have multiple stacks, I put them at various points at the top of the address space, and I leave some space between them so they can grow, and I let them grow down. On the other hand, the heap that's used by malloc typically starts at some point about in the middle of the address space and grows toward the top. So this seems like a problem, right? I have one thing that's starting at the bottom and going up, and I have this other thing. And both of these are dynamically allocated, so they're growing and they're growing. Can these ever collide? No, probably not. I mean, this is theoretically possible, but in order for this to happen, you'd have to have either an enormous stack, which would probably indicate some sort of bad recursion bug that would have crashed your system a while ago, because most sane operate systems have a limit on how large the stack can get, after which point they just assume that the user has done something wrong. Alternatively, that would also mean that your heap would have to be huge. So you would have to have allocated an enormous amount of memory for dynamic content. So now that we have this nice address space model, it does allow us to solve this problem. So at least the program knows where its own code and data are, and it can load them in the same place every time. However, it turns out that when you dynamically load libraries, you still have this relocation problem. So if you go and look up relocation as part of linking, you will find that there's lots and lots of information about how this is done. But this is for dynamically loaded libraries. And the reason, without getting to the details, is that the dynamically loaded library can be put in a bunch of places in the address space, anywhere, really. So when I load a dynamically loaded library, I say I want it to start right here. So those libraries have to be prepared to be put anywhere. And in order to do that, there's a bunch of extra information that you need to include it. So this address spaces sound awesome. They sound like a great idea. What is the catch? So I don't know how many people have ever seen the reviews on New Egg, where I think they must force you to write something in the con section of the product, so people will say things like, doesn't cook breakfast or wasn't free, for example, of a product they really like. But this sounds like a great idea. The problem is this. Can we actually implement this abstraction? So what do we need to do? The first thing that's sort of obvious when you start to think about it is that somehow it's clear that these addresses that we're handing out don't seem like addresses anymore, not the kind that we're used to. Because process one, let's say I have a process that calls fork. We know that every address in that new child's address space is the same as every address in the parent's address space, and yet we also know they have to point to different memory. So somehow we took this idea of a memory address which made a lot of sense. And now it doesn't make so much sense anymore, because there could be multiple OX 10,000s. One process A, one process B, those have to be different addresses. So clearly we're coupling the address now to the process that's using it. This is important to remember. I also need to do something about protection. I need to have a way of implementing this, and I haven't really addressed any of the core technical challenges here. And I also need to do something when one or several processes allocate more memory than's available on the entire machine. Remember, I'm giving every process this view, I'm allowing every process to pretend that it has two gigabytes of address space. So I'm also encouraging processes to spread out. I've got all this room in here. I'm going to put the code way over there, and go to the stack way over here. That sounds awesome. Not necessarily trivial to implement. So what we're going to do over the next couple weeks is we're going to talk about how we implement this abstraction. This is the story of memory management. But this is the core idea. If you understand the address spaces, you will understand what we're trying to accomplish. This is what we'll talk about next time, and I will see you guys on Wednesday.