 Welcome back to the course content after a couple weeks off. So today we're going to, today, a little bit of Wednesday, we'll finish up a discussion of virtual memory. On Wednesday or halfway through Wednesday, we'll start talking about disks. And then I think my plan for the leftover sort of free time at the end of the semester will be to spend some time looking at research papers in operating systems. We did this last year. And I think by the time we're done, you guys have learned enough in this course to appreciate some recent advances in this area. And there are some really neat papers that are out there to look at, including one that inspired one of the midterm questions and probably some others that will inspire exam questions. So that's sort of my plan with the free space just to let you know what's going on. So I posted the exam itself and the solution set online. So they should be under the exams tab. We're going to try to have them back to you next week. So there's a large number of you. There's a small number of TAs. As you know, the structure of the exam is pretty free format. So these are not particularly easy to grade. This week, the TAs are not going to do office hours. The office hours will be run by the Ninjas. And that'll give them a little bit of extra time to grade. But just be patient with us. We're going to try to get these back as soon as possible and do a good job. So the assignment three coding in and design doc deadline is Friday at 3. Oh, five. That's what I said. Just the usual time. Hopefully, assignment three problems and design document are a little bit more straightforward for you guys now that you've done this a few times. There aren't as many coding questions for assignment three. The ones that are there are a little bit more involved. And then the assignment three design document, I would suggest that you guys look at the rubrics and how we graded the assignment two design documents for an indication of how the assignment three rubric is going to look. But hopefully, by now, you guys have done this once. And so a little bit are more familiar with what we're looking for. Recitations will cover assignment three. And of course, there's always the promise of joining the 100 club. 8080 club has more of a ring to it. So as I said, on Piazza, I think that in general, I was looking at last year's side deck for today, essentially. And last year's students hadn't even turned into assignment two yet. So you guys are actually way ahead of where last year's class was. And the only danger here is that there's this long yawning expanse in front of you between now and when the assignment three implementation is actually due, which I think is like mid-May, it's after the final. I briefly toyed with the idea of moving that date up. But I felt like I might get killed. And so we decided not to do that. You can thank the TAs and Ninjas for. But on the other hand, I don't want you guys to just stop doing everything. So hopefully, the code reading and design will get you guys interested in assignment three. Assignment three is really cool. Assignment three is probably the assignment that got me interested in doing what I do. And it's just a really, really fun assignment. And there's a huge sense of satisfaction when you guys get it to work. And it's just super cool. So please don't just think, OK, I'm doing awesome. And then just peace out and wake up in mid-May and not have started assignment three, because that will be really sad. Because I think you guys are in a great place to do really well on assignment three, because you've come so far so much quickly than other classes. And the assignment two grades look pretty good. Not to say that assignment three isn't going to expose a bunch of flaws in your assignment twos, because it will do that almost immediately. But anyway, I think you guys are in a good spot. So I'm really proud of this class. You guys are doing well. OK, so long, long ago, before spring break, before the midterm, we've got to the point where we described how to swap in and out pages. I'll do a little bit more review than usual today just to try to refresh your memory since it's been a while. But any questions before we start swapping, paging, page replacement, any of that cool stuff that we talked about before break? So at this point, we've actually reached a stage with our development of our understanding of virtual memory, where we pretty much talked about all the mechanisms for how to do various things. And today is the day that we talked about the policy for how to perform certain actions. OK. So swapping was this process of creating memory by moving pages transparently back and forth to some external storage, normally the disk. Discs are bigger and slower than RAM. And so by using some portion of the disk as a swap file or as a swap area, I can make RAM available to be used as actual memory. So who remembers when I'm swapping well, when I'm making good decisions about how to extend memory into the disk? What does the system behave? How does the system behave? Damn. Yeah. So when I swap well, it makes the system feel like all of that extra memory, all of that extra space on disk, is actually memory. It's as fast as memory. So it can make a system that only has a small amount of memory and is using a large paging file or large swapping file feel like it has a lot more memory than it has. However, when I'm not doing this well, what does it make the system feel like? It can make it feel like, essentially, I mean, another way to think about this is all of my memory is actually disk. And we'll get to sort of a colloquial definition of the state that your computer can get in, where it feels like every time I go to memory, it requires a disk operation. And in some cases, that's because it's actually true. So in the worst case, I have a system that's almost unusable because I've taken a fast part of the system, which is memory, and I've made it look really, really, really slow like the disk. So what do I need to do to swap out a page? What are the steps I need to take? This is a question that has been on the final, in some form or another, almost every year that I've taught the class, either this or the swap in steps, not above reusing this one. Yeah, go. So the first thing is I sure as better get it out of the TLB. That's one of the first things I have to do because I have to make sure that the page isn't being modified while I do other things. So I'm gonna remove the translation from the TLB. If it exists, it's possible it's not mapped in the TLB, in which case, during the paging operations, I need to make sure that it doesn't end up in the TLB. So once I start moving the page to disk, it cannot be inserted into the TLB. What's the next thing I need to do? Yeah. Yeah, actually copy out the page. So I make the contents on disk. I bring them up to date with the contents that are actually currently located in that page of memory. And then I update the page table entry to make sure I know where these page contents are so that I can find them late. So these are the three steps. We had a nice diagram. This is a page that's in the TLB. I'm gonna remove that entry that ensures the process can't use it during the operation. I start writing it out to disk. This takes a while. So it's possible that during this operation there are other things can happen and there are some race conditions I have to handle here. And then I readjust the page table entry to reflect the current location of the page and I have succeeded in creating a page of memory that is now available for some other purpose. All right. So there was a trick that we were gonna play to make swapping out faster. What's the thing that the operating system can do to try to make this potentially very slow process a lot quicker? Yeah. Yeah, so when I have some idle time, synchronize the page contents with the page contents on disk. If I do that, then I can skip this very slow second step of writing out the page because the contents on disk are already up to date. So in that case, all I have to do is remove the entry from the TLB and update the page table entry and those are the fast parts of the operation. It's the writeout that's quite slow. And the second thing is, so let's say a process requests some new heap. Let's say I use sbreak as a process to ask that the operating system update my heap. Is that gonna cause the OS to actually go and find more memory? Right at that moment. I've given the process permission to use more memory but did the operating system actually go and find more memory when the call is made? No, so and we'll come back to this in a minute. This is demand paging. So I always wait as long as possible in hopes that maybe the process allocates some heap space right before it gets shut down. Maybe that heap space never gets used. There's a variety of different reasons that can cause a process to not use parts of its address space, virtual addresses and virtual pages that it has access to. And we don't do this because many code and data pages are never used, okay. So the process of swapping in a page, this had a few more steps to it. What do I have to do here? So first of all, when does this take place? When do I have to swap in a page? So the process is tries to use the virtual page. Remember the process has no idea that I've moved this page out to disk. I didn't tell it that, I just did it. Behind its back. Now it tries to use that address. So when the virtual address is used again, that's the point at which I have to make that address behave like memory. Remember I'm okay with where I can play a lot of games with virtual memory but it has to obey the memory interface. So I have to be able to load and store to that address. I can't do that directly to disk and so now I've got to make this page, I have to make this page appear back in memory. So the first thing I have to do is stop the instruction that's trying to use it. At first it generates a TLB exception, why? I mean this is technically a page fault but this operation had better generate a TLB fault first. Why? Yeah. Yeah, this page is not resonant in memory, there had better not be a TLB entry for it. There's no valid mapping in physical memory for me to point this virtual page to. The next thing I have to do is find a page. What can this cause? What thing that we just talked about might I have to do in the process of swapping in a page? Might have to swap out another page. So it's possible that on the swap in path and this is one of the reasons why operate systems try to make sure that there's always some clean memory because this is already slow and if I have to swap out a page and wait for that IO to complete then this path gets even slower. All the process did was try to use an address and no idea was gonna cause all this trouble and it was gonna be so slow. I have to locate the page on disk. I used my page table entry to do that. I copy the contents back into the spot I created in memory and I update the page table entry, load the TLB and restart the instruction. Any questions about these steps? A little bit more involved and I won't go through the diagram in details online. All right, blah, blah, blah. So remember that we use two different, we have two different ways now describing TLB and two types of memory-related faults. There's a memory-related fault that I can handle without doing any IO. Referred as a TLB fault. In that case, the contents of disks, the contents are in memory somewhere and all I need to do is load the TLB entry. In the second case, the page is not resident. We refer to this as the page not being resident. In other cases, the page may not exist so the first time that I fault in a newly allocated section of heap. So I actually need to do some work to create the page and that might involve writing things out to disk and might also involve writing things back in if this is a page it was used previously. So every page fault is preceded by a TLB fault and not every TLB fault is gonna generate a page fault. Hopefully as few TLB faults generate page faults as possible. I think we went through this already. So any questions about swapping now that I've reloaded all of the pages that you swapped out over spring break? I have hopefully generated enough TLB and page faults for you to page back in this material. Any questions? All right, so one of the last little bit before we talk about page replacement policies I just wanted to mention because this is a feature of the most commonly used architecture today which is, well I don't know, it's a great question. I think maybe two or three years ago you might have said x86 wins but now that ARM is so popular on smart phones and other lower power processors maybe we have another battle going on but in any case the x86 processor uses what's called a hardware managed TLB and what a hardware managed TLB does and actually allows hardware to search the page tables that are maintained by the operating system but on a TLB fault, the hardware can search the page tables itself and look up the proper entry to load into the TLB assuming it exists. So what this means is that there are no TLB faults on a hardware managed TLB architecture. Now, the pro is fairly obvious which is that hardware is faster. So this means I don't have to trap into the operating system at all when I have a TLB fault. If the page is in memory, the hardware can use an agreement that it makes with the operating system about how the page tables are structured to locate the correct translation. The con is that the operating system now has really no or very little flexibility in how the page table structures are established because I need to set up the page table structure in a very specific way so that a fixed piece of hardware can walk it and locate the valid entry. So the operating system now has to implement its page table data structures in a way that the hardware understands. There's no option because I can't change that code that came with that approach came with the processor. And as I said that kernel never sees TLB faults on a hardware managed TLB, right? So, you know, as you guys start assignment three, don't worry, the MIPS architecture has a software managed TLB, so you guys can set up your page table data structures in any way you want to, which is sort of nice. Any questions about this? I just need to point this out because again, this is a feature of x86 chipsets and those are fairly common. All right, so, now let's talk about choosing which page to evict. So this is the policy question related to swapping. The mechanics of swapping we've already discussed. So we know when we need to swap in a page, but when we talked about swapping out a page, one thing that we fail to address is which page should we remove? And this happens again in one of two places either when I'm choosing a page to swap out and it's possible that this also happens because I need to swap in a page. So, this decision gets made frequently. So here's the cost benefit, so what's the cost benefit calculation? What is the cost associated with swapping? What's that? Right, it's the disk IO, the time and the disk bandwidth that are required to move a page from memory to the disk. So this is, and keep in mind, this is, the disk, believe it or not, does not only exist to act as a page file. There's other stuff that goes on on the disk. There are your files there. There's a file system, which we'll start talking about later this week. So any bandwidth that's being used by the OS for swapping competes with other uses of the disk. What's the benefit? After I do this, what have I accomplished? I swap out a page and in return, I mean there must be some benefit, otherwise why did I do this? Yeah, right, I have a page of memory that is now free for the OS to use. But the critical thing here is that the benefit, the one parameter here is how long that page remains unused. That's the only thing that can vary here. The cost, there are tricks that you can play to minimize the cost, and there's plenty of interesting engineering that operating systems have done in terms of how I lay out the page file and other types of things. But really what we're gonna focus on when we talk about these algorithms is how to try to maximize the benefit. And the way that we maximize the benefit is by picking a page that is not going to come in back in for a long time. So if I pick the right page, the benefit is large because that page remains unused for a long period of time. So another sort of metric that you'll see used when evaluating page replacement algorithms and evaluating paging systems is the page fault rate. How frequently does the system have to move data back and forth between the page file and memory? Given the same workload, a low paging rate means that I'm making good decisions about which page is to evict. And a high paging rate means that I'm making poor decisions. So in the worst case, you end up in this situation called thrashing. How many people have ever had feel like their computer has ever been thrashing before? Or experienced this, right? So maybe everybody, you guys need to do more intensive stuff on your computers. Either that or just tear out most of the extra memory you put in there. So thrashing is actually difficult to define quantitatively, but it's very, very easy to define qualitatively. It's the state that your computer can get in where essentially the only thing that it's doing is moving pages back and forth to disk to try to create memory. And what can happen is that on a system that's really heavily overloaded, the overhead associated with doing this can cause really all of the other legitimate functions that the system is trying to perform to stop. All the system is doing is just trying to create memory desperately. And every time it swaps something out, that thing either comes back in very quickly or something else comes in back in very quickly. And one of the, maybe where that term came from, I don't really know. I mean, there's some sense of the system is thrashing around without really making any progress. But if you've ever on old systems that had physical disks, when a system got into the state, you could hear it because it's like, you know, I mean, the disk is just running constantly. And you can also experience this because at this point, usually the system is totally unresponsive. It may take you 10, 15 minutes just to log into the system so that you can try to reboot it or kill off whatever's causing the problem. So this is a very, very bad state to be in. So the idea here is let's try to maximize the benefit by picking the page to evict that's gonna remain unused the longest. So what is the best possible page to pick? We're trying to maximize the benefit of moving a page to disk. If I could predict which page to evict, what would I do in the back, yeah? Okay, so now, no, no, no, no, no, no. Imagine you could predict what's gonna happen to this page in the future. That's a possible way of figuring out what to do with pages if we don't know the future. But if I know the future, what's the best possible page? Yeah. A page that will never ever be used again. So this, in this case, the benefit is kind of, it's not really infinity because at some point that process will shut down. But I get that memory from now until the process shuts down. That's fantastic. That is the best benefit that I could ever possibly get. And of course, these pages do, in fact, exist. Pages, again, you click on some weird menu in an app or on your browser and you do some weird things with your browser that you don't normally do. And there are some pages that get swapped in to in order to provide the code for those rarely used code paths. And then once you're done, you may never use those pages again. And so at some point, a good system will identify that, move that page to disk, and it's gone forever. You will never use it again. So this is usually, it's usually difficult to identify these pages. But what would be like, so again, what is the thing that we would like to know about a page before we, while we're choosing which page to evict? This is sort of like our optimal schedule. Yeah. Okay, so that's something that, again, you guys are a step ahead of me. That's something that I would actually know. What would I want to know? You're using an Oracle schedule here. What's the thing that I would like to know? Yeah. The next time it will actually be used. So if when I was choosing a page to evict, I could determine when the next time each page was actually going to be used, what would my algorithm reduce to? Pretty simple. I know for each page in the process, the next time it will be used. Yeah. Right, I sort them by how long it's gonna be till they are used, and I pick the one that's gonna be unused the longest. Maybe there's a whole group of pages that fall into that category we just described, where the time that they're gonna be used next is undefined because they'll never be used again. But that is the algorithm I would like to implement. And this is the optimal schedule. This would pick the, because this is the one that maximizes the benefit. The cost is something we can't control, the benefit we can't. So if I maximize the amount of time I've made this a memory available for, then I'm doing the best possible job. The scheduler is hard if not impossible to implement. So what do we do? If you don't know the future, you use the past to predict the future. So I feel like this is, this is something that, so we're gonna talk about a couple of algorithms here, including ones that people have already suggested. For each one, you're only talking about really two, but when you're designing these systems, you have to think about a couple of things. So first of all is one information I'm actually going to track about the page. How do I collect this information? So this ends up being tricky. And how to store it. So let's talk about some of the trade-offs here between these variables, right? The first is that storing state, so collecting the statistics might be expensive itself. And we'll talk about this in a couple of cases here, where for example, I might love to know how often a page has been accessed, but that may be impossible for the operating system to know without completely ruining the performance of the system. I also need to keep in mind that whatever statistics I'm gonna use as inputs to this algorithm, I need to store them for every page on the system. And so every virtual page on the system, not even every physical page. So whatever data structure I'm using is gonna get a little bit bigger. And certainly if you argued that I need 32 more bits to store this information, that would be a very, very difficult case to make because you may be doubling the amount of RAM that the kernel needs to use in order to store this information, which is not necessarily a great trade-off. Okay, so think back to schedulers. What is the simplest possible page replacement algorithm? So I need to swap out a page. I'm gonna choose the next one. How do I do it? Random. Pick a random page. Actually, well, that's a good question. Yeah, just pick a page at random. I know that the process is using certain pages. I know which ones are resident, and I choose one at random. This is a simple algorithm to implement. And going back to our discussion of scheduling algorithms, you guys may decide for assignment three that you have a very clever new page replacement algorithm that you wanna try out. If it doesn't outperform random, it's gonna be hard to sell anybody on your new idea. So whenever you guys are developing algorithms, I would always suggest come up with a simple straw man just to make sure that you actually haven't made negative progress. If you can't outperform random, just give up and try a new idea. Random is a little too simple, though. We can probably do better than random. So what we're gonna do instead is the algorithm that people have essentially discussed and people sort of pointed out uses how long it has been. So I don't know how long it's gonna be before the page is used again, but I know potentially some estimate of how long it's been since the page was last used. And this leads to what's called the least recently used algorithm. And so rather than picking the page, it's gonna be not used for the longest period of time, pick the page that has not been used for the longest period of time. It's amazing how you can change an oracle algorithm to an implementable algorithm by just changing the tense of one of the verbs. Let's use past tense instead of future tense, then you're done. And keep in mind, I'm making an educated guess here. I'm hoping that the fact that the process has not used this page for a while is a hint that the page will remain unused for a long period of time, potentially forever. So this might be as good as we can possibly do. And one of the things I think that's really fascinating about computer systems and about the algorithms that drive computer systems is how simple some of them have remained over time. So how many people have taken a course on computer algorithms? Hopefully most of the people here. How many people think that this algorithm is just way too simple? How could something like this perform so well for so long on so many systems? And you guys are probably the people who are gonna go out and try to do better than this. And you won't be the first. But it's surprising that in many cases, computer systems are still driven at their core when doing resource allocation by very simple algorithms. And it's a sign of some of the trade-offs that you have to make at this level. Okay, so the biggest con here is figuring out the two things that are required to actually implement some form of LRU. The first is how long is this? How do I determine when a page is accessed? The second is how do I store this information efficiently? So let's look at both these issues. So at what point is the operating system sure that a page has been accessed? Yeah. No, but okay, so here's the problem. Remember, what do we want to be the common case when we're translating virtual physical addresses? Yeah. What's that? There's an entry in the TLB. When there's an entry in the TLB, does the operating system ever notified? I mean, no, not explicitly. Maybe the hardware here will help you and keep some statistics, but it's unlikely. So remember, in the common case, in the case that we want to be as normal as possible, the operating system is never even notified when a page is used. So what's the only time the operating system is sure that a page is used? Yeah. I don't want to do any special instrumentation. This is just during normal operation, yeah. When there's a TLB fault, right, that's the only time. So, so obviously this does not reflect every page access, so imagine that I start up a process. It immediately faults a couple of pages. One of those pages, that may be the only time. There may be special startup code that the program uses to initialize itself. That's the only time the program will ever go down that code path. It's been faulted, that's all the OS knows. Another page may hold some sort of data structure that the program uses constantly throughout the rest of its life. That may be being hit all the time. So, once those pages are in the TLB and being translated automatically, there's really no way for the operating system to distinguish between them without some form of hardware support, which as far as I know, most hardware doesn't have. So, the only time, and now on a hardware managed TLB, this is even worse because the hardware won't even tell me when things are being faulted into the TLB. It just doesn't work for me. So, in that case, all that is here are the page faults. So, LRU sounds like a fantastic idea. In practice, storing the information required, even collecting the information required to implement it can be tricky. And clearly, again, I do not want to be on the path of every page access. That's the whole reason I have a TLB in the first place. I'm not gonna get rid of it now just so I can implement some sort of fancy algorithm. That's a terrible idea. We better have random and a working TLB than a fancy algorithm and no TLB. The first system is much, much faster. So, the second problem with LRU is how much access time information can we actually store? So, regardless of how you wanna store time, storing even a reasonable amount of resolution here requires a fair amount of space. So, I imagine two to the 32 ticks. I can't remember how many minutes this is if you record something like a microsecond, but it's not actually that many. But this is gonna double the page table entry size. If I use eight bits, which I might be able to jam into the page table entry with some careful engineering, not only will they have 256 units of time that I can use to distinguish pages from each other. So, now the accuracy of this calculation is quite a bit smaller. And then, the final problem here is when I'm looking for a page, remember I'm on this path where things have already gone wrong. I'm swapping in a page, I might need to swap out a page at that point. If I'm doing it in the background, it's not a big deal. But frequently swap out happens on a hot code path because I need that page for something else that's trying to get worked at. And so, now I may need to scan or maintain some sort of other fancy data structure in order to store this information. And this is gonna have to be searched every time I evict a page. So, LRU, I mean LRU sounded like the best we could do. Once you start actually talking about how to implement it in practice, it starts to become a little bit more interesting. There are some trade-offs to make here. So, let me describe an approach to implementing LRU that's sort of common and eliminates several of these problems. That's sort of a clever algorithm. So, we call this algorithm the clock. And the algorithm is known as clock LRU. So, clock LRU only requires one bit of access information per page. And I'll show you how that's used in a minute. But this bit is set every time a page is loaded into the TLB. So, now somewhere on the path of loading the TLB, I need to just set this one bit in the page table entry. Now, when I'm located into page two evict, here's what I do. I choose an order. I establish an order within the pages. Maybe I order them by virtual page number for a particular process or across the entire system. And if a page is accessed, sorry, if a page access bit is clear, then I evict that page immediately and that's the page I choose. If the page access bit is set, I clear it. And I continue doing this until I find a page that has a clear bit. Now, can I be sure that this algorithm is going to terminate? So on a single core system, the answer is actually yes. Because I'm running, nothing else is running. And so as I'm going through all the pages, even if when I start, all the access bits are set, by the time I finish looping through all the pages and clearing the bits, I get back to where I started and I find a clear bit. On a multi-core system, I'd have to think about it a little bit more, but I suspect that there are variants of this that ensure this property even on multi-core systems. So here's how this works. Imagine these are my page table entries that I've established in some fixed order. Here are ones that are free. So this one has an access bit set. I clear it and I move on. So I don't take that page immediately. What I'm looking for is this guy. So this guy's bit has remained clear. This is the first page that I evict. When I restart the clock algorithm each time, I start from the same place. So the next time I keep going, I get to this guy, this guy is now evicted. This makes sense, questions about how this works. So thinking back, now if I look at the dynamics of this algorithm, what does it mean about the system when the clock hand is turning slowly? Imagine that you could see every time I have to make a page eviction decision, and overall for the system, I'm watching the clock hand turn. What does it mean when it's turning slowly? There's a couple of things that it could potentially mean. Yeah, what's that? Too many page faults? Well remember, I only have the chance to advance the clock when there is a page fault. So it's actually the opposite. So when the clock hand is moving slowly, in general, this is a good thing. It either means that the system is under a lot of memory pressure, meaning that I don't have a lot of pages that I need to evict. Remember, I only really get into this situation once I start over provisioning. Once I have allocated more virtual memory to processes, then there's actual physical memory on the system. If I haven't done that, then this is not a problem. Or it could mean, even if I'm running a memory intensive workload, I'm making good decisions about which pages to evict. So if I look at this particular workload, it works well with this outcome. What does it mean when the clock hand is, on the other hand, if the clock hand is spinning rapidly, this could mean one of two things. The first thing could be that I don't have a lot of memory pressure, but in general, sorry, it could mean lots of memory pressure. There's a lot of memory activity, there's a lot of processes running. Or it could mean that for whatever reason, this algorithm is not a good fit for the workload that's running, and then making bad decisions about which pages to evict, and that's producing a lot of paging activity. So in general, I wanna be in the place where the clock hand is turning slowly, that's a good sign about the system. Any questions about page replacement? Yeah, yeah, good question, could I? Okay, so that's a great question, right? So let's say I want to implement a system where, when I, so is this a great idea? But let's think about it a little bit. So I've got this, so someone is gonna point out, well this is a dumb feature of the system. Every time you evict a page, you write the whole page out to disk. Why do that, why is that a good idea? Why don't I just write the parts of the page that are different, right? So that sounds like a great idea, what's the problem? Yeah, no, no, no, no, so this won't break swapping at all. So here's the idea. As the page is being used, I know which bytes of that page are different from the bytes on disk, I keep track of that. And then when it comes time to swap the page out, I only write the bytes I need to write out, yeah. What's that? No, that's not really the problem. I don't have to do any disk access to have the system work. There are several problems with this approach. Yeah, back to the creators, thought about it a little bit more by this point. Okay, so problem A, right, is that where am I storing this information? So now for every 4K page, I need another 4K page, or well, okay, let's say I only store a bit of information, but I need some amount of memory to store whether or not the bytes on that 4K page have changed. So let's say I'm storing one bit per byte. So for each 4K page, I now need a 256 byte. No, is that right? 5 out of 12 byte data structure. So that's expensive, right? I've just increased the memory overhead of my system by 13%. It may be a better use of that memory to use it for page contents rather than tracking which bytes of the page have changed, but there's another problem. Even if I could track which bytes of the page changed, but Isaac, no, no, no. So I think the proposal here is, think about it, if when I'm paging out, I know that some bytes have changed and I only write those bytes to disk. So there's still a 4K page on disk and I can still ensure that that 4K page matches the contents from internal memory. I just do that by figuring out which bytes have changed and writing those bytes out to disk, but what's the other problem here? Yeah, what's that? Well, but even if there is a TLB fault, what does a TLB fault tell me? Yeah, no, go for it. Well, there's some overhead, but I would argue that if I can reduce bandwidth, that's a good thing, yeah. No, but I can do the same thing I do currently with page out, right? I make sure the page doesn't change once I start to swap, yeah. No, I should make this an exam question. Yeah, yeah. I still argue as proposed, this is not a bad idea, right? It will, it could potentially help by reducing disk bandwidth. However, there's a memory overhead to it, but there's another really deep fundamental problem with this. What does a TLB fault tell me? So let's say I have a page, 4K worth of data, and before the process can use that page, it has to generate a TLB fault. Now, we know that that address is inside the page. Once I've loaded the page into the TLB, what more do I find out about accesses to that page? Nada. So when it comes time to write that page out to disk, what does the operating system know? Let's say I tracked something about the TLB fault. I only know one thing, what's that? Do you know what the page is? No, I only know one thing about the contents of that page in terms of how the contents have changed. So again, the idea here is I'm gonna keep a mapping of the bytes in the page that have changed, but how do I maintain that mapping? What's that? But still, how do I see, so again, let's say that the process is gonna write half the bytes of the page. How many of those writes will the OS see? At worst, one. So again, I'm a process and I'm gonna write half the bytes, I'm gonna change half the bytes on this page. I am not gonna tell you which bytes they are. When I change the first byte, you know, because there's a TLB fault generated, and so now you know one of the bytes that I changed. But the other 200 and, where it is, 247, you have 2047, you have no idea. You never saw them because the TLB translated them automatically. So there's no way for the OS to know which bytes of the page are different when it goes to write out the page. Now there is a way for the operating system to compute which bytes of the page are different from the contents on DITS, how would I do that? Yeah. I could read the page from disk, right? So I could read the page from disk, compare the contents, but now I've done one disk I.O. plus whatever I have to do to write the contents out. So that's really the problem with this idea. It's not necessarily whether it's gonna work or the state I have to change. It's that the operating system doesn't know, but I think it's really important for you guys to understand. Does that make sense to people? The operating system in general does not have much of an idea or any idea, I would argue, about what parts of a particular piece of memory are being changed because most of those memory operations happen completely transparently. In the worst case, I might not know any of the bytes. In the best case, I know one of the bytes on the page that's different. In the worst case, I know zero. Why? An easily constructed case in which you don't know any of the bytes I've changed. No, there has to be a DLB fault. Remember, in order for me to use the page of memory, there has to be a DLB fault. Yeah, there we go. So my program is clever. It reads a byte from the page first. So that way you know nothing. And then I can change all the bytes I want and you have no idea what they're doing. Too bad, I just threw out a good exam question. Probably come up with something similar. Okay, good question. Any other questions about page replacement, virtual memory, to do a grab bag of everything? I think, let me see how we're doing on time. Oh, we are out of time. So next time we will finish with a quick little design exercise talking about copy on write, which is a very clever memory management technique. And then we'll start talking about disks.