 All right good morning everyone happy Halloween everyone's all dressed up I see and all ready for their midterms because they're super scary so We have a nice other relaxing topic today, and that is page replacement. So hopefully this isn't too bad Lab fours release. So if you can Wall that lab for primary lecture is still fresh You could probably start it and finish it after your midterms and be stress-free for two weeks. So that would be nice The quiz to grading is apparently happening later this week. So hopefully that comes back soon I stay still decided a day, but hopefully it'll be soon All right, so today we'll be talking about page replacement So memory and storage on your device is a big hierarchy where there's always this capacity and speed trade-off So at the top of the pyramid that is the fastest Memory on your machine. It's your CPU like your registers your L1 cache or L2 cache is Next fastest beside that and at the top Your registers are the fastest. That's where all the operations take place But there's only a limited number of them. They're not very big then below that you have your CPU cache Which is a bit bigger like it's in the megabyte range now like the 32 megabytes are the big ones then after that you have memory or RAM that is, you know 32 gigabytes or something on that order and fairly fast and then below that there's going to be Non-volatile memory or NVMEs, which are you know your little small solid-state devices that are really really fast then below that are going to be your SSDs or just plain old flash memory which are going to be slightly slower and Slightly slower have higher capacities Or be cheaper then below that you've hard disk drives, which are just spinning magnetic Dis that are even slower But you can buy, you know 12 terabyte ones quite easily and they're fairly cheap And then below that there's tape drives that are still actually used. That's like what Google uses for Really really slow archival storage So you can buy a tape that holds like 50 terabytes or something like that But if you've ever seen the machines you can like YouTube it. It's like a giant It's like a really old tape system. It'll go pick up the tape Do it in right slow as hell, but it's cheap as hell as well so all the archival storage would be on something like that and it's still actually used although no one will likely in this class ever see tape drives so We have this giant hierarchy and it's only going to be Getting worse and worse over time because there's hierarchies within devices So like your MVME drive will have a cache on them too and there's just going to be this whole hierarchy Hidden everywhere and you don't want to have to deal with all of it You would ideally like each level here to pretend it has the speed of the lower level or sorry the speed of the higher level and the capacity of the lower level so You get the kind of the best of both worlds And you want the memory used by processes, you know, you have some physical limitations so all the Memory used by all the processes can't exceed the actual physical memory on the device So what you can do is you can make the illusion that you have more memory by essentially Using the capacity of the device below it So in this case if you want to make memory seem like it's even larger You could just put some pages on to you know the hard drive storage and Use that so that's what we'll be talking about today How you decide what pages go into hard drive storage and what the best policies for that would be so what you do with that is if there are Memory pages being used you keep them in memory While all the unused pages you can flush to disk and just keep them around in disks So if you need them, they're still there But it will just take some more time to access it and then whenever you need to use that page again You would swap it back to memory so you would write another page out to disk and then swap in the one that was from disk back to memory and then you can use it and have the illusion of You know having 32 gigabytes of RAM even though your machine even has eight or something like that So there are four page replacement algorithms. We'll talk about today. There is the optimal algorithm Which is essentially like an oracle like you're seeing into the future So in this case you replace the page that won't be used in the longest so I just Throw out the page I know I'm not going to be used in the longest so that all the other pages I'm reusing over and over again, and I'm essentially getting all the cash hits for free and Then there are a few other strategies There is a strategy of just randomness where I just randomly replace a page which surprisingly works quite well There's our good old friend first in first out Which just replaces the oldest page first? It is a very easy decision and a very easy thing to implement But it's probably not going to be the best thing and then the last policy is the least Recently used policy. So this is something you could actually implement and the idea here is just to replace a page that hasn't been used in the longest time because Likely nothing's using it anymore. So I can write it out to disk and all the other pages are being used over and over again So I'd keep them in memory. Yep Yeah, so the question is Is the least recently used history base and yes because you're keeping track of the least recently used and then Optimal would be essentially prediction based or if you're actually doing the optimal solution It's you just look into the future, which isn't going to be possible So we pretty much just discuss optimal just so we have a baseline to compare a bunch of different algorithms So here is our scenario that we're going to be using the whole day so It's a bit of a ridiculous scenario just to make things easier to write So assume our physical memory can only hold four pages and then you access this following So we just give each page a number so in here there are a total of five pages one to five and Initially, we'll just assume all pages are on disk. So we have to bring them into memory So this is what we'll use for all of our examples and our goal here is we want the fewest number of page faults When we do all these accesses So for every example at the end of the day, we're just going to record the number of page faults We have and we'll be recording when they happen So let's go through a little example just to get our feet wet and we will do the optimal Example for all these accesses. So these are all our accesses So we access page one two three four One two five one two three four five So if we do that We're assuming that all of our pages are on disk and nothing in memory So the boxes there are what is currently in memory and they're all currently blank So whatever we have an access to one. So I put the accesses above the box It is not in memory So we have a page fault and we have to bring it into memory from the disk So in red I will put the pages that we just brought into memory So for the first access for one not in memory. We have a page fault. We have to load it in Okay, and this story is going to look fairly boring for the first few So if we access page two next then it's not in memory. We have to bring it in so We have four slots in our memory so we can keep page one around So we bring in page two as well and now there are page one and two in memory And that would also be a page fault And we have page three same story. It's not in memory. We can load it in page fault. It's in red Same story for number four not in memory. We bring it in. It's in red So what happens now if I try to access our next access is on page one So what happens then I got thumbs up. So thumbs up everything's good. It's already in memory I don't have to do anything so in that case. I would just write out the same thing. Nothing's in red There is no page fault. Nothing bad happened. It was already there. I'm nice and fast Now we access page two same thing is going to happen. It's already in memory and we're good We don't have to reload anything. We don't have to swap anything out We're still living the good life Now we have an access to page five So now there's access to page five. It's not in memory and we have to kick something out and Between page one two three and four. What should we kick out? I? See a four everyone who thinks four? We got a lot of fours three Okay, do you have any twos or ones? Okay, so remember for optimal you are going to replace a page that is not going to be used the longest into the future so Page one is going to be used in the next access right away So we should keep that and then we should keep page two because that's also going to be used in the Subsequent access after that and then we should also keep page three because it's used right after that So if we want to keep around Throughout the page that's going to be not used the longest We would actually kick out page four because one two and three are all accessed before that So when we access page five, we're kicking out page for so we replaced the four with a five And we wrote it in red or a different color or you can circle it or whatever to indicate that it's a page fault And that was our swap So now because we looked into the future and we saw everything when we access page one That's a hit. It's already there Access page two. That's a hit. It's already there Then we access page three. It's a hit. It's already there and now we have page four So now we have to replace either page one two three or five Which should we replace? So should we replace page one? one replace page two replace page three replace page five Okay, so no one wants to replace page five and we were fairly ambivalent about the other ones in this case There's no future accesses. So you can pick whatever one you hate first Generally when I teach this classes seem to hate number one the most so So number one gets kicked out. So in this case doesn't matter as long as you don't kick out five. That's fine so in this case we kick out number one and We replace it with four and now because we didn't kick out Number five on our last access we access page five. So we are all good. It's in the cache so the last thing we did need to do is Get the number of page faults. So how many page faults do we have here? Six so you just count the number of red numbers so we have six page faults and if this is our optimal whenever we look at any other algorithms If this is the optimal any other algorithm should only do at best this and can't go lower. Otherwise, that's kind of a weird contradiction So six page faults is our target Okay, well, let's go at it with a fight foe example or first in first out so in this example our first four Page accesses are going to look exactly the same There's nothing in memory everything is on disk. So if we access page one, we have to load it in it's a page fault page two page three and page four So these are our first four page faults and now our memory is full and we have the pages one two three and four in So now when we access page one We get a page hit then page two we get a page hit Now something more interesting happens here when we have an access to five So when we have access to five, what are we going to replace if we're doing first in first out? One one right so the first page we loaded was a one So if we're doing first in first out since one was the first one in way back here way back here in step one then It's going to be the first one out. So when we access page five we replace page one So now we replace page one with page five. So now what's currently in memory is five two three four Now we're going to access page one and oh, no, we just kicked it out So we're going to have to swap back in page one and what are we going to swap out? To so now two is the oldest one because that was the second thing we put in so Now we're going to swap out page two With page one. So now in memory or five one three four and now unfortunately in the next access We're going to access page two. So we have to swap it back in now three was the most Was the first one in between the remaining pages? So we swap back in three and now we have another page fault and now in memory. We have five One two four Now we have an access to page three. Oh, no, we just kicked it out again So now four is our oldest one. So we replace four. So now in memory. We have five one two three And now our next access is to page four Which we just kicked out again. So you can see how this was really bad so now Five is our oldest one. That's still remaining So we were to replace five with four now We have four one two three and now of course we are accessing page five So which is what we just kicked out. So now our oldest one is page one So now we would kick that out again and now we have four five three or sorry four five two three so We did pretty much the worst thing we could do and in this case we have ten page faults so we only got two page hits and then we kept on going through the cycle of Accessing the page we just kicked out and you can see when you do this an easy way to look at it is Hey, all the page fault ones go in kind of like a waterfall pattern there So there's their nice diagonal lines So they should all look like that if you're doing first in first out makes it easier to track a little bit so Ten page faults Compared to six that is pretty bad And probably the worst we'll see all day Now let's see something fun. So Let's do the same example So here we have we hold four pages So if our memory held five pages We would be pretty much. Okay. We would have five page faults and then And then everything else would fit essentially, right? So we'd have five page faults then everything would be in memory or it all fit So it stands to reason that if more pages are better for the number of page faults then the Fewer pages we have the more page faults we have that would stand to reason that less pages we have to work with the More page faults we have So let's reduce it. So we'll do FIFO, but now instead of four Page or frame page frames in memory. We only have three so Now we have three instead of four so on our first access of course we have one that is a page fault So we loaded into memory and now memory contains the first page Then access to two same thing is going to happen Then access to three and the same thing is going to happen and now we have an access to page four So what should we replace with page four? One so if we do our waterfall thing or whatever we can look one is the current surviving one So that's the one that should go so we would place one with a four So now we have an access to page one and so this is looking really bad So we already have more page faults than we had before we are Zero for four on this case So we have an access to page one and that's a page fault and what are we going to replace? Two so two is the next oldest one. So we replace a one with a two. Oh No, our next access is to page two that we just kicked out So now we're going to load in page two So in this case our oldest one is page three So we're going to kick out page three So with our access to two we kicked out page three. So now we have four one two Okay, yeah, this is looking pretty bad. We got six page faults all in a row not great So now we have an access to page five It's not it's going to be a page fault two as well. What are we going to replace? Four so four is our next oldest one. So we replace four. So now we have five one two Okay, and now our next access is to page one It's in there. We're looking good. We got our first we got our first hit And now we're going to get super lucky again because our next access. It's a page two Which is also in there. So that's pretty good Now we have an access to three. So what's getting kicked out? five One so between all the pages that are currently in there one was Latest one to go in so it is out. So we replace one with a three. So we have five three two Then we have an access to page four, which is also bad So we're going to have to replace something and in this case we're going to replace two So we'd have five three four and then our last access is to page five, which is a hit and How many page faults do we have? weird so We have nine page faults, which is actually better Then when we had four is when we had four frames and we had more memory. We actually had more page faults Kind of weird So there's a name for this it's Belladies anomaly and it says for first in first out It says more page frames paradoxally Cause more page faults and this is a problem with all FIFO algorithms It does not exist for LRU or least recently used which is like that history-based ones or any kind of stack-based algorithms that We probably won't see so there's a paper about this if you're like really into math That you can read and it basically says you can construct any arbitrary sequence to get any Arbitrary page fault ratio you want if you can craft it yourself So that is fun for some late-night reading if you want to put yourself to sleep and so like Thankfully for other algorithms increasing the number of page frames like increasing the size of your memory Actually decreases the number of page faults, which is what you would actually expect This is only the case for FIFO algorithms, and it's this weird anomaly. Yeah But yeah, so in this one if we had five in this example, we only have five Accesses to five pages, so they'd all just load in so we'd have we just have five page faults And then everything would be in memory. Yeah, you have to you have to like Be able to swap things and in that case it all fits so you never have to swap anything So the paper is like if you have to swap anything you can construct an arbitrary sequence that make makes it terrible okay so let's use let's try a LRU or least recently used policy and in this case we'll use FIFO to break any ties we have where they would be Which seems a bit weird that you'd use FIFO to break ties in least recently used With only one CPU and one thing doing accesses So that would only be the case if you actually had like multiple CPUs And you could actually get a tie but in these scenarios. We essentially can't get any ties So let's go back and we'll make our memory be able to hold four pages And we'll do the same accesses again So the first four are going to be boring. They're going to all be page faults So we're going to have one two three Four and then we're going to access our next accesses are to page one and two Which are currently in memory. So they are both hits And the only difference is going to be when we access page five So when we access page five What should we kick out? So if we do least recently used well That's going essentially back into history So most recently used is two since that was the previous access so too safe I shouldn't touch two and I also shouldn't touch one Because it was used after two and then going back. I also don't want to touch four So between one two three and four my lease recently used is number three So if I'm doing lease recently used I should kick out three So when I do the access to page five, I'm replacing page three. So now in memory we have one two five and four and again, this is just using the Using the fact that generally if you access some memory, you're going to reuse it again Soon after that. So this is trying to take advantage of that So luckily it paid off because our next access into is to page one and we didn't replace it So that's good. Our next access would be a hit Because we access page one. It's there and then we access page two So if we access page two, it's also still there. That is a hit and now we're slightly unlucky We're accessing a page three So we have to kick something out So again, we look back into history to see what is the least recently used so Page two if we go back was most recently used because it was their last access Subsequently one is also safe because it was used after that and then five is also safe So the number that's currently in memory. That's not one two or five is four. So we kick that out So we kick out page four with page three and Oh, no, our next access is to page four So for our next access is to page four again We have to swap it in and we'll go look back at history So if we look back our history our three most recently used pages are one two and three So we would keep all those and replace four with or replace five with four So now we just kicked out page five Which was a bit unlucky because our next access is going to use page five and at this case and now when we have page five our least recently used is going to be page one because the three most recently used ones are Two three and four. So they're all safe. So we would kick out one So that's our last page fault. We replace five with a one So now if we go ahead and count all the page faults, we have one two three four five six seven eight So we have eight page faults so that was better than our first in first out which was ten and first in first out with three pages is only nine here at least was better than with only three Three page frames, but not quite as good as six All right, any questions so far fairly straight really straightforward, which is a nice break So if you implement least recently used in hardware You're gonna have to do what we did in the example Which was search all the pages to see when it was least recently used So if you actually have to implement something like this You could use a timer or a counter on each page that keeps track of when it was least recently used So whenever you reference a page, you would have to update that. Hey, it's been used recently So every time you access a page, you would have to update that Timer on that page. So your operating system essentially knows not to touch that anymore and then whenever you have to kick out A page you have to scan through all the pages to find the one with the oldest clock Which would be slow you might be able to speed it up a little bit by using a you know Like a red-black tree or something like that but it's still going to be fairly slow and the worst part about this is on every page reference you have to write some memory into the like a clock that keeps track of it and If you do that for every single page access Which is essentially every single memory access on your machine. You're probably going to make it twice as slow So that's not going to work. It's going to be way too expensive You could also be like hey, well, I'm smarter than that instead of doing just a counter I could just create a doubly linked list of pages that has a front and a back So whenever I have a page reference, I'll move it to the front of the list and say hey It is now the most recently used page and then whenever I want to replay replace the page I just pick whatever is at the back of the queue, which is like 01 But this 01 actually matters as well. So for each of them, there's like six pointers updates for each page reference and It also creates really high contention and bottleneck if you have multiple processors So all those pointer accesses are going to have to be Essentially guarded with a mutex. So it's going to be all serial and if you have Six pointer updates for every one memory access It's going to be terrible and even more terrible if you have to acquire a lock with it So in practice, it's not going to work at all so what we're going to do is settle for a Approximate least recently used and that is generally okay because least recently used is an approximation of the optimal case Anyways, so an approximation of an approximation is exactly what makes good engineering So there's lots of different tweaks you could do to implement it more efficiently So next lecture even though I yeah, usually this one takes more time, but I guess we can end early So next lecture will look at the clock algorithm, but there's also a few different variations There's least frequently used which is a slightly bit different There's a 2q approach and there's an adaptive replacement cache that we won't see But there's a bunch of different options and there's generally like scheduling. There's no Right answer. It's just a series of trade-offs So any questions today or everything fairly straightforward Generally page replacement is like a nice little break. So we have midterms today. So we can just wrap up. We are heller early It's tomorrow. Oh Well, we can study or we can take a time and we can start lab four and we can probably you can probably like finish most of lab four So page replacement was all the algorithms There's only one metric you use and it's the number of page faults and that's what you aim to reduce So we saw the optimal algorithm, which is you just replace whatever is not used in the future the longest So that's good for comparison Obviously not realistic because we can't predict the future and then there's random replacement Which we didn't see an example of because it's just random. It actually works surprisingly well it's better than first in first out because it avoids the worst-case scenario and then FIFO is easy to implement, but it has that Belladies anomaly, which is really weird We're reducing the memory actually reduce the number of page faults and there's that weird paper that says that's true And then there was least recently used that gets close to optimal but easy to implement So hopefully this is a nice break. So just remember I'm pulling for you. We're on this together