 All righty, welcome back to operating systems OBS decided to crash. So yeah, it's fun All right, so today we get to talk about page replacement. So before that Hey midterm grades got released this morning. It looks like some of you are looking at them So got released this morning. If there's any possible mistakes for your questions or whatever Yeah Can you guys see the average I think you can see yeah, it's like 75.3 or something like that might be going up slightly so Point 75.9. All right. We got to start cutting grades somewhere. All right Yeah, so we're gonna just start cutting. I'm joking. I'm not gonna it's fine 75 is good. Yay Yes Someone got a hundred someone got like 20% not great So Hopefully it wasn't too bad. So final will be more or less kind of the same I didn't like the amount of rating. So hopefully the final will have less writing That was other than that. I thought it was Marginally, okay Aside from the too much writing. So anyone disagree with that or want to yell at me for doing that seemed Okay-ish Yeah possibly so if they weren't and you answer them right just Talk to the TA if the TA doesn't agree with that it goes up to me and I get to look at it So any regrade eventually if they're questionable, they go up to me and I get to look at it So if you can argue with me then great All right, so now page replacement another fun topic and this will definitely be on the final So we should probably pay attention. All right so Computer memory hierarchy, it's all a trade-off of Capacity and speed so on your CPU those have registers Registers, there's only a limited amount of space, but they're really really fast below that We have like our CPU cache that has like L1 L2 L3 all that fun stuff that may or may not be shared between cores then after CPU cache, there's memory so just RAM and That's where we have virtual memory pages and everything like that and then below that will have like SSDs like non-volatile memory So those are like the fast solid state this and we might have SSDs like just plain old flash to orange Below that we might have hard disk drive. So like the spinning magnetic things and below that we might have tape drives, which How many of you know what the hell of VHS tape is? Probably like two ish so Tape drives are exactly what they sound like although some of you may not have seen any tapes whatsoever. They're like Think of like an old-school movie that has like, you know the film and things are written on the film Well, that's what tape drives are they can store like terabytes and terabytes of data and people like Google will actually use tape drives to store large amounts of data they don't need to retrieve very fast so like a tape drive can be like several several terabytes and it's dirt cheap. It's just really really really slow and Yeah, you don't use it for anything like that, but if you're Google and you're archiving Insane amounts of data you might use tape drives, but for us We're probably only used to like this level of the pyramid CPU cache on the CPU Which we can't really directly manipulate and then memory and then our disk so What we want ideally is to hide this memory hierarchy from the user So each level pretends it has the speed of the layer above it and the capacity of the layer below that because while There'll be a trade or there'll be like a cost and speed trade-off as you go up and down the layers so We also want to have the illusion where hey if processes Use more memory then is actually physically on your device Well, they should be able to do that and you can maintain that illusion. So that's why sometimes Even if you only have like eight gigs of RAM or something you can open up a web browser And it'll say it's using up to like 10 gigs of RAM and that's more than you physically have But it's not running out of memory. It's still working somehow So the idea behind this is well, we can actually use the disk to expand our memory and That way if a program is actually using some pages, we can keep them in memory otherwise if You know we're we've context switched away and that program is not actively running We could move some of its pages to disk and then whenever we context switch that process back in Maybe we have to bring those pages back in from disk back on to memory So that's what the man paging is where we use memory as a cache for the file system So operating system wants to be lazy in order to be quick So for instance if you try and open a file Well, it might just set up the page tables map that file into memory kind of like what we saw with that M map call and just map it and Skip reading the actual device until you actually use that memory So whenever you try to use that memory then it would generate a page fault and then it could go ahead and read that in to memory from disk the other flip side is well The page we're caching or keeping track of well that might not correspond to something on disk And if we want to create the illusion that you know our processes can all use more memory than is actually physically on the device Well, if a page doesn't actually map to a file on the hard drive We can map it to a special file or a partition or something like that called swap space so that means if we want we can move this page of memory from virtual memory and move it to disk and on Windows this will be like in the root C directory. You'll have something called a swap file or Page file. It might be called. So that's basically what that is So if you run out of physical memory it will take some of the pages and write it to that swap space and then it can Free up some other pages in memory to use for active process Active processes. So that's a way that maintains that illusion So each process has a working set and a working set is basically given an amount of time The number of pages your process uses is called its working set. So ideally if your working set is you know Whatever you context switch it in to whatever you context switch it out Then hopefully all those pages are already in memory. So you don't have to go and get them from disk Which would be much slower So if you can't fit your working set into physical memory your process does something called thrashing and thrashing is Whenever you are using your pages you get a page fault and then it has to read it from swap space from the disk Which is really slow and then You have that memory now and then whatever you swapped out you are now trying to use again So it would have to swap in and out again So thrashing is basically just constantly swapping in and out of memory And it's really really slow and you don't want to do that So if your computer suddenly Slows to a crawl and you're like at the limit of virtual memory then likely your processes are thrashing They're swapping in an own disk, and it's just really really slow So in order to deal with that swap space and actually figuring out What should be in memory and what should be in that swap file or that swap space? We need to have some good fat old-fashioned algorithms, and they're called page replacement algorithms So the optimal thing to do is given a bunch of memory accesses that a process makes will just assume one process and one core Ideally you just replace the page that won't be used for the longest So the page I will send to the disk, which is slow. I will send the page to the disk Whatever one is not going to be used in the longest time that way I maximize the amount of cache hits, but that's not going to be practical because well That would you would need to have all the accesses before hand you need to read into the future Which isn't practical. We use optimal just to evaluate our page replacement algorithms so another Strategy we can use this is called random as the name implies So instead of making a smart decision with what page to send to the disk We just pick one at random and hope we made a good decision On that our first algorithm that we usually go for is first in first out So if I keep a lot track of which pages I have thrown to disk Then I can just replace the oldest page I brought back in from disk first and in the case of least recently used I can just replace the page that hasn't been used in the longest time And that is looking at the past to predict the future, which is something we could theoretically do So whenever we Evaluate any of our page replacement algorithms We typically say how many pages in that physical memory can hold and then give you a list of accesses to pages and You just assume initially these have all been previously swapped to disk and you count the number of page faults that happen and Swaps you have to make so we'll use this exact sequence for a bunch of examples in the slasher Just to keep things consistent so we can compare things so every example at the end of the day Unlike scheduling where we keep track of like the number of context switches the waiting time the response time all that fun stuff All we care about at the end of the day is just the number of page faults because of page fault in this context means we have to read something from disk and That's the main thing. We are concerned about and it's slow. Yep So in this example if we page fault, we're just assuming a single process is running and accessing pages so page fault we just assume that We're just reading a page in that's already on the disk and it's just slow So it would already had we just have one process. It's right. We don't have to worry about other processes for this It's the same idea, but like since these are all physical pages anyways It would just mean that different processes probably use different physical pages. That's the same idea All right So for this typically you will draw something like this so assume our physical memory holds four pages So I draw a box for every access and in the box I put in what pages are in memory and if there's a page fault and we just brought in a page I put it in red and on top of the box. I just write whatever page we were accessing So initially we just assume that all of these pages are already on disk for whatever reason and We need to bring it into physical memory. So if we try to access page one Well, it is not in physical memory We have a page fault and we have to read it in from the disk So we read it in from the disk we put it in red to indicate there's a page fault and now our memory has Page one in it. So any questions about that set up because we're going to use it the whole time Hopefully it's straightforward get used to that. You'll probably have to do it on the final Not a hint or anything. All right, so now if we access page two all the first four accesses are going to be boring So page two isn't in physical memory. So we have to read it in from disk So we would just read it in we have enough room for it. So we have four slots and Room for four pages and physical memory. So there's enough room for it. So we would read in page two So page one and page two are now in physical memory All good Everyone on the same page Hell yeah That's why I teach this course just so I can make that terrible joke. All right. I'm done. All right, so Same thing for page three not in physical memory page fault read it in same for page four Not in physical memory read it in All right, what's gonna happen if I try to access page one? Nothing, right? It's already in physical memory. I'm all good. You could essentially say that's a page hit So I don't have to do anything. So I just keep my box the same Nothing is in red because I didn't read anything back in from the disc. All right. What about the access to page two Nothing again. Same thing already there All right. What about page five? This is where it becomes interesting. What do I do now? Yeah So yeah in this case I need to pick something to kick out and read something in in this case Well, depending on the algorithm I can pick page one two or three or four in this case We're trying to do the optimal thing So the optimal thing would be to maximize the amount of page hits I get in which case I should be able to look into the future and then I will replace the page That isn't going to be used in the longest period of time So in that case should I replace page one? No because I'm going to use it next. Well, should I replace page two? No, because I'm going to use it after that. Should I replace page three? No, so I'm left with page four because I got a kick out something So what's going to be used for this in the future is page four. So I kick it out Yeah, so I if it's not in physical memory. I have to bring it into physical memory So I have to make a decision right now I have to just pick a page to kick out to bring in page five so kicking up page four is in this case, it doesn't actually matter but the most optimal thing would be to maximize the number of page hits and Yeah, so you want to just maximize the number of page hits so whatever is furthest in the future you kick out in this case in general not Yeah Yeah, so this is just optimal this I can't implement this Unless you know you you might be able to throw machine learning at this and then try hey That's like you want to do operating system research. You can do that All right, so we said what we say we were kicking out page four. All right kick out page four boom So we replaced page four with page five. So page five goes in red So that was another page fault that we resolved so now because we did that we did the optimal thing Well, if we access page one, that's a hit two. That's a hit three. That's a hit What happens now if we access page four? Yeah, page fault and now I have to kick out something. What should I kick out? one Or two, could I kick out three? Yeah, at this point. It doesn't matter. I could kick out one two or three and You know, whatever. I hate the number one So I'll kick out one and then as long as we don't kick out five because we're using it right next So in this case, how many page faults do we get for this? Six so makes it easy. We just count the number of red numbers So we got six page faults. So remember that so we can evaluate other algorithms to this So this is optimal so nothing should go less than this. Yeah Yeah for this Generally, you just assume you can do one page at a time But for a real system if you wanted to you could do multiple pages at a time or do things in the background Or what most operating systems will do is something called pre-fetching Well, they'll just bring pages in if they're not doing anything useful anyways, but typically this is slow So you don't want to do like three at once because you want to just do one and get back to work as quick as possible Yeah No, if you get a question like this on the final be like tell me at each step What's in memory and the number of page faults? The colors is just for this because it makes it clear if On the mid or final I guess you wanted to circle it or do whatever or whatever That's fine or just not write it and write the correct answer in which case if you're wrong Then we can't do anything to help you but if you are Write the wrong number of page faults and we can follow it. We can at least see if you knew what you were doing So that's generally the Thing with grading so if you want if you want to write really really really really short be sure you're right All right, so let's do the same thing with FIFO So if we do the same thing with FIFO Can I skip the first four accesses because it'll be the same and they'll be the same old boring thing? All right, so access one bring it into memory to bring it in memory three bring it in memory four bring it into memory All right, so now we access page one. What should happen here? hit we're all good and We're accessing page two next. This should also be hit. We're all good. All right. What about Now we're accessing page five again if we were doing first in first out. What should I kick out? One all right. We're already experts at this so I kick out page one. I bring in page five. Oh Now I'm accessing page one. What do I do now? Kick out to we will see that this is also a poor decision. So The oldest now is is to so I kick out two and I get one and now I'm accessing page two Yeah, you can see this was a bad decision. So now what do I kick out? I? Kick out page three. So I replace page three with page two now access page three another poor decision on my part Now what do I kick out? Four right. So I kick out page four. I replace it with three. I made another awful decision And now I'm accessing page four. So you can kind of see the pattern with first in first out, too Your page faults kind of look just like this waterfall thing. So this kind of looks like they're all diagonal lines So if you want to do it really quick, you can look at it like that So in this case with my diagonal hack if I am accessing page four. What do I need to kick out? Five right. So if I keep up my diagonal line, you know, it has Went back up to the top. So I click kick out page five. All right now. I'm accessing page five. What do I kick out? Page one. All right. So that was a pretty Poor attempt right how many page faults did I get there? ten yeah Pretty terrible. All right now. I have a question for you so Everyone agrees that the more memory you have the less page faults I should have right because the more things I can fit in memory at a time would make sense, right? So if I have if I can hold Five pages in memory. I should have less page faults than four. So everyone agree with that All right. What about the others hide that so what about if I only have three pages? I can fit in memory. I should have more page faults in this, right? All right, let's keep that in mind. So let's do that. Let's shrink memory. So instead of four pages We will only have three So let's do it again So the first three accesses are going to be the same thing. I have to Load them into memory because I'm assuming memory is already empty So now what happens if I access page four you can already see this looks pretty bad So what am I going to replace when I access page four? page one All right Wow, that was a bad decision. So now I'm accessing page one. What should I replace? two Yeah, this certainly seems like it's worse. All right now. I'm accessing page five or page two Sorry page two. So why am I going to kick out for page two? Yeah, I'm going to kick out three. So now I have four two and three. So now I access page five What do I kick out for page five? Four. All right, we are experts All right. Now I access page one. All right. So far I've page faulted every single time. Not great So now I access page one All right, that's a hit, right? So all good Don't have to do anything now. I access page two Okay, page two is also a hit. That's good. Now I access page three So now if I access page three, what do I kick out? Page one again. So page one is out of here. Now I access page four So access page four, what do I kick out? page two All right. Now I access page five and that's still in memory How many page faults do I have here? Nine. So you guys just lied to me So before when we had four Frames and we could fit four frames in memory. We had ten page faults now We can only fit three and we have less page faults. So it's better What why'd you guys lie to me? Huh? This was lucky Yeah, well this shows you that hey if you want to optimize the performance of your computer Just take out a stick of RAM and we'll go faster all right, so This actually has a name So it's Belladies Anomaly and it says that if you have more page frames You can actually get more faults which would be an anomaly because that's not what you assume that would happen And this is only a problem with FIFO doesn't exist for least recently used or other algorithms We will look at in fact if you want to read a math paper in 2010 Some people that wrote a bunch of Greek letters that I don't understand anymore wrote a paper that said hey You can construct an arbitrary sequence of Page accesses to get any page fault ratio you want if you do FIFO So for other algorithms it behaves as you expect Increasing the number of page frames actually decreases the number of page faults But for FIFO and FIFO only it has this silly anomaly Where that is not true and actually making it smaller might actually make it better in terms of page faults Might so if you're a math person you can construct arbitrary ones that get whatever Ratio you want and you can make it so that the smaller ones always better than the bigger one But depends on the sequence. So if you actually implement this it would be up to luck But in general if you want to theorize the hell out of it. Yeah, you can do whatever you want All right Fun little anomaly. That's about as mathy as we get So any questions about that? All right good because I can't answer math questions All right Good so now we can do the same thing and this will probably get boring where we do least recently used and in the case of a tie we would use FIFO in this case We won't consider ties because a tie would come from if multiple processes are running in parallel and That is how you would get a tie in this case I'm just assuming a single process running on a single processor and I don't have to worry about that So same thing. I'm going to have my four initial page accesses. I have to bring in a memory so one two three Four and then I access page one. That's a hit. I don't have to do anything now I access page two and That's a hit. I don't have to do anything now I access page five and I need to replace one of the pages. Yep Page three. Okay Yeah, so I have to replace one of the pages one two three or four in the case that I'm using least recently used I look backwards in time and I assume that whatever I used most recently. I will probably use again So think of the same variables you generally will be accessing the same variables over and over again so I will go ahead look at the past and Whatever I've used the most recently. I won't get rid of it. So in that case I just use page two. So I'm not going to get rid of page two I'm also not going to get rid of page one because I use that right before it And I will also not get rid of page four because I use that just before page one So last page standing is page three and I get rid of it. So Any questions about that? Nice and straightforward unlike, you know, having multiple processes and threads So we replace page three with page five and now in this case Hey, it turned out we were right because while we access page one before we're going to access it again So that's a hit then we access page two That's a hit now. We access page three. We gotta get rid of one two five or four. What do we want to get rid of? Four all right, so four because yeah, if we look backwards three Well, we don't want to get rid of five one or two. So we get rid of four which in this case was a poor decision So now I access page four now. What do I get rid of? Five I've also made another poor decision So get rid of five for four and then I access page five and in this case, what would I get rid of? Page one. All right. So how many page faults do I get in this case? Eight so better than both FIFOs not as good as optimal if it was better than optimal that would be weird But better than FIFO. So this looks good. Is this this is probably more like how it's implemented, right? Is there any problems with this if you were to actually implement this? Yeah, yeah Yeah, if you did one two three four five one two three four five one two three four five and You only have four physical Frames you can fit in memory. Yeah, you're just constantly getting paid faults doing this But without knowing the future It's the best I can do it. At that point if I bought some more RAM and I could fit five then I'm golden, right? All right, so in this case, how would you implement lease recently used? Yeah Double a link list plus a hash map or a doubly linked list All right, would that be a great idea? So if you keep track of leasingly Lease recently used that means whenever you access a page you have to mess with a linked list So would you like to you know, you read a single variable you read an int and then you have to update a link list That sound good So I read an int update a link list. What do I have to do to update a link list? Yeah, I have to like read a pointer change a pointer write a pointer. I have to do a bunch of memory accesses Sorry What a Splay tree. Yeah, you could do that But if you do anything whenever you access an int no matter what how optimal it is as long as you're doing more than one memory access You're pretty screwed. It's probably way too slow So it's like over half the time even if you just read an int. That's over half the time spent doing that Yeah, a counter for each of them. That's a bit better All right, so we'll go into that I guess in the next lecture. We just have to argue that Actually implementing this is actually pretty bad So I could implement it like yeah having a counter for each page if I wanted to like a timestamp And then each time I reference a page I could like save a system clock to the counter and then for replacement I scan through all the pages and find the one with the oldest clock. That sounds pretty bad so that's like O N and If I update the clock on every reference whenever I read or write a single byte That's also not bad and in general you would have to search all the pages Also, not bad or sorry also very bad also very slow So if you implemented it in software and assume you didn't have like a counter that you could just save it You know I could create doubly linked lists of pages and then for each page reference you know the computer science thing that said it would be fast as I just move it to the front of the list and Then if I replace it all I do is remove whatever the pages at the back of the list but even updating those well it will require like probably six pointer updates each time you reference a page and also assume I had multiple processes all running at once and Well, guess what you're using a link list for it and now you have data races with that link list You would have to do what you're doing in lab 5 and put a lock around it in which case you make it even slower So probably not a good solution in general So any questions about that or why? Clearly this isn't going to work, but least recently used seems like a good idea so We can't actually implement least recently used in any Efficient way that is actually practical So we do what we always do and if we want to make software go faster And we can't implement it as fast as we would like we just change the problem So we just instead of implementing least recently used we implement something that kind of looks like least recently used so we just approximate it and It seems like an okay thing to do because least recently used is kind of an approximation for the optimal But I can't read in the future. So I have to approximate it least recently used kind of approximates that so I'll just approximate an approximation and Well, that's pretty much what our computer life is built on So lots of different tweaks you could do to implement it efficiently So specifically next lecture we'll be looking at something called the clock algorithm But there's also like least frequently used which is slightly different and less fine grain There's something called to queue. There's adaptive replacement cache something called arc We won't get into detail, but it's like the same idea is scheduling There's no perfect answer in general in this case. There's an optimal answer, but you can't see into the future So it's going to be a whole bunch of trade-offs just like scheduling So hey, we get to any questions before we wrap up Otherwise we can end early on this gloomy crappy day and enjoy the not weather Yeah Yeah, well, so yeah, we saw optimal so that's good for comparison But not realistic and then random well turns out random is actually better than FIFO because FIFO actually does just the worst thing generally and has that weird has that weird anomaly so Random the idea behind that that works sometimes surprisingly well because since it's random It's less likely you make a very poor decision. Sometimes you'll make a good decision Sometimes you'll make a bad decision FIFO typically makes poor decisions all the time. Yeah Yeah, random in this case to just be completely 100% random But of course hey if you actually want to implement you want to make random with a bias you could do that Yeah, yeah, if you run enough, but random it's like the average case. It would be okay, too And also, you know, these are page faults for people's programs And it depends on how you write that program and how it accesses memory So what about if you just wrote a ridiculous program that just uses random pages all the time Then hey random might be good if you're right if you're random my random meet up Then hey, it's good if you if it's like some program just uses completely random pages These recently used isn't going to work very well either It's probably gonna be slower than just picking a number and doing something making a quick decision So yeah, the pens because remember at the end of the day the operating system has to run your programs and Well, you can write some really really poor software Hopefully you don't Generally software, you know should use memory that's close together and all that fun stuff But you can write a very non-performant program that just acts like what we saw before with the TLB example if we access memory on different pages every time We can make our program go like 30 times slower So if you want job security Just spread out your variables really really quick and then once they tell you to make your thing go faster Make them closer together boom it runs 30 times faster You look like a genius and they probably won't notice the difference So more fun job security tips here Don't actually do that I will probably get yelled at but technically you could all right So then we ended up we did least recently used gets close to optimal expensive to implement We'll see an approximation in the next lecture. So just remember pulling for you