 Welcome to the 13th lecture in the course design and engineering of computer systems. In this lecture we are going to continue our discussion on memory management and paging and we will study a topic called demand paging in this lecture. So let us get started. So the mental model we have so far the story we have seen so far is you know every process has a virtual address space that is divided into fixed size chunks called pages and for every page to store the contents of this page to store the contents of these virtual addresses the OS will allocate a physical frame for every page of the process and this mapping is kept track of in the page table that is what we have seen so far. But the actual truth is what we have seen so far is only an approximation it is not exactly correct the truth is that the operating system does not allocate memory for every page of the process it only allocates memory on demand that is the operating systems provide you with what is called virtual memory. So you can say that okay I have these addresses at this page there is code data stack heap everything you can say that all of these addresses are assigned to you assigned to your program but until you use them the OS may not allocate actual physical memory for you until the CPU is running that code you may not actually get physical memory allocated. So the OS allocates physical frames to pages only on demand when you access that memory contents until then you may not be given a physical frame right and also if you are not using some physical memory the OS can also take it away from you even though you think there is code data stack heap something stored in that page actual physical memory may not be given to you it can be taken away even after giving it to you right. So this concept is called virtual memory okay and therefore in your page table some for some pages even though the page is valid and there is valid data the programmer thinks there is valid data stored in that page the physical frame number may not be allocated and the page can be marked as not present in RAM until you use it of course if you use it once you begin using it CPU says get me this data then of course it has to be in RAM but until then it may not be in RAM and the page table entry can be marked as not present right. So when such pages are accessed the MMU will trap to the OS because there is no physical frame number MMU cannot translate there will be a page fault this particular trap is called a page fault and then the OS will allocate a page suppose you think you have your some data stored in this page over here right your page table entry will mark this page as valid you think these virtual addresses you can use but there will be no physical memory allocated corresponding to this when you access this page when the CPU accesses some virtual address here it will trap to the OS then the OS will say okay if you are using it fine I will give you the physical memory you can put the contents inside it and then the translation can happen right. So this memory allocation is done on demand because of this that some of the virtual memories of all the processes in the system can actually be more than the physical RAM you have the OS over commits memory this allows the OS to run many more processes then there is actual memory in the system and this is because every process is only using small amount of memory at a time right you are executing some code over here then all of these other physical frames you do not need them if you are not using them. So this is an optimization that modern operating systems do in order to support many more processes over a limited amount of RAM. So how is this memory allocation done on demand let us understand that in more detail. So before that to understand that we need to classify the pages in the memory image of a process into two types right if you look at all your pages in the memory image they are of one of two types one set of pages are what are called file backed pages that is these pages contain data from files on disk for example if you have a page containing your code of the executable right. So this data that is there in this page is already also there on disk right you will simply copy the disk contents into this page and on the other hand there are some pages which do not have any copy on disk for example the pages of your stack right the stack you are using at runtime as you are running your pushing arguments your popping arguments right it does not correspond to any file on disk such pages are called anonymous pages. So in your memory image some pages are file backed pages which is your executable code your language library code operating system code right there are many such pages that can be file backed and there are some pages that are anonymous that are not file backed. Now for these two kinds of pages the on demand allocation will be different. So for file backed pages right how can on demand allocation happen suppose you have created a process and there is some you know this page of the process is file backed it actually corresponds to the code in the executable. So when the process is created the OS will create this virtual address space it will create a page table entry saying okay from this address 0 to some 4kb this is your code the OS will tell you that and the CPU will start to execute this code program counter points to some address over here all of that will happen but until the CPU says give me the instruction at this location you will not actually allocate a physical memory frame for this page. So when the CPU says give me this code then you will copy from disk into some free physical frame and then you will add a page table entry from this page number to this frame number right. So you will fetch from disk only when that code is accessed for the first time not when the process is created when the process is created itself you do not give it a bunch of physical frames and copy all the code and data into it only when the CPU starts to run it it says give me an instruction at this address then you will quickly find a free physical frame copy the instructions into that frame add this page table entry and then the CPU will access it. So file backed pages can be fetched from this on demand and if the OS is running out of memory it wants you know to reclaim some memory take away some physical memory from you then what is there it can simply take away this physical frame the contents are there on disk anyways right it will remove this page table mapping it will mark this pages not present in RAM and you are done the physical frame can be reclaimed easily when it is not in use by the process and the next time the process wants to run this code again once again you will allocate a physical frame create this page table mapping right. So that is on demand allocation for file backed pages. Now what about anonymous pages suppose you have a stack or something here right it has some data on it that is not there on disk. Now when you access the stack for the first time then you will allocate an empty page right you will not give an empty page for the stack when the process is created instead when it starts to actually use the stack then you will give it a empty page. Now once later on once you have given the stack you have given a physical frame to the stack the some contents are stored here now if you want to reclaim this physical frame OS wants to take away this physical memory then you cannot simply take it away right there is some data in this RAM it will be lost what do you do. So once you have a modified or also called a dirty anonymous page it cannot simply be reclaimed OS cannot take away the memory then you will lose the data. So in such cases if the OS wants to take away some of the physical memory given to you it can store a copy of your memory frame on a special area of disk called the swap space that is you have certain physical frames assigned to you that are storing some anonymous pages contents of some anonymous pages that are not there anywhere else on disk and but the OS wants to free up this physical memory in such cases this content in this memory physical frame will be written to disk into the swap space and now this memory frame is free right it can be raised this content can be raised given to store some other content given to some other process and so on right. So dirty anonymous pages are written to the swap space when the OS wants to reclaim the memory and when again when the process accesses the stack again you want the contents here then the OS will give it back to you again it will read from swap space and give you your memory page again right in this way on demand allocation is done. So now we have to keep track of all of this right we have to know that is this page does this have corresponding physical memory or not has the OS yet done the allocation or not you have to keep track of all this information. So you have some extra bits in your page table entry that will help you keep track of all of these things. So the page table entry if you look at the page table entry right one single page table entry or page table has many such page table entries in an array if you look at one page table entry it will have the it will have some space for the physical frame number and it will also have a few extra bits of information what are these bits one bit is the valid bit which indicates if this virtual addresses in this page are in use by the process or not right if the process is storing anything code data stack heap anything in those virtual addresses then those virtual addresses are in use the valid bit is set otherwise the valid bit is not set. Then there is the other bit which is the present bit that is even for pages that are assigned valid data even for valid pages you may not actually assign a free physical frame yet. So the present bit indicates if a physical frame number is assigned to the page or not note that these are two different things. So if a process accesses a page where the valid bit itself is not set this means that you know you have some say you have your heap all of these addresses are there in your heap and this is the end of the heap and you have accessed an address here right beyond your array you have accessed for such pages the valid bit itself is not set that is an illegal memory access like a segmentation fault. On the other hand for some valid pages there may not be any physical memory allocated yet. So for such pages the valid bit is set but the present bit is not set in such cases also the memory address translation cannot be done MMU traps to the OS in such cases this is not the fault of the programmer right the programmer is accessing correct memory but the OS has not yet allocated physical memory here in such cases the OS will allocate memory on demand. So such pages when they are accessed the OS has not yet allocated memory and it will allocate memory on demand. So these two are different understand the difference between a valid bit which indicates all the virtual addresses that the process thinks it can access and the present bit which indicates if actual physical memory is there for these valid addresses or not. So with demand paging these two are different the set of valid pages and the present pages are different. So there are also other bits in the page table in every page table entry that are set by the MMU which indicate how the page is being used to the OS right because the OS does not know whenever a any time a memory page is accessed. So the MMU will tell some extra information to the OS for example if a page has been written to the MMU sets a dirty bit. If the page has just been accessed read or write the MMU sets the access bit. With this the OS will know oh this is you know an anonymous page this is a dirty anonymous page this is a unmodified page all of this information the OS can know right so which is why these bits are set by the MMU. So these is all the extra information in addition to the frame number that is present in the page table entry. Now let us see what happens on a page fault right. So what is a page fault anytime the MMU cannot translate an address it raises a trap the CPU executes the int n or the trap instruction jumps to OS code we have seen this before. So when the MMU cannot translate a virtual address to a physical address then it raises a page fault it traps to the OS with a page fault. And this can happen for many reasons and what are those reasons one could be some kind of illegal access the process is accessing memory it should not be accessing for example in user mode you are trying to access a kernel OS virtual address you are trying to write to a read only page any such cases there will be illegal accesses. There could also be invalid accesses right the MMU looks at a page table entry and it finds that the valid bit is not set this virtual address is not even in use by the process but somehow the CPU is requesting that virtual address then that is an invalid access. Then sometimes the address is you know valid the program thinks it has put some code data something in that memory address but the OS has not yet allocated a free physical frame has not yet copied that content into memory from disk or wherever in such cases MMU when it is translating the address it will find that the valid bit is set but the present bit is not set right because the memory allocation non-demand allocation is not yet done by the OS. These are all the possible reasons why MMU's address translation can fail when it is walking the page table and it can trap to the OS for a page fault. So, how does the OS handle these cases if the OS is if the program is doing any illegal or invalid access then the OS can simply terminate the process it can say you know if there is a segmentation fault your program crashes right so the OS can terminate the process. But in this case in this case when it is not the fault of the programmer a valid bit is set but the present bit is not set then the OS will service this page fault and try to allocate a free physical frame to the process on demand. So, now let us understand this case in a little bit more detail. So, you have your virtual address space of a process and for some CPU has requested some virtual address and this virtual address is not yet assigned any memory frame by the OS. So, this is a valid virtual address you know the program thinks it has some code or data or stack something over here. But actually no memory has been allocated or maybe memory was allocated by the OS in the past but when it needed to free up some memory it copied it into disk into the swap space and now there is no memory corresponding to this virtual address. In such cases what the OS will do is when this page fault happens it will find a free physical frame and it will give it to this particular page. It will find a free physical frame to assign to this page. So, where will you get a free physical frame from the OS usually maintains a list of free physical frames you know RAM there are some empty locations in RAM that nobody is using it will have a free list from that it will find some frame number and give it to this page. Sometimes all of your memory is full there is no free physical page at all. Then what do you do? Suppose there is this physical frame that is in use by some other process then the OS will take away a physical frame from some other processes logical page and give it to this process that is called evicting a victim page. So, now some other process its page contents its logical page certain virtual addresses are stored in this physical frame right in this frame. This frame number is there in the page table of some other process the contents of some other process are stored in this physical frame. The OS can take this away from that process say you know if this page is has some modified modified anonymous data then it can write it to the swap space or if it is file back data it can simply erase the data whatever it is it will take away this physical frame from some other page of some other process and give it to this process to handle the page fault. So, how do you pick this frame which poor guy will you make your victim there will be a policy every OS has a page replacement policy that tells the OS which is the best page to pick we will see what these are in a little bit. So, the OS will run this look up this page replacement policy find a victim page and free up this physical frame you know it may have to write it to swap space if some other process has stored some anonymous modified anonymous content over here and then now that this frame is free the OS will fill it with whatever data suppose this has to have the code of this process then you will get it either from the file or from swap space right if this page if it is an empty page well and good you will just clean it up and give it to this process otherwise if it had some content beforehand then you may have to read it from disk right read from disk and initially you may have to first write to the disk right you will have to write this victim page contents either to swap space or somewhere and read some extra content into this physical frame and then add a page table entry from this virtual address to this physical address once the frame is ready with content you update the page table mapping and you tell the MMU hey look now please try this address translation again it will work because I have added this address translation in your page table it will work now right and you will restart the process execution. So, this is called servicing a page fault right it involves identifying a free frame probably by evicting its contents today reading the contents of this page from disk if required updating the page table mapping and restarting the process. Some page faults so all of this is a what is called a major page fault right so this is a major page fault in contrast sometimes you will have minor page faults which is this frame is already there in memory maybe it is a shared library frame or a frame having OS content it is already there in memory all you have to do is just add a new page table mapping to it that is all right you do not have to do anything like reading it from disk or something you just the content is there you just have to add a page table mapping. So, such page faults are easier to fix they are called minor page faults otherwise if it is the page is not there in memory already then it is a major page fault. So, now we have seen a page replacement policy right it should make this difficult decision of if this process this page these virtual addresses need to be put into RAM which other process which other page should I evict out of RAM in order to make space of course if you have free memory well and good but if you do not have free memory you have to pick some other victim page to evict. So, which one will you pick these policies have to pick pages such that in the future immediately there would not be another page fault suppose if you pick a very popular page to evict immediately that other process whose page you evicted will again have a page fault again you have to bring back that contents right. So, you do not want to do that you want to evict pages that will not result in page faults again in the immediate future you want to reduce the number of page faults. So, there are many page replacement policies for example, again a very simple policy is the 5-4 policy right you see whichever page was assigned a frame in the past in the order in which it was assigned in that order I will evict. So, this process for this page it started long back it took a frame long back hey you had enough of this frame give it back to me you can do that. But this may be suboptimal because that process could actually be a very active process that page could be very heavily used. So, if you take away its memory and put it in swap space immediately it will have a page fault right you do not want that. So, a better a more sensible policy is what is called LRU policy or least recently used policy that is if a process is you know has been assigned some memory and it has not used that memory has not touched that code or data for a long time then you can say that there is a lesser likelihood that it would not be used in the near future of course we do not know it could be that immediately it might need it in the future we do not know. But the probability is that if you have not used it for a long time maybe you will not need it again in the near future either. So, such least recently used pages will be evicted that is the LRU policy and this is what most operating systems use some version of this LRU policy because this is a sensible policy. So, how does the OS know which page is LRU now in your page table you have all the list of all the pages of a process. But how do you know which page has been least recently used note that the OS is not informed every time there is a memory access right the MMU is translating the addresses the CPU is accessing the memory the OS is not involved. So, how does the OS know which page is LRU there is no like you know the OS is not putting a timestamp on every page whenever it is being accessed right. So, OS has to make this difficult decision of finding the least recently used page and there is no easy way to do it. So, most modern operating systems implement only an approximate version of LRU how they do it is as follows in your page table in your page table array there is a accessed bit ok. So, every time the MMU looks up a page table entry walks the page table finds a page table entry whenever it touches this page table entry it will set an accessed bit ok. I access this page therefore, I set the access bit I access this page I access this page MMU will keep setting these accessed bits right. And periodically what the OS will do is it look at all these accessed bits in the page table oh these pages have been recently accessed therefore, they are active pages I will find one of these pages that are not recently accessed and pick them for eviction right. So, using so this is only an approximate LRU right it is not exact LRU policy, but this is the best that modern operating systems can do. And of course, you can periodically reset these bits look at you know in the past few intervals when all bits have been set access bits have been set you can do many improvements on this. But in the end it will only be an approximate LRU policy that operating systems will use. So, you can pick pages that have do not have this accessed bit set that are inactive or you can also use other heuristics ok. If the page is dirty then I do not want to evict it because after once again write it to disk. So, some combination of looking at this dirty bit an accessed bit modern operating systems make this decision for approximate LRU. So, now we can put everything together and let us see what happens on a memory access. We have seen this before also now we have even more information about you know page faults and non-demand memory allocation. So, let us revisit this story again you know the CPU is executing the code of a process it periodically requests for either an instruction or some piece of data some variable at some memory location you know either instruction or data the CPU requests. Now what happens when that happens? So, the CPU has made a request using some virtual address. Of course, you will first check in the CPU caches if it is a cache hit your data is available you go back to the CPU. Now it is a cache miss. So, then what do you do? You will go to the MMU right. So, the MMU will check the TLB to get a physical address corresponding to this virtual address. If the physical address is there well and good you will take that physical address go to RAM. You will go to main memory access that memory contents put back the contents in the cache and return the data to CPU right. If this is not there the physical address is not there in the TLB then the MMU will walk the page table go to RAM fetch the page table multiple levels of the page tables it will access it will find the page table entry translate the address put a copy of the translation in TLB and then access the memory contents. While doing this while walking the page table sometimes there could be an error. If the address is valid and present in page table well and good you have translated the address put it in TLB all done. But if there is an error if the address is either invalid or you know valid but it is not present some permissions do not match in case of any error the MMU will trap to the OS for a page fault. And if it is due to the on-demand memory allocation not being done by the OS then the OS will have to you know service the page fault we have seen right it will have to find the free frame may have to evict the contents of that frame to disk may have to read the contents of again from disk from swap space do all of this fix the page fault and restart right. So, there are many overheads in this memory access you have CPU cache misses TLB misses and finally page faults right page faults involve multiple disk accesses. So, imagine the CPU can actually access can run instructions very fast at like nanosecond time scales and if you know data is available in cache in a few nanoseconds things go very fast otherwise you have to do multiple accesses to RAM which is once again hundreds of nanoseconds you might have to waste. And not even that if there is a page fault you have to waste several milliseconds possibly to do various disk accesses back and forth from swap space right. So, now the CPU that can operate at nanosecond time scale is actually waiting for so long for this memory access to happen for it to proceed right. So, these things really kill the performance of your application. So, later on when we study performance engineering we are going to revisit this topic as to how you can reduce all of this right and a very common thing that you notice when a lot of page faults happen that state is called thrashing right. So, what is thrashing? Thrashing is when a system spends too much time servicing page faults right you there is very little memory in the system and you do not have enough physical memory frames and every time you access something there is a page fault you have to go to disk go back and forth from disk swap this output that and it is like musical shares right. You are spending a lot of time just moving things back and forth from disk and not doing any work at all that is called thrashing and sometimes when that happens your application performance will slow down a lot. So, when does this thrashing happen normally every process has a working set of you know some small number of pages that it is frequently using. So, you have to give every process at least as many frames as to store its working set right if your working set is say 4 pages you have to give the process at least 4 physical frames the OS has to do that otherwise if you only give it 3 frames what will happen it lacks as the fourth frame there is a page fault you take out one of those frames and it lacks as this frame there is a page fault again page fault you will have frequent page faults right and frequent interruptions to the process execution. So, that is why it is a good idea for operating systems to estimate what is the working site of a working set of a process and give at least as many pages are there as is there in the working set as the working set size of a process that much memory at least OS has to give processes otherwise you are going to have this issue of too many page faults and this thrashing is not just for memory it can happen for CPU caches or TLB also if you have many a lot of memory that is beyond the CPU cache size or you know many address translations beyond the TLB size also you can have frequent cache evictions but the term is most commonly used for memory and when this thrashing happens your system slows down a lot in such cases what you should do is the users the application can reduce its working set size try to operate with lesser amount of memory OS can terminate a few processes reclaim memory right you have to do something you have to fix so that the performance of your system improves. So, that is all for today's lecture in this lecture we have studied the concept of demand paging and virtual memory how page faults are handled what are page replacement policies and now with this over the last three lectures we have an end to end view of memory access when the CPU accesses a certain code or instruction or data at a virtual address what are all the things that can happen in this process what are all the overheads and how do you go about fixing this overheads right later on in the course when we study performance engineering we are going to revisit this topic and come up with certain helpful tips to fix these overheads in real applications. So, that is all for today's lecture a small exercise for you is you can actually look at pick any process in your system look at how much virtual memory it thinks it has and how much physical memory the OS has actually allocated to it. So, this information if you are on a Linux machine you know an output of a command like PS can actually give you this information please dig around it a little bit more to understand the concepts of this lecture better. Thank you all and see you in the next lecture.