 Okay, why don't we get started, couple minutes behind already. Latention, donka, all righty, and again I'm Carl, I'll be covering today's class, actually Jeff is still trying to get out of Buffalo, so obviously with the weather, who knows when that will actually happen on that, and what we're going to be talking about today is going to be actually directly relevant to your upcoming project, it is concerning if you will, pages and how the paging process works, so it's going to be something that in a large extent you're going to have to understand and implement as part of assignment 3.2, so it's a fun task but kind of pay attention to some of the mechanics because it's relevant not just for the test but in this case directly to one of your coding projects on this. I assume by this of you people sitting here either you're taking a break which is a smart idea or if you are winding down just in terms of coding here you have a couple more days left on that. Yeah, you're getting crunched for time, do speak to the staff in terms of prioritizing how to best use your remaining time on this. I can also say no matter what happens on this, do remember that the time that you spend on assignment 2 is not wasted because you're still going to need this for assignment 3 because these projects are cumulative and in order to pass the test for assignment 3, essentially you do need a working assignment 2, so no matter what high it happens this is definitely if you will needed time and it'll help you later on as you go through the rest of the course on that. So as far as that goes remember Friday we are not going to have official class on that it's just going to be kind of extended office hours, I'll leave extending that and we hope to send out more information in the next couple days on that. Aside from that make use of obviously office hours tomorrow too and then presumably next week you will be dipping your toes into the white sand breech beaches of Tahiti or Fiji so enjoy that week off and then the week back obviously it's going to be midterms again do make sure that you at least at some point you start looking over the material don't leave it for the week back and again for those of you who have been playing hooky in video land I said this last time again if you have let us lectures pile up and you at this point have and lectures to watch again trying to go through a whole bunch of lectures and have the stuff actually stick is going to be somewhat problematic so that could actually be perhaps something for you to do while you are dipping your toes into the white sand beaches in Tahiti watch some of Jeff's lectures so that you'll be all set when it comes time back for midterm week I digress from my hectoring let's talk about the rest of the classes today here okay what actually is a page and Kate and remember we've got these two concepts of memory we've got physical memory and we've got virtual memory and I didn't pause for questions questions I can ask before we actually go into today's sermon and by the way I guess everyone gets a plenary indulgence for being here okay if you are here this is so much the better I'm sorry there you go I mean just general questions about class progress projects diving right in okay how can we locate a page state and this kind of dies as if you will delves in with well page faults and translating between physical pages and virtual pages let's take a look at this by page state again you've heard the term state ever since 115 on this remember roughly speaking it's the metadata about something in other words the page is simply by definition a block of 4k of memory that the user or the kernel uses on this but we're talking about page state is well what about it in other words as far as the user could is concerned is this page well in real memory is it on disk does it exist at all has it been written too recently where in physical memory is it in other words it's information that the system as a whole needs to know about memory so it's memory about memory on this and that's going to be important because that's essentially the gravamen of your assignment 3 is keeping track of memory you're writing a memory manager again it's a cool project on this so couple things here in terms of TLB remember the general if you will approach of what's going on here remember I talked about this last Friday we always begin with the user refers to a virtual address right well what do we do the virtual address remember really doesn't exist right the virtual address goes to the MMU the memory management unit essentially this TLB thingamabob here and this TLB thingamabob does some sort of a translation specifically it checks to see do I have an entry for this particular virtual address on this and if I do well I can now kind of determine what physical address if it is if I don't have an entry for that I'm going to have to actually if you will ask for help from the kernel because remember it's the kernel that keeps track of if you will all the information about virtual to page trend I'm virtual to physical translation on that so again user refers to a virtual address the virtual address goes to the MMU in hardware the hardware checks to see whether or not there's an entry if so it does the translation on the fly and completely silently to the software if not it has to poke the OS for help hey I need you to tell me a little bit more information on this so from the standpoint of the operating system we have to keep this TLB to date remember back to 341 you talked a little bit about data caches in other words the idea is we've got if you will a small amount of memory that in the case of a data cache is fast and ideally we want to keep it populated with stuff that we're using all the time well it's going to be an analog for the TLB because the TLB it's not a cache of data what it is it's a cache of address translation information on this so it's our job as a kernel as a virtual memory manager to try to keep this TLB updated with really really relevant useful information about this if you will translation because just like with remember if you talked in 341 about if you will schemes for let's say data cache eviction because we want to try to keep cache misses as infrequent as possible same thing here into the TLB what we're going to try to do and what you're going to be implementing as part of assignment three is if you will making sure that we keep the TLB up to date with stuff that is going to minimize the number of page faults because that's going to increase the performance of your system the user is going to be happy or in your case a student you're going to be able to pass the test a lot easier on this so we want to keep it with if you will up to date information here and this is actually what we need here store information about each virtual page here okay what that means is again storing metadata about the virtual page remember I was mentioning earlier where is the page okay a virtual page is again kind of only an abstraction but where is the underlying physical page is it in memory if so where is it in memory is it on disk if so where is it on disk does it even exist at all has it been written to so that's what we mean by again storing information if you will metadata about the page we'll look at some sample example I'm sorry sample types of entries that you might put into your metadata in a few minutes the second part about this is an issue of performance on this as you can imagine in an operating system page faults happen all the time because and as a result you need to service them very quickly on this ideally we don't want to have to service them at all that's why we have this TLB and hardware on this and again this is why we want to try to minimize the number of page faults but you know what sometimes they are inevitable and when we do have a page fault that actually goes to the kernel we need to be able to locate this information quickly so performance is going to be a problem on this I know my old code that I had written for OS 161 I remember I had a bit of a problem with this a year ago because we came out with new and updated tests for last year's class and all of a sudden my code which here too for had been passing just Jim dandy on the previous set of tests started flunking because the new test if you will were more demanding in terms of time constraints on that so I had to add some optimizations in here so again kind of with page state there's two things what is the page state that we we need to keep track of but number two we need to be able to get at that page date very quickly and efficiently on this so kind of recap we talked a little bit about how we get at if you will from virtual to physical we talk a little bit about if you will what is this page state and why we need to get at it quickly questions comments so far going once going twice sold selling a lot of bosses here all right pte's page table entries these are essentially the thing of the bobs that if you will store information about a single virtual page here so again the page table entry is the struct that is going to store your metadata so you've got a 4k block of if you will data that is the data data and then we've got however many bites are in this page table entry that store if you will metadata on this I think then essentially you can imagine this is per virtual page on this so again each virtual page has a page table entry by the way to digress I know I've just talked about on Monday about core map and stuff like that but in terms of core map you also need to keep track of per physical page frame metadata also that's part of assignment 3.1 I'm going to be talking about that in under an hour if you're going over to recitation and fronzeck so again if you will to digress you need to keep track of metadata for each physical page that's in the core map aka physical page frame and you need to keep track of metadata for each virtual page and that's going to be a page table entry in your drum roll page table on this so sorry for the alphabet soup of terms this makes sense so far on this yes no head bobbing okay all ready okay page table entries here okay now what Jeff is suggesting here I will say don't try to if you will take this is you have to follow this model exactly right now remember get Jeff's post about kiss keep it simple theory okay sit down with your partners right now and in terms of what you're designing to what to put into these page table entries come up with what you think you might eventually need to put into it but start simple what do I need to do to get started on that same thing is true with a core map like what do I need at this point in time and then think ahead so you're not painting yourself into a corner but again I will tell you right now if you especially there's been stuff like if you look at old blogs and whatnot they suggest you need to do this this this and this because they're throwing a kitchen sink of all these possibilities or if you Google this stuff online okay you'll see let's say Wikipedia articles about let's say what Linux does in terms or Linux I should say or Linux or however is you say what Linux does in terms of storing metadata it's going to be far more information than what you need initially to get past the hurdle of 3.2 so what is the information that we actually kind of need to store in a page table entry well at the very least what do we need remember if we have a page fault that means we went to the TLB the TLB could not locate an entry translating a virtual address to a physical address so it says operating system help all right so what's the first thing we need we know the virtual address because that is the fault address okay that's the address on which we had the fault the page fault the exception if you will but what we don't know is this thing of a bob the physical page number so that's at the very least what we need is the information given a particular virtual address I need to be able to grab what the corresponding physical address is and then by the way people what do I do with that so like let's say that I fault on virtual address 1000 and it tells me that it is corresponding to physical address two three four five six seven okay well now the operating system knows that okay but what does the operating system do with that information then ideas how did we get to be in the operating system in the first place coming from where I'm sorry user land but then how do what between user land and what happened in user land that calls us to land in the operating system a trap yeah a page fault right okay and why did we have a trap what what piece of hardware caused the trap we user referred to a virtual address and the virtual address goes to what well we hope it goes to actual memory but how do we remember back to last Friday mmU the TLB exactly right so the mmU is looking for a if you will virtual to physical mapping and it didn't find that that's the thing that causes the page fault the mmU signals the page fault in hardware makes it so this was the problem but we want solutions we don't want problems so how do we repair the problem we have the virtual address let's say we have looked at the page table entry and we now have let's say the physical page what can I do with it I can put that information in the mmU exactly right okay because remember what the mmU is it's simply a cache of translation information here so in other words what's going on here is we have a page table entry we had a fault okay from the fault we we retrieve the corresponding physical address and we have to take that physical address and update the cache in the TLB as you can imagine that is a privileged set of opcodes that will do that users are not allowed to do that this is getting into the whole policeman and enforcement mechanism the kernel is allowed to do that because someone has to do it right and then ideally we return to the user land the user is happy as a clam doesn't even know that there was a page fault it's totally seamless to user code at least we hope that that's the case now there's some other metadata on here that Jeff's going to be talking about especially in terms of optimizations later on here but things like for example permission the things about virtual memory that's cool is you can do stuff like I talked about this last Friday too you can mark a particular pages being let's say you can't execute it to prevent data execution attacks you can mark it as being let's say read only so that the user doesn't accidentally munch on executable code stuff like that so you can actually you want to keep track of what permissions are for a particular if you will virtual page here other stuff like valid and valid we can't even take that a step further there's kind of two things it does the data exist anywhere at all but if it does where is it in memory is it somewhere on disk what have you on that question yes that's exactly right if you look in it's like arc mips vm tlb.h okay there's documentation for that that's actually some of the bits that you have to set in the hardware registers in other words not just the virtual to physical translation but you also very definitely okay have to tell the hardware okay but they reads are allowed or writes are allowed or executes are allowed etc on this okay along with other things like you don't have to support this OS 161 unless you want to on this and that is you can have multiple processes with valid entries in the tlb at one time on that right because again what the idea is in a production system we want to try to keep eviction as infrequent as possible and page faults as infrequent as possible answer your question on that other questions okay so again this is more about metadata if you will permissions valid reference a nice thing about a reference that's going to be cool when it comes to eviction on this remember we probably talked about I know I've mentioned I'm sure Jeff has probably mentioned it several times in terms of the technique of using the past to predict the future on this so one of the things that we want to keep track of is what pages get used a lot so this reference bit and it may not even you might want to get fancy and have more than one bit how often does a particular page get referenced because that's going to allow you to make smart decisions in terms of which ones we want to evict and all that stuff later on so but anyway again this is not gospel gospel you definitely need the physical page address but you're going to be adding in some other stuff I can definitely tell you this definitely there's some other stuff like in terms of synchronization that you're going to have to have in here too but again my point is in initially designing stuff with your partner don't pay yourself into a corner but start simple so you're going to have to have a physical address but then going beyond that you know don't worry about coming up with everything right away all right page state on this so let's see what we have here okay well this kind of walk to what we're talking about earlier okay we are in user land and I want to store to a particular address go to the M.M.U. and again now at this step if the M.M.U. has the information it doesn't go to the kernel and that's kind of ideally what we want think about this as a terms of a matter of policy this is a good exam question how could we design our system to try to keep this kernel help stuff to be as infrequent as possible so again what policy what we want to make sure is that we do know what virtual address 10,000 maps to but if we don't go to the kernel here exception kernel okay now at this point what Jeff trying to get handle is take a look at this let's see where did I put that page table entry give me a minute it's around here somewhere probably looks like the underside of my desk in the lab all right point being here is if the kernel is having a hard time of it locating this information it means what is going to happen to the user experience okay well it's not even necessarily could crash yes but what yeah the user is going to feel very kicked off at this in other words if we have a poorly designed M.M.U. even if it works logically if in terms of performance it's slow it's going to impact the performance of the entire system here so this is going to be up until now with assignment to it's essentially been a case of functionality and logic one of the tough parts about assignment three and well actually it's probably good that you're getting to it in this class but I know a lot of classes don't talk about it here at UB but in order to get past the hurdles especially of a 3.3 there's going to be time performance constraints on this in other words how can you get this information in a reasonable amount of time so you're going to have to not just get the answer but get the answer quickly and implement a bunch of optimizations and let's actually talk about some how we can do this here so we need to do here is speed okay it's basically a hot path given a page fault we need to if you will unpage fault ASAP on this all right but the other thing too is remember we've got this if you will time space trade-off we want quickness but we also know that we have limited resources on this so we have to balance that it just not chewing down too much of the memory because then user experience is going to be degraded in that way so let's see how we can kind of balance this questions over how much so far complaints over page tables all right the data structure we need to quickly map basically this is the translation again from virtual to if you will a page table entry essentially virtual to physical plus virtual to some other metadata is a page table and it holds these things that you can imagine called page entries on this so again this is the guts of a sign at 3.2 3.1 is going the other way you know that's your physical map that's your core map that's information about physical and also indirectly physical to virtual but kind of leave that aside for now but if you will page table is information about virtual and how it maps to physical on this interesting each process has a separate page table now let's again go back to core map for just a minute core map is your if you will mapping of physical memory on this that's the RAM chips when you pop the cover of your computer on this and the RAM chips are if you will shared by all processes which tells you who uses the core map which process kernel how about them process food or if I'm process Firefox do I need to use that well maybe not directly but indirectly yeah right how about process bar if process bar is who knows opera browser right my point is what who uses the core map which process bingo all of them okay so it's something that has to be shared which of course introduces some mutex issues on here but now wait a minute here we're talking about this mapping of virtual pages and now we're getting off saying here that each process has its own separate page table why do we need that why can't we have if you will a mapping of virtual to physical that is system-wide here glad you asked virtual addresses are private process and translated differently for each on this member backed I forget which one of just lectures I was covering I had mentioned that one of the nice things about virtual memory is it allows the programmer in user space to make some simplifying assumptions for example executable code always begins at address 400,000 or whatever it happens to be okay and remember we said then that's not quite true because if you think about it okay let's say process flu has its executable beginning on address 400,000 process bar has its executable beginning at address 400,000 wait a minute how can that be the case you can't have two things in the same place at the same time what is this is like the poly exclusion principle of virtual memory so what we have here is how can we actually implement this it's true if you will the concept of virtual memory here so each process has an entry that essentially says my executable which begins at 400,000 maps to what a different physical location in memory and that's how I'm able to pull that stunt off make sense and this is why we need information on a per process basis here because again what one virtual address appears to be in one process may map to something completely different from something in another how about fork think about this have you ever wondered how you know it's probably you just call this as copy thing I'm a Bob and it duplicates the address space well it doesn't duplicate duplicate the address space we've been slightly sloppy about that because if it were an exact 100% clone of the child to the parent they would be munging on each other's memory what's being copied if you will duplicated is the virtual addresses but underlying that they get two different address spaces and two different address spaces the same virtual addresses are pointing to different areas in physical memory on this so it's not quite duplication in a very strict sense of the term if this makes sense okay question before we go on to actually implementation and one other thing I know if Jeff probably going to talk about this after break to in terms of virtual versus physical on this in general each virtual address has to map to a unique physical address for a given address space okay now you can have the same virtual address space in different processes or different address spaces map to if you will the same physical address space though okay that's actually how you can support stuff like if you will copy on right for fork on this so in other words you can have let's say two different virtual addresses in two different processes map to the same bunch of executable code because let's say someone is called fork and we're just going to be lazy and say we're just not even going to bother if you will copying physical memory we're just going to kind of redirect if you will the virtual addresses in different address spaces to point to the same physical address space but by and large we're going to have is if you will each virtual I'd say the same virtual address space in different address spaces point to different physical addresses what you can have is if you will different virtual addresses in the same address space point to the same physical address so it kind of sounds weird but it is possible and it is allowed by the TLB on this in other words we can have the same if you will block a physical memory up here at different virtual locations on this but we can't go in the other direction okay otherwise if you try to do this you try to put an entry in the TLB with let's say for the same address space address space one thousand maps to one million and then address space one thousand maps to two million it's going to burp up and you would say well which one do you want me to do flip a coin when the user says one thousand so again it essentially it's a mapping okay given one virtual address it can point to one and only one physical address within a particular address space make sense okay the last part is to talk about implementation and this is getting in a little bit into the mechanics and in terms of what your implementation is for assignment 3.2 this is again do a little bit of thinking on this sit down with your partner sit down with some of the staff there's pros and cons there's no one right answer I certainly have my political orientation on this other people have their political orientation on what is the right answer and everyone else is a heretic on this so let's take a look at this here one way of doing this is a flat page table let's take a look at this let's say we've got here is our virtual address space and user space here and remember okay what we're trying to do is map virtual addresses so if this is a 32-bit system by definition we have to map all potential 32 bit addresses to the 32 addresses in other words 4 gigabytes by definition and by map I mean we have to be able to provide some sort of an answer with what's there because the user could potentially refer to any address between 0 0 0 through F F F F F F you get the idea on that and so I need to have some data structure that is going to allow me as a kernel to kind of spit back what the status of a particular page is anywhere in that 32-bit address space mini digression when it comes to core map on this how much are we trying to map if it's on a 32-bit system we may or may not have 32 if 32 bits that is 4 gigabytes of physical memory we may but it may actually be a subset of that so in the case of a core map it's kind of a slightly different kettle of fish I can talk about that in probably about half an hour or so but for now let's talk about flat page tables and see what's going on we have to map all if you will 4 gigabytes of address space everyone see why okay so option number one is we have 4 gigabytes of address space so I in a fact set half to let's just declare if you will a flat page table that has essentially an entry for every single potential page probably just declare it is one gigandus over array on this and as a matter of fact we can even be specific how big is this array going to be in terms of number of sizes here 4 gigabyte system 32 bits okay if we have a 4k page size do a little bit of vision so it's 4 gigabytes divided by 4k pages that's going to mean we're going to have exactly one million well one binary million entries in our page table on this and this will work like a charm here so in other words let's just say I have a bunch of these page table entries and let's say that part of the code here maps to somewhere in this page table and there's the page table entry part of the heat maps to someplace corresponding in the page table there it is presumably these page table entries are declared on the heat okay but this page table itself think of this is analogous you know you had the file table and then that in effect pointed to if you will file handles same thing here we've got a flat page table probably like an array of pointers or something like that and for each of these things it's pointing to something of let's say an existing page table entry here simple to implement here now the issue is here let's take a look at this here speed it's very quick because for any given virtual address I can immediately map over what is its entry in the page table right because again all I need to do is essentially drop off the lowest 12 bits okay and I have still the top 20 bits that's going to give me 2 to the 20th entries I know exactly which if you will entry in the array to go to and I can then speed over and take a look at my page table entry so very quick compactness this is the fly in the ointment on here in effect by definition I have to map 4 gigabytes of memory 1 million potential entries so my flat page table must by definition support if you will 1 million entries now interesting if I have a 4 gigabytes of RAM it probably is not going to be that difficult having something like that I can fit it into memory it's going to be maybe a little bit wasteful but I can make it do but if I have let's say a smaller amount of memory let's say 16 megabytes or 4 megabytes or something like that you know just going to configure your kernel on some of the tighter bit test with one or two megabytes of memory good luck trying to fit an array with 1 million entries into 1 megabyte of memory it's not even going to fit into memory so Houston we have a problem well option number two if it ain't an array let's go for a list on this so I'm sorry two slides ahead all right so for now linked list here okay for hash bucket all right okay so link list here so that's what we have here back to our virtual address space here and here are page table entries here and we need to link up again we have to potentially support all 2 to the 32 entries here so instead of having an array that has an entry for all if you will 1 million pages okay how are we going to do that we're just going to kind of chain these buggers up together and here we go let's say if we have a page fault in some area in code we're going to check is it this page table entry now well how about check the next one no going to the next one next one no in other words basically we're iterating our list and finally you can see we hit our match on here so we essentially have to iterate over our list eventually we will get our answer on this and again this is the old time space trade-off speed definitely not a go for a production system you do not want to waltz if you will like molasses through a list in a production system that's not going to work for a west 161 it is good enough so this definitely is an option that you can use for implementing your page tables if you are fond of lists again this is one of these holy words on here compacting nodes into my list well obviously when the program launches to support if you will let's say some initial information but then as the program loads okay I'm going to have a bunch of page faults to load in those executables and insert node node node node and then when the program starts running and it starts using let's say the stack and heap segments I'm going to have to do more insertions on this so I'm definitely going to have to support insertions throughout the runtime of the kernel maybe that's the ultimate answer to your question did that answer your question okay other questions thank you for bringing that up that's actually a gotcha that came up last year I know University of Toronto used to always force their students to do that well now it's true here too and that's actually the way real operating systems work to be frank so in other words you simply mark areas as valid and then you're going to have to actually support if you will on demand allocation of this and by allocation remember that involves two steps number one is creating the metadata and number two allocating the underlying physical memory on that and again that's kind of part of your design document on this do sit down again with your partner think this through I know I and my partner had many many midnight whiteboard sessions by the vending machines in Davis it had just opened at the time so do that make use of the office staff and take it from there side from that thanks for coming people enjoy spring break next week midterm the week after that we're going to be talking about core map in about a few minutes or so off to la vista