 Welcome back to 353. Thank you for joining me on this wonderful Friday. That's slightly underattended, but we'll push through it. So where we left off last lecture, we were talking about virtual memory, how you would implement it. And we figured out that, hey, we should probably just do some type of mapping. Arrays are fast, so we'll do something called a page table where we essentially split up the address and only deal with memory and fixed size blocks called pages, split the address, and then use the virtual page number to look up the physical page number. And then we're off to the races. But we argued that even for a 39-bit virtual address, our page tables had to be ginormous. They had to be a gigabyte each. And they'll be independent for each process. So that would be bad. So what should we do about the page table size? So one note most system designers make is that, hey, most processes or programs don't use the entire virtual memory space. So how can I take advantage of it? And the simplest answer to that is, well, if a huge page table is the problem, I will make a smaller page table. And then, boom, you collect your bonus check for being so brilliant. So I just make it smaller. That's the solution. So we're going to do it in a bit of a clever way. First off, I'll start with just one step. So for now, we're just going to make each page table fit exactly on a page. And it will become apparent why we do this later. But for now, we'll just assume a single page table. So if we have a single page table that fits exactly on a page, remember our page size would be 2 to the 12. So it's 4,096. And our page table entry, well, it is 8 bytes or 2 to the 3. So we could fit 2 to the 9 entries on our page table to fill it up. So this 2 to the 9 times 2 to the 3, well, that's equal to our page size. And we'll just make a page table that in this case, it only needs nine bits to actually index it. So in other words, it only has 512 entries. So now we'll also add this SATP, which is essentially going to be the root page table. So the actual MMU hardware, all it needs is it needs to know where to start doing the translation. So it needs a location of the page table. And that's just specified in a register that the kernel goes ahead and controls. So on risk 5, it's called SATP. But it's basically the root page table. And just tells the MMU where the first page table is or where the page table is for this process. So here, that page table to the 9 entries. And each page table entry would have a PPN and at least a valid bit. In this case, same scheme we had before where the PPN is 44 bits. We had 10 bits for permission. And then we didn't use the other 10 bits. And similarly, we have the offset we don't have to translate. So any questions about this? All I did is take last lecture and just make the page table smaller, such that it fits exactly on a page. OK, now we get, yeah. So what it means by it fits exactly on a page is our page size is 4,096 bytes. So I want all the page table entries to fill that up exactly, yeah. So I'm just restricting the size of that page table. All right, so why in the blue hell would I want to do that? Well, this is where our minds start to break and why this is hard. Takes a bit to wrap around it because this is the full picture of how we translate virtual memory. So we do something called a multi-level page table to save space for programs that don't use that many virtual addresses. So if we had a 39-bit virtual address, we'd single large page table. We would need a gigabyte. What we do instead of doing that huge page table is we split it up into different levels. And how does that work? In this case, we need three levels of page tables. So how it works is that each level of page table, we break it down to a smaller size, and they each fit exactly on a page. So how translation works now is a bit different. So we take this virtual address, and instead of just having one big virtual page number here, we split it up into nine-bit sections here. So in this case, in order to get a full 39 bits here, well, I need three of them. So nine plus nine plus nine plus 12, that gets us our 39 bits. So they're split up into different levels. So typically, we name the highest level, some number, all the way down to the lowest level, which is L0. So in this case, there's L1, L2, L1, and L0. And this is in no way connected to the name for CPU caches at all. So how translation actually works is there is still this root page table register that tells the MMU where to start the translation. And what it does is it points at a level two page table. So your level two page table would have two to the nine or 512 entries. And what we had before was, okay, we look up that entry, we get the PPN, check that it's valid, and then we just use that for the physical address. Well, now we do it in a bit more of a roundabout way. So the PPN here doesn't tell us the final translation, it tells us that is the physical page number of the L1 page table. And then we jump to this page table and the MMU knows to use this page table to figure out, we use this page table for the L1 entry. So then we know what index to use from the virtual address, we go into here, then we figure out, okay, which entry does that correspond to? We get a PPN for that. And then since we get a PPN for that, that tells us where the L0 page table is. So the MMU would then use this L0 page table, get the index from the bits from the virtual address, look it up, and then finally in the L0 page table, the lookup is exactly the same as when we had a just one giant page table. So the PPN in this page table entry, that's what we use for the physical address, and then we're done, thankfully. So we'll be talking about this all day. So any initial questions about this, we'll do some examples and stuff, yep. Sorry if I'm getting confused with the two to the power nine page table, the same size of page, because it wouldn't be two to the power nine and then there's 64 bits per entry. And isn't the page table size supposed to be 12? So we do it in bytes. So it's two to the nine here, which means we can address up to 512 entries, and each entry is going to be eight bytes. So eight times 512 is 4096, that's the size of our page. Why is the page also determined? Yeah, so the offset is like, you have to be able to say what byte on a page you want. So I have to say what byte on a page I want. So it has to be two to the, however many number of bits I need to be able to say, I want byte zero, I want byte 511, I want byte 4,000, whatever. Okay, any other initial questions with this? We'll have examples, so yep. Can I only have one, what, sorry? Yes, yeah, so you can only have one SATP, so there's only one L zero page table per process. So it all starts there. So SATP in this case, since we have three levels of page tables, that will point to the L two page table to use. Yeah, so, and that would be set by the kernel per process. Yep, so we could have two to the 19 different L zero page tables to the 18. Right, because we have two to the nine choices here, and then for each of these two to the nine choices, we can further go on to two to the nine. So we can actually, if you actually look at this, we can actually map the same number of pages, but we're actually gonna waste a bit more space if we have to map every single page, right? Because to map every single page I'd essentially need a gigabyte worth of L zero page tables, but I'd also need L zero page tables to point to them, and then one L two page table. So it's actually slightly worse for space if you use everything. But the idea here is at minimum, to only translate a single address, I just need like one L two, one L one, and one L zero. Yeah, so all of these page tables will have a physical page number in them. It's just what they refer to. So in the L zero page table, that physical page number is going to tell us the physical page that the next L one page table is at. So they'll all be saying physical memory. In every other page table except L zero, they tell you the page of the next lower page table, and L zero it tells you the page for that mapping, like the final mapping at the end. Yeah, this is for literally, this is the scheme that's used for any program. Yes, so if a process is using 512 gigs of memory, then yeah, we're stuck. We have to keep the translations anyways, so we can't really do much better than using a gigabyte. So at minimum, we need one of each. So to translate a single address, I need one L two page table, one L one, and one L zero. So I need three pages to translate a single address. So if the program only has to translate a single address, three pages, before when I had a giant page table, even if I need to translate a single address, I still need that gigabyte page table. So this is only good for programs that don't use all the memory, which is every single program except for Chrome. Man, picking on web browsers is easy. All right, any questions before? We'll do an example just in case. So first, let me brief detour because this actually makes memory allocation really, really simple if you are the kernel. So if you are the kernel, to do memory allocation, since we separated out memory into pages, you just do memory allocation a single page or block of memory at a time. So whenever you boot up your system, you can just take all the physical memory, divide it into physical pages, and then the kernel just maintains a free list, which would just be a link list of all the physical pages, so they just link one to the other. They're not being used by the program so they can store a pointer. And yeah, so the unused pages themselves, they store that pointer, it would get initialized at boot. And then to allocate a page, really simple if you're the kernel, you just remove it from the free list and use that page for your allocation. And then whenever you need to deallocate that page, you just add it back to the free list. And since you're keeping track of the mappings, one page is as good as any other page, you don't actually care where it is in physical memory. Okay, we're fine. All right, so here is some more explanation to our insight of using a page for each smaller page table. Because the kernel can, if we just do that optimization, the kernel just has to allocate pages. It doesn't really care for what, it doesn't have to have a separate allocator for page tables and a separate allocator for pages that the process uses for actual memory. Just can use the same thing, just gives a page and then we decide what to use for it. So in this case, just to be really explicit, so in this case, the reason why we have two to the nine entries is because all of our entries are eight bytes each in this case, oops, eight bytes each in this case, and they have to fit on a page, which is 4096. So we can figure out that, hey, this is two to the 12 divided by two to the three, that's two to the nine. And the rule, no matter how many levels of page tables we have, level N always points to the page table for level N minus one, and then we just follow this blindly until we hit L zero, and then that contains our final lookup, that's the physical page number for that virtual page number at the end. So just also to be clear, the page tables, just think of them as arrays. So if I had something like int page table 512, quick off the top of your head, how many bytes is that? What's the size of that array page table in bytes? How many? 2048, right? Yeah, 2048, so an int is four bytes, and if I have 512 of them, that means it's 2048. All right, so remember I talked about offset, it's just like the number of bytes from the beginning something is, so if I'm like accessing the index two of this array, what is the offset at index two? It's that gang sign, what was that answer? Eight, eight, right? So an int, it's four bytes, so index zero is at byte zero, offset zero, first thing's at the beginning of the array. At index one, it would be at offset four, so it's four bytes after because an int is four bytes, so index two, that would be eight bytes into it, because while the size of an int is still four. So, oops, so think of the page table entry is the exact same way as it's an array. So all I do is instead of saying int, I just say page table entry. So in this case, if I have an array of page table entries, well, the size of this page table, which should be 512 times the size of whatever the page table entry is, in the case of the default 32-bit three-level page table, it is eight bytes, so these equations should always hold, even if I just come up with some random memory system. So the size of the page table should be exactly the same as the page size, and then the size of the page table should just be the number of entries in the page table times the size of the page table entry. Nothing really, hopefully, they're just arrays, right? Any arguments about being arrays? All right, so let's consider, let's go over a translation just with one additional level. So in this case, with just one additional level, it will be like a 30-bit virtual address. So say my virtual address is this long thing, yikes, that looks bad, but I divided it into here, this is the L1, this is the L0, and then this is the offset. So instead of all that, let me just switch to this and go through it. So in our scheme, we will have a 30-bit virtual address, and it'll use that good old SV39 scheme, although, well, sorry, be like SV30. So in this case, when we write out our address, it'll still have the same section, so it'll still have offset, and then previously we had this whole thing as a virtual page number, but now because we are using a multi-level page table, then we're going to split it off into smaller page tables that fit exactly on a page, in which case for each level, we need nine bits. Nine bits, and then our offset is still 12 because our page size is still 4096. So questions about that. So nine plus nine plus 12 equals 30. All right, so as long as I'm not crazy. Okay, as long as I'm not mathematically crazy. All right, so our address was something like 3FFF008. So I know, thankfully, I don't have to translate the offset. The offset's a nice number, so 12 bits, that's three hex characters. So I will just write it like it was 008. Don't have to translate that. The rest of the address was FFFF3. So without, it's basically just a bunch of ones. So how I designed it is it's just 11111, so three. It's all ones for L1, and then all ones for L0. And then the offset is the same. So far, we're good with breaking it up into those parts. Okay, so to translate this address, we need to know what L1 page tables start off. The question would probably tell you where it starts off. So in this case, let's assume it starts, I don't know, at P, oops. So SATP, or the root. It might start at something like, I don't know, PPN7. So if it starts at PPN7, that means that the contents of that page table stretch on from 077. 0 all the way to 7FFF. So that entire page is going to contain all of the page table entries for that L1 page table. And they're gonna go all those addresses, yep. So the SATP is just the register that holds the root page table. Yeah, in this case it would be L1. So in this case, like this is what's in the SATP register, just physical page number seven. And that would be where the L1 page table is. Yep, that's a register in the CPU and the MMU uses that one. Yeah, so that's basically like your page table. So if you context switch to a different process, it would just change that register so it points at that process as root page table and then we're off to the races. All right, so since it fits on a page and we're not doing anything weird, well, what indexes L0 or L1? What L1 index do we want to use? Or how good are we at knowing the maximum number that's in a nine bit number? Yeah, seven one FF, so this is our L0 index as part of the address to use. So what is that index, I don't know, in decimal or something that we can understand? Two to the nine minus one should 511 essentially, right? So how this translation will start working is on this L0 page table, we will use this index from the address. So we go ahead, we look up in here, we figure out what the page table entry is at essentially index 511, and it should tell us two things like the PPN and then whether or not it's valid or not. So let's say the PPN was at like 0.8 and it was valid, then what that means is okay to follow through and do the next step of this translation. We have to use the page table at page eight, which would be all the addresses from 800 all the way to a FFF and then we treat it as the L0 page table. Which thing? This? Oh, so if I'm at physical page number seven, it means all of these actual physical addresses are all on the same page. So remember for like our physical address, it always looks like PPN and then an offset, right? So if I'm able to access any single byte on that page, well PPN is seven and then I'm just saying all of the possible values that this offset could be. So it could go all the way from the offset being all zeros to all being Fs and you can go over that range just to double check. This is 4,096 bytes. So it kind of makes sense, right? Okay, yep. So here the eight bytes would just be like the size of this thing. So yeah, the eight bytes is the size of the page table entry. Bites, bytes, yeah. Memory always byte addressable. Yeah, so memory always is gonna be byte addressable. Each index is a byte. And then yeah, part of the address, that's bits because we get to select what byte we want. Right, offset is essentially what byte we want. How many bits do we need to select all the bytes we want? Okay, so where are we? So yeah, now we know where the L zero page table is. It's on physical page eight, which is addresses 800 all the way to 8FFF. So here it's the same index, except we would look it up in this page table. And then in this page table, we'd get a page table entry at 511 as well. Maybe it's something cooler where the PPN is, I don't know, cafe and then it's valid. So since that is our entry in the L zero page table to get the final physical address for what we have before 3FFFFF008, same rule as before. We just replace the whole VPN, which is this, with the PPN, which is now cafe, and then we keep the offset the same. So our final resulting address after all that time is cafe008. So questions about that, yep. So the 11111, they're not stored in L one or L zero. It's just how we broke up the virtual address. We're trying to access 1111111008. Yeah, so this would be your program trying to access this. And then, So that's a virtual memory address. Yeah, so this would be the processes virtual address. And then the other components are, your kernel is going to set, it's going to manage all of these page tables. So it would set SATP for that process, and it would actually fill these entries in for each page table, and that's it. And then the MMU would be responsible for using those page tables to actually do the translation. So we're doing the job of the MMU here. So SATP, that's just pointing to the root page table for that process. So that's all we needed, right? So, So the L zero, for example. Yeah, so the only two things we need to like, translate a virtual address like this, all we need is the root page table, and then that's it. So you're wondering if everything uses virtual memory, how does the kernel know to, Where to start. So everything in the process is virtual. So kernels can either one, only use physical addresses and turn off, you can turn off the MMU. So the kernel can either use physical addresses or the kernel has to play a fun dance where it uses its own virtual memory. So the kernel would have an L two page table dedicated for itself, and it has to make sure that when you do like a system call, that's part of what has to do. It has to switch to its virtual addresses because it doesn't care about yours. And then do stuff and then it manages everything. So you can see once you get into the like, how to actually manage page tables while juggling other processes, it gets a bit tricky, but hopefully for a single process, it's not too bad. All right, any other questions about that chicken scratch I just wrote? Oh yeah, so it'd be cafe zero zero eight. Yeah, everything would be like leading zeros for everything. Yep. Well, I didn't just come up with cafe zero zero eight. So the physical page number came from here, right? In the L zero page table and the offset remained the same. So the offset was always the same. So it's no, that was part of the virtual address. Oh, the cafe was, yeah, that was in the L zero page table. They're highlighted in green. Yep. So yeah, the eight that we got from the L one, that doesn't factor into the final PPN at all. All that told us is where the L zero page table is. That's all it told us. And then the MMU would use that, figure out where it's actually located in memory. So yeah, you might also notice that, hey, this might actually waste a bit more space. So let's assume each process would have like a single L one page table that could point to two to the nine different L zero page tables, right? L zero. And then each of those page tables, the L zero page tables, well, they all fit on a page. So they're all like to the 12. So our total size here is going to be like two to the 12 times two to the nine for the L zero page tables. Plus we would have a single page for our L one. And if we do the math on this, was that that's two to the 23, which is, if I take it into prefixes, that's the same as two to the one and two to the 20. And two to the 20 is a megabyte and two to the one is two. So that's like two megabytes. Throw a little I in there because that means powers of two plus four kilobytes. So I need that much space to manage all the page tables if I had to map every single address, right? Well, if I just did one big page table and split up that 30 bit virtual address, it would be like 12 bits for the offset and then 18 bits for the VPN. And then, well, turns out my page table would be what? Like two to the 18. So I can have up to two to the 18 entries and each entry is two to the three. So that by itself is the same as two megabytes. So I'm actually wasting a space, if wasting a little bit of space, if I have to map every single address, right? So if I just had a single large page table, it would always be two megabytes. If I had two levels of page tables, this is my worst case of I have to map every single entry. So it'd be slightly more, but the minimum is what we care about. So in order to translate a single address, well, still needs two megabytes if we have a single large page table. To translate a single address, if we have multi-level page tables, we just need one L one page table that would point to one L zero page table. And then that's it. So we only need essentially two times the size of a page or eight kilobytes. So I'm in. Right, so it's a little bit more wasteful if we have to actually map everything. So questions about that. Yeah, I don't need them. So like in this example, to translate this single address, I only actually needed two smaller page tables, right? I just had an L zero. That told me where to go find the L one. And then my final translation was an L one. I only use two pages here. Yeah, yeah. So if the kernel had to start your process out from actual scratch, you'd have one L one, one L two, one L zero. So you can map an address. And then the more addresses you need, it would just fill up entries with pages until eventually that's full. Then it needs to get a new L zero, make an L one entry point to that and then try and fill that up. When that fills it up, it needs a new L one. Da, da, da, da, da. So yeah, what if you have to fork a process that already has a page table? You get to copy its page table. Yep, yeah, so just to make it clear, in lab three, you get to do the job of fork. So you get to clone the page tables. Fun, trust me, it's fun. So here, as an example for lab, so we can even go over a little example. So just to make it clear, I wrote an MMU simulator for you so you can go ahead and try this out yourself. So here, I just checked that my page size is actually equal to 4096. And I showed the steps for how to actually go ahead and allocate, or how to actually do the translation. So this would be done in your kernel and this allocate page would return a new physical page for you. So I just wrote a user space version of it that gives us a page to simulate. So I'd have to create a new L two page. So I'm going to go all the way back to here. Da, da, da. So going back to the scheme where we have our full three level page table. So we have L two, L one, L zero, and then finally we get the address. So in this case, I need an L two page table. So I asked for a page and I'll say, I'll go ahead and use it for an L two page table. And then to simulate that register, that root page table, I just have a global variable here, where I just set it to say, hey, this is where the L two page table is. Then after that, I need an L one page table. I'm going to translate the address ABCDEF. So like this would be the offset. And then whenever we do the translation, we would have to get the indices from this ABC. Whoops. And in order to save ourselves the effort of writing it out in binary, I already did it for you. So it would have leading zeros. So in order to translate that address, the L two index it would use is L is zero. So then I get an L one page table. Then I create an L zero page table. And in that L one page table, I make index five point to that L zero page table. So in this case, we'd have index five if we wrote this all out. And then the last part is going to be index two or one 88. So do you want me to write this out in binary to figure out where I got the indexes from? Are we good for that? Good? Okay. I will not write it out in binary. Hopefully you won't have to write it out in binary. So if I write this, so if I go ahead and call this MMU function, this MMU function uses a root page table, falls it all the way to L zero and then gives us the final address if it's valid. Here I put in cafe for the PPN at the last level. So if I go ahead and I run that, I should see, well first, assuming all my indexes are correct, what is the physical address I should get for virtual address ABCDEF? Oh, whoops, I'm covering it up. Yeah, cafe DEF as long as I set it up correctly, right? So if I set it up correctly, while all my indexes should go all the way until I get this entry in my L zero page table. And here it just creates a page table entry from the PPN cafe. So it should keep the offset the same and just replace the entire virtual page number with the physical page number, which is cafe. So it should be like cafe DEF, right? All right, let's see. So yeah, I got cafe DEF. And I have the MMU such that it will go ahead and check and see if all the entries are valid. If any entry is valid from like L zero all the way up to L one, if it hits a valid at any point and trying to figure out that address, you will get a page fault and this is your seg fault in your program. That's what's actually happening in the operating system. The MMU is just saying, hey, I can't figure it out. Generates an interrupt to the kernel. The kernel might pass it up to you in ways of a signal. So questions about that. So let's play with addresses. So if I did something like, is that address going to page fault yes or no? Yeah, it shouldn't, right? It's still on the same actual page. So if I go ahead, what the hell did I write? 94, whoops, okay, don't know how that got there. So yeah, that doesn't page fault either because it's on the same page, right? So every single kind of work and even like 0, 0, 0, that will still work because it's also on the same page. So you know how sometimes your program works even though that address is definitely not valid, like you overstepped the bounds of an array. Guess what, the address was just on the same page. That's why it didn't immediately page fault and just gave you some random crap. So that's why it's not smart enough to give you a seg fault for just random size arrays because, well, you're on the same page and it happens to be valid, neat? All right, so if I want the next one up, if I did this, is this gonna be valid? Probably not, so it should die if I try to translate that address. Yeah, it gives me a page fault. If I want it to be valid, all I have to do is like, I know from this virtual address that it's just essentially going to use the next index up. So it's going to follow all the way until the L zero page table and then in this case, it would use index 189. So if I wanna make it valid, I'll just change this to 189. And now, hey, guess what? It's valid. If I want to change what physical page it goes to, I'll just change it here. Let's have beef for dinner. So let's just beef zero, zero, zero. Questions about that? So weird things you can do in your process too is since the kernel is managing the page table and you have some say over it, you can do silly things. You can do that. So here I have two virtual addresses. You would assume that they are completely independent, but if I go ahead and set up the page tables like that, hey, guess what? They map to the same physical address. So you can change one variable that's completely unrelated from another one because, well, the operating system decided to map it to the same physical page. What a jerk. And we can do any other mapping. So any other questions about this thing? Lots of fun to play with, yep. Yeah, yeah, so in this example, so this is all in a single process. So this would be like all the page tables for single processes, but you could set it up that, hey, in one process, this address maps to this physical and then in a separate process, this address maps to the same physical page. So you could absolutely do that if you want. That's one way to share memory and you're not even aware you're sharing memory. Kernel can also do a bunch of fun things. No, I shouldn't explain that yet. But yeah, any other questions about that? This, we're gonna also go through this again in the next lecture because yeah, it's one of the most more confusing things I think people have. All right, also to note, one of our goals of virtual memory was to make it like really, really fast, right? Almost as fast as actually just using physical memory. Does that seem fast? No, this does not save time. This is way worse. So if I had a single large page table, right? If I wanted to figure out the physical address of a virtual address, how many memory access do I need to figure that out? One, right? I just look up its page table entry and then figure out where it is and then do the original memory access. Oh, even that's twice as slow. All right. So if I have two levels of page tables, how slow is it? Well, here I have two lookups. So I looked up L1, then looked up L0, then did the memory access three times slower. In the case of this, where we had three levels of page tables, oh, we have three lookups in the original memory access. So it's four times slower. Yikes, we created another problem. So we saved ourself in space, but we actually made it slower, which we get to solve in the next lecture. So just remember, pulling for you, we're all in this together.