 All right, please find your seat. Let's get started. So last lecture, we started our journey about virtual memory, and we left off with doing a page table, which was basically a giant lookup table, right? We decided not to individually translate individual addresses, because that would be insane. So we decided to have a big block of memory and translate something called pages, which are typically 4,096 bytes, and that's where we left off. So any questions about where we left off yesterday? Pretty clear, hopefully. No major questions? So the big problem we were left off with last time is, well, if we have a giant page table, and we only have one page table, so again, each process would need its own page table, because each process has access to its own virtual memory. So if we have just one giant page table that supports up to a 39-bit virtual address, well, our page table would need two of the 26 entries again, one for each, and then each of our entries would essentially need to store at least a physical page number. So that's one of the things that would need to be stored in the PTE. Does anyone remember what other things we need to store in a page table entry? So one of the things for sure we need to store is the physical page number, so we're to actually find this page in physical memory, and what are some other things we need to store as part of the page table entry? Yeah, at least a valid bit, maybe some other flags if I can read or write to it, maybe if I can execute it. So we'd have a physical page number, some flags, which would definitely include something like a valid bit. So this doesn't really work, because, well, if we multiply those two numbers together to get the size of our page table, well, that means the size of our page table is 2 to the 30, so 2 to the 26 times 2 to the 3, which is what 8 is, again, generally it's better to keep everything in powers of 2, because then using exponent law you can just add them together. So 27 plus 3 is 30, so the size of this page table is 2 to the 30, which again is 1 gigabyte. So you might notice I write something weird. I write like a lowercase i after the gigabyte instead of just gb, and that's because gigabyte means something different depending on who you're talking to, and gib means definitely base powers of 2. So in that 2 to the 10 is 1 kilibytes, 2 to the 12 is 1 megibytes, 2 to the 30 is 1 gigabyte. And then depending on who you're talking to, well, a kilabyte might mean the same thing, or maybe not. Maybe it means base 10 because we're humans, maybe it's 10 to the 3, and a megabyte is 10 to the 6, then a gigabyte is 10 to the 9, and then let's do a terabyte for fun. So a terabyte in base 2 units is 2 to the 40. In base 10 it would be 10 to the 12 terabytes. And these numbers are that much different if you do at kilobytes because it's 1,024 versus 1,000, basically the same thing. But if you go all the way to a terabyte and you actually get the difference between these two numbers, it becomes somewhat significant. So this is probably the base two things we care about and make sense to us. And this is what hard drive manufacturers will sell you. So has anyone ever had a disk and then formatted it, and it's way smaller than it should be? Well, if you read the fine print, it's because their definition of a terabyte is probably different than your definition of a terabyte. And since it's actually kind of ambiguous, they choose the smaller one because it's cheaper. So that's just a fun aside thing. So I'll always write the lowercase i there, which means it's a base 2. So that's a fun fact for the day. So that's way too big. One gigabyte per process is not going to work. So this is like the shock of the lecture. So this is what we're going to do. So we are going to divide up that huge, huge, huge page table into a sequence of smaller page tables. And the idea is, well, if I only need to translate a few pages, well then, I don't have to store one giant one gigabyte table. I could actually just store something like, in this example, at minimum three tables. So each of these tables are significantly smaller. So you'll see they only have 512 entries, so 0 to 511. And they just contain a normal page table entry. But instead of doing a direct lookup, they just point to the table below it. So instead of our one big page table, we would start at L2. And that's where our first page table would be. We get an index into there, which is significantly smaller. And that points to another page table. Then we do the same thing. Point to another page table until eventually we get down to the bottom here at our lowest page table. And then that just behaves as normal. That has a direct lookup we have. So at the end of the lecture, we're going to explain this. It's OK if this doesn't really make that much sense right now. But after we're done, we'll come back and see if we understand it. So first, we'll dispel one kind of myth about what we're talking about. So all this page table stuff, all of the management is done by the kernel. And it directly accesses physical memory because it's the kernel. It needs to make virtual memory for your processes. So it doesn't exist before the kernel has anything to do with it. So when you boot up your computer, your kernel will be able to figure out how much physical memory is in your machine. And what it is going to do is divide all that physical memory up into pages and essentially maintain a giant link list of all those pages. So it'll have a link list of all the pages for efficiency. It would just store a pointer on a page because all that memory is not used yet. But it's basically just a giant link list. And then all they do is point to the next free one. And initially, when you boot up your computer, the kernel will just create a link to the next one, to the next one, because nothing's using memory yet. So this would be the initialization at boot. It just creates a bunch of pointers between all the actual physical memory blocks. And then in the kernel, memory allocation becomes dead simple because everything is the same fixed size block. We're dealing with pages. So if you request a block and the kernel has to allocate it for you, it just removes it from the free list, whatever it's pointing to. And then you just use that block of memory and it's allocated for you. And then when you're done with that page, it would just add it back to the free list. And that's it. So your data structures add stuff to a list, remove stuff from a list. If in this type of memory allocation, that's all you need to do to implement memory allocation. You just request a physical page. You get it. When you're done with it, you give it back. So any questions about that? Yep. Yeah, this would just be a list of pages that we haven't used yet at all. OK, so our major insight for this is since we're dealing with pages and we don't want, and it's really easy to allocate pages, then I don't want to do any special cases for a giant page table or a really smaller page table or variable sizes. So the insight for the designers of all this virtual memory systems were to just use a page for each of the smaller page tables. So instead of one giant page table, I'm going to create a page table that fits exactly on a page, which in this case, and for most of our examples, it will be that magical 4,096 bytes. So we can work backwards from this. If we know that's our size of our total page table, which would be 2 to the 12, again, remember, we got to get really used to base two units to make our lives easier. Well, if each page table entry is the eight bytes we were discussing before, well, then it's just a simple question of how many eight byte things can we fit on 4,096 byte page? And if you just divide, you get 512. So I could fit 512 entries, page table entries, on a single page. So that's what we do. And each of these levels of page tables just point to a level below it. So in that case, before, we had three levels of page tables. And of course, we did index zero because we like computers. So you start with L2, L2 points to L1, L1 points to L0. And then we treat L0 as our normal big page table that just has our direct look up in it. So these smaller page tables, just to get used to them, they're just like arrays. So if I write it out as a table and have two columns of index and that page table entry, well, if we want to be super efficient and store it on our computer, well, we just store it as an array and you wouldn't have to keep track of the index. You wouldn't store the index. It would just be everything is nice and sequential and contiguous. So just to review to make sure we are OK, if I had something like this in C, so I have an array of ints and I have 512 of them, what's the size of this? How big is that array? And what is the size of int? Four bytes. Yeah, so if I want to figure out how big this array is in bytes, and remember, we do everything in bytes because memory is byte addressable, so that's the lowest atomic unit we have. Well, there's 512 ints, so the size of this is 512 times whatever the size of an int is and an int just happens to be four bytes. So this would be like 2048 bytes big. So any questions about that? Hopefully that's kind of review. OK, so now if we have a very int x and we do something like this, we do page table two. Well, we brought up that word offset before, so let's review and make sure we understand it. So what would be the offset of index two, knowing that integers are four bytes? Yeah. Yeah, so the answer was the offset of that is eight bytes. So does anyone have any questions about how I got that? Because this array, it will be stored at some address somewhere in memory. Generally, you don't care about where the base pointer is, like where the array starts. And if you want to be consistent nomenclature, it's kind of how many bytes from the beginning of that is something. So in this case, well, everything's index zero, so your first element of your array is going to be at byte zero, which is an offset of zero. So that just means it starts at the beginning. And because all of our integers are four bytes, well, index one would start four bytes later, so it would be at index four. And index two would be four bytes after that, so it would start at offset eight. And the offset is basically just how many bytes away from the start is something. So in this case, page table index two would be an offset of eight bytes away. So doing this whole page table thing is the same thing. We just have to substitute an int for a page table entry. So now all I did is change int to page table entry. So now if I have 512 page table entries, everything is still the same. So if I do the size of the page table, well, that's the same as the array. So it's the number of entries times the size of whatever the thing is I'm storing. In this case, an int is four bytes. My page table entry in this scenario is eight bytes, so it would be eight times 512, which happens to be that we can double check the size of our page. So two to the nine times two to the three, that's two to the 12, which is the exact same size as our page table entry. So any questions about this? So that our smaller page table, because we want to fit it exactly on a page, it would have 512 entries, which would be like index zero to 511. So any questions about that before we move on? Yep, so the question is, is there a standard page table entry size? And the answer to that is it depends on your CPU. On once, like per on your CPU, there's a standard, but when you jump between CPUs, it may or may not change. Yep, but for any questions, you'd be given the page table entry size or maybe like have to explain some trade-offs because if I made it smaller, well I could probably support less physical memory, but I could fit way more entries on a single page table if it was like half the size or something like that. Okay, so let's consider just one additional level. So before we had a 39-bit address, well now we'll just have a 30-bit virtual address. So if our 30-bit virtual address is something like this, three F, F, F, F, F, and then zero, zero, eight, well we're gonna have the exact same page size we had before of that magical 4096. So if we divide up this address into offsets and indexes, well we know that since our page table size is that two to the 12, that we don't have to translate the last three hex digits because they're essentially the offset into the page and that doesn't change, we don't need to translate that part because that part stays the same, no matter what page we get, doesn't matter. And then the rest of the bits values up here is just all ones. So if we divide that or if we write that in binary, we would get something like this where every just hex digit goes to binary, four binary digits, or if we want to break it up into different categories, it would be like this. So this is 12 bits, that's our offset that we don't have to translate, and then we group them into groups of nine that are to index these smaller page tables. So if we start backwards, the L zero group index, which would be nine bits here are all ones, and then we then just go to the next group of nine bits. So the next group of nine bits is also all ones. So if we go back and we divide up our address, we get something that looks a bit like, so we get something that looks a bit like this where our L one index is nine bits or L zero index is also nine bits, and then at the bottom, what I don't have to translate is the offset, which is 12 bits, and I'll just keep it in hex and then write it in red. So I don't have to translate these at all. So I just grouped up my numbers, started at the back, I chopped off the offset bits that I don't have to translate, and then essentially I went in groups of nine to figure out the indexes. And as a Sandy check, this is a 30-bit virtual address. So if I add all my bits nine plus nine plus 12, that's 30, so that's a good Sandy check to make. So any questions about breaking that number up into our indexes? Okay, so now is where it gets tricky and we try and follow things. So instead of each processing, each process having its own giant page table, it would have its own root page table, which would be the highest level one, which would be L one. So it would start with its own L one page table, which has those 512 entries, and that's its page table, and that's where it starts translating this address. So somewhere your kernel would allocate a page for it, it would get a physical page number back and it's free to do, it's free to add entries to that page. So for instance, in this case, if your kernel just gets physical page seven, well, now it's going to use physical page seven to do as the L one page table. So because the kernel gave you back a page, well, going back to this, well, we essentially have this, we essentially have 512 page table entries we could make on that page. So somewhere in there, so somewhere on there we would get a page back and this would be our L one page and it would start at physical page seven, it says, but this page is going to be, since it's a page, whoops. So since this is a page, the page we get back is at 4,096 bytes. So how many page table entries can I fit on this thing? Yep. Yeah, 512 that we just saw right there. So I can fit 512 of these because my page table entry size again is eight bytes and my total page size is 4,096. So to visualize it, instead of writing out an array, we could do it in the form of a table where it would have entries all the way from index zero, one, two, dot, dot, dot to 511. And then what you would do is to translate this address, well, let's also put another column here to say if it's valid. Well, to translate this address, if I look at it, my L one index I would use to translate this address as all ones, which would be 511. So if I try and translate this address, what I would do is use my L one index, then I would know from the kernel that this process is L one table is located at page six or page seven or something like that. And then I would look into it and then actually look at that entry, which would be 511. And all I need is the valid bit to be on and I can use it. And in its physical page number, it would have to point to an L zero table to use to continue the translation. So in this case, it might have something like a value of eight. So if I want to continue translating this address, it means that the next table I should look at starts on page eight and I should consider that as an L zero page table. So any questions about that? Or maybe hold off and we'll see what it looks like at the end. Yep, yep, yep. So this is in heck. Yeah, so the offsets in hex here because I don't need to translate it, so I don't really care. So I can just keep it the same. Also, I ran out of room on the page to write 12 or to write a bunch of zeros. Okay, so to continue translating this address, well, I would have to look at my L one page table. It would have an entry for that index. So the L one index again is this one, which is 511. So in order to successfully translate this address to something, it would need to be valid and point to another page. So it would point to another page and let's draw what that would look like. So it would be another page that would start, in this case, at physical page eight. Again, the size of this would be the same size as a page and it would look pretty much the same. It would have its own indices, zero, one, two, all the way to 511 and it would have a valid bit and also the physical page number of the translation. So if I wanted to continue translating this address, well, this says I should go, this is my L zero page table. So I should look at my L zero index here and it's the same one, it's 511. So I would go ahead and look at the entry at index 511. In this case, it would actually contain our translation. So let's just make up a value. So say it C-A-F-E, just make it more fun and it would have to be valid. So if we follow this, well, now we can actually just do the same thing we did yesterday and translate the address. So what would my final address be if I have virtual, my virtual address was 3-F-F-0-8 So now following this, what would my physical address be? So when we had a big page table, what did we do to get the final address? We just kept the offset the same and then replace the whole virtual page number with the physical page number, right? So we do the same thing, this L zero table, we treat it just like our big table, it stores the final translation. So to get our physical address, well, our offset doesn't change. It's still that zero, zero, eight and we just replace the virtual part of the address with the physical part. So the physical part, excuse my writing, is this, maybe that's why you couldn't see it. So you just replace it. So our physical address would be C-A-F-E, zero, zero, eight. So any questions about how we got that? Yep. Yeah, so the question is how are the two tables connected? So each process would have its own L one page table that stores, you know, the top level, that's, yeah, that just stores page table entries to a bunch of L zero tables. So you would be given something like this that it would start at physical page seven and this would be your L one table. And as part of decoding the address, the hardware on the CPU that we'll go over next lecture will be like, okay, well, I know I need to translate two levels of page tables. So this is my L one, whatever value I find here, that points me to my L zero and it just knows them and that's how they're connected. So because this is an L zero, because we have two levels of page tables, well, we know we're pointing at the highest level. So this is like an L zero or an L one table. And then these entries all point to L zero page tables. And we just know that just by knowing how many levels of page tables we have. Yeah, so the question is, is L zero shared between all the processes? So what we'll get into later of little tricks you can do is you could share them if you want to share memory. So if two different processes have the same L zero entry, well, then all those addresses are gonna translate to the same physical addresses. And that way both processes are gonna actually get the same memory and actually start sharing memory. So we'll see when we do translations whether what we can do with that because we could have processes share memory if we want. If we don't want them to share memory, we just make sure all the L one or L zero tables are all independent. Yep. So these are all still on physical memory, right? So this is what the kernel is maintaining and it has to use physical memory. So all the page tables and stuff it use, or yeah, all the storage of the page tables are done in physical memory and they're all pages. So that's why we use, you know, why this physical page number, we just use it to say where that page is in memory. And we just use that because everything fits on a page. Yeah, so page, okay, so remember physical page number, like in this case, physical page number in seven, it doesn't start at address seven because it would start at address like seven, zero, zero, zero, right, and go all the way to address seven, F, F, F. So that's where it would be actually stored in physical memory. So if it says physical page seven, it means that all these addresses are on that same page, which to Sandy check again, well, zero, zero, zero to F, F, F, like the difference between them is the same 4096 bytes. So everything kind of jives. Yep, so how big the physical page number is depends on whatever we wanna support. So in this case, if our page table entry size is like eight bytes, well, at least what the designers of risk chose is to only use 44 of those bytes for the physical page number, so it can support up to two to the 56, like a 56 byte physical address. So yeah, so the size of the page table entry determines like the size of physical memory you can support. While the size of the virtual is just how much each process could actually address. Yep, yeah, so there'd be two page tables here. So the first L one page table would live on page seven here. So L one lives on this page, which if you translate it to physical addresses, like straight up physical addresses that we don't care about where the page is, it would be 7,000 to seven FFF. And then this table over here, because it starts at physical page eight, well the addresses that represent this page are 800 to 8 FFF. Yep, sorry. So yeah, so what you store, you don't store the address, you just store the physical page number. Cause yeah, so if my physical page number is 08, well it means I can access all those addresses are on that same page. So the way to think about it too is from our, what we've been doing to translate it is you don't, all the offset bits are for you to use for your own purposes. That's where you are within a page. So all those can be whatever you want. And the other, the upper part of the address would be the physical page number. Yeah, yeah, so if we go back to this, this entry here, like 511, that would be at a particular offset in the page, the same way that an integer's at a particular offset in memory. So in this case, this would always start at like, oh God, math. 4,096. Yeah, so if all of our entries are eight bytes and our page is 4,096, that means entry 512 would start at byte 4,086. And that offset is the same no matter what page we're on. So if this page table happened to be, happened to be stored at page nine, that offset does not change. But if I were to access that entry, well, you know, it would be at address 9FF8 or something like that. Because FF8 would be the offset, so that is 4,096 and that's our physical page number. Or is that too quick binary? That might have been too quick binary. Okay, so we have, yep, sorry. Yeah, yeah, yeah, yeah. So the question is, well, in this case, my physical page table on the left is only one digit and the other one is four hex digits. And that doesn't have any specific meaning. My page table or my PPN would be a certain size for both of them. In the case that we're following the risk five one, they're all 44 bits and I just didn't write any leading zeros. So, yep. Yeah, so these index bits as part of the virtual address don't correspond to what page you actually use to do the translation. So that's a separate thing that gets stored. So what would be stored here as part of the operating system is this part that would say that the L1 table starts on page seven. So that would be stored separately. So then you combine these two things. So as part of the virtual addresses, the virtual address tells you where to index in these page tables. Yep, so this would be within L1, where do I index? And that would be 511. And then this would mean within L0, where do I index? And then the other pieces of the puzzle to actually translate the addresses, I need to know what where L1 is. So you'd be given, in this case to translate the address, I would have to tell you where L1 is. So in this case I said L1 starts at seven. So then you'd use that index. So if I changed, yeah. So then I use L1 index, I know to go to 511 to find what I'm looking for. That tells me the next piece of the puzzle, that tells me where, what L0 table to use. And then I go back up to my virtual address and that answers the question, what entry in that table do I need to look at? And then once I have that, that's L0. You just do the same translation as we did with our big page table and that gives you the address. That makes sense, hopefully. Yep, yep, yep. So now we know when we create a process, if each process has its own root page table, well as part of creating a process or a new process when you fork, that's independent. The offering system's gonna have to ask for another page and then it gets another page and would say I will use this page table for this process and then it can monkey with the entries, just if you wanna share memory or if you want to make memory independent or not. But all the page tables do is help you translate virtual addresses to physical addresses. Right, good, yep, yep. So let's go back to our first slide of the evening. So now we can kind of understand this a little bit better, hopefully. So for this slide, we go, we look up a virtual address and we slice it, in this case, all of my page tables are of support 512 entries. So I start dividing it up after the offset in the groups of nine bits to the power of nine is 512. So the example we just saw was starting at L1 and going that way, but now if we were to add another level of page table, it's just the same thing but we just do it again. So as part of our address, we would have three levels of page tables. So an L2, an L1 and an L0 and then this SATP, that's essentially the root page table that you would use for this process. So in this case, this points to an L2 page table and then that's part of the information we'd be given and then we can figure out the address. So we would use this L2 page table, get the index from this part of the virtual address and then that physical page number is going to point us to an L1 page table. So it would point us here. Now we are using this as our L1 page table and then we get the index from the virtual address again, then find out what entry it is, it would have a physical page number, then it would point to an L0 page table and then this is our lowest level page table so it's like the big table again. So we would just take the physical page number and then that's our physical address and we don't touch the offset. So if we change the offset to whatever, we wouldn't change it, yep. Yeah, so that's a good question. So what about if I'm having so much fun with all these page tables, I want to do it again. So I want to do four levels now. So you can do four levels. So let's see. So with three levels, well our address looks like this and we'd always use nine index bits. So for a, right now it looks like we have 12 bits offset. Then we had nine bits for L0, nine bits for L1, nine bits for L2. So if you wanted to add another level because everyone's having fun, right? You just add another nine bits here. Now how big of a virtual address can we support if we add all those bits together? Sorry? 48, so if we had so much fun that we want another level of page table, well if we add another level, essentially we can have a 48-bit virtual address now. That additional level lets us translate more addresses. So now we can support a 48-bit virtual address which is massive, right? Because a 39-bit virtual address, well it means I can address two to the 39 bytes which is 512 gigabytes. What about for 48 bits? How much virtual address, or how, what's the size of my virtual address space? Anyone, quick powers of two? So 48-bit is to the power 48. Do we remember what's like the biggest unit that I could chop off some stuff? Yep. Yeah, so terabytes. So if I rewrote that, a terabyte is to the 40. So I would have two to the eight terabytes or to the eight. You should know what to the eight is. So that's 256. So if I added another level of page tables, suddenly every process can address up to 256 terabytes of virtual memory, which seems pretty good. It's a bigger number than 39. Why might we not want to do this? Yeah, so the answer is that my addresses take up a lot more space. So we do need another level to translate an address and it would take up a bit more space, but take up one additional page, which isn't too bad. Oh, yep. Why would it be slower? Yeah, so if we know that all our processes use virtual memory. So as part of using virtual memory, your computer needs to do the translation. So each translation step takes time. So if I have three levels of page tables, well, I have to do three steps, which if everything scales linearly, if I add another level and now I do it four, well, I slowed it down by like 25%. And we have some ways to go around that, but basically no one has above 512 gigabytes of memory anyways. So why would you pay for something you're not gonna use? Yep. Oh, yeah, so the question is why is the offset 12 bits and is it always 12 bits? So the offset is always corresponds to how big of a page we have. So in this case, we have 4,096 byte pages, which is to the 12, so we need 12 bits to address every single byte within a page. And we call that the offset because it's, it would stay the same no matter what page we're on. And this would be given to you. This is the usual size on any hardware that I know of today, but you know, fun questions. You could just dream up of us. I could just change the page size and tell you, oh yeah, well what happens now? And yep, yeah, so the question is why do we have nine for the different levels? And the answer to that is because of our nice insight, if our memory allocator just cares in pages and we can make it really, really easy, our insight is always to use exactly a page for our smaller level page table. So it's 512 only in this case because we know our page size and we know how big our page table entry is. So essentially we have this equation and you have one unknown, so you can always figure it out. So on any type of question or any design problem, you have this equation and you'd always have one unknown. So I could give you the size of a page, which would be the size of our page table and I could tell you the size of the page table entry. Then you can figure out how many entries you can fit in a page and then the number of bits you need for the index bits comes directly from that because you have to be able to index each one of those. So in this case, my size of the page table is to the 12. The size of the page table entry is two to the three or eight bytes and that means the number of entries has to be to the nine. So I need at least nine bits to index every entry within that page. So the number of index bits you need is gonna change depending on this. So if I suddenly say, hey, let's rewind to like 20 years ago when we had 32-bit machines, well, when you had 32-bit machines, the page table entries were four bytes instead of eight bytes. So if they're four bytes instead of eight bytes and we still have the same page table size, well, how many entries can we store then if our page table size is four bytes now? So we had two to the nine before when they were eight bytes. How many when they're four? Yeah. Yeah, two to the 10. So back in the day, well, we could fit more entries on a single page and lo and behold, if we actually go back to it and we have a 32-bit machine that has a 32-bit virtual address and we have the same page size, well, our offset is going to be 12 bits. That's going to be our offset. And then since we can fit more entries on a page table we can fit to the 10. So each of our index bits needs to be 10. So this would be like our L zero index. And if we were just to work backwards, well, that's 10 plus 12, so that's 22 bits. So I have enough room for another level. L one index. And it would also be 10 bits. And lo and behold, on a 32-bit machine, well, 10 plus 10 plus 12, that's 32 bits. So I can translate any 32-bit address if my page table entry size is smaller by just having two levels of page tables. So this is like the state of the union before 64-bit machines came into the picture. This was all fine, but obviously has some limitations but this is what it was like before. So any questions quickly on that before we wrap up and we'll do more practice and everything tomorrow? Yep, quick. Yeah. Oh yeah, so the quick question before we leave is there is the SATP here and that's basically a pointer to the highest level page table. So that's just a pointer to the L two page table. What L two page table do I need to get started in this translation? All right. Well, just remember, pulling for you. We're on this.