 All righty, welcome back to operating systems. So, lab three is now out. So, hopefully you have learned something from the labs and time management maybe. So, just as a guideline, the labs are meant to be like 10 hours if you have two weeks since five courses, eight hours a week per course, minus three for lectures. So, you have like five hours a week. So, the labs being two weeks means you should spend around 10 hours, eight to 10, something like that. So, it seems most people fell within that ballpark, but if you didn't listen to me and didn't, you know, check for errors and immediately crash, you might spend like eight hours debugging a single thing. So, listen to me. So, next lab should be a lot easier. The code is like, if you are good at writing concise code, you should have it done in under like 80 lines, something like that. And it's mostly thinking you won't have to deal with threads or processes or anything like that. You have to deal with page tables. So, you should be able to start it after today. So, page tables, always fun. So, how we left off on Tuesday, if we remember, we just had this gigantic page table full of page table entries, and that is how we actually translated virtual addresses to physical addresses. So, that page table was absolutely gigantic. It was a gigabyte, which is not something you want, a gigabyte page table for a process. So, let's think of brainstorming together. What should we do about this page table size? Because most programs don't use 512 gigs of memory. Your programs probably use like a few megabytes or something like that. So, what should we do in order to not have a gigabyte page table? Any ideas? Use a hash? Oh, use the cache. So, a cache just makes lookups faster. It wouldn't reduce the number of entries we have. So, it still would be a gigabyte. Read ahead the lecture and make it fit on a page. Or in general, let's make it smaller, right? I can just make it smaller. So, if I just make it smaller, that's a solution. So, if I made it such that one nice thing to do would be just make it fit on a page. We'll see why we do this in a little bit. But, turns out that a page is a nice number. Your operating system can just allocate pages nice and easy. So, if we only had a page for a page table and we try and make it fit on a page, well, we can do some math and work backwards. So, our pages are to the 12. So, that would be the entire size of this page table. And if our entry size is eight bytes, which is two to the three, then I can fit to the 12 divided by two to three entries in this page table, which would be two to the nine. So, I would need nine bits for my index and then that's it. So, now my page table is only four kilobytes here. And when I am translating the address of MMU, all you have to tell it is where to start, where the page table is physically located in memory. So, there would be a register that says where your page table is. And that would be a physical address. And then in this page table, it would look it up, do the same translation we did before, but instead of 27 bits, now we only have nine bits. And here the physical address would still be the same because our page table entry is the same deal. Everyone on the same page for this one? Yeah. So, this is not the same amount of output. So, I can only fit into the nine entries in this case if I make the page table small enough such that it fits on a page. But everything else stays the exact same as we had before. All right. So, we're good so far because, yeah. So, the obvious question might be, the next obvious question might be like, well, so two of the nine entries, if each of those represents a page, then my program can only use at most like two megabytes of memory. So, what happens if I need more memory? What about if I have a program that actually wants to use all 512 gigabytes or a gigabyte or something like that? And that leads us into our solution, which get ready for it because this is where everyone's brain starts to break. That is our solution. So, we have multiple levels of page tables and we will spend the rest of this lecture talking about this. So, how this works is it has the same idea where you just kind of cascade a bunch of smaller page tables that fit nicely on a page altogether. So, up in the virtual address before we had 39 bits, now if you add all of these together, you get 39 bits. So, luckily our offset doesn't change. Page, it's related to the page size, we're not changing the page size at all. So, it is 12 bits and in L zero, one of the lowest level of page table, it is nine bits. So, that's just because it was like before, we want to fit each level on a single page. So, here, if you count up the number of bits, we have to have three levels of page tables. So, now, when we translate an address, this root page table register just points to an L2 page table and then the only difference in here, this page table is still full of page table entries, but the physical page number actually is the physical page number of the L1 page table. So, the entry here tells you where to find the L1 page table and then in the L1 page table, you take this L1 to figure out what index you need to look at to get the entry and then once you get the entry, it has a physical page number that tells you where to go in the L zero page table. Once you're in the L zero page table, that's the same as we had before. That page table entry has the physical page number that corresponds to the actual physical address of that virtual address you're actually interested in. So, any questions about this? Cause this is, and this is how your page tables work. Yep. Yeah, so cache is named the same way. So, this has nothing to do with cache. They're just different levels of page tables. So, they're named the same way. The only thing here is that you might consider it backwards, but L zero is like that is the page table that has the entry you actually want, like in this step, where we just look it up and we get our physical address directly. So, no matter how many levels of page tables, once we get to L zero, that will have the actual physical address of whatever we're interested in. Yeah. Yeah, so in the L zero page table here, you can only have two than nine. Cause it only has two than nine entries, but in this table, right, it can point to two than nine things and then each of those can point to two than nine things and then in there, they can point to two than nine things. So, this is, we need to map to the 27 bits for the virtual page number here. Yeah. So, we're still representing that 512 gigabytes of virtual memory. And yeah, isn't this just more steps? Yeah. So, this two to the nine, so all these are gonna be two to the nine because we want to make our page table fit on a page. So, this two to the nine, why there's two to the nine entries is because two to the nine, the number of entries times the size of the page table entry needs to equal our page size. So, our page size is two to the 12 and our page table entry is nine bytes or nine, or two to the power of three. So, yeah, so here you can, here I can write that. So, I'm not bold. So, if our page size is, whoops, two to the 12 and our page table entry size is two to the three, which means eight bytes. Well, it means that the number of entries we have that we can fit on a page is equal to two to the 12 divided by two to the three, which is equal to two to the nine. All right, yeah, two to the three bits is a byte. So, that's two to the three bytes because the page size is also bytes. Yeah, so, this is, the obvious question to this is like, what the hell isn't this the same thing with more steps? Oh, yeah. So, this keeps the number of entries the same. So, when we had that big, large page table to translate a single address, how big did our page table need to be? Yeah, so, before it went with one big page table, that one big page table is a gigabyte, even if you just need to translate a single address. In this case, if I need to translate a single address, I only need one L2 page table, which is a page big. Then I need one L1 page table, which is a page big. I need one L0 table, which is a page big. So, that would be 12 kilobytes instead of one gigabyte. Yeah, so, inside the page table is just the same page table entry. And in the page table entry, that has the 44 bits, right? Because the page table entry is 64 bits, so it's big enough. So, yeah, yep, what works? So, we're shrinking, each of these page tables is way smaller. So, each of these is only a page. So, the only thing we're saving is space if we don't have to do a lot of translations in a program. So, in the worst case, it's actually waste more space on page tables. But we're going for the common case, which is most programs do not use all the address space, right? So, in here, in order to translate a single address, we only need one L2 page table to point to one L1 page table to point to one L0 page table. So, we only need three pages to translate an address. So, that is 12 kilobytes instead of one gigabyte. So, in the worst case, we would have one L2 page table pointing to two of the nine L1 page tables if we wanted to use everything, and that would also point to two of the nine L0 page tables, and then each L0 page table would point to two of the nine, and you'd actually waste more space than a gigabyte, slightly more. But the case we're trying to optimize for is most programs. Most programs do not use 512 gigabytes. All right, any other questions? Because we're going over this all day, yep. Oh, that is a good question. How much slower would the lookup be? Yeah, so, how slow is this? So, in order to, whoops, in order to access memory, how many times do I have to access the page table to actually find my address here? Once, right? I just look it up so I need, instead of like, if I was just using raw memory, I would just have one memory access I actually cared about. For this, I need to look it up in the page table, which is one memory access, and then do my original one. So I'm making it twice as slow over not having one, and then in this case, yeah, you're right, this is way worse. So, in order to translate a single address, I have to have a memory access in L2, and then I have a memory access in L1, then I have a memory access in L0, and then I have my original memory access, so I have four memory accesses. So we'll address that next time, but the solution is your solution, just put a cache in front of it, but we'll see that later. All right, any other questions? All right, then we will just hammer this home. So, the reason why it does this too is page allocation. If you have fixed size blocks of memory or pages, writing a memory allocator is dead simple, it's just a link list. So, whenever your computer boots up, the kernel can just divide all the memory into pages so that everything is nice and aligned and starts at address zero and everything, and the operating system just treats them as pages and can create a link list of all the pages. So, initially, every page would just point to the next page over and over again because you're not using any memory at the beginning. And then, whenever you want to allocate memory, one page is just as good as any other page, so you can just take the next pointer in the free list, that's just a link list, get it from the head, nice and easy, and if someone deallocates a page, it's the same thing, it's not used anymore, you could put it at the beginning of the free list and you can reuse it again. So, allocation becomes dead simple once we have pages, we don't have to have some complicated memory management, anything like that that we'll talk about later. So, dead simple once everything's the same size. So, and yeah, this was our insight, is just to use a page for each smaller page table, it saves us a lot of hassle because you know where everything is on a page, if it fits on a page, everything is the same, it's nice and uniform, you don't have to argue about different page boundaries or figuring out how big a page table is if it was dynamically resized or something like that, you just grab another page, it's nice and easy. So, some people get tripped up on like the size of page tables, it's really easy if you just think of it as an array, so I assume you could do this, so if I gave you an integer array of 512 elements, can anyone tell me how big that is in bytes? 512 times eight, next? 512 bytes? 256, 256 bytes? How big is an int? Yeah, an int is four bytes, so what is four times 512? Yeah, 2048, so this is just 2048 bytes. So, one question might be also, well, what's the offset of index two? So offset, remember, is just the number of bytes from the beginning, so at index zero, well, the offset is zero, it's at the beginning of the array, and in this case, since every element is four bytes, they're divisible by four, so it'd be like zero, four, eight, da, da, da, da. So at two, that would just be two times four, so it should be at index eight. So, it's the same thing for page tables, just substitute an int for page table entry. So this would be something you're always given in a question, if you're not given it, a good number, it's eight bytes. So, by default on your system, it's eight bytes right now. So, if the page table entry was eight bytes, and our page, then we had 512 elements, it's the same idea, how big would that be? Well, now it's eight times 512, so it'd be 4096, which is, guess what, the size of a page, everything makes sense, right? So, you should also double check these numbers, so page size was 4026, which is the size of the page table I got, and remember the size of the page table, it's just the number of things in it, or number of entries, times the size of the page table entry. So, all these numbers should be consistent, if something doesn't jive, you have screwed up, there's lots of ways to double check your answer. All right, let's translate a fun address then. So, say we have, let's just write it out. All right, so say we have some crazy address, we'll have the address in hex, three f f f f zero zero eight, zero zero eight, and we will assume that we only have one hex, like two levels of page tables, so we have an L one that points to an L zero, and then in the L zero, that's where our actual lookup is. So, if we take our address here, and then divide it up, well, it should be something like, we have 12 bits for the offset, and then it's gonna be the same scheme, our page table entry size is eight bytes, and our page size is at four kilobytes. So here, if we write it out, we have nine, we have nine bits for our L zero, which doesn't nicely go in there, so I have to write in binary, and then nine bits for my L one. So, because it's the same thing when we're translating the address, this part, whoops, so this part is the offset, I can just, it's 12 bytes or 12 bits, so I can just keep it in hex, I don't have to translate it, it stays the same. The rest of it, to figure out what indexes I need to use, unfortunately, I have to do something like I have to put in binary, which for this, luckily enough, everything is just a bunch of ones, so one, one, one, one, one, one, one, one, one, one, one. So, if we write it all out, this is our L zero index, this is our L one index. So, if you have nine bits and they are all ones, and you want to bring it to decimal just so that we can, I don't know, talk about it like actual humans, then this is the same as 512, so that's just 512 in decimal. So, in order to translate this address, you would need our root page table in like an SAP register, something like that, and it would just say the physical page number of wherever the L, in this case, there's only two levels of page tables, so it would say where the L one page table is. So, it would say something like, I don't know, say it's at physical page number seven. So, if it's at physical page number seven, that means our fancy L one page table here, it means it starts at address 7000, zero, zero, zero, because that would be the beginning of a page, and it would go to address 70FFF. So, you don't need to tell where to go because it fits on a page, it has 512 indexes, so it uses the entire page. And to double check, if you do that math, whoops, if you figure out what this address range is, well, you know, it's 10000 in hex, which guess what, is 4096. So, everything kind of makes sense. So, in order to translate this address, there would have to be an entry in this L one page table at index 511. So, at 511, there would need to be a page table entry, I'll put some columns here like the physical page number, and then like a valid bit or something like that. So, in this entry, the physical page number could be something like 08, and it would have a valid bit, which means it can actually use it. So, in that case, what address do I go to to access my L zero page table? Yeah, so, because my physical page number is eight, it means it would start at address 80000, because it fills up a whole page and it goes to address 8FFF. So, my L zero page table would be there. And this way, it makes it flexible. It didn't have to be right beside it, right? And question, oh, could I go over how I drew this page table? So, I just draw the page table that every row inside of it is a page table entry, and I just write the index outside of it because it's not stored in it, just to the left. So, I just show what the index is, and then each row in this page table is a page table entry. So, in this, in order to translate this address, there would need to be something at 512, and then it could have a physical page number of something like C-A-F-E with the valid bit, in which case, if that is what is there, then I could translate the address. So, I keep the offset the same, so my physical, so my physical address, well, I would keep the offset the same. The offset is 008, and the physical page number just comes from the L zero page table, whatever the entry is. So, my physical address would be C-A-F-E-008, and that would be my physical address. So, any questions about that translation? So, we found what entry that is from the L one part, from the L one index of that address. So, it was the same thing here except we didn't have an L two. So, we had a 30-bit address, so we only had up to L one here, but you can see how this gets extended. So, you could have a 30-bit address, that's one, if we add another page table on top of it's 39, then the next extension is if we have four levels of page tables, well, the next size of virtual address space we can have is 48 bits. So, we just add another level on. Yeah, and then we have a question, what happens if V is zero? So, V is zero, which is the valid bit. It means this entry is not valid, so that's when you would get a page fault or you can't access the memory, your process. If you guess a random address in your program and try and dereference it or something like that, you're gonna get a seg fault, which segmentations don't exist, but basically just a memory fault, you're accessing memory you're not allowed to have. So, that would just, by default, if there's no valid entries it'll just be zero and it'll just fail the translation and your program will probably die. All right, any more questions about that? Okay, we can translate a address. That was two levels, doing it another time isn't too bad and doing it a fourth time, not too bad. So, in this case, the savings aren't that big if we had a single, like, if we had a 30-bit virtual address and we just had one giant page table, well, then we wouldn't split it off, we would have two of the 18 entries. So, that would be two of the 18 entries if we just had a giant page table and then each of our entries are eight bytes. So, in total, we would need a two-megabyte, you can make up numbers, that should work, yeah. Okay, so that was only two page tables, nice and small and that's it. So, we can have an example that we can play with and we can understand, so we can play around with stuff to see and this will be closer to lab three. So, any questions at all? Cause usually people are like, have no idea what the hell is going on but hopefully that was slightly clear, yeah. So initially your L zero page table, if there's no valid mapping, it would just be full of invalid entries and then if the MMU tried to translate it, it just wouldn't translate it, right? Yeah, so if, here we go. So, it would look something like this, right? So our L one page table, these are all the entries, it would have a single entry that points to an L zero page table and then here it would have 512 entries and eventually if this all fills up, well then we have to put a new entry here that points to a new L zero page table, whoops, and then it can hold more entries, right? And go on and on and on and then of course I could have, whoops, I could have two of the nine of these, I could point to two of the nine L zero page tables. Yeah, yeah, to translate a single address, I just need one L zero page table. So, if I had to translate two addresses, it could be like you said and they're beside each other and that would be nice because I wouldn't need an L zero page table. You could also get insanely unlucky because it just depends what the indexes are, right? So, I could craft it such that for two addresses they correspond to two different entries in the L one page table, in which case to translate them, I would need two L zero page tables because they don't correspond to the same one, right? Yeah, no, so it would just, so like the indexes just need to line up and your operating system, if that's a valid address, just has to make sure that it can follow the page tables until that address is valid. So here, we can do an example too. So we have a little simulator and guess what? That's what lab three is anyway, so we may as well see it. So, in this case, let's make this bigger. So, here I just check if the page size is what I expected it to and here in this allocate page table, all it's going to do is give me a page that I can put whatever the hell I want on it and otherwise there's just a bunch of little macros to create page table entries for you. So, this is a pointer to a 64-bit entry so the page tables, they're just 64-bit numbers, 8-bit numbers correspond to the page table entry. So, here I create a new L2 page table and then the MMU, the only thing it needs to start off the translation is it needs a root page table which would be an L2 page table if I have three levels. So, if I have three levels, I have to tell the MMU what my root page table is so it can start off using that for the translation then pass that, it just follows the levels until it translates the address. So, I tell it what it is, then I create an L1 page table and then here at index zero of my L2 page table, I create a page table entry to essentially point to that L1 page table. So, this is, and this macro just plays with the bits in order to get essentially extract out the physical page number from that address and does that for you. Then here I have to create a L0 page table or an L, yeah, an L0 page table and then in my L1 page table, I just pick index five because I felt like it and I make it point to an L0 page table. Then the L0 page table, I create something at index 188 and then this is where I put in the physical page number directly. So, it doesn't play with, you know, doesn't assume that that address is a page, it just uses it directly. So, yeah, and then another question, do programs fill by row or randomly anywhere? So, ideally if the addresses are right next to each other, that would correspond to addresses that are sequential, right? The pages would be the same, so they would be next to each other in the L0 page table, which is good because we wouldn't be making a bunch of page tables if we don't want to, right? So, that would be good. So, in this case I'm going to translate two virtual addresses. I don't know why that looks weird. Okay, my shell dined, cool. So, if I do that, I get a valid translation and I get the page fault, which just means that it couldn't translate the addresses. Yep. No, so those are little helper functions. So, you'll have the same ones available to you in lab three. Yeah, I wrote them. It's essentially simulating that free list thing, although it just asks the kernel for memory. So, the kernel doesn't. All right, so, can we figure out why this translated properly and this address did not translate properly? Yeah, yeah, well, let's see. So, if I were to translate the address A, B, C, D, E, F, that's as far as I know my alphabet. Thankfully, I work with computers, so I don't need to know any more than that. So, if we had this address A, B, C, D, E, F, and we had three levels of page tables, well, it would correspond to the same thing, right? We have, in my virtual address, I would have nine bits for L2, that index, nine bits for L1, nine bits for L0, and then my offset is going to be 12 bits. So, again, I do not have to translate D, E, F, and then for the rest of it, I can just write it out in binary because that's the thing we have to do now. So, what's A in binary? Do we remember our shortcut from last time? Yeah? One, zero, one, zero, 10, 10. All right, so A is one, zero, one, zero. Do we know what B is? Can we add one to it? One, zero, one, one. And then adding one to that again, one, one, zero, zero. So, that is A, B, C in binary, and then I have to group it in groups of nines. So, that's four, that's eight, and this is nine. So, and then, of course, there's a bunch of leading zeros all over the place. So, in this case, what would my L2 index be? Zero. What would my L1 index be? Five. What would my L0 index be? 188, hopefully. Well, it worked, so it must have been that. So, that's why that address translated successfully, because given a virtual address, I know what entries I need to make valid in order for it to translate correctly. So, my L0 page table, that's where the MMU, like that was given to the MMU, and then how the MMU translated, it looked for entry zero in the L2 page table. So, entry zero points to L1, and then in L1, given this, it would have looked at index five to find the L0 page table. So, it pointed to this page table, and then at 188, mercifully, it had an entry, so that's why I was able to translate it. All right, well, what about if I wanted to do, let's do a brain buster. So, what happens if I put an entry to there? So, what address would I need to have to use that? Yeah, instead of A, B, C at the beginning, it should just be A, B, D. All right, I'm just increasing it by one, same as going to the next index, same thing. So, if I do that, hey, I get two different addresses that correspond to the same physical address, which is fun, isn't that fun? Yeah, isn't that a problem? That is a good question. Is it a problem for you? Yeah, it's sometimes a problem, but sometimes this is useful. So, you can ask the kernel nicely to do this in two different processes and you can share memory between them and then communicate like that, which is pretty fast. There are other reasons to do this, even though you might think it's a bad idea now, but if you get into research or like you get a job where you have to be a wizard, this could be useful. So, we'll figure out ways to play with this, but any questions about this? Cause this is fun, right? Yeah. Yeah, the fives it index at L1. So, if I just go backwards, this is my L0 index, this is my L1 index, and this is my L2 index. So, the offset is just where you are within a page, right? So, you don't have to translate it because it's gonna be the same. And I don't need an offset or anything when I'm pointing to the next page table because it always starts at the zero boundary and it fits exactly on a page. So, if I need to index something, I know where to go, I don't need an offset. My index is given to me by the virtual address, and then if I know the page entry size, I know where to access memory to find it. Yeah. Yeah, so that's the question. So, here we're given a virtual address and then the question might be, well, how does it figure out what virtual address you need to start anyways? So, if we remember back in lecture two when I made that example, you hard-code the address. So, I had a starting address there, and if you remember, I told you I just picked this for no reason. So, I picked it for no reason. That was true because it's a virtual address and then it's the kernel's job to map it for me. So, the kernel figures it out for me. So, yeah, you having the problem of given a virtual address, you need to figure out which page table you have to figure out the page tables and get memory for, that's the kernel's problem. So, a lot of your programs will just, like your global variables, minus some compiler options will basically, the compiler will just pick a virtual address for you and then the kernel has to figure it out. So, this is how it works and you can do lots of fun tricks with virtual memory that we'll see later too because, hey, you can just arbitrarily map things to whatever you want. All right, any other questions? Yeah. Oh, so the, oh, so here, so this part is like the offset. So, I don't need to translate that, I don't need to do anything with that, right? Then here, I just wrote eight. So, up to here, this is A, B, C in hex, right? And then above that, it's just like if we were doing numbers in decimal or whatever, there's implied leading zeros there. So, in this case, it's like a whole 39 bits. So, it would be like a bunch of zeros. Everything else would be a zero, right? It's just, I skipped a few numbers. So, it's just leading zeros, everything would be a zero just by default. Oh, yeah, so I just shortened it. So, this would, technically, if I wasn't as lazy as I was, this should be zero, one, two, three, one, two, three. So, I was just, yeah, nine bits and then this would be nine zeros. I just don't wanna write nine zeros. All right, any other questions? All right, well, lab three is you're essentially copying what is going to happen on a fork. So, if a fork happens, well, if you just copy all the memory, you have to copy the page tables and then for every bit of memory it's pointing to, you have to copy the page and make it point to a new page and then copy all the memory. So, in order to implement a fork, copy all the page tables and for every entry, you point out a new address and then make the two pages the same and then they're the exact same as the fork but independent, same virtual addresses, same everything. Different physical addresses, lots of fun. So, good. Ish, I see concerned faces, I see ish faces, I see whatever. All right, well, we have time left over so, well, we don't have that much time but start working on the lab as questions. So, this lab is mostly figuring out what the hell is going on and the actual code you have to write isn't that bad, this one's more thinking. So, don't worry about whatever the hell happened in lab two, you get to revisit that pain again in lab four. So, yeah, we also, we haven't hit the hard part of the course yet, we're like, this is, at least in my experience, the second hardest thing. Yeah, no, but you can come talk to me about them or whatever if your thing doesn't work or I can spend, yeah, I can't post them because I have to reuse them and it's a thing. Yeah, I can do something with it. I can either just answer questions of people or just do a non-recorded thing where I just go through it. I don't care. All right, cool, just remember, pulling for you, we're all in this together.