 All righty, welcome back to Operating Systems. So people obviously have a midterm today. So we will keep this short and sweet. So this is pretty much this cleanup crew for scheduling and fun little things that you should probably know. So first we'll go back into scheduling and talk about a actual implementation of a priority schedule. We'll see what that looks like. So let's just get into it. We'll probably be done this lecture quite quick, and then you can go stay back for your other midterms and all that stuff. So let's explore dynamic priority scheduling, also in some literature called feedback scheduling. What this is, is it uses priorities and has the algorithm manage the priorities dynamically. That's why it's called dynamic priority scheduling. So we use set time slices, measure CPU usage during each time slice, and then the end of each time slice we recalculate the priorities of all the processes. Idea behind this is we increase the priority of processes that don't use their time slice, so they're more likely to execute in the next time slice, and we decrease the priority of processes that use their full time slice so they get shoved further back in the queue. So in this, we have to pick which number represents lower or higher priority, so we will do the thing Linux does and the thing that makes this algorithm work a bit better is the lower the number, the higher the priority. So each process, whenever it starts, gets assigned a priority P of N, and whenever we make a scheduling decision, we pick the lowest number or highest priority to schedule. If it yields or blocks or does anything, we will pick the next lowest number. If there is a tie, we will resort to a rival order, which you will always be told, and if a lower priority number or higher priority process becomes ready, we would switch immediately to it because we're doing preemption. Then for this, we record how much time our process executes in this priority interval, so you say calculate however long it executed before with the numbers C of N, and in this case, timer interrupts still occur, so the kernel maintains control. It could context switch whenever there's a timer interrupt, even though sometimes you don't want it to. So at the end of each priority interval, we update the priority of each process with this calculation, the new priority of the process N is equal to its old priority divided by two. Idea is that it gets smaller each time, and smaller number means more likely to get picked and scheduled. And then we add a constant, which is just however long it executed for in the last time slice. So after we add it, adjust the priorities, we reset our timekeeping variable C back to zero, and that is valid for the next time slice. So what's that look like? Well, let us assume all priorities have an initial priority of zero. That means they are all the same priority level, and we would try, and they're all tied. So if they're all tied, they go by arrival order. In this case, we have process X, Y, A, B arriving in that order, and A and B are CPU-bound processes, which just mean they want to execute on the CPU. If you give them any CPU time, they will take as much time as you give them. They just want to execute. They gotta go fast. X and Y are called IO-bound processes because they can only execute for a little bit of time before they block because they're waiting for some type of resource. So if they're waiting for a file to get read from the disk, something like that. So in this case, both X and Y can only execute for one time unit, and then they get blocked for five. So they won't use the CPU because they're waiting for some other resource. So in this case, timer error-ups occur every second. Each process can use at most 10 time units, and our priority interval is every time, 10 time units. Yeah, yeah, yeah. So in this case, after X executes for one time unit, it can't execute again for another five. So you may as well schedule something else. So it's waiting for something. All right, so here we have time slices 10. Priority interval is 10. So we will recalculate the priorities every 10 time units, and the time slice is an absolute maximum. I separated them out because we will see that, hey, we don't recalculate at the end of every time slice. We recalculate every 10 time units. So in this case, since they are all of equal priority, we go by arrival order. We would schedule X first. It can only execute for one time unit. So it is done at time one. So it would block for five, so we can't schedule it again for another five time unit. So it is available to be scheduled again at six. But we have to make another decision. We have to schedule another process. So next in line is Y. Schedule Y for one time unit, it executes. Now it has to block for five time units. So it is available after time seven. But in this case, we switch to process A, and it would execute for eight time units, even though at time six, X is ready, and at time seven, Y is ready. So we don't switch to it because they are of equal priority. If they are equal priority, we don't switch unless we absolutely have to. So even though they're ready, we don't switch to it, and we wait until time equals 10, whatever we have to recalculate the priorities. So if we recalculate the priorities, while their initial priority was all zero, so zero divided by two, all going to be zero. So their new priority is essentially going to be how many time units did they execute for in the last 10 time units? So the priority of X is one, priority of Y is one, priority of A is eight, and priority of B is zero. So now at time 10, after we recalculate, we can context switch in a new process. In this case, process B is the highest priority or lowest number, there is no other tie. So at time 10, we would just schedule it. It would run for the full 10 time units until we have to recalculate the priorities again, and then this would go on and on until eventually the process is finished. So also just as a note, if we didn't switch, our priorities here to have B as the higher one and A was a higher priority, while it would go for the rest of its time slice, so it would only execute for two more time units until having to context switch. But if it's still the highest priority, it would probably just be itself again and get picked again. So it would probably execute for all the way to 20. But in this case, B is the one that's picked, yay. All right, any questions about that? Pretty simple. All right, let's make this as exciting as we can. So we will change process A and B to have an initial priority of six. So they are lower priorities. So we would prefer to run X and Y first. So same thing happens. X and Y are a priority zero or the highest higher priority, there's a tie. So we have to, we just go to arrival order again. X gets picked, schedule it for one, gets blocked. Now in this case, we go to the next lowest priority number which is six, there's a tie between A and B. We pick A at this point because we go by arrival order. So we pick A, A executes for four time units until X is ready at six. And because it has a lower priority or higher lower number or higher priority, we would context switch away from A to B. It would context switch away from A directly to X. It would execute for one time unit, get blocked. Then Y would become ready. So we execute Y for one time unit. It gets blocked. And now at time eight, we haven't recalculated the priorities yet. The two processes we could run are A and B. They're of the same priority. So we have to do the same thing. We have to go by arrival order. So we would schedule process A to run until we recalculate the priorities. So here, if we recalculate the priorities, well, the initial priority of X and Y were both zero. So zero divided by two is still zero. And they both ran for two time units. So their new priority is two, Y same idea. And then for process A, its initial priority was six because we said so. So it's divide by two, three. And it ran for six time units. So its new priority is nine or very low. Process B executed for zero time units and its initial priority was six. So its new priority is three. So it doesn't get shoved ahead of process X or Y, but it gets a higher priority than process A. So in this case, at time 10, we can only schedule between process A and B because X and Y are both blocked. So we would pick B in this case because it is a lower number, higher priority. Get scheduled for two time units until process X is unblocked again. So it gets unblocked at seven plus five because we say it gets blocked for five time units. So at time 12, it's ready again. It is a lower number or higher priority than process B. So we would context switch directly to A. It would run for one time unit, get blocked. At time 13, Y is ready. It has a lower number or higher priority than process B. So it would get executed. And now at this point again, we can only choose between A and B. B is a lower number, higher priority. So we would schedule that. It would run for four time units until X is ready again. The next would run for one time unit, then Y and block. And then we would recalculate the priorities, go on and on. And I could ask you to do this as mean as I want to be. Do it for 50 time units or something like that. Hopefully I'm not that mean, but you can see you just do this over and over again. It gets boring. Yep. So yeah, you can context switch away from any process at any time. But if it's blocked, you may as well context switch to something else. Otherwise your CPU is gonna be idle, right? If it's just blocked. No, so you can context switch at each timer and erupt. Yeah, so this is preemptible. So you can just preempt it at any time. Yep. Yeah, you just update them at each priority interval, not whenever you context switch. Because you wanna do things at a set interval because otherwise if we recalculated it, each time we context switch, we'd probably be wasting a lot of time if things switch really fast and we want it to be more predictable. Yep. If we can just like basically switch anytime we want. So usually the time slice and the priority interval are the same thing. So it's like the most it'll get run for until we recalculate the priorities and then it will probably get kicked out. So basically under this case, the time slice is only there to serve the purpose of keeping track of the priority interval? No, so in this case the time slice is only to keep track of however the maximum amount of time we can just have a process going at once. Yeah, actually yeah, you're right. In this case we don't even need to talk about the time slice because only the priority interval matters. So time slice maximum doesn't really matter because if it ends, it will have made the same decision until it recalculates. Yeah. Yeah, so in the previous example they were all tied for priority. So you wouldn't context switch at all if they're the same priority. So that's whenever you need to make a scheduling decision. So they just have this as if it becomes ready and it's not a higher priority number it just keeps, it doesn't bother to context switch. That's just a rule, yeah. Sorry. Yeah, so we'd recalculate all the priorities here and the priority of B would be three divided by two plus one, two, three, four, five, six, plus six. So six plus 1.5 be 7.5. All right, any other questions? Cool, oh yeah. So the time slice limit doesn't matter. The only thing that really matters is the priority interval whenever it recalculates priority. Because the rule is we'll always run a higher priority process over another one that is of lower priority. Yeah, you can context switch at any, in this case, any time unit you can context switch. So this is because we have priorities here, we're recalculating it. Before we just had round robin. So we just did time slice, we tried to be fair. We just had the time slice and we just alternated processes first in, first out. Yeah, yeah, oh, sure. Sure, you can have rational priorities or you could just scale everything by two if you want or something, I don't know. Or just round down and make it whatever, yeah. So these could be, you could use the same formula except just round down whatever you calculated if you wanted it to be whole numbers. Doesn't really matter. Yeah, sorry? No, in this case we only, we only recalculate the priorities every 10 time units in this case. Doesn't matter if they block, doesn't matter. Yeah, so your kernel would know that they're blocked like it's waiting for something. I can't execute it because it's waiting for some file or something like that. So they wouldn't ignore them when they compare the priorities. So remember process state diagram last time, there's like blocked and then ready and then running so they would be in the block state so they aren't eligible to run because they're waiting for something. So they're just, yeah, they still have priorities, we just can't pick them because even if we pick them we couldn't execute them because they're waiting for something. All right, cool, that's the content, yay. All right, fun content. So since we've done virtual memory now you know how to be a wizard. So large language models are a thing, people love them. So some are like gigantic 30 billion parameters. The model file itself is 20 gigs. So I must have to read the entire file but someone did a single line of code change. Okay, that's an embellishment. It was like five lines of code and they essentially shaved off like 75% of the memory usage and everyone thought they were a wizard even though they could have just taken this course and that's all you need. So in this, you can see the discussion, it's lots of fun. This is clickable if you want to read it. They're like, oh my, how did you do this? You are such a good programmer. Please impart your wisdom. So how this was possible is we are allowed to control our processes virtual memory. So there is a system call called mmap or memory map and it allows us to map files to virtual address to our virtual address space. And why is this useful? Well, you just get a pointer return to you. It allows you to access the file directly as if it was just a big array of bytes. There's no need to set up your own buffers. There's no need for read and write system calls. In fact, most programs that are programmed by any NC that are sane at all use mmap for files. They don't use fopen. They don't use, oh well, you stuff to open it but they don't use read and write because they are a giant pain and they are super inefficient. Well, they're not inherently inefficient but if you want to make them efficient it's kind of difficult. So let's just dive into the example code. All right, so this program, all we're going to do is open a file. We're going to be meta because I guess that's the parent company, that's funny. We're going to open this mmap file with the read only flag. We'll use this open file, open system call. So we get a file descriptor out of it. We'll assume that it's valid. And then here is a system call that doesn't quite fit into anywhere and it is an fstat system call. So what that will do is fill out this struct stat for you and what this has in it is it has all sorts of information about files. So if you do LS something like this, well, these are the permissions of the files. This is a number that will figure out what it means later. This is who owns it. This is the group that owns it. This is how big it is. This is when it was last modified or something. I forget by default what it displays, but it tells you all information about a file. So this fstat system call will essentially fill in this structure with a bunch of fields and tell you all the information about whatever actual file this file descriptor represents. So the only one we care about here is st underscore size, which is just how big the file is. So in this case it is 538 bytes. So our M-map system call has six arguments and we don't need to use all of them. So the first argument is something that you will probably not ever use. It is an address. So if you had a preference for whatever virtual address you wanted to, you could request that the kernel set the virtual address to whatever the heck you want to set it to. Why would you want to do this? 99.9% of the time you will not have to do this, but hey, maybe you want to be very particular for some reason. I don't know, I don't always tell you what to do, but if you don't care, you can just set to null in which case the kernel will figure out whatever virtual address it likes. Next argument is the size. So how many bytes to make valid? And in this case, I say, well, make as many bytes valid, that matches the length or the size of the file. Next one is protections. So this is like the permissions for the page table entries. So in this case, I want to be able to read this memory. So there are other flags. I could also put like a right flag or an executable flag or whatever I wanted to. In this case, since I don't have a right flag, if I try and write to this memory, nothing will happen. The next one is real fun. So the next flag says essentially what should happen if a fork occurs. So the default you should pick is map private. And that means those virtual addresses behave the same as any old virtual addresses if you fork. So after you fork, both of them are completely independent. If you change some memory at a virtual address, it won't affect the other process. So that private means they're private for each individual process. If you wanted to, you could change this to shared. If you change this to shared, that means that is not true. That means whatever virtual address, if you use a virtual address given back by M-map and then you fork, if you change that memory in one process, it changes it in the parent process or vice versa. So it makes it so that that same memory is mapped to the same physical memory and those processes are sharing memory. So if you change it in one, you'll see it in the other. So this is another way of doing IPC shared memory. It is actually, once you figure out what the heck virtual memory is, shared memory actually isn't that bad. And it is a way to do inter-process communication because hey, two processes can now communicate with each other through some shared memory. In this case, we can set to private because it doesn't matter. All right, next argument is a file descriptor if I want to map a file. I could just request some memory and not give it a file descriptor, but in this case, I want it backed by a real file. So I will give it the file descriptor. And then this offset just says, how do we bytes into that file should I start this memory mapping? If you wanna map the whole file, you just start at byte zero, which is the beginning of the file. So after that, I check for errors and then my program comes real easy. After that, I don't have to do reads or well, in this case, I'll use printf instead, right? But I don't have to do any reads. Don't have to do any buffers. That file just looks like a big array of characters if I want to treat it as such. This case, I can just have a silly for loop that goes over every single character and then just prints that character just assuming that that's just a big old array. So if I go ahead and run this, I should get the contents of the file and I do. So I get the contents of the file because it just maps it to memory, makes it real nice, real easy. And then at the end of this, Lake Malik has a free M map, has a free and it's called M unmap. So everyone liked that, kinda cool? Yep, yep. So the file's loaded into physical memory but you don't really have to bother with it but that's the same thing. If you do a read system call into a buffer, while that buffer is in memory and in physical memory and that read system call is going to read information into that buffer, right? Ah, that's a great question. How does this save memory? So M map is lazy. So all it does is set up the page table, set aside some page table entries that will represent however much space you need and initially it doesn't even read anything from the file. So it creates an invalid page table entry during the M map call, sets aside some page table entries and because of that, it makes all the entries invalid. So it would get a page fault if you tried to access this virtual address before it does anything and because of that, the kernel can handle that page fault. So it would use the rest of the information in the page table entry to essentially have some bookkeeping information that says, hey, this page actually represents this file. So if I see a page fault on it, well, instead of passing that page fault back up to the program and making it SEG fault, I could be like, okay, I know what this address is supposed to represent. Then it could read from a disk and read that information into that page and then do the mapping and then it would have a valid mapping. So you only pay for what you use in this case. So yep. So if a file is smaller than a page, it'll still occupy a page because the kernel only cares about pages. So you'll have some of that space that you just can't use anymore. So in this case, whenever you access a new page of the file, that is when the kernel will read that data from a file. So if I only execute the first byte and the last byte, well, I would get a page fault whenever I access the beginning of the file, then it would read a page. So it wouldn't read just a byte, it would read a page from that file in the memory and then whenever I execute the last byte, well it would know how many pages it has to go and then only load the last page into memory and that's it. So I would just have to do two page loads instead of a file, I could map a 20 gig file and only use two actual pages. So 20 gigs, only eight kilobytes. How does math work? Yeah. Question is, can you M map a pipe? Probably not. Yeah, you could just say a size if you want. But yeah, I'm not sure if you can M map a pipe, but you typically do that for IPC in which case you just use shared memory anyways. So you can try, you can see what error you get out of it. You might, yeah, you've probably just got an error. So yeah, with this, this is nice. So it ensures that you only pay for what you use and you only pay for what you use, like whatever you read from the file and you don't have to manage anything. You just kick the can down the curve where you let the kernel deal with this. You don't have to do anything. If you had to implement this using read and write system calls and buffers, you would be there all day and you would probably make a mistake and then you would probably destroy the file or you would read something invalid and you would have to debug it for hours. Just let the kernel handle it. So if we go back to this question, that's what they did. That's what their entire fix was, is they just end map the file instead of reading the whole thing into memory. So that way they don't have to do anything special. You essentially just pay for what you use and it turns out whenever you do inference on this, well, they didn't use all the 20 gig file. They only use 6.8 gigs or something like that. So way better. Most models are really sparse so you don't use all the information for each inference. You only use parts of the model and turns out it works really well with this. So you can now be a wizard too. This doesn't really fit anywhere in the course but if you use files, just end map them. It's way better and you can actually explain why. You can probably, if you go in this discussion there, someone that took this course was talking about, well, isn't this wasteful too? How much space do you need for page tables in that case? Someone posted that you need 40 megabytes of page tables. Why did they get this number? Well, we are M mapping 20 gigabytes. So this is 20 times 1024, 1024, 1024, which is to the power of 30. So that just represents a gigabyte. So this is just how many bytes are in a gigabyte. And then they figured out that, hey, our page size is 4096. And then if I need to calculate how much space is being used for the page tables, I would multiply that by eight because I need a page table entry for each of these pages. And then they just divided by megabytes so they got a nice number. So 1024 times 1024 is just a megabyte. So someone said that this is the same as 20 gigabytes divided by the four kilobyte page size times eight bytes per page table entry divided by one kilobyte, which on the discussion was a typo because that should have been one megabyte. So is this correct? Is this the worst case where I need 40 megabytes for all the page tables to map 20 gigabytes? So yeah, so in this case that is, this is only for entries, right? Which would be entries in the L zero page tables. So this doesn't say anything about L one page tables. So we should be able to figure out how many L one page tables we need in the best case. So this isn't quite correct. This is the correct number if we're only concerned about L zero entries. So we would need 40 megabytes in the best case of L zero page tables. So this like gets into like some actually stuff of like why are you being this pendantic? This is like silly, but this is just to illustrate that, hey, we have multi-level page tables, so let's be accurate. So in this case, if we have 20 gigabytes of memory, well, we can figure out how many page table entries we need. We'll just keep it in the number of page table entries so we can argue about how many page tables we need. So this is 20 times two to the 30, so that is 20 gigabytes divided by our page size means we need 20 times to the 18 page table entries. So these are how many page table entries that we need to do virtual to physical. So those would be the page table entries you need across your L zero page tables. So in fact, if we figure out that, hey, how many entries can we fit in an L zero page table? Well, we can fit two of the nine entries in an L zero page table, which because there's only nine index bits. So in that case, we can just take the number of page table entries divided by however many entries are in an L zero page table or in any page table to figure out that we need 20 times to the nine full L zero page tables. So these full L zero page tables is equal to 40 megabytes because of course, if they're full, then each of these are four kilobytes and we get back to the 40 megabyte number. So the numbers kind of all jive together. So if we have 10,240 L zero page tables, we can calculate in the best case how many L one page tables we need. So each L one page table can fit at most 512 entries. So we can figure out how many full L one page tables we need. So we take 10,240 divided by 512, which means we would have to round up and we know here that we would need 20 full L one page tables. So we would assume that our process has an L two page table because you need a two and every process needs its own unique L two page table. So we wouldn't add that to the calculation. So in the best case, we would actually need 10,260 full page tables which if you multiply that by the size of a page divided by a megabyte, we get like the actually like get your glasses on. This is 40.04 or sorry, 40.07 megabytes of page tables, not 40. Why am I being this exact? I have no idea, but this is the actual best case because we would have to consider L one page tables if you consider everything. But hey, turned out the number is close enough. So why the hell did I do this? I don't know. I like doing weird stuff. All right, any other questions? So L two, you only have one L two page table and it would already be created for you. So in this case, it kind of makes sense too. So this, if I have 24 L one page tables, it would mean that, well, I need to use 20 entries in my L two page table that already exists. So I just point each entry. So this also makes a bit more sense too that, hey, in the L two page table, there's 512 entries. Each one of those essentially can map up to a gigabyte of memory. Kind of makes sense. So in this case, I use 20 entries, 20 gigabytes, makes sense. All right, any other questions? Or lab three or anything? All right, otherwise we can end early and have good luck with your midterms. Yeah, yay for reading week being like super late and us trying to do a midterm after reading week. So, oh, well, all right. Well, just remember, point for you.