 All righty, welcome back to operating systems. It's midterms probably for the rest of your courses, so yay. So today we're doing another scheduling example just because it's like half, it's like 20 minutes or something didn't quite fit in scheduling. Then after that, we're just doing something that you, that is good to know but you don't have to know it for our exam which is like also in like a month anyways. So yay reading week at weird times. So anyways, let's get into it. So we'll explore an example of dynamic priority scheduling. Might also be called in some literature feedback scheduling. So it's just an algorithm to manage the priorities for you. All you set is an initial priority for the processes and then let the algorithm take over. So it uses set time slices, measure CPU usage and then tries to adjust the priorities dynamically. Based off that, we'll see the algorithm that they use. It's not terribly complicated as long as we know how to divide by two. So the idea behind it is we increase the priority of processes that don't use their time slice so they haven't been scheduled. So we want to make that number lower and we want to decrease the priority of processes that use their full time slice and make that number higher. And the idea behind this is hopefully if you haven't run in a while, you will have a higher priority until you finally run. We try and be fair and all that but we allow you to set a initial priority so that some processes definitely get scheduled before others. So like I said before, we pick the lowest number. That means the highest priority, same as in Linux priorities. So initially when a process starts, it gets assigned some priority, call it P of N and then whenever we schedule, we pick the lowest priority to schedule. If it yields, we pick the next lowest number. If there is a tie of the same priority, we break the ties with the arrival order. And if something, this is going to be preemptive as well. So if something with a lower priority number is able to execute again, we will context switch directly to it. So for this, we record how much time each process executes for its priority interval, which might be the same distance as a time slice, but typically you don't recalculate priorities every single time you do a context switch. You do it every so often at a fixed priority because it's essentially some extra work you have to do. So you record how much time a process uses during its interval, that is called C of N and in this timer interrupts still occur so we can have preemption. So our calculation is at the end of the priority interval, we update the priority of each process with this formula. So it's P of N, so its new priority is equal to its old priority divided by two. So it gets lower and lower, which means we are more likely to execute it. And then plus however long it executed for for that time slice. So if you just executed, your number will be higher, you will have a lower priority and it is more unlikely that you will run. And each time we go through the, we reset the priority interval, we reset our counter of all the processes back to zero. So what's that look like? So let's assume we have four processes, X, Y, A and B and they are in that arrival order. So X, Y, A and B, they all have an initial priority of zero. So that is the initial priority we'll work with with all processes so initially they're all tied so we'll go with arrival order. So A and B are CPU bound processes. What does that mean? It means whenever they're ready all the time they just wanna execute CPU, they don't block or anything, they just want to execute on the CPU. Doing some important or not so important calculation. X and Y are IO bound processes so they execute for one time unit and then they block for five. So that means they cannot execute, they're waiting on a file, they're doing something, they're blocked, they can't execute for five time units after they execute. So we'll have the opportunity to schedule other processes. So in this timer interrupts of recur every one time unit so we could context switch then if we wanted get control back to the kernel and here each time slice is 10 time units so that's the maximum amount of time any process can be scheduled for and our priority interval is also 10. Typically these are the same thing. I separated it out because the time slice is like a maximum, the priority interval happens every 10 time units. So in this case what is the scheduling where we could preempt at any single time unit? Well, initially they all have the same priority, they're all zero, they're all of the same priority level so we go by arrival order. In this case X comes first, it executes for one time unit and then it's done. So since it's done at time equals one and blocks for five we can't execute it again until six time units. So next highest runnable priority is Y because it arrived after X. So it would also execute for one time unit and then it would also block for five so we can't execute Y again until seven. So now we can only schedule between A and B. A came first so we'd schedule A. So A would be available or would execute all the way to time equals 10 whenever we have to recalculate priorities. Y, well, X became ready at time six but it didn't have a higher priority so we wouldn't bother to switch to it if we're already running something of equal priority. Similarly at time seven Y was ready but we didn't switch to it immediately because our priorities were tied. So at time 10 we recalculate our priorities so priority of process X is well it's old priority they all started at zero so this term's just gonna be zero for all of them. So plus however many time units they executed for in the last 10. So X executed for one, Y executed for one, A executed for eight and B executed for zero. So those are our new priorities. So at time 10 while we have a process that has a lower priority or it is a higher priority or a lower priority number or a higher priority than the rest of the processes. So at time 10 we should switch to process B because its priority is lower number, highest priority. So we would switch to B for the next time 10 time units and then we'd recalculate the priorities again and hopefully this isn't terribly complicated. So any questions about that? No, pretty straightforward, all right. So same thing except now A and B have a priority of six so that is a higher number or lower priority so we would prefer to execute X and Y over A and B. So in this case going to start the same there's a tie between X and Y, their priority is zero. So arrive first so we pick X. So it runs for one time unit blocks for five, execute Y runs for one time unit then blocks for five. Now in this case we don't have any processes that are a priority zero so our next priority is six and we have a tie between A and B, A came before B so we would schedule A for four time units. Now at time six, well X is ready again. So if X is ready again it has a lower number or higher priority so we would switch directly to X. It would execute for one time unit then block for five, Y does the same thing, executes for one time unit blocks for five and then we're back between A and B and we would choose A again until time 10 whenever we have to recalculate the priorities. So in this case when we recalculate the priorities the priority for X and Y were initially zero so zero divided by anything is still zero and both of them ran for two time units so their new priorities are both two. Process A while its old priority was six or its initial priority was six so we divide that by two and it ran for six time units so six divide two is three plus six equals nine so its new priority is nine then process B while six divide two it ran for zero time units, its new priority is three. So in this case A or X and Y still have a lower priority number or higher priority so they'll execute again whenever they're ready but in this case we would switch to process B for two time units it would execute to time 12 until X is unblocked then whenever X is unblocked runs for one gets blocked again Y executes for one gets blocked again switch back for B for four time units then X is unblocked so we can schedule it again and then Y and then we're done yay any questions about that at time when sorry so at time eight we haven't recalculated the priorities at all so A and B are still tied and A arrived first so we pick go back to A and then at eight Y okay. Yeah so in this case both their priority was six at that point so we go by arrival order and A arrived before B so we pick A and it's the same reason we picked A the same why we picked A at time two is the same reason we picked A at time eight we haven't recalculated the priorities yet everything's the same alright any other questions yeah so in this case we haven't recalculate at time eight we haven't recalculated the priorities yet we only recalculate them at time 10 so at time two whenever we scheduled A then A and B were tied in priority and A came arrived first so we picked A so at time eight we didn't recalculate the priorities we can only run A and B they're tied for priorities so we went to arrival order A came first alright good pretty boring that's the new content for today yay alright other fun stuff that you don't have to know for the exam but you may want to appear as a wizard especially in machine learning courses so large language models are huge 30 billion parameters or something like that they take like 20 gigs of memory just loading the like the contents of the file so someone wrote a program that uses a large language model and someone did a single line of change before that thing needed to read like a 20 gig file and the memory took up 20 gigs of RAM at least so you had to have 32 gigs of RAM to even run this thing then someone did a like one line change and suddenly it only took 6.8 gigs of RAM so how is that possible? So there's a whole discussion about that and we will see the magic behind it it was like a one line change well a few more like five lines but in order to get into how that worked well we need to learn about Nmap so we can actually control our processes virtual memory so there is a memory map system call or Mmap and that allows us to do fun things like map files to virtual memory addresses and then if we do this we just have a pointer and you can access the file directly as if it was memory there's no need for read and write system calls there's no need for buffers there's no need for anything you just use it as if it is one big array of bytes and it is quite awesome so let's just dive into the example and see it and this is like how most programs deal with files because one it's way more efficient and two it's actually less lines of code so you just have to know how virtual memory works in order to in order for this to make sense at all so at the beginning we just open a file we will just open this own source code file just to be a bit meta so we open it, we open it as read only we get a file descriptor, yay, it's the same thing we could have just done a read system create a buffer, done a read system call on this and like printed it out to standard output if we really wanted to but we can do something more fun so there's this other system call that is not really covered in the course but might be useful so there is the stat struct which tells you lots of information about files basically it tells you all the information if you do an LS and see like this line you see like permissions, a number we will figure out what that means later who owns it, what group owns it, how big it is the data was modified and all that well that information is all stored in this struct called a stat struct and you can do a system call to populate it and ask the kernels to give me some information about this file so we do an fstat system call with fd which should be a file descriptor pointing to that mmap file and then we give an address so it can fill in the fields for us and then we're only really concerned about the size so there's a field called stsize in the stat struct that just tells you how big the file is so after we figure out how big the file we do an mmap system call so it has six arguments and we will go through them one by one so the first argument is the address so you are allowed to specify whatever virtual address you want if you are picky about a virtual address and for some reason you want a set address so if you don't care what virtual address you get out of this you can just set that to null and let the kernel decide so 99% of the time that will always be null next argument is the length so how much virtual memory should set aside for you in this case I ask for the exact size of the file because I want this block of virtual memory to represent the file next is protection so those are permissions which kind of correspond to the same permissions as your page table entry in lab three so you can say, hey I want to read just read this memory and then if I try and write to it I'll get an error if I wanted to write to it I could just enable that flag and so on and so forth the next one is a fun flag so this flag specifies what happens when a fork happens so this flag right now says map private so that means for each process if you fork it's going to behave like fork normally behaves in which case your own virtual memory space is yours it's independent if one process changes it it doesn't change it in the other that's what we know and love if you wanted to you could change this to be shared in which case it will map to the same physical memory and if you change it to one process you change it in the other one so this is also a way to do IPC if you want you can just set up some virtual memory have it shared and then whenever you fork that virtual memory represents or maps to the same physical memory so if you change it to one the other sees it alright next argument is the file descriptor so if you wanted to represent a file what file do you want to represent in this case we just give it FD and then offset is how many bytes into the file should I start this mapping if we want to read the whole file our offset is zero so in this case we check to see if it failed or not then we can close the file descriptor because we don't need it anymore we will access this file through that virtual memory so now I can just write a for loop that iterates over every single byte in the file and I'll just do a print F and this array of data just corresponds to the contents of the file now so data at zero is the first byte of the file and I could go the last byte of the file I could read any part of the file at any given time don't have to do any read or write system calls don't have to screw with buffers don't have to worry about everything and the nice thing is this is also more efficient so it also whenever we're done with it like you have malloc and free when you have M map you should free it with an M unmapped system call and you also have to tell it the length all right cool so if we execute that or if we build that and execute it hey guess what we read out the contents of the file no fuss no must no buffers no anything kind of cool at least I think it is and most programs that want to be efficient they don't use F open they don't use anything like that they use M map because it's way better so any questions about that alright so let's discuss quickly how it kind of works so M map is really lazy all it does is set up page tables it doesn't actually read anything from the file so what it would do is create kind of set aside page table entries as many as it needs so it would set set aside however many pages it needs to fit in that mapping and it would create invalid page table entries during that M map call and kind of reserve some virtual some virtual address space now because of this if the MMU tried to resolve this it would page fault because they would be invalid so the kernel can go ahead and use the rest of the bits for whatever it wants all 64 of them so it would actually use the rest of the entry to have some bookkeeping information saying hey what does this entry represent what file does it represent and where in the file does it represent so whatever you first access that memory it would do a page fault and then the kernel would see oh you have an invalid memory it it should be mapped to this file then it would read that file into memory and replace it for that address so it would just do it lazy so you only will load what parts of the file you actually use which is really hard to do if you don't have a map if you were you have to you know jump around the file make sure your buffers the right size that that that's just a real pain in the ass you can just have the kernel manage it for you and this ensures that only the parts of the file you use actually get read and put into memory so it's only pay for what you use just let the kernel do all the hard work so if you go into the discussion someone says oh yeah all I did was emap it because large language models usually whenever you do inference you do not use all the model to do a single inference you only use parts of the model so all it does is emap that 20 gig file and that's it it just reads from it whatever parts of the model it reads from let the kernel figure it out read it in the memory in this case it only uses a few gigabytes instead of the entire file and it's great super efficient they were like wow you're a wizard how did you make this suddenly you like cut the memory usage by down by two quarters wow you should definitely take an offering system course so then later on the discussion they were like oh yeah well how much space does the kernel actually need for page tables like isn't this super wasteful to do like a whole 20 gigabyte file well someone posted in that that you only need 20 megabytes of page table entries so in this case how they came up with that number was 20 times 1024 times 1024 times 1024 so this is just one gigabyte or two to the power of 30 which is a real gigabyte and then they figured out well if you have 20 gigabytes of memory how many pages do you need and this number again our favorite number represents a page size so that is how many virtual pages i actually need to map this file and then they multiplied it by eight because each of those virtual page numbers needs a page table entry so our page table entry is eight bytes and then they just divided by 1024 times 1024 just to the 20 or real megabyte in order to get the answer in megabytes so someone clarified later that this was like 20 gigabytes so this part divide by the 4k page size which is that times the eight bytes per page table entry divided by one kilobyte which must have been a typo because they meant one megabyte so is this correct is this the only is this the only thing we need for the page tables yeah so this has multiple levels so these that number represents how many entries you would need in l zero but we have different levels so we could if we wanted to be super pendantic we could calculate how many l one tables we need for this and figure out how many of the l zero page tables would actually be full in the best case so this is like super super like actually i want to be like correct so you will impress no one at a party if you know how to do this i don't know why this is here this is mostly here so that we can understand like hey multi-level page tables so in actuality well yeah we need 20 gigabytes of we need 20 gigabytes of space and each page is two to the 12 so we need 20 times to the 18 page table entries so those again are only for l zero page tables so we can take the number of page table entries and divide it by the number of page table entries we can fit in each table so this is two the nine because we only have nine bits to figure out each index on the page table with our systems so if i just do the math on this it just becomes 20 times two the nine and if i write out the answer that means we have 10,240 full l zero page tables in the best case so that's the best case where all them are full in this case all the addresses are going to be contiguous anyways so all the entries are going to be beside each other so our worst case is like the first table might not be completely full which would make the last table also not full so we might in the worst case this would be 41 and that's also super pendantic so we can just assume best case here we would have 10,240 full l zero page tables so we know each l one page table can point to up to 512 l zero page tables for the same reason right that's how many entries it can have in it so if we just take the 10,240 divide by 512 well we can figure out that we need 20 full l one page tables in the best case so in total we need 1240 l zero page tables 20 l one page tables so we would need 10,260 full tables this case we wouldn't add an l two page table because each process only has one of them it would already have one it would already have entries in it already so in this case if we figure out this number we multiply by the size of a page table so 4,096 again then divide by megabytes we get actually it took 40.07 megabytes not 40 so slightly more again i don't know why i did this all right any questions because that was pretty much it all right cool work on lab three or study for other midterms i guess because you will probably do that all right just remember pulling for you we're all in this together