 All righty, welcome back to operating systems. So after all the craziness of having multiple processes and children we can kill and orphans and all that fun stuff, we get a bit of a break today. So today should be fairly straightforward. Give your brains a rest. A lot of you are working at lab two, so that's good. So today's lecture will be hopefully fairly straightforward, unsurprising and quick. So there are two types of resources on your machine. They're broadly divided into preemptible and non-preemptible resource. So a preemptible resource is something that can be taken away and used for something else or otherwise shared. So for example, your CPU is a preemptible resource. I can stop executing any process at any given time and give the CPU to another process and then you are sharing the same. So that's talking about a single core. So typically when you are sharing something, if it's contended at all, I'm sure you've done this in your life where you schedule it. So you each get a certain time period with whatever it is and that's how you share it. So non-preemptible resources can not be taken away without any acknowledgement. So an example of that, something like disk space. If a file takes up a gigabyte, you can't be like, I need to borrow 100 megabytes or something like that and I'll give it back later. That's not how files work because that would probably overwrite the data. It would get messed up. You wouldn't be able to recover it. Another example would be memory, physical memory. So as soon as you use it, you allocate it. And then whenever you're done with it, to tell the operating system, hey, I'm done with it, you can deallocate it. So there's an exception to this. So CPUs and super high performance computing where you have to actually reason about how your CPU works, you may be able to schedule a CPU like a non-preemptible resource. So you can say, hey, I need the CPU exclusively and then you have it and then you deallocate it when you're done with it. But that's only in super sensitive, high performance computing. We're just dealing with general computing, in which case we can share CPU cores easily. So there's two components. There is a dispatcher and a scheduler and they work together. So the dispatcher is like the low level thing and that is what is actually responsible for doing the context switching. So saving the registers of one process and then restoring another process to run, doing the context switch, that is the job of the dispatcher. The scheduler is like the high level policy. So it decides what process to run and sometimes when. So the scheduling is what we're talking about today. So by default, the scheduler runs whenever a process changes state. For instance, a process terminates or something like that. So first, we will consider non-preemptible processes. That means they just run until they're done going back to the uniprogramming approach. So once a process starts, it just runs until it's done. So in this case, if you are a scheduler, you only get to make a decision when a process terminates or if there are no processes currently running. So preemptive allows the operating system to run the scheduler at will. So if you run uname-v, you'll get a whole bunch of information about your kernel. One of the things it will tell you is preempt, and that means it uses a preemptive scheduler. So that's what one of the words means. So when we talk about scheduling, there are four main metrics. The first one is you want to minimize waiting time and response time. So those are fairly self-explanatory. Waiting time is just how long a process is waiting for the CPU. So how much time is that individual process? Idle, not actually executing. And then response time is how long does that process take from when it wants to start to when it first starts executing, and that is called the response time. Because ideally, if it's something you interact with, you want that response time to be long, that program, that process will not terminate, but you want it to be responsive. So you move your mouse and you instantly see it running, draws the updated mouse cursor on the screen, and then it feels like your computer's responsive instead of taking an entire second to run or something like that. The second job is it wants to maximize CPU utilization. So if you have multiple cores, it should utilize all the cores on your machine as much as it can. If you have a single CPU, it just means it doesn't want that CPU to be idle. You have to make a decision and make it quick. Then the next is you want to maximize throughput. So that's just completing as many processes as possible, which you might care about if you are just running machine learning jobs or something that just takes a while. You don't care. You just want to get through them as fast as possible. And then the last thing, which is at odds with all the rest of them, is fairness. So you try to give each process approximately the same percentage of the CPU. And that way, it's not like you sharing, I don't know, something with your siblings. You want to be fair about it. And each one is treated fairly. Someone doesn't get like 99% and the other gets one. So the most basic scheduling algorithm is one you have encountered every single day of your life if you have ever went to get food or anything else. So it is called first come, first serve. It's basically a line. So the rule behind this is the first process that arrives gets to execute the CPU. And the processes are stored in FIFO order. So first in, first out. It's just like a normal queue or a line we've interacted. We've had like every single day of our life pretty much. And it's based off the arrival order. So this is how we will illustrate our scheduling using a fun Gantt chart. So in this case, we want to schedule four processes. In this case, they're cleverly named P1, P2, P3, and P4. And then the other columns are arrival time. So that is at what time unit does the user request this process to actually start executing? And in this case, we're going to assume that they all want to start at time unit zero. Then burst time, you might think, hey, that's a weird term. Burst time just means how long each process wants to execute on the CPU for until it is completed. So process one needs to execute for seven time units, process two for four, process three for one, process four for four. Fairly, yeah. And that's all we need to know about the processes. So they all arrive at the same time instant. So I'll just break the tie and tell you that the order they arrive at, like down to the nanosecond. And say I say process one arrives in two, then three, then four. So if this was first come first serve, well, at time zero, you have to pick which process to run. If we're just going off in line order, process one runs first. So it runs for seven time units because we're doing non-preemptible first. So it just runs until it's completed. So it will take seven time units to run. So at time seven, well, the scheduler has to make a new decision. What process do I want to run? And it is just the next one in the line P2, runs for four time units. P3 runs for one time unit. And then P4 runs for the last four time units. And that's it. So any questions about that? Hopefully, nice and boring as a change. So if we want to compute the average waiting time, well, for each process, we have to figure out how long each of those are waiting for. So process one is waiting for zero time units because wants to run at time unit zero starts running immediately. It's waiting time is zero. Process two, well, it wanted to run at time equals to zero. And it started running when time was seven. So it was waiting around for seven time units. Process three was waiting around for 11. Process four was waiting around for 12. So if you want to calculate the average, again, not terribly complicated, it is zero plus seven plus 11 plus 12 divided by four. And that is 7.5. So the average waiting time for in this case would be 7.5 time units. So we can do the exact same thing, except we'll assume a slightly different order. We'll assume it went P3, P2, P4, then P1. So if we do the same thing again, well, our schedule looks what we would probably expect. P3 was first, so it runs for its one time unit, then it terminates, it's done, then based off the line P2 runs for four, then P4 runs for four, and then P1 runs for seven. So if we were to calculate the average waiting time now, P3 waits around for zero, P2 waits around for one, P4 waits around for five, and then P1 waits around for nine. So nine plus five plus one plus zero divided by four, that's 3.75. So just based off the order of these processes, our average waiting time was actually halfed. So you might think that, hey, first come, first serve, we're kind of at the mercy of whatever order the processes arrive in, and we can probably do a bit better. So the student among you would have noticed that, well, if I want to minimize average waiting time, I should probably just do the shortest job first. So instead of first come, first serve, we just always schedule the job with the shortest burst time first. And again, this is assuming no preemption. So if we do something like this, we will make it a bit more interesting by changing the arrival times. So only P1 arrives at time zero, and then P2 arrives at time two, P3 arrives at time four, and P4 arrives at time five. Then otherwise, everything else is the same. So I always just write the arrival times on the top here, so I can see when P1 arrives, when P2 arrives, when P3 arrives, when P4 arrives. So if I was to do this schedule, well, at time zero, when I want to make a scheduling decision, P1 is the only process that wants to run, so the scheduler has no choice but to schedule process one. So it gets scheduled, and again, we're in non-preemptible land, so it will run until it is done, so it will take seven time units. So while it was executing, process two, three, and four, wanted to execute, but the CPU is busy, then whenever P1 finished at time equals seven, the scheduler gets to decide the next process to run. So if we're doing shortest job first, well, between P2, P3, and P4, P3, the burst time of one is the shortest, so it would schedule P3 to run. So P3 would run for one time unit, and then we have to make another decision, do we run P2 or do we run P4? Both their burst times were four, so it was a tie. So since there was a tie to break this tie, we'll just pick whoever arrived first, we'll just go back to first comfort serve for any ties. So we would schedule P2 to run, it would run for four time units, finish at 12, and then P4 would run. So if we calculate the average waiting time now, hey, it's four, yay. So this all fairly boring so far, straightforward, doesn't look like any questions, nice change then like forking and then having a bunch of stuff, yep. Yeah, so you might notice this is completely unrealistic on a real machine because the scheduler would not know how long each process takes. So this is generally used after the fact to kind of judge other algorithms. So this is provably a way to minimize the average waiting time, but it's obviously not very realistic, yep. So yeah, so the question is, well, could I just guess the burst time by how big the binaries are? So you could guess, but as you might have figured out by now, your computer is very, very unpredictable. So you could throw AI in it and try and make better predictions of everything, but yeah, can be wrong. And generally with scheduling, there's no correct answer, it's just a bunch of trade offs that we'll see. So today, and yeah, so here's your point. This is not practical whatsoever. So provably it will minimize average wait time, but we won't know the burst times ahead of time. You could use the past to predict the future, although might not be a good predictor. So you might, you know, you might launch something at like 6 a.m. go uh and then close it immediately while other times if you have a midnight, you know, you hack on it for like hours and hours and you never know. And the other problem with this is something called starvation. So starvation means that a process might not actually ever get to execute. So in the case of shortest job first, if you just had a process that took a long time and then you had a constant stream of very short jobs that come in, well that long job would never ever run. And that is just, I don't know why we just use the term starvation for a process that might never run. Yep, yeah, yeah. So the other thing, if you have a really long job then nothing else can run too, right? Yeah, some jobs can restart, but the next thing we could do is something called shortest remaining time first. So we can add preemptions to it kind of interrupt a long running job so it doesn't just hog the CPU or something like that. So we can do a little tweak just to introduce preemptions. So we're gonna just have the same idea of shortest job first, but we're going to tweak it a little bit and add preemptions. So the algorithm just becomes shortest remaining time first because we can interrupt processes and schedule a new one. So again, for all this, we're just assuming that our execution time is one time unit, so we can only interrupt on time unit boundaries just so we don't have to deal with fractions. So this also optimizes the average waiting time and let's see what that does. So with the same, with the same processes, our schedule would now look like this. So at time one, or sorry, time zero, we still have process one, which is the only process. So it would get scheduled, it would run for two time units and then process two comes. So when process two comes, the scheduler can preempt and stop, pause, whatever you wanna say, process one. So at time two, process one has executed for two time units. It wants to execute for a total of seven, so it has five remaining. So process one at this time has five remaining time units and process two, which just arrived has four remaining time units. So if we're doing shortest remaining time first, process one, or sorry, process two is shorter than process one, so we would schedule process two. So we schedule process two, it executes for two time units until process three comes in. So at this time instant, process two has two time units remaining because total of four, it's run for two already. So process two has two time units remaining and the new process three has one time unit remaining, so it is now the shortest. So we would interrupt process two and execute process three. So it would run for its one time unit and then it is finished. So when it is finished as well, process four comes in, which has four time units remaining. So we have three processes, process four has four time units remaining, process two has two time units remaining and process one has five time units remaining. So we'd schedule process two for two time units then at time seven, it's between process four and process one. So we schedule process four that has four remaining, it executes for four time units and then we finally finish off with process one. Yep. So switching between processes is not intensive, you said. So switching between processes, remember we called context switching before? So we will keep track of the number of context switches that happen because if the context switch takes like the same amount of time as each of these time units, then that's not good, right? We're wasting a lot of time, but if it's like a fraction of the time, that's okay. Typically context switching accounts for like less than 1%. So, but we'll see later that, hey, if you context switch too much, that can be really bad. If I context every one nanosecond, probably not good. Yeah. Yeah. Yeah, so this is also, again, assuming I know how long each process is gonna take. Yep. Yeah, so this is mostly just to compare ourselves with other things. This is like the optimal thing to do if you just care about waiting time, but a lot of drawbacks, including you can't actually implement it. So in practice, you could use this to evaluate your scheduling algorithm if you wanted to. So this is just more theoretical. This is not super practical. But yeah, we can see what's the same schedule if we just do the waiting time calculation. Well, our average waiting time actually goes down with preemptions, which makes sense. So the waiting time gets a bit hairier to calculate here because we have preemptions, but there is a fun little shortcut we can do. So for instance, getting the waiting time for process one, well, it's the amount of time it's waiting from when it wanted to start at time zero to the very end. So I could just count every block where it's not executing. So like one, two, three, four, five, six, seven, eight, nine. That takes a long time. I don't wanna count. So what you can do is you can calculate the time that process one was just around on the CPU in total. So process one ended at time 16 here and it started at time zero. So the total amount of time it was on the CPU was 16 time units. And if I wanna calculate the waiting time, well, if it was around for a total of 16 and I know its burst time was seven, I can just subtract. So 16 minus seven means it must have been waiting around for nine time units. So, oops. So it's waiting time would be nine and you can do it for the rest of them to get the average waiting time of three. All right, so first actual algorithm that can be used round robin. So so far we haven't talked about fairness at all. And in fact, it's a trade-off with all the other metrics. So if you have something that has the lowest average waiting time, which is the shortest remaining job first or shortest remaining time first, then you're gonna have the issue with starvation and you're also going to have the issue where you can't implement it. And you're also going to have the issue where it is not fair. So the fairness thing to do is probably if you do have siblings and there was some toy that you and your siblings wanted to play with, you probably did something like round robin. So the operating system, like your parents would say, hey, you each get 30 minutes with it and then after you're 30, it's the next 30 and then you just kind of go around in a circle like that. So operating system does the same thing. So it divides execution time slices. They sometimes literature wants to be fancy and call them quanta and call the individual time slice a quantum. I'm not that fancy, but they use the terminology. I don't know why, but we can say time slices if we want. So what they do is it's just a queue again, but it's a queue that goes around and around. So you maintain a FIFO queue similar to first come first serve, except that each process only executes for a certain time slice. So at the end of its time slice or end of its quantum, it gets re-added to the end of the queue. So it just goes to the back of the line and then we just pick the next process at the front of the queue and go on and on again. So what are some practical limit considerations for deterring that time slice or quantum length? Yeah, if you make it way too small, you might be spending a lot of your time context switching. So if your time slice is smaller than it takes to actually do a context switch, you're gonna be wasting a lot of time. So what would be the opposite problem where my quantum length is gigantic? Yeah, so a process can end early. So if it doesn't use its whole time slice, it just terminates and then the offering system can immediately schedule something else to run. Yeah, so if the time slice is too big, it just becomes first come first serve, right? Which kind of sucks. So if your quantum length is like eight years or something like that, well, it's the same thing as first come first serve, which kind of is not that great as we saw at the beginning. So you have to be fairly careful about picking your quantum length or time slice. So let's do the same thing with a quantum length of three time units and see what fun this brings. So thankfully for us, this is about as complicated as it gets. Round Robin is a real scheduling algorithm that also works and these types of questions are easy things to put on exams and they're not that hard. So they should be more or less free. So they do get a little tricky so we'll see essentially as hard as they'll get. So here I wrote out, I have the same processes I want to execute for the same amount of time and I wrote the arrival time at the top again. So if I'm doing round Robin, I will do a quantum length, let me read. So we're saying our quantum length is three time units. So if our quantum length is three time units, well at time zero, we only have process one. So we can only pick process one and it gets up to three time units again. It can end early if it wants to but in this case it wants to run for seven and that's less than three. So P one would execute for three time units and when I do round Robin, I find it much easier if you write the queue at the bottom. So at time two when process two came in, it would have been added to the queue and nothing else would have been in the queue so process two is at the front of the queue. So at time three when process one's time slice is done, well my queue would look like this. So P two would be at the front and P one gets thrown off at the end. So when the scheduler needs to make a decision, what to run at time three should be fairly easy. It's whatever is at the front of the queue which is process two. So it gets scheduled for three time units because it wants to run for a total of four. So at time four when process three came in, our queue would be process one would be at the front, then process three would be added to the end and then at time five when P four comes in, our queue would look like P one, P three and then P four. Right? So at time six right here, when we need to make a new decision what to execute, what are we gonna execute? Process one, right? Straight off the front of the queue. Nothing unlike processes, this is not like something that will, hopefully something that will not wreck your brain. So process one would execute for another three time units because it can execute for up to seven. So it's still good. So my queue, whoops, I screwed up. Oh, sorry, I forgot to re-queue process two. So here I did not re-queue process two. So it'd be P two, P, whoops, sorry. It would have been process one, process three, process four and then process two. Sorry, process two would have been added to the end of the queue and then we took process one off. So at this point our queue would look like P three, P four, P two and then P one would be added to the back. So the next process we run is P three. It only runs, what are you doing camera? Right, cool. All right, so P three only runs for yikes. Okay, that thing is dying. Okay, so P three is done at that point. So P three is done. The next process in the queue would be P four, P two, P one. So I would schedule P four for four time units or for three time units because it's my quantum length. And then at this point my queue would be P two, P one, P four and they all have one time unit remaining. So it would just be P two, then switch to P one, then switch to P four. So any questions about that? Yep, yeah, so the question is what would happen if we just had like, so P one was still the same that arrived at zero and then P two arrived at like time four. So if that was the case, P three would run for its three time slots and then we'd have to make a scheduling decision at time three and it would just be P one, we have no other choice and it would go for another three time units. It wouldn't get interrupted in the middle of it even though it's like hogging it a bit. Yeah, it still gets its time slice. So the only way that you get a shorter time slice is if you terminate. Yep, so if two processes arrive at the exact same time you would be told how to break the tie. So, or you just pick one because it's a tie and it doesn't really matter. Yeah, lower process ID if nothing else. Yeah, so other things you would have to do is do some fun calculations. So like average, whoops. Yeah, so average waiting time is something you would be asked to calculate. So you have to go through each process and calculate how long it was waiting for. So you can do the boring thing of like, oh, for process one how long was it waiting for? Well, one, two, three, four, five, six, seven, eight and go and count how many times it was waiting until it finished or you could do something much faster. So process four ended here, then process one ended at time 15. So the total amount of time process one was around four. It was here, so it was around from 15 to zero. So it was around for 15 time units and we know it was executing for seven of those so it must have been waiting for eight of those time units. Then same thing for process two. It ended here at 14 and it started at time two. So it was hanging around for 12 time units, right? 14 minus two. So it was hanging around for 12 time units. It executed for four so it must have been waiting around for eight and that saves me a bunch of time. I don't have to count a bunch of stupid blocks. If I do that for the other ones, process three was waiting around for, I can count this one, one, two, three, four, five. So process three was waiting around for five time units so that was one, two, three, four, five. And then process four, well it ended at time 16. It started at time five. So it was active for 11 time units. It ran for four. So 11 minus four, please say I can do math. It should be seven. So it was waiting around for seven. So if we do our average waiting time, there were four processes and then this is something like seven. So the average waiting time in this case was seven. So if we do, I'll make it a different color or ensure. If we do the average response, which is slightly different, remember response is the time from when it arrived to the time it first started executing. So for process one, well, its response time was zero time units because it started executing immediately. And why do we care about response time? So again, you might care about response time if this is an application you interact with. You wanna see some response out of it. As quickly as possible, the lower response time, the better. So for process two, its response time was one time unit. So it arrived at time two and started executing at time three. So it was waiting around for one time unit. For process three, it got scheduled here at time, or it arrived at time four and didn't arrive or didn't start executing till I believe this is nine. So its response time was five time units. And then for process four, well, it's the same thing. It arrived at time five and then didn't start executing until time 10. So its response time was five. So if we add those all up, divide seven, that should be our average response time, which is 2.75. And then the last thing you would be asked to do is tell me how many context switches there are. So how you do that is you don't count zero as a context switch or you don't count termination as a context switch. So you just go through it and count however many times it switches between processes. So that erased it. So there's one switch between process one to two, then there's another here, then another here, then here, then whoops, the not there, then another here, here, and here. So if you wanna count the number of context switches, you just count the number of red lines. So how many do I got there? Seven, yeah. So when a process terminates and switches to another one, that still counts as a context switch. Although you can imagine it might be slightly faster because I wouldn't have to save the state of whatever it's terminating, but I still have to load whatever I have to execute next. Yep, so it doesn't have to. So in the case where like P2 didn't arrive to time four, we would have gone back to back process one and that wouldn't have been a context switch. So context switch is only when we're switching between processes, not when the scheduler runs. Yep. No, so when P3 goes here, it gets three time units and then it finishes early. It terminates. The quantum length resets for the next process. Yeah, it doesn't, you don't get, imagine someone could easily play that game if you knew the quantum length was, I don't know, say you knew it was 10, then hey, I'll just make processes that only last for nine or a multiple of nine. And then that thing just, the other processes just get one time unit all the time, get the spare every time and I screw over every other process. So that's a reason why you could think of as restarting it every time. Okay, so fairly good so far. So the hard version is due to again, but quantum length is one. Yeah, yeah, so we'll see a tie right now. Yeah, so this is like the hardest one it can do is the quantum length is one, so we have to make a decision every single time, which gets kind of annoying. But thankfully, if you follow writing the Q at the bottom, it's not too bad. So at time zero, process one arrives, so it would be at the Q. I'll draw it for there. So process one would execute, then after it's done, it's time slice of one, it goes to the back of the Q. So we would go to the back of the Q, come again, and now we have this tie. So now there's a bit of a tie. So process two arrived at the same time we're throwing process one to the back. So the rule for this is always favor the newly arriving process. So we would favor process two and throw P one behind it. And the idea behind this is well, that's just newly arriving. I can lower my overall response time if I favor the newly arriving process over the old one, then just re-queuing. So remember there's, for tie breaks, there's usually a reason behind them. So this one favor the newly arrival process. Why? Because I want to minimize the average response time. So in this case, P two would execute, then get thrown to the back of the Q. My Q would look like P one, P two, and they go back and forth. And again, I have the same situation where I'm re-queuing P one and P three comes in. So I would favor P three and my Q would look like this. So P two would have been at the front, then P two would have been behind it, and then P three would have went to the very back. So now my Q looks like this. So then P two executes, it is at the front of the list. And now we have that same situation again. So my current Q before would look like P three, P one. And now P two is being re-queued and P four is arriving. So I would favor P four and then put P two way at the back. So now the hard part is mostly done. So at this point, I just go through round robin with all of these processes. So thankfully when I scheduled P three, P three only wants to execute for one time unit. So it gets scheduled and it's done. And now my Q looks like P one, P four and P two. So it's just gonna ping pong between all these processes over and over again until they are done in that order. So hopefully I can go fairly fast. So it would go P one, P four, P two, P one, P four, P two. And now at this point P two is done because it wants to execute for four time units. And we are now at four, one, two, three, four. So P two is now done. So now my Q would look like P one, P four. So then it would just go P one, P four, P one. Now P one's done, then P four, now P four is done. So any questions for that? That's as hard as it gets. Yep. So the reason we didn't immediately execute P three is because P two was already in line before it even arrived. So it can't butt ahead of it too bad. So you could tweak it if you like really want to optimize response time, go the immediate front, but then you might get into fairness things, right? So if that always happened then if you just have a bunch of really short processes that finish at a qualm length, they'll always run immediately and that might not be good. Yep. So typically there will not be a mix of them. It's an either or thing. Either I schedule it non preemptible or preemptible and you don't really go back and forth. All right, so then the fun calculations start. So average waiting. Well, we can again, figure out when things end. So this is when P four ends, 15 is when P one ends, 14, 13, 12 is when P two ends and P three is not too hard to figure out. So my average waiting time for process one, well, it was waiting around, it finished at time 15 and it arrived at time zero. So it was there for a total of 15 time units, executed for seven. So it must have been waiting for eight. Process two, it ended at time 12. It arrived at time two. So it was around for 10 time units, executed for four of those. So it must have been waiting around for six. Process three, we can see here, only waited around for one time unit. And then process four ended at time 16, arrived at five. So it was there for a total of 11 and it took four. So it must have been waiting around for seven. So if we calculate this 5.5, so our average waiting time went down compared to our quantum length of three. Next thing is response. So the average response, well, process one ran immediately, so did process two. Process three only waited for one single time unit and then process four was the longest one and it waited for here for two time units. So we can see that the response time went way, way, way down, which is great. Bad thing. How many context switches did we have? How many was that? 14. So the number of context switches was 14. So that may or may not be a bad thing depending on how long a context switch takes. So everything's kind of a trade-off. So if you wanted to practice, hey, you can do it again in quantum length 10. Now it's first come, first serve, right? Because 10 is longer than the longest process, which is only seven. So you could go ahead, calculate it if you want. Only three context switches now. The waiting time happened to go down, but the average response time got real, real bad. So this is basically summarizing what we discussed. You don't wanna call them length that's too short and you don't want one that's too long. And if you simulate round robin a bunch of times, it generally has a bad average waiting time when the jobs are of similar length. So scheduling just involves a bunch of trade-offs. We'll see other things we can do next lecture. You can combine things, do all sorts of weird things, but round robin is like a good, safe scheduling algorithm that's super easy to implement and works relatively well as long as you pick a good time slice. So we saw first come, first serve, shortest job first added preemptions, shortest remaining time, saw it optimized waiting time, wasn't realistic, saw round robin fair, something you could implement. Yay. All right, we're good. All right, so just remember phone for you. We're on this.