 All right, welcome back to operating systems. So today we get to chill finally. So it's been a bit of a whirlwind. Today we're just talking about basic scheduling, basic. We get to essentially relax. This will not be surprising if you have lived life or ever gone to a restaurant. This will not be very surprising. So we have two types of resources on our computer. If we want to categorize them like this, we can categorize them as preemptible and non-preemptible. So a preemptible resource can be taken away at any given time and used for something else or otherwise shared. So a CPU is a preemptible resource. And it can be shared through scheduling. I'm sure you've encountered this resource many times in your life that you can actually share. And then, hey, if it's highly contested, maybe you have to schedule it some time to actually use it. A non-preemptible resource cannot be taken away without some type of acknowledgement. So like disk space, for example. So if your file is using a gig or however much it's using, you can't just take away like a few megabytes of it and use it for something else because then you probably invalidate the whole file. And that's just not the way file works. Memory is also an example of this. So a non-preemptible resource is gonna be shared through allocations and deallocations. So there's gonna be some explicit requesting of a resource and then some explicit saying, I'm not going to use it anymore. And just if you ever get into high performance computing, while suddenly CPUs become non-preemptible resources where you can actually say, hey, I need this CPU exclusively for an amount of time and then you give it back whenever you're done with it. But for this course for general computing, we don't do that with CPUs. So the dispatcher that we kind of talked about earlier that does the actual context switching and the scheduler work together. So the dispatcher is like the low level thing that is responsible for doing the context switching between processes, which is just swapping their registers. So we're changing which given process is actually running at the CPU at any given time. Then the scheduler is like the high level policy. So it is responsible for deciding what process to run and when. So the scheduler is going to run at minimum whenever the process changes states. So we'll consider, for instance, non-preemptible processes. So once you give them CPU, they're just going to execute until they eventually terminate. Kinda like that Uniprogramming operating system we saw earlier. So in this simplified world, once the process starts, it keeps on going until it finishes. So in this case, the scheduler can only make a decision when a process terminates. So out of the current processes that want to run, which is the chosen one I'm going to choose to run right now. So preempt, if you have a preempt of kernel, it allows the operating system to run the scheduler at will. So if you use that uname command again, give dash v, it gives you like a whole bunch of information. One of the things that will give you is a preempt. So it will say that, hey, I preempt processes and they don't get to decide. So when we are doing scheduling, we care about four major metrics. And the first one is hopefully somewhat self-evident. We want to minimize waiting time and response time. What that means is we want to minimize the amount of time a process is not executing for. That is waiting time. And we also want to minimize response time. So that is the length of time from when it wants to start till it first starts executing. So that's when you would perceive some interaction with the process. That is why it's called response time. You also want to maximize CPU utilization. If you have a system with multiple cores, you don't want to just let them sit their idle when they could be doing some useful work. You also want to maximize throughput that is completing as many processes as possible. So you want to get through as many processes as possible. And then you've less to run and then you can start running some other ones. And the last one is fairness, which is kind of at odds with the rest of them. So ideally you would like to give each process the same percentage of the CPU or otherwise be fair because if a process wants to run, you essentially want to ensure it will run at some point in the future. So the first type of scheduling is what everyone has probably encountered if you've ever went to like any fast food restaurant or anything ever. It is first come, first serve. So you are in a line. So the first process that arrived gets the CPU and then the processes get stored in a first in, first out queue based off their arrival order. And what process do we want? Do we know runs next? Well, it's whatever is the front of the queue because that is the one that arrived first. So it's just a good old fashioned line. So everyone loves Gantt charts. So that is how we will illustrate the schedule today. So let's consider the following processes. So throughout all of today, we'll be scheduling four processes, processes P1, P2, P3 and P4. And then in the other columns, I will have their arrival time. So I just separate this into time units just to make it easier. So the arrival, all the times are in just time units without a unit. So the arrival time in this case is they all arrive at the same instant pretty much at time equals zero, except they assume we arrive at this order like they're off by a few nanoseconds or something like that. Then burst time, that might look like a weird term. It is just the term that is supposed to represent how long each process wants to run on the CPU for. So process one wants to run for seven time units, process two for four, so on and so forth. So if we assume this arrival order, so P1, P2, P3, P4, then this is our schedule. At time zero, we just pick whatever process is at the front of the line. So it is process one and it runs for seven time units. So because we're not doing any preemption or anything like this, we just schedule it until it terminates. So at time seven, it terminates and then we have to make a decision. Oh, what do we need to run next? And the next process in line is process two. So it gets to run for four time units, takes us to time 11, next process in line is P3, it takes two time, or one time unit, takes us to 12, and then P4 is the last one in line, takes four time units, makes it all the way to 16. So then the question you might get asked is like, oh, well for each process, what is the average waiting time for all processes? So P1 waited around for zero time units, it was scheduled immediately. P2 was scheduled after seven time units because it arrived at time equals zero as did the rest of the processes. P3 waited for 11 time units and P4 waited for 12 time units. So if you do 12 plus 11 plus seven plus zero divided by four, because that's how average is work, you should get something like 7.5. So 7.5 was the average waiting time for these processes. So let's see, same processes, but let's just assume a slightly different order. So we will assume that our line looks like P3, P1, or sorry, P3, P2, P4, and then P1. So if we have first come first serve, this is our schedule. So at time zero, the first process in line is P3. It executes for one time unit. At one time unit, we have to pick the next process, which given our order is P2. So it executes for four time units, then P4 executes for its four time units because it is next in line, and then P1 executes for seven time units. So nothing too exciting unlike previous lectures, right? Fairly straightforward. So if we compute our average waiting time now, it is zero plus one plus five plus nine. So if we add all those up, divide by four, we get 3.75. So having the same processes, our average waiting time went down by half just because of the order they happen to arrive in, which sounds really variable and something that we would ideally like to control. So our first actual algorithm is something called the shortest job first. So it is a slight tweak to first come first serve. And I say algorithm, it's not much of an algorithm. All it does is pick the job with the shortest burst time first. And again, we are still assuming non-preemptible resources. So now I will change the arrival time slightly so that each process arrives at a slightly different time just to make things more interesting. And then we'll use this example for the rest of the lecture. So on the top, just to make things easier, I put their arrival time on top. So at time zero, P1 arrives at time two, P2 arrives at time four, P3 arrives at time five, P4 arrives. So those are arrival times on top and our algorithm is shortest job first. So at time equals zero, we only have one process to schedule so we do not have a choice. So we have to run P1 for seven time units. So we run it for seven time units. And then while we're running it, all the rest of the processes arrive. So we have P2, P3, and P4. So between P2, P3, and P4, if we're doing shortest job first, we look at the burst times and then pick the lowest one first. So the lowest one out of all these processes is P3. So whatever the CPU is free at time seven, we would pick P3 because that is the shortest job. So then at time eight, we have the decision to pick between P2 and P4. They have the same burst time. So in order to break this tie, we'll just pick the one with the shortest arrival time. Just why not? So P2 arrived before P4, so we'll pick P2 and it will run from eight to 12 time units. And then at 12, we have to make another scheduling decision. There's only P4 left, so we would schedule that for the last time unit. So if we compute our average waiting time, we have to take into account whenever processes arrive. So for P1, it arrived at time equals zero, and then it started executing at time equals zero. So it waited for zero time units. Then if we calculate the waiting time for P2, well it arrived at time equals two and P2 started executing at time equals eight. So eight minus two equals six. So it was waiting around for six time units before it executed. For P3, it arrived at time equals four, executed at time seven. So it waited for three. And then finally for P4, well it arrived at five, executed at 12 and it waited around for seven time units. So our average waiting time now is four. So we can use that to base everything else off. All right, any questions so far? This should be relatively boring, right? Fairly straightforward, easy types of questions to ask on an exam that should essentially be free marks. So the first thing is that shortest job first, not super practical. So through math you can prove that it is optimal at minimizing the average waiting time if there's no preemption. But in reality you will never know the burst times of any processes. So maybe you can guess them, have some model for that or something like that. You could do something like that but in general you might be wrong and some things aren't always predictable. And the other disadvantage of this is you may starve long jobs which means starvation in this case just means they will never execute. So if I have one really long job and then I have enough short jobs to keep the CPU busy, well that means that long job will never execute. So say every time unit I had a one time unit job come in and I also had a job that wanted to execute for like eight time units while since new short jobs are constantly coming in that job that was eight time units will never execute and that term we use is that it gets starved. So it never actually executes. So two reasons it's not practical. So we can tweak it slightly if we want to add preemptions to the mix. So shortest remaining time is that tweak. So we just add preemptions and then we will schedule the process with the shortest remaining time left. So again, we're just gonna assume the minimum execution time is one time unit just to make our lives easier and this will also optimize the average waiting time but still have that starvation issue. So here's what it would look like for that same schedule but now we have preemptions. So it starts off the same with at time zero P one is the only thing we can schedule so we have to schedule it. So we start executing it and then at time two while P two comes in, P two comes in and we have executed P one for two time units. That means it has five time units remaining right because it's total burst time is seven. So it has five time units remaining and P two has four time units remaining because we haven't started executing it yet. Well, four is shorter than five so we would pick to execute process two. So we execute process two for two time units and then process three comes in at time equals four. So at that point, process two has two time units left because it wants to execute for four and so far executed for two. So it has two time units remaining and process three just has one time unit remaining because that's all it wants to execute for. So it is now the shortest. So we would immediately schedule it to run. It would run for its one time unit. So when it finishes P four comes in which also which wants to execute for four time units. Well, guess what P two just has two time units remaining. So we would schedule P two to run. So P two would run for its two time units and be done at time equals seven. Then we just have to choose between P four and P one. P one has five time units remaining and P four has four. So we'd schedule P four to run. It runs for four time units. It's done at time 12 or sorry, time 11. Then at time 11 P one gets scheduled for the last five time units and it finishes out. So the average waiting time here. Well, for process one it's waiting around for a total of nine time units this time in the middle. And if you don't want to count it there is a shortcut. So you can just figure out what time did it stop executing? So P one stopped executing at time equals 16 and started at time equals one. So that's the total time it was trying to run for. And then you just subtract the amount of time it was actually running for, which is its first time. And that will get you the amount of time it was waiting for. So if you do 17 minus seven guess what you get nine. So P two waited around for one time unit while P three ran, P three waited for zero time units because it was scheduled immediately. And then P four ran waited for two time units, right? Because it came in at time equals five and waited two time units for seven. And yeah, this is the comment. I'm running into the same problem where I don't know the burst time is in advance. I'm just showing you the preemptible version of it. Yep. The shortcut? Yeah, so for the shortcut, if you don't want to like, so like the waiting time for, here let's pick a different one. So the waiting time for P two, let's say, say there was a bunch of things in the middle and we didn't want to count them. Well, we would just take the total amount of time it was like available for. So it P two ended at time equals seven and one started at time equals two. So the total time it was hanging around for was five time units. And then if you want to calculate the waiting time, you can just subtract how much it's burst time from that. So total time is available as five time units. It was running for four of those. Therefore it must have been waiting around for one of those. And then yeah, so that's just a shorter way. If there's a lot of switching, might be annoying to actually count between and for longer ones, you know, why would you? So that's just a shortcut you can do. All right, so far so good. But yeah, this is also super impractical. Yep. Yeah, so this is just, so most of the time this is presented because it's provably like the shortest waiting time. So you use it to compare things after the fact to, so we're gonna use this as essentially a benchmark for this, but it, yeah, it has some problems where we don't know the burst times ahead of time and it has that starvation issue, which if something wants to run for a long time, it might never actually execute. So here's our real one. It's called round Robin and it's probably not terribly surprising either. If you have siblings and you had a toy, you both liked well, your parents probably, you know, you have 30 minutes with it and then you have to pass it to your sibling and then they have 30 minutes and you pass it back and forth like that. Round Robin is exactly that. So, so far we haven't introduced fairness. So, you know, if the last algorithms were represented like a toy or something like that, well, you'd probably be very mad depending on what sibling you were. And we haven't talked about fairness at all or tried to evaluate it. Obviously starvation is, you know, a counter example to fairness. Starving something is not terribly fair. So, what the offering system does to round Robin is it divides executions into time slices and in the literature, you might find them called quanta. Why are they called quanta? I don't know, they're computer scientists. They wanna try to be fancy. They're just time slices. And an individual time slice is called a quantum again. Don't ask me why, but you might encounter that. And what this does is just maintain a FIFO queue similar to first come first serve except it will keep on going around and around and around based off the time slices. So, what this will do is it will preempt the process if it is at the end of its time slice and essentially throw it to the back of the queue. All right, you go to the back of the line. So, what are some practical considerations in determining this quantum length or the length of the time slice? Yep. Yeah, speed of your processor. So, if you pick it really, really, really small then you're probably gonna waste a lot of time doing like context switches, right? So, switching between processes. What is the opposite problem if I pick a super, super huge quantum length? Yep. Yeah, that's the same thing as first come first serve. So, if my quantum length is like eight years then it's the same as first come first serve which we know kind of sucks. So, there's always like a sweet spot with whatever time slice you pick. So, let's go through an example. So, again, same processes and I will go ahead and work through an example with you. So, here's what you would probably start with on an exam. You get the processes with their arrival time at the top and the burst time. And for round robin you're typically given a quantum length. So, quantum length, let's start off with three. So, our time slices are three time units long. So, at time zero, process one is the only thing we can schedule. So, we would schedule it for its quantum length. So, it gets three time units and it would get all three time units because it lasts for seven. So, it would still want to run. It doesn't want to exit earlier or anything like that. So, at the bottom I will write in my Q at every time step. So, at time two while P one was executing, P two came in and it would be at the front of the Q. So, at time two it's at the front of the Q. It's the first one in line now. So, at time three when P one is done, it would go to the back of the line. So, at time three our Q would look like P two, then P one. So, what would we choose to execute at time three? P two, right? It is the first thing in the Q. So, we would pick P two, it would run for three time units and then at time four, while P three comes in. So, P one would move to the front of the Q because P two was ahead of it, it started executing. So, it would now be the only thing in the Q. Then at time four, P three comes in. So, it would be added to the end of the Q. Then at time five, P four comes in. So, it would go to the end. So, our line would look like P one, P three, P four. So, at time six, P two would be done executing. It still has time left. So, it would get thrown to the back of the Q and now my Q is going to look like P one, P three, P four, P two. So, now what process is the next one to execute? P one, I saw at least one. So, P one would run for three time units. Remember, it's total time is seven. So, it would take all three time units to run and when it's done executing, it would get thrown to the back of the Q. So, we'd have P three, P four, P two, then P one. Now, we pick, we run whatever is at the front of the Q and that is P three. So, P three only wants to run for one time unit. So, at this point, it is done running and it would just go back to the scheduler and the scheduler has to pick a new process to run. So, our Q, whoops, would then look like P four, P two, P one. So, at this point, P four would run. It would run for three time units because it in total wants to run for four and it would use all three. So, at this point, my Q would look like P two, P one, P four and they all only have one time unit left. So, we would run them in that order. So, whoops, that is an eraser. So, we would run P two, P one and then P four. Right, any questions about that? Really straightforward, some people are sleeping. I should probably punch a microphone and wake people up. Yep, yeah. So, the length of the QAN is given to you in the question. So, I just picked it because, hey, we'll pick different ones and see what the effect is. So, here, questions we might want to ask ourselves. So, these are like the three questions you will be asked for every scheduling problem. So, first one would probably be like, what's the average waiting time? So, the average waiting time, well, we have to count the average waiting time for each process. So, P one finished at time equals 15, right? P one and then P one ended at time 15 and then started at time zero. We can do that shortcut again. So, it was a round for 15 time units. It first time was seven. So, 15 minus seven means that P one was waiting around for eight time units. So, now we have to figure out the waiting time for P two. So, P two ended at time 14 and it came in at time two. So, it was around for 12 time units, right? 14 minus two. So, it was around for 12 time units. It ran for four of those. So, it must have been waiting for eight of those units. Now, we have to calculate the waiting time for P three. P three, I will just write it out here. We could do the trick or in this case, it doesn't really change. So, P three was waiting around for one, two, three, four, five, wait. Yeah, it was waiting around for five time units. Yep. So, P two arrived at time two. So, as soon as it arrived, it was immediately waiting, right? Because P one was running. Yeah, so when it arrives, when it arrives to when it ends, so like the total time it was hanging around for. So, yeah, you have to include it. You don't say when it starts executing because then you miss out on the time it was waiting to execute initially, right? Which is, in fact, part of the response time that we'll want to do. So, now to finish it off, we need the total waiting time for P four. Well, P four ended at time 16. Cool, whoops. So, P four ended here. And then it came in at time equals five. So, it was hanging around for 11 time units. It's burst time was four. So, 11 minus four, as long as I can do elementary school math, should be seven. So, our average waiting time, if we crunch that all together, oh yeah, I forget, what is it supposed to be? It is supposed to be something like seven, right? So, our average waiting time is seven. So, we can remember that when we compare other things. The next thing we'd wanna calculate is average response time. So, response time is kind of what you were talking about. It is the time from when it first arrived to whenever it first started executing. So, it's just the initial time it was waiting around if you want to word it like that. So, the response time for P one, well, it came in at time equals zero and then started executing at time equals zero. So, its response time was zero. For P two, it came in at time equals two and then started executing at time equals three. So, its response was one. P three waited around a while. So, P three came in at time equals four and waited one, two, three, four, five. So, it waited five time units until it started executing. And then, process four. Well, it came in here at time equals five and it started executing here at time equals 10. So, its response time was five time units. So, if we divide by four, we get 2.75. So, those are our two numbers. The last one we would want to keep track of is the number of context switches. Because remember, if we're doing real scheduling on a real CPU, that is not free. We have to keep track of how many times this happens because if we context switch too much, it might be real bad. So, to count the number of context switches, you just count the number of times we switch between processes. So, we switch and I will do it in red. So, we switch between process one and two here. Why that delete the whole line? So, that's our first context switch. Our second is between process, wow, that's annoying. Process two and process one. Then we have another context switch here between process one and three. Another context switch here between three and four. Then four and two, two and one, one and four. So, to count the number of context switches, we just count the number of red lines we just drew. So, one, two, three, four, five, six, seven. So, we have seven context switches. All right, any questions about how I came up with any of those numbers? Yep. So, yeah, when the process actually finishes, we still have to context switch to run a different process, right? So, yeah, you might be able to not save the registers of the process that's ending, but you still have to restore the other ones, right? So, we still have to do, it would be slightly quicker, but we don't go into that level of detail, but yeah, it would be slightly quicker. Yep. Yeah, so you would not count zero as a context switch or 16 as a context switch, because it's done at that point. Yep. Yeah, so that's a good point. So, the question was, how much does a context switch cost? Wouldn't first comes first serve have the least number of context switches? So, yeah, if you didn't even have preemptible processes, that would have the least number of context switches, but it would also have the worst response time possible. And you would probably be very angry if you could only run one process at a time on your machine. So, they wouldn't really be used, but most scheduling algorithms are just a series of trade-offs. So, there's no one thing that can optimize everything. Different, fairness is at odds with, you know, waiting time, but you don't want it to be too bad. It becomes an engineering problem, which thankfully we're a room of engineers and not computer scientists. So, next one. So, for fun, if you want to write a lot, well, we can do it again. With quantum length equals to one and see what the difference is. So, in this case, every single time unit, we get to make a new decision. So, this will be fun. So, this will be the most annoying one. So, same arrival, same burst time, same everything. Just, whoops. Just, we have to make the decision at every single time unit, which will become very tedious, very quick. So, at time equals zero, while P1 wants to run, so we would run P1 and then it would get thrown to the back of the queue, but guess what? It's still the only process running. So, we would schedule it here. So, now the first interesting and only interesting thing happens. Well, P2 is coming in at the same time that P1 goes to the back of the queue. So, the rule for this is, you always want to put whatever just arrived at the front of the queue. And the idea behind that is, well, if I put it at the front of the queue, I'm going to minimize response time overall because P1 already executed. It was going to the back of the line. P2 is new, so if I can reduce the waiting time at all, that's a win. So, P2 would go to the front of the line and when we re-queue P1, it goes to the back. So, that's about the only interesting thing that happened. So, now we'll just go through it. P2 would run, goes to the back of the queue and now we can see that we're going to just constantly ping pong between these. So, P1 would go, then our queue would look like P2 is the only thing there. Now, we have that tie again. P3 is coming in while we're re-queuing P1. So, we would go P3, then P1. Then P2 would execute. Our queue would look like P3, P1. Then P4 comes in and then P2. And thankfully, the hard part is over now. We're just constantly going to go around and around in this order until things stop. So, P3 would execute. It mercifully only executes for a single time unit, so it is done. So, our queue looks like P1, P4, and P2. So, can I take liberties here and go quicker? Yeah, see nods, we can get out of here. So, it would go P1, P4, P2. P1, P4, P2. Then at this point, P4 should be done, right? So, P2 executed here, here, here, and here. So, P2 is now done. It has allotted its four time units. So, it is good to go. Then we would have P1, P4, P1, P4. Yep, anyone disagree with me there? So, this is where the shortcut will save our lives, because this is really annoying if you count everything. So, at time 16, P4 ends, oops, time 15, P1 ends, time six, P3 ends, and where does P2 end? So, P2 ends here, and that is 14, 13, 12. So, we answer the same questions. Well, what's the average waiting time? Well, for process one, it ended at 15, and it came around at zero. So, it was sticking around for 15 time units. It executed for seven of those, so it must have waited for eight time units. Next one is P2. Well, P2 was done at 12, came in at two, so it was hanging around for 10 time units, executed for four of those, so it waited for six. Next one was P3. Well, it came in at time four and finished at time six, or we can just see that waited around for one time unit. Then the last one, P4 waited around for seven time units to skip a step, divide by four, we get 5.5. Our response time is going to be slightly easier, response. So, for P1, while its response time was zero, it waited around for zero time units until it started executing. What is the response time for P2? Zero. Right, it started executing immediately. What about for P3? One, I saw at least one one. What about for P4? Two. So, without even doing the math here, whoops. We can tell that the response time is much, much, much, much, much lower. 0.75 as opposed to what we get before, 2.75. So, the response time went down. The average waiting time also went down. It was 5.5, so that's good. Last thing is the number of context switches, which will be fun. So, here I switch between P1 and P2. I'll just mark them all down. Do, do, do, guess what? We're essentially context switching every single time. Who wants to count that number of context switches? 14. So, the average waiting time went down. The average response time went down. But the number of context switches went way up. So, if context switches were slow, this probably, this, you know, this quantum length might not be a good thing to do. So, last one we can do, which, I mean, you guys have pretty much figured it all. Ray, if we have a giant quantum length, yep. Yeah, so when a process arrives at the same time something is being re-queued, you put the thing that's being re-queued at the very back. So, that's why here, at time two, whoops. So, that's why here, at time two, that's why P2 went first, because it was the new process and P1 was being re-queued. So, I put P2 first and then put P1 at the very back. And then I did that every single time I re-queued when a new one came in. Yeah, yeah. So, in that case, the process that arrives when one gets re-queued, it becomes second last, instead of last. Okay, so, next boring thing. Hey, do it again, but quantum length 10. Wow, guess what? It becomes first, come first serve. It becomes real boring. So, it would look like this. If you compute everything, it looks like this. We have way less context switches, which was your point. You only have three context switches in this case, but our average waiting time actually goes slightly down with quantum length one. Our waiting time was 5.5. Now it's slightly better, but our average response time is kind of garbage now. It's 4.75, which is way worse than 0.75. And this is just first, come first serve without preemptions. So, yep. Yeah, so that's a good question. Why does response time matter? So, it depends on the application. So, response time may not matter if you're just doing some machine learning thing, right? You only care when it ends. But if you are interacting with your computer, response time matters a lot. So, if you're doing something interactive, you want low response time. As soon as you move your mouse, you wanna see it move, right? If it took a second to respond to you, you'd be like, what a piece of crap. And then you'd go, you'd go sell your Microsoft one, get a Mac or something, right? Yeah. Yeah. Yeah, so response time is just from arrival time to when it first starts executing. And that's it. Yeah, response time doesn't care about how long the process takes. It's only when you first get a response out of it, which is like a crude measurement to how something would actually feel, but better than nothing. Yep. Yeah, so that's another good question. Hey, can you mix and match scheduling algorithm? So like, hey, I'll let this run and do something else with this after a certain time. And the answer to that is, yeah, there's no perfect scheduling. You can do whatever you want. In fact, we'll talk more about scheduling next time and little things you can add to it, but you might figure out that, hey, this is a process someone's interacting with. Maybe I wanna prioritize low response time for this one. Here's something that they don't really care about. I wanna run that as fast as possible. It doesn't need response time. They make different decisions based on the processes. So absolutely you can do something like that. All right, so this was a good, hopefully a good having a boring lecture once in a while is a nice.