 All right, welcome. This will be given all the multi-threaded stuff and all the locks and all that fun stuff. This lecture will be very straightforward and easy. So yay, we've reached the other side of the hill. So today we will talk about basic scheduling and this should be kind of what you're already doing and be fairly straightforward and they're just, I guess, more easy questions to put on something like a quiz. So let's get to it, we can probably go through this. So there's two types of resources. So being preemptible or non-preemptible doesn't just apply to like a CPU, it's a more general term and just means that you can take something away. So preemptible resources are resources that can be taken away given to something else and given back so you can switch back and forth between them. The preemptible resource we're all used to is just a CPU and it can be shared through things like scheduling. So you get some time, you get some time, you get some time and we all get to use the same resource. Well, non-preemptible resources are something you can't take away without at least some type of acknowledgement. So like think about your disk space. I can't give everything full access to the disk. I have to just give it some space on a disk. I can't take it away unless I like do something really rude and delete it. But there should at least be some acknowledgement so you can actually program against it. So another example of non-preemptive resource which would be something like memory. I just can't give you memory and then take it away and then expect your program to actually work. So a resource that is non-preemptible is usually shared through allocations and deallocations. So that's exactly what you do for memory. It's essentially what you do for disk and we're kind of used to it now. So just as a note, if you have super high throughput systems where you actually need to control resources, you might make CPU non-preemptible resources so you just give a program a CPU for the whole time and just let it execute as fast as it can. But you won't see that unless you're in like high performance computing or something like that. So there's two things. There is a dispatcher and a scheduler so we already kind of know what the dispatcher does. It's like the low level thing responsible for the context switching which is essentially like that get context function. You could also call the thing that does the context switching and dispatcher which you've kind of already written for lab two. And then the scheduler is the high level policy which we haven't talked about yet. And it's responsible for deciding what processes to run. So the scheduler runs whenever the process changes states. So for now we'll consider non-preemptible processes. So a process just executes until completion. So it's like the old batch systems of older year. So once a process starts it will run until it is completed. So in this case, the scheduler only has to make a decision when the process terminates and it has a free CPU and it just has to pick the next process to run to completion. So if you have a preemptive scheduler it allows the offering system to run the scheduler at will. That's what you'll be doing in lab three and that's what happens on something like Linux. So if you do uname-v in part of the description it will say preempt and it means it is a preemptive scheduler which everything has been for like the last 30 years. So when you argue about scheduling and want to quantify how good it is there's a few metrics we want to keep track of and try and minimize. So one is you want to minimize waiting time and response time. So if a process comes in you might want to react to it as quickly as possible and that is response time. So it's the time from when the process first arrives until you first give it some CPU time. And then the other one is waiting time. So the total, which is the total amount of time it is waiting and not running when it could otherwise be running. Then if you care about your CPU, right? You want to maximize CPU utilization. You don't want a CPU idle if you have a machine with eight cores you want them all running at full speed if you can. Then you want to maximize throughput so complete as many processes as possible and some of these are at odds with other ones. And then last metric we'd care about is fairness. So you want to try and give each process the same percentage of the CPU ideally if we're being completely fair. So the first scheduling algorithm we can use is not really even an algorithm. It's what we're all used to. It's called first come first serve or FCFS which is the most basic form of scheduling, right? You go to the food truck outside. You are in this type of queue. You line up whoever gets, whoever at the front gets served first and then it goes on and on like that. So it's just first come first serve. So in this, the first process that arrives gets a CPU and the processes are just stored in FIFO order and something like a queue. So we'll use Gantt charts to illustrate schedules. So everything will look like this as we go on and this is the type of questions you'll be asked. So you'll be given something like consider the following processes. So in this case, in the first column we just have a process where we just give it a name. So we have process one, two, three and four and then the arrival time. So at what time does the process arrive in this case, they all arrive at time zero and we'll just start everything at time t equals zero. And then burst time is just the total amount of CPU time this process needs to complete, which is pretty much just for arguing about the metrics of this in real world situations you wouldn't know the burst time, right? Your kernel, whenever you start a program has no idea how long a program could last and it has no idea how to predict that as well. So for this example, let's just assume that process, so they all arrive at time zero and we'll assume, it's somewhat atomic that we have some order where it's ordered by one, two, three, then four. So if we schedule it at time unit zero here, the first process that arrives given that order is process one and it's burst time or the amount of total time it takes to execute is seven time units and we'll assume that you can only break things down to the smallest unit you can do is one time unit. So here it would take seven time units, so that's the P one here in the darker color that takes seven time units to execute and that t equals seven, it's done executing because it's had at seven time units. So now at time seven, the things left in the queue are process two, three and four, so two was at the head of the queue, so at t equals seven, we would schedule process two and then process two, if we look at the chart, takes four time units, so one, two, three, four, takes us to t equals 11 and then process three is next which only takes one time unit which takes us to t equals 12 and then it goes on and then it takes 16 time units to complete everything. So the average waiting time is just the amount of time each process is waiting for, so process one is waiting for zero seconds because it comes in at t equals zero and gets scheduled at t equals zero, so it's waiting time is zero, then for process two, it's waiting time is seven because that's how long it takes to start and then p three is 11, p four is 12. So average waiting time, nothing special about it should, so it should be zero plus seven plus 11 plus 12 divided by four which is if we want our average wait time that should be 7.5, all right, make sense for everyone? All right, so if we just assume they all still arrive at the same time unit so they all still arrive at t equals zero and the only difference is we'll just assume that they're in a different arrival order, so it goes three, two, four, one instead. So now our scheduler would look like this, again first come first serve, so we do p three which would only take one time unit, then p two which would take four time units so it would complete at t equals five, then p four which again takes four time units so it completes at t equals nine and then we have finally p one which takes seven time units so it completes at t equals 16. So our average waiting time now would be zero plus one plus five plus nine divided by four. So if that's our average waiting time that would be 3.75 which we decreased our average waiting time by half just by changing what order the tasks arrive in. Okay, any questions so far? We got way shorter waiting time just based off the order so there must be something to this. So if we want to actually minimize the waiting time we have a slight tweak to it where if we have a bunch of processes we can do the shortest job first. So as we saw before what kind of happened before is we were always doing the shortest job first in the previous slide so we always did the shortest job first and then that is a way to minimize the average waiting time. So we'll always sketch so instead of first come first serve yep, yes, yep, yeah so shortest job first we'll have a tweak and we'll see the tweak where we have different arrival times, yep. Yeah, like I said before the kernel would not know so this is just so we can argue about different scheduling algorithms and actually show what trade-offs are but realistically the kernel is not going to know. Okay, so we'll still assume no preemption in this case and we'll slightly change the example so that we have different arrival times. So now not everything arrives at t equals zero so on the Gantt chart here on the top row I put in all the arrival times so p1 arrives at t equals zero, whoops, p2 arrives at t equals two, p3 arrives at time equals four, p4 arrives at time equals five so I just put the arrival time at the top so we can keep track of it. So if we're at t equals zero the only process that has come in at that time is p1 so we have no choice but to schedule it so we would schedule it and again we're assuming no preemption so as soon as we schedule it it runs until it completes so at t zero we pick p1 so we're stuck with it until it's done after seven time units so while it's executing all the other processes arrive so we have p2, p3 and p4 at t equals seven whenever p1 is done so if we do shortest job first the shortest job is the shortest burst time so we would pick p3 so we would pick p3 to run at t equals seven it would run, finish then at time equals eight we have a choice between p2 and p4 and at this point it doesn't really matter because they have the same burst time so they have the same completion time so if there's a tie generally you just pick doesn't matter but generally you can just pick whatever one arrived first and if there's a tie for completion time you can fall back to first come first serve so in this case if we compute our average waiting time again it's going to be slightly different because we have different arrival times so for p1 our wait time is zero came in at t equals zero and it got scheduled at t equals zero and then for the rest of them because we're non preemptive you essentially just count the difference from when it arrived until when it started executing so for p2 that would have taken six time units because it arrived at t equals two and didn't execute until t equals eight so it waited around for six time units for p3 it only waited around for three time units so it arrived at t equals four got executed at t equals seven and then for p4 came in at t equals five and couldn't get executed to t equals 12 so if we do our average waiting time it's four any questions about that? Nice and simple finally right? After multi-threading so this is not super practical so you can provably minimize the average waiting time if you have no preemptions but again like you commented that you don't know the burst times for each process you could do something like you use past executions to predict the future that's not gonna be perfect you could try and stick some AI there too again that's not going to be perfect and then you also might starve long jobs so they may never execute so assume we had a schedule like this where just one task takes like a thousand time units and in this scenario it would never get executed because you'd still get tasks with like a bunch of fours and a bunch of tens or something like that so your longest running task might never get executed and that's called starvation and then the first tweak we're gonna do is we're gonna make it difficult now and say that we can preempt processes so if we wanna tweak shortest job first to work with preemptions we can just do shortest remaining time first and again we'll assume that the minimum execution time was just one time unit and similar to shortest job first this will minimize the average wait time so here's the same thing again but now we can preempt whenever a new process comes in and immediately schedule it as long as its remaining time is lower so we'll have the same arrival times as we saw before and the same burst times so now if we do shortest remaining time first at t equals zero we only have p one so we would schedule that so it would run for two time units and then process two arrives so if we do the shortest remaining time first for p one it takes seven and we execute it for two so it has five time units left to execute and whenever p two arrives it has a total execution time of four time units so p two is we'll finish faster than p one so we just kick p one out and schedule p two instead so everyone follow that so we have five time units left for p one so we would schedule p two for two time units and then when p three comes in we might have to make a decision so we can reevaluate so now p two has two time units left to execute and p three just arrives and it only takes one time unit to execute so we would go ahead and switch over to p three and execute it for its one time unit until it completes now at t equals five p four also arrives which will take four time units but now our shortest remaining time is going to be that p two we kicked off that only has two time units remaining so at t equals five we would just schedule it for its two time units and let it finish off that t seven we'd have p four which takes four time units left and we'd also have p one which still has five time units remaining shortest remaining time first we schedule p four so we schedule it for the four time units and then finally at t equals 11 we only have one task left we'd schedule p five for the last five time units so if we do our average waiting time we have to just calculate the total number of time that process is waiting when it otherwise could have executed for p nine it's waiting around for nine time units so it got started immediately at t equals zero but between t equals two and t equals 11 it was just waiting around because process two, three and four all executed to completion during that time and then if we look at the waiting time for process two it only waited around for one time unit right here while process three was just sitting around while process three executed and then process three waited for zero time units it got scheduled immediately and ran to completion and then finally p four waited for four time units because it arrived there at t equals two or t equals five and what finally began executing at t equals seven so if we do our average waiting time now we got it down to three when before it was at four right yep yes so the comment is does this still have that starphasian problem and it does where you would start long jobs because essentially we just tweaked it to run with preempt we just tweaked it so it can run with by preempting processes okay now we get to the difficult one yep so the waiting times so you just take the total time from when it arrives until it completes and you can just subtract out the total time it takes to run and that's how long it's waiting for so like the total waiting time for p one so it comes in here and it doesn't complete to here so and it's not executing here yeah so nine time units so it's the total time it takes if you don't wanna count like the individual squares easiest way to do it is just time it comes in time it finishes and then minus how long it takes to run and that's how much time it was waiting for okay so now is our hard tweak which isn't very hard and it's the first scheduling algorithm that is actually used in offering systems and kernels and I guess lab your labs so it's just called round robins because so far we haven't handled fairness we just did the shortest thing first and we just tried to minimize our waiting time so in this case we're going to make it fair but there's gonna be trade-offs with other metrics if the other metric is probably reduces the average waiting time then if we make it fair it is not going to have the lowest waiting time so we are going to have some type of trade-off here so what the offering system can do and what happens in lab three is the operating system could just divide the execution into time slices and this is called quanta don't ask me why they wanna seem harder than it actually is so an individual time slice is just called a quantum and you do essentially the same thing where you maintain a FIFO queue of processes really similar to first come first serve and then you just serve them in order but they only have set time slices so you try and be a bit more fair so you just preempt the process if it's still running at the end of the quanta and just re-add it to the back of the queue and it just takes the next turn so there's going to be some practical considerations for determining our quantum length anyone care to guess what metrics we'll be trading off here yeah so one of them is the number of context switches we'll have so the shorter it is the more context switches we'll have and then what could be another one in terms of things you might care about for processes which is kind of related to the polling question before yeah responsiveness right so if we have it super super short we're going to have a lot of context switches but we're going to be very responsive so whenever a process comes in there's a set time that you know it will take until it actually executes and again drawback of this is if the quantum length is too long so if it's you know essentially if the quantum length is infinity this is something we've already seen if the quantum length is infinity it is first come first serve so there'll be no difference so let's see that so here's round Robin we'll say there's a quantum length of three time units so the most any process can execute is three time units so again we'll put a rival in the top and then I'll show at the bottom here all of these that is the current queue just so if you write this out it helps keep track of it so same processes same arrival time same burst time same everything we're just changing the scheduling algorithm in this case so for process one it comes in at t equals zero it'd be the only thing in the queue so it would execute and our quantum length is three time units so it would be able to execute for three time units and then we'd preempt it and then check our queue so at t equals two p2 arrives so that would be at the front of our queue process one is still executing and then at t equals three we would re-add p1 to the end of the queue which now has p2 in it so at t equals three we would schedule p2 everyone on the same page alright so we schedule p2 we schedule it for three time units it gets its three time units it still hasn't done yet and in the meantime our queue grows at t equals four while p2 is executing p3 arrives so it would go to the back of the queue so it would execute next after p1 and then at t equals five p4 gets added so it would be at the back of the queue again and then at t equals six we put p2 all the way at the back of the queue and then for the next thing the next process to run we just pick off what's at the front of the queue so we execute p1 for another three time units one, two, three which takes us to t equals nine and it's not done yet so we re-add it to the back of the queue so at t equals nine our queue looks like p3, p4, p2, p1 so p3 is at the front we schedule that it gets a maximum of three time units but it only takes one time unit to execute so it just executes for its one time unit and then it's done then at t equals 10 we get a new chance to make a decision so we'd pick off what's now at the front of the list which is p4 it runs for four time units or sorry, three time units so that takes us to t equals 13 where it gets re-added to the back of the queue then our next process is p2 that only has one time unit remaining we schedule that p1 is at the front only has one time unit remaining schedule that and then we schedule p4 which again only has one time unit remaining so does this make sense to everybody? this is literally as hard as it gets as a nice break for us oh sorry this is the tricky part where we have to count so we have to count the number of context switches so it's just the number of times we change processes so one, two, three, four, five, six, seven no tricks, no anything we're just counting the number of context switches so we can compare the difference between different scheduling algorithms so there are seven context switches if we take the average waiting time it's going to be eight time units for process one so it doesn't finish executing till t equals 15 and it starts at t equals zero so if you take 15 minus seven it's waiting around for eight time units and we can verify that by counting it like one, two, three, four, five, six, seven, eight so it's waiting for eight time units until it's not done executing then similarly for p two p two is done at t equals 14 and it comes in at p equals two so it's waiting for one, two, three, four, five, six, seven, eight waiting for eight time units and then process three waiting for five time units process four waiting for seven time units so if we calculate our average waiting time it's seven and then if we want to calculate a response time it's going to be the time from when the process first comes in until when it first starts executing so for p zero its response time is zero came in at zero, got executed zero for p two there's one time unit came in at two and started executing at three and then for p three it came in at four and started executing at nine so it waited around for five time units and then p four came in at five and didn't start executing in 10 so it waited around for another five time units before it started executing so here's all our numbers if we write them down so most of the time when you compare scheduling algorithms you count the number of context switches the average waiting time and the average response time and then quick note here if there are ties so a new process arrives while one is preempted generally you always favor the new process in if there's a tie and the rationale behind that is well if I favor the new process instead of the old one all I'm doing is reducing response time for the new process, right? The first one already had its response time so, oh sorry this is as hard as it gets so we'll just reduce the qualm length to one time unit so we just write over and over again but it's not any harder so now if we have a qualm length of one time units we just have to argue our queue over and over again so at t equals zero p1 is the only thing there so it would execute then check the queue again p1 is the only pass there then at t2, p2 comes in so we would favor that because p1 would end its time unit so we'd put it at the back of the queue so this is where our tie comes into place so we would execute p2 and then they're just going to go back and forth and around Robin so there's only those two tasks so we're going to get p1 then our queue is going to look like it's only going to have p2 on it and then p3 comes in so there's a tie between re-queuing p1 and then the new task p3 so this is our tie so we'll favor the new task which is p3 so at that point we would schedule p2 and then p4 would come in at t equals 5 and there's now another tie with two re-adding itself to the queue so we'd put four ahead of two and this would be our queue so this is just going to be our whole queue we're just going to round Robin at one process at a time so you'd have p3 get p3 executed it's done it only executes for one time unit then the next process is p1 and then p4 then p2 then p1 then p4 then p2 and then at that point p2 is done executing then you have p1 p4 you can see is the round Robin and then p1 then it's done executing then p4 then it's done executing so any questions on this? so there's just a queue it keeps on re-adding, yep so the question is at t equals 4 why didn't we favor p3 and we did so right at the time here right at the time here when p1 is executing there's only p2 in the queue so right at t equals 4 when p3 comes in we would just add there's a tie between p3 and p1 re-adding itself to the queue so that's why we put p3 here ahead of p1 so that's our tie so we do that we just go on and on and on again and you just have to pay attention to you just have to pay attention to how long each process has left and when to throw them out of the queue because they're done executing so if we go through again and calculate all of our numbers again so p1 if we do it's well first we do the number of context switches so 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14 right you just count the number of context switches so in this case there's going to be 14 context switches instead of what we saw before which was only 7 and you'd calculate the average wait time and average response time so if we do that these numbers should make sense that the average waiting time went up a little bit so now we're at 5.5 as our average but because we use a shorter quantum length we can actually demonstrate that our response time is better so we're at 0.75 instead of before when we were at 2.75 so we had and nicely enough to our waiting time went down so we got a lower average waiting time and we got a lower response time and our drawback was we context switched twice as often so before we had seven context switches and now we have 14 cool so here's it again same question but now we have a quantum length of 10 units so if we do round robin with that again q at the bottom and schedule at the top at p t equals zero we only have p1 so we schedule it and it gets up to 10 time units so it only takes seven to execute so it would just be scheduled for all seven and in that time p2 comes in which would again the q's at the bottom so p2 would be at the front of the q then p3 comes in so it would go to the end of the q at t equals 4 then at t equals 5 p4 comes in and it's at the back of the q then time seven when process one is done executing we just pick off what's at the front of the list which is p2 it executes for its four time units then it would execute p3 that's the next thing in the q it's less than it's less than the uh it only takes one time unit to execute which is less than the quantum length of 10 so it would just execute to completion then we only have p4 to schedule we would execute that to completion and if we run the numbers on this this also is exactly the same as our first come first serve schedule right because our quantum length was bigger than the time the longest process takes so so it's not going to do any better but it is going to further reduce the number of context switches because in this case we only have three context switches but if we crunch the numbers our average waiting time is now somewhat better than before with our lower with the lower quantum rates it's 4.75 but our response time is now terrible yeah yes the average wait time isn't necessarily dependent on it and like all with these like with every scheduling algorithm it's a hotly debate topic because it's all trade-offs you can't have a perfect one that reduces everything so then there's tons of fun scheduling names and people just make up their own schedulers all right so in this case it was the same as first come first serve so your round robin performance is going to depend on the quantum length and also the job length which most of the time you can never predict the job length you can only analyze it after the fact to see how well you did so round robin has low response time pretty good interactivity it has fair allocation of the cpu a fairly low average waiting time when the job lengths vary so you can kind of game it to make the waiting time as bad as you want but the performance really depends on the quantum length if it's too high it becomes first come first serve if it's too low there's just going to be too many context switches which is just wasted time that your cpu is just wasting just to swap registers and context switch between programs so if you actually go through a bunch of examples typically for this they'll just simulate a bunch of schedule like a bunch of processes and then see how each individual algorithm does so if you do this and kind of figure out trends round robin has a really poor average waiting time when all the jobs are of similar length which kind of intuitively makes sense all right any questions about that so scheduling is like super really dependent on your load and just a whole bunch of other factors that you can't predict and there's lots of trade-offs you can do so we looked at yeah so linux has two main scheduling algorithms one is round robin which we just went that we'll see in the next lecture which is a bit more advanced but actually not that not that complicated it's called the completely fair scheduler that tries to be more fair and divvy up cpu resources a bit differently there's also a few fun schedulers one is called brain f and that's a fun scheduler that people seem to like like you can create your own schedule for the linux kernel and see if that works any better yeah yeah so that's a good question for the next lecture is what if jobs have priorities you might want to complete one over another and that's that's the next lecture so next lecture we're basically just going to add in priorities and see what the linux actually uses yeah so you'd care the average response time if it's like a gooey application and you're a user so more a faster response time it's just going to seem like the system reacts instantly and it just feels a lot better so there's always going to be some type of trade-off so sometimes like windows will do something different and a lot of the feel of different programs just is how it's scheduled and there's no right or wrong reason right or wrong answer just depends so scheduling involves a lot of trade-offs we saw first come first served the most basic scheduling algorithm that we're all used to in everyday life then we tweaked it to do shortest job first to reduce the average waiting time then we added preemptible processes and we came up with shortest remaining time first and that optimized the lowest waiting time so which is also called turnaround time that you might see and then we saw around robin which optimized fairness and response time so that's it nice we can take a breath just remember I'm pulling for you we're all in this together and let me