 Welcome to Friday. Everybody happy? How many of you are happy it's Friday? All right Because what looms in front of you is a long weekend that you can devote to working on assignment to, right? Okay, so today we're going to finish up our unit on the processor People who are sitting way in the back. You might want to move down a little bit Just because I can't promise that you'll hear me the entire time so Today we're going to finish up our unit on scheduling and multiplexing the CPU with a story about Linux scheduling so watch it look at a Real scheduling algorithm is proposed for Linux and it gives us a chance to talk a little bit about Linux about the Linux community And about how development works on this particular large open-source project. Okay? Also do a little bit of review on Monday, we're going to start memory So this is the end of one of our first big units on how operating system multiplex and abstract system resources We're going to be done with the CPU and we're going to go out to memory and that'll consume us for the better part of a month okay, so The grading for assignment zero through two the code reading questions should hopefully be done today. We're close You know we front loaded a lot of this So this is like three quarters of the human grading on the assignments that the TAs do all semester And we're trying to get it done pretty early, right? So a lot of the questions are finished. We're wrapping up the code reading questions The design should be done on Monday. Those obviously take a little bit longer. Okay? One thing I want to point out about the design document grading is that when you receive your grades You'll notice that the rubric is really designed to evaluate Whether or not you've completed the portions of the design that we asked you to do the rubric is not Evaluating whether or not those portions were correct. So for example, it says did you talk about weight and exit? You will probably get points if you talked about weight and exit even if you said something totally crazy about how you're going to implement them Okay, because what we're trying to get you guys to do is think about how you're going to implement them Right, so please don't take the design document marks as an indication that what you're proposing is a good idea, right? If you want feedback on that print out your design bring it to office hours talk about with the TAs posted on Piazza We just don't have really a way of delivering that kind of feedback on the designs to the website. Okay? Simon do the Implementation is due two weeks from today. So there's not a huge amount of time left on Monday. I'm going to start Sort of presenting some of the people who have finished up assignment to so there may be people out there Who were thinking assignment twos not going to take very long. I finished assignment one in a couple hours That's great. So prove me wrong, right? I think that most of you guys should be started right now and working hard in order to get a good assignment to Together so if you want to prove me wrong do it this weekend and then on Monday I'll have you down here in front of the class and we'll all give you a big round of applause for finishing the assignment to Implementation I am betting there's not going to be anybody down here, right? Zero okay, so prove me wrong, right? And then you'll have two weeks to hang out and just relax okay Is one final note? So please keep in mind that the academic integrity of policy is not just applied to the code that you submit in this class It's also applied to the design documents and particular to the code reading questions So just please keep that in mind as you submit them We we do look for and notice suspicious similarities between the answers that you are able to view once you submit the questions And the answers that you submit so we do notice something obviously the TAs are looking at the two side-by-side okay Okay, so it's it's really funny. I have these videos up on YouTube. I have a thousand subscribers. I'm pretty proud of that Of all my videos there are 13 dislikes 13 total okay, I've got a lot more likes but six of them for whatever reason are on the video for this lecture I don't know why maybe you guys are going to find out today. I've invited people to this and now it's going to be terrible I am going to try to work on this verbal tick. I promise It is now embedded in my brain. I know I do this too much my wife knows that I do this too much She thinks it's funny Apparently some random. This is a comment from one of my YouTube videos This one actually so apparently I issue rights at the rate of one every five point six seconds That seems pretty impressive She actually asked me she said did you check his math? I'm like no, of course I didn't why would I do that first of all? I don't want to hear myself doing it second of all It just looks pretty credible. Okay, so I will try to stop this. I know I know I have a bad habit right I need like a beep every time that happens like Okay So any questions on the scheduling algorithms we've already covered before we talk about new ones Okay, so let's do a little bit of review So we talked about different types of information that schedulers might use to decide What thread runs next? What are some people give me some examples of information that the scheduler might want to use to help it decide? Which thread should be able to run? One example Yeah Yeah, so the last time it ran how how much time did it run before it blocked? Because that might allow me to make better use of other system resources Why also so what I would love to know right what nobody pointed out is I would love to know what the threads about to do And we talked about a category of schedulers that can actually do that We can't implement them, but we can talk about them. We can pair other schedulers to them What I do instead of being able to predict the future is I use the past to predict the future This is something that we will come back to when we talk about virtual memory And this is something the operating systems are always doing Assume that what just happened is going to continue. So the Multi-level feedback queue is used doing a variant of this and then finally what does the user want to happen? So I have things like priorities that give me the ability to inject exogenous information into the scheduler In order to try to improve how the system behaves So most schedulers have ways of accepting external input. They don't understand why you gave that threat or that task a particular Priority, but they are willing to do your bidding. They're willing to give it more access to the CPU because you told them It was important So we talked about some examples of schedulers, however They don't use any information to choose what the next thread to run is What's the simplest version of a no-nothing schedule? Random the random scheduler, and then we also looked at round robin, which needs a little bit more information It needs to remember what order it ran the threads in last What about schedulers that try to do a better job? So we talked about things we might like to know Kendra pointed out I would really like to know how long is they're going to use the CPU next When it blocks I might want to know how long is it going to block and I also might want to know if it's going to block or yield if it yields remember It actually is saying I want to keep using the CPU So it's not necessarily in the process of using another resource if it blocks it is so if it blocks It's great because what I've allowed it to do is Do whatever I needed to do to get another part of the system active and then get it out of the way right so we talked about multi-level feedback cues and This was the algorithm we used and I'll just we'll talk about this for a minute I'll just put it up there and then we'll talk about it because I want you guys to be thinking about this as we look at the Rotating staircase scheduler today because there are some some similarities here So what the multi-level feedback you algorithm is trying to do is reward? Interactive threads or at least identify and reward threads that give up the CPU quickly So if a thread runs does a small amount of work and then blocks That's a thread that I want to prioritize because again. It's put some other part of the system to use So I choose a scheduling quantum remember most of the algorithms are based on this We'll see this again in about 20 minutes It's the longest period of time I'm going to let a thread run before I forcibly stop it and allow another thread to use the processor Once I have a scheduling quantum from MLFQ. I established some series of cues or levels and If a thread blocks before it finishes its quantum that thread will eventually be promoted into a higher priority cue If the thread hits the end of its quantum It does not block that thread will eventually end up in a lower priority Cue any questions about MLFQ? You don't understand MLFQ. It's gonna be hard to understand our SDL Okay, so Just a little bit of review CPU bound threads. What direction do they go if I have a thread that's computing digits of pi What cue is it gonna end up it? Autumn right these go down IO bound threads They go up Okay, and and keep this in mind right because one of the inadvertent features of this type of algorithm Is despite the fact that I prioritize quote-unquote high priority high Interactive threads by allowing them to use the CPU first They still may end up not receiving as much CPU time as lower priority threads simply because they block very quickly And this is something this is a problem with these type of schedulers that con Calibus who wrote the scheduler We're going to talk about today noticed and tried to address in his design so Finally, why are we doing this work to try to identify interactive threads or threads that would give up the CPU? Before the end of their time quantum Why do I want to run them first? I've said this two or three times already today, but somebody remember yeah Yeah part of it Right at it. Yeah, exactly. So it allows me to keep other parts of the system busy. I Have to run to use anything else on the system If all I want to do is get to the CPU so that I can just take that character input and redraw to the screen And then I'm going to be sitting there waiting for you again I might as well let that happen because then I've got some other resources that are kept busy like you and I can let the CPU bound thread run at that point, right? Okay, so any questions on the schedule in algorithms who we already covered. Yeah So can I fool the schedule is short, right? I mean you can manipulate the scheduler, but in order to do that you have to run for very short periods of time So for example, I guess if I was computing digits of pi What I could do is I could just pretend to block for a couple of times Get into a high priority queue and then start computing some digits, right? And then I would start to fall and then I would do that over and over, right? You you could do that It's not clear that you're not going to get better performance And if you just do what you need to do right because you're wasting time every time you block unnecessary But it's a good point I mean a lot of these algorithms are sort of based around the fact that we don't really expect threads are tasked to be malicious We expect them to be going about their business trying to do what they need to do We don't expect them to be trying to gain the system or take advantage of the system. That's a good question. Any other questions? Is this doing scheduling algorithms, okay? So so now we're going to talk about sort of that the Semi-recent history of the Linux scheduler and I say semi-recent because this was recent, you know four years, right? But I think it's neat because I think it's fun to know that Some of you guys might think I remember asking you guys at the beginning of the semester How many of you do you think are going to actually hack on a real operating system and very few people raise their hands? The fact is there's still opportunities to hack on real operating systems including Linux Linux is still under development people are still making changes to it It's still evolving and even something that you would think is as basic and simple as the thread scheduler What is a more? Core part of an operating system then thread scheduling this algorithm has been rewritten several times within the last 10 years Right this part of the system. Oh, sorry This part of the system continues to be developed And so you want to jump on board even if you have a day job the guy that wrote one of these new schedulers Is not a full-time Linux hacker. He's a full-time anesthesist So I guess while he's sitting there watching people who are having operations He's thinking about ways to improve thread schedule I don't know if I would want my anesthesis to be doing that But it's probably pretty boring to be watching someone who's under anesthesia, right? They don't do much And there's some other sort of fun things that come out here, too Okay, so Linux right how many people have a device that uses Linux? How many for how many people is that a phone? See how many people have an Android phone? All right, so all of you guys that have an Android phone have a device that runs Linux Linux is everywhere Linux is a large, you know clearly very actively Maintain these statistics are a couple years old about 10 million lines of code Half of that is dedicated to supporting device drivers and this is an interesting There's an interesting story there about how device drivers are difficult to maintain and end up being sourced of a lot of problems with Kernels, but we'll come back to that later in the in the semester Again a few years ago about 2,500 developers So think about the challenges you have coordinating with your partner to do these assignments Imagine if you had 2,500 partners for the class So clearly there's an organizational structure that has to emerge, right? New Linux is you know pushing kernels new releases frequently At the time it was about every three months a new Major kernel release and the minor ones come may come out even faster So again, I don't know anything about Linux firsthand. I have never committed a line of code to Linux But here's what I understand about how the system works. This is pretty normal Different subsystems have maintainers and every file within the subsystem also has a maintainer The maintainer is the person that you contact when you think there's a problem When you think there is a bug when you have a problem with a particular Subsystem on a particular device or in a particular context So the file maintainers are in charge of the contents of the file and the subsystem maintainers are Probably also making larger design decisions about the direction that various things should go So for example, there's somebody in charge of maintaining these scheduling subsystem That person is the person who says hey by the way, it's time to write a new scheduler We don't like features of the existing one. So we're actually going to write a new one and use it to replace the code that's already there And at the top of this process, there are a couple of people who are responsible for merging in Pretty much every commit and again, this is three, you know four years old, but I'm assuming this is still true You've got Linus and Andrew Right Someone at the very top and these people may be different now But there is some person who literally is merging in patches and pull requests into the mainline Linux tree Somebody has to make these final decisions About what gets committed and it turns out this isn't you know You might think this is kind of weird for an entire kernel or operate system to just have a small number of people Who are making the final decisions, but this is kind of how it works so I did an internship once at microsoft and It microsoft os at the time was very similar. There were three guys. I mean there were a lot of people developing Different features for windows, but when you talked about the core os There were really three people who were making final decisions about things. I remember being in a meeting once Where we were the people were talking about oh, what should we do this and should we do that? And this guy named landi wang who is the maintainer of the virtual memory subsystem and windows walked in And it was like a god had descended from mount olympus or something Right, you know when he walked in he was like no, we're not going to do that and they were like, okay You know, you know, that's that's it right like that Because he was maintaining the whole thing and if he didn't want it it wasn't going to happen So there was no point even continuing a conversation. Um, so again, this is not not particularly unusual so The top guys maintain an official mainline linux release Now there are also a lot of variants and forks and sub trees of linux that you can find out there And some of those are maintained by the individual developers themselves So the idea is if you're a major linux developer you have code that you want to test out That code goes into your own tree first and then eventually If it works and if the you know the guys who are in charge the people who are in charge Think that it's a great idea Then finally it will migrate into the mainline kernel and be released as an official linux release But there are features that sit in these sub trees for long periods of time sometimes forever There are they have been popular features in linux that for whatever reason never left a certain subtree Maybe they were considered too experimental Maybe somebody who is in charge doesn't like them for whatever irrational reason But they sit in these sort of like they're like I think there's something called an m There's like an mm subtree conclevis who we're going to talk about later maintains a dash ck Subtree or did when he was working on this um And keep in mind get The version control system that you're using this semester was built to To it was built to host linux That was the you know, it's it's written by Linus it's built to host linux and was built to facilitate the linux development model And to address features of earlier systems that linus did not like So so one you know you can imagine this is a big community and linux runs on all sorts of different types of devices right desktops servers bones embedded devices And it's it's pretty impressive that you're able to maintain a single sort of stable Mainline kernel release for such a large diverse project But there are a lot of tensions between different parts of this community So let me give you some examples of tensions between the desktop and server people right So server guys right the server guys work with these large companies that run linux servers Those large companies frequently have a lot of developers. They have well established code bases They have benchmarks that they use to evaluate new releases. They know exactly what they want When a new version of linux comes out they run their benchmark if it slows down things a little bit They just say uh, you know, we're not we're not going to use that one and they wait to see what happens with the next release um A lot of there are a lot of people you know linux itself is an open source project But linux as a community has generated an enormous amount Of economic activity. There's plenty of companies that package linux who will pay you to help support linux Etc, but a lot of these guys work in the server space, right um And and a lot of the companies that work In the server area and use linux Have people on staff who are also part of the linux development community So if you look at those 2400 people who are developing linux To the degree that they have jobs that are directly related to linux A lot of them are at these types of companies right because these companies make money They have one guy in staff who's in charge of Working on the features and linux that they need and also sort of contributes things back into the linux ecosystem Okay, so server linux now desktop linux. How many people run linux on a desktop machine? Okay, you guys are weird, right How many people running on a laptop? That's a terrible idea. Don't do that. Um, okay, so You know desktop linux think about it, you know, first of all What what are the benchmarks here? I mean desktops run all sorts of different types of software And so in contrast with the server guys who are frequently running a small number of applications The desktop benchmarks are really really less Well-defined You've got users that are cheap, right? These are the people wouldn't pony up 100 bucks for windows Or these are the people who got windows for free with their new pc and then got rid of it anyway, right? Um So yeah, these these people are not necessarily like going to inject a lot of money into the community Okay server guys maybe your desktop guys not so much And then you know desktop users are frequently just normal people now again linux desktop users are not normal people, right? But desktop computer users is a subset of people that pretty much includes everybody now, okay? So they're not necessarily Going to hack on the system. They just want it to work, right? Think about somebody's grandma She just wants to be able to open her web browser Right, she doesn't me. I'm doing it. She doesn't want to write a new interactivity benchmark Just wants to use a well instant messenger or whatever people use that Um, the the linux kernel maintains a mailing list. How many people have ever tried to read this kernel mailing list? I did this one summer for fun Um, it's terrible, right? It's like 99 percent of the time you have no idea what's going on I still think I you know, this was 15 years ago. I tried to do this And all the time and and energy I've spent thinking about computer systems between then and now I think I might have moved that number down to 98 I think if I read it now, I would not understand 98 percent of what's going on It's really complicated and the discussions that are going on parallel about really low level features and subsystems that A lot of people don't even know exist or really care about This is not where you go if you want help Like I can't get firefox to open. What do I do? And sometimes people wind up there Who are confused? These are the kind of people who clearly need help because If you had a little bit more of a clue, you wouldn't have asked the question in that forum You would have gone there and like looked at one of the messages and said no no no This is not where to ask this question. Um, and sometimes people can be mean so I I you know, I I looked this up and there are some of the unfortunately and you know This isn't entirely intended to be funny, but some of the people in this community are have some of the same problems that Unfortunately, primarily men in the tech community seem to have With anger management issues Using appropriate language having some reasonable norms for how to talk about this. So this is sarah sharp She's an active linux developer and she finally got so fed up with it that she wrote this as part of a long series of angry Back and forth between some of the linux developers That was apparently prompted by a discussion that you know as she points out it got a little bit out of control We're talking about computer systems people. There is no need to threaten anyone with physical violence Um, just let's just get the let's just get the new driver to work. All right, and then um, okay, so So let's talk about the schedule. So Before linux 2.6 and this is pretty old now. It's probably Aughts maybe maybe late 90s um The 2.6 scheduler Did not scale well So one of the things, you know, we talked about schedulers and one of the things we wanted our scheduler to do Was to not take very much time because the process of choosing the next thread to run Is by definition not running the next thread Which is what I want to do So the longer that process takes the more cycles I'm wasting on the machine that I could have used to do actual useful work that the user would notice The pre 2.6 linux schedule had this terrible property where the runtime actually increased as the system Had more threats. So the longer the ready queue was the longer the scheduler would take to figure out what to do What's kind of terrible? I mean, this isn't very good period But why is this even worse? Then it might first appear Yeah Yeah, so that's not a property you want your system to have Where the more people are it's a great way to put it the more people who are waiting the longer they have to wait Okay, this was what was happening. So the more loaded the machine got the more time the scheduler was wasting The scheduler was doing its best when there wasn't Even much of a decision to make and it just kept doing worse and worse and worse as the decision got more complicated So this is definitely not what we want. This is the opposite of what we want So the linux 2.6 scheduler aimed to fix this so the linux 2.6 scheduler had o one performance Which is fantastic constant running time regardless of the length of the run queue The o1 scheduler combines two priorities. There's a static priority which is set from outside the system So this is that external information that you can tell the scheduler Here's how important I think this particular task or process is And it also computed something called a dynamic priority And the dynamic priority Goes back to this goal that schedulers have had of trying to improve interactive performance And so the dynamic priority was an attempt to boost the performance of tasks That were deemed to be interactive How this was done was a problem so The code that was required to detect that threads were interactive got really Weird and complex. There were a lot of these magic numbers and constants And it was very hard for people to understand how it worked And because of this the entire scheduler was very difficult to understand So it was hard to model the scheduler. It was hard to say Given a certain set of threads and tasks with certain behavior How is the scheduler going to work? So this became difficult to do and for something Again, the scheduler is a core component of the os Other than the page fault handling code the scheduler may be one of the parts of the os that runs the most often It's running all the time And it's something that you want to be able to understand I want to be able to understand how it works and I might even be able to want to model how it works But I certainly don't want a big piece of spaghetti code that's got a bunch of nasty Constance and weird math built into it And this was sort of what the o1 scheduler. That's the problem the o1 scheduler had. Okay, so Our the hero of our of the day con coletus an australian anesthesist And he started to become interested in scheduling now. I think this is really cool It's a guy Just, you know random hacker in australia and he decides i'm interested in scheduling All right, you guys are learning about scheduling. Maybe some of you are now interested in scheduling And he starts and in particular he starts becoming interested in scheduling for interactive workloads and systems And this is a much harder problem than the server community. We'll come back to that minute Here he is Concletus looks like a nice guy, right? So I don't know if you guys can read this. I'll read it To you parts of it So one of the things he did when he got started was that he tried to define What he was trying to accomplish he said there are some properties that we want of interactive user-facing systems and at the time Not only where they're not necessarily precise definitions or a precise understanding of what those properties were But there was also a you know a very very sort of problematic The lack of benchmarks that would allow us to measure those properties So he tried to address both of these problems. So here you'll see You know He's saying we don't really understand what makes a nice feeling linux desktop We don't understand at a scheduler perspective the things that go into accomplishing that He separates them into two things responsiveness the rate at which your workloads can proceed under different load conditions Interactivity this scheduling latency and jitter present in task where the user would notice a palpable deterioration under different load conditions This is kind of like what we talked about before with responsiveness being click Um, sorry. Yeah with responsiveness being click and interactivity being continued So as he points out later Responsiveness allows you to continue using your machine without too much interruption to your work Then interruption would be caused by disruptions to your Responsive patterns using the machine things that are taking too long to paint or return Uh, you know, you imagine the characters get all weird and laggy. It makes it very difficult to type move things around Interactivity allows you to play audio or video with any drop without any dropouts or drag things around Right if some of you guys have had a system that's under load And starts to feel laggy One of the things that you can do to experience this in all its glory Is grab a window and drag that window around and you'll see like You know, it's let's get jumpy and you know And and that's a way that you can measure the fact that the system is not producing good interactivity So this is this is nice and I think this goes on. Well, okay. I'll get back to that later So he wrote some benchmarks to address this. I think that's in a later quote So 2004 he released something that is called the rotating staircase scheduler So the rotating staircase scheduler to the existing scheduler it threw out about 500 lines of what he referred to as black magic The interactivity detection Code in the existing scheduler that was really hard to understand and difficult made the scheduler difficult to model And he replaced that with 200 lines of code that implement a fairly simple approach that has some nice properties We will talk about a minute There's some similarities to multi-level feedback So here is his description Of the scheduler it's 01 which we want runs in constant the scheduler algorithm itself runs in constant time It's scalable No interactivity Estimator that was one of the nasty pieces of black magic No sleep run measurements. So you'll see there's no where where I'm Having to measure how long a particular task spends sleeping or running and then use that into some as an input into some weird algorithm Um The design has a strict enough design and accounting the task behavior can be modeled And maximum scheduling latencies can be predicted and we'll come back to this in a minute. So I can actually Guarantee that a certain task will have a chance to run within a certain interval That's really important for for Continuous tasks, right? Because they want to be able to make sure that they run every certain amount of time in order to do something like Paint the screen or write audio to the soundbuffer So there's a link on the website to the full description, which is pretty cool to read And here's my attempt at it. So there's one parameter is the round robin interval That's how long The time to remember I think that's how long each Task is going to run. So this is the sort of maximum maximum Length of time any task will run at a particular level And I can also input a priority. So like many other schedulers, this allows me to input a priority Now a prior what a priority does in the rotating staircase Scheduler is it defines the levels or stairs at which the task is allowed to run A high priority task has more opportunities to run because as you'll see what happens is it starts At a higher level and if it runs out of time at that level It has an opportunity to run at a lower level and at the lower level and lower level and lower level So if I start at the very top of the staircase, I can potentially fall all the way down And run at every level all the way down to the lowest level So that's what defines priority. I have more chances to run during each scheduling epoch Sorry One of my graduate students finally committed me that Committed me. I'm saying that word wrong. So it's epic, I guess and every scheduling epic I like to say epoch. It sounds like something from Star Wars If I'm a lower priority task, I start at the bottom and so I have fewer chances to run And and this will become clear in a minute when we go through an example And and so tasks can run at most a fixed amount of time per level So every task has a quantum that it's allowed to use And if I'm a high priority task, I'm allowed to use at the highest Priority level and then that quantum gets reset if I run out and I'm allowed to use it at lower Levels as well. If there's time, I'll show you how this works in a minute Now there's also a a There's also a limit on the amount of time that the scheduler can run in any level And what this means is that regardless of the task behavior I can predict before I run the scheduler Exactly how long it will be before any task has a chance to run This is really nice. This is bounded latency. So I can say and we'll do this in a minute Here's the beginning of time and within this many number of milliseconds Every task in the queue Will be able to run. It's a really really nice feature So here's how this works to start. So I break time into epics To start a scheduling epic I put all threads in the queue that's determined by their priority Remember the highest priority threads have more chances to run they start in high priority queues and fall down then I run threads from the highest priority highest non-empty priority queue in a round robin order If the thread blocks or yields It remains at that level. So if it blocks it goes on to a waiting queue when it returns It will come back to the same level if the scheduling epic has not ended If it yields same thing it goes back on to the end of the run queue for that epic and has another chance to run If a thread runs out of its quota So if it's exhausted the amount of time it's allowed to run at a particular level Then it falls down into the next level And it will have a chance to run at that level. It ends up at the end of the run queue for that level Finally if the levels quota is exhausted, then I take all Any threads that are still ready to run in that level and I move all of them downwards So and I continue this until either all the quotas are exhausted for every level Or there's no threads that are runnable at which point I would reset the epic and move every thread back to where it started So let's see how this works I have a bunch of threads Different priorities. I've colored them to represent their priority level And I order them into these queues So priority two in this case is my top priority priority zero is the bottom priority Let's say that each one of them has a the round robin intervals five Units five milliseconds, whatever And what this allows me to do is to sign a quota to each to each level So you'll see what I've done is I've decided to assign The quota of 15 to priority Two that's because there are three threads that are ready to run in priority two Each one of them has a quota of five milliseconds same thing for the other levels Now remember if The quota for a level is exhausted, then I'm always going to move down So How long is this scheduled in epic going to last at most? Who can tell me? Yeah No for this particular example Well, how many how many time units right? So I just add them up right 45 time units. That's the longest that this epic can take It can be shorter It's possible that it's shorter. In fact in many cases it will be shorter But the longest it can take is 45 time units This is really nice because I know even before I start running the tasks even before the tasks do anything Block yield whatever I know exactly how long it's going to be Before I reset the epic now keep in mind I also know that every thread will have a chance to run in this epic And it will have a chance to run for at least five time units The higher priority threads may have more chances. They may have they may not But what this allows me to say is within a 45 millisecond period Every thread that's active will have a chance to run for five milliseconds. So I can guarantee that So this is a very nice property for the schedule and algorithm to help So now what do I do I start going round robin through the top queue? Let's say I run thread five thread five runs it runs out of its quota So it's exhausted its quota at priority two. What do I do with this thread? Or this task, where does it go? Goes into the to the back of the priority one level Now notice here that I have not changed the quota for priority one. It's still only 20 units So is it guaranteed that thread five will have a chance to run at that level? No, because if every thread before it uses its entire quota Then the level quota will be exhausted and I'll have to move it down into priority zero So keep in mind It's not guaranteed that threads that fall down the staircase from the highest priority will have another chance to run But they may yeah What's that? So how do you turn the initial priorities are are determined using whatever the same external information is that another scheduler would use right So you might say I want to make sure that my crown jobs run at the lowest priority and my video player runs at a higher one So yeah, the priorities are not determined by the scheduler. There's external information that the scheduler is is responding That's a good good question So now what do I do? What do I do next? I run thread one and let's say thread one runs for two time units and then yields So where do I put this thread? So it goes back on the end Of the run queue I'm running things in round robin now. I'm going to run thread two Let's say thread two runs for three time units and then blocks So that thread is not now not runnable. You could just put it off the screen somewhere or something. It doesn't really map But it's blocked Now, what do I do? I went thread one and let's say thread run also blocks so now What now what am I going to do? I just gave it away What thread gets to run next? three So this is so I just want to point something out This is why the priority levels tend to work because you'll see that despite the fact That thread priority two still has a quota left at that level There are no threads that are ready to run at that level and so now I go down and Despite the fact the priority two I'll still some quota left So this is what gives the threads that start at the top priority and and continue to exhaust their quota This is what gives them more chances to run because in many cases I will not Uh, I will not use up all the quota at a level to run the threads that started at that level So I'll have some left over to run threads that started at higher levels Okay, so now I start thread three Uh thread three runs for a bit in blocks I start thread eight thread eight runs for a unit it also blocks I start thread six thread six exhaust its quota. So where does thread six go? Goes down into priority zero It's falling down the staircase I take thread nine I run that for a unit and let's say it blocks and now here I am So now I've got thread five that started in the priority two q Exhausted its quota there fell down got a new quota at priority one and now Happily it gets to run So its high priority is paid off. It's now had 10 time units to use So let's say it exhaust its quota Now it falls all the way down into priority zero So now I'm out of threads to run at priority one So what would happen at this point if there was a let's say that thread one wakes up It's whatever it did that blocked that call returns. What happens then? It would get to run again. Remember so the schedule is say, oh wait. Hold on There's something in priority two. I go back up the staircase. I'd run that So every time I choose the thread to run I'm looking at the current state of things if a thread blocks It comes back to the same priority level that it left So I don't necessarily always proceed In order down the staircase. Sometimes I have to jump back up to grab a thread that became runnable during the schedule in epic Okay, and I'll you know, this is this is pretty Pretty obvious what happens from here on out any questions about this This is nice. I like this album a lot. Yeah right So yes, if it's sleeping while the epics are started or the epics were started It is returned to the priority. It started and given a new quota Right, so when it comes back from sleep, it'll be put back on whatever priority queue it started Right, so I just at the end of each epic. I just reset everything if the thread's not ready to run It's not considered But it's quotas still reset and it's put back at the priority level that it started Great question. Yeah Any other questions about rsdl? Okay, we'll get through there. Yeah Now the levels are a parameter like there are there are there are configuration parameter for the algorithm What's that No, the the algorithm remember the algorithm is is is oh one Um, and I think that's because of the data structures they use right all I need algorithm needs to know is What's the queue that has the the highest priority thread in it? Right and then I just choose that right so choosing this should be constant time As long as I can maintain like a heap or some sort of data structure So I always know what's the highest priority thread that's available That's a good question all right So rsdl has a bunch of nice features like we pointed out at the beginning of every epic I know exactly how long it's going to be before every thread in the scheduling That's ready to run when the epic starts has a chance to run And this makes the schedule really easy to model The accounting is really easy There was no need to measure weird things All I'm doing is decrementing account till it gets to zero and then resetting it periodically So this can't be done in no one And I can also use this interleaving idea right so what I can do is Rather than moving straight down So for example It's possible that I want to give a thread a chance to a low priority thread a chance to not only run at the bottom But to run periodically throughout the epic so What this is sort of a new version of the scheduler there's a one here There's a zero here if I have a chance to run And there's a one here if I don't So remember Linux priorities are backwards negative numbers are good So you'll see that the thread with a nice value of negative 20, which is the highest priority can run in every epic The thread with the nice value of 19 is the lowest can only run in the final epic But these other threads they don't just run at the back They run periodically throughout the throughout the epic So this allows me to it's a it maintains the exact same properties of the scheduler But it means I don't have to wait until the very very end of the epic to run low priority threads For example, you'll see that the nice 15 thread can run here here here here and here Does this make sense? That's just a little minor variation that reduces latency a little bit more So this is um a nice description what's good about this And and he points out something that we we noticed earlier Which is that a lot of schedulers end up inadvertently penalizing interactive threads Because they only run for short periods of time And so by the time the scheduler sort of reruns them a bunch of other threads have had a chance to run They've used a lot more cpu time and then you end up trying to as he puts it bonus them back Through this weird interactivity detection And this design does not suffer from that problem Okay, so now sort of the sad part of the story You know remember So part of the problem here was that the desktop community didn't have benchmarks and well-defined goals So he came up with this nice idea, but it was very difficult to actually Show somebody sort of a you know a real convincing proof that this was good People liked it. They felt they reported good performance on their machines But there wasn't something easy to measure where you could say this is really really good This was popular with users, but it never left his private linux kernel tree Around the same time so engel molnar is the Was the maintainer of the 2.6 linux scheduler he was the he's the he may still be he probably still is the maintainer of the linux subsystem So he's like oh by the way Maybe it would be a good idea for me to implement my own new scheduling algorithm right around the same time He came up with something called the completely fair scheduler that was some what based on these some of the same ideas and at some point con clevis got very frustrated and said see I don't want to be a part of linux anymore I'm tired of doing all this work and then I can't get my patches into the main One really interesting point that the con may which I think is is actually really really a useful Design feature was that he said is it does it really make sense for linux to have a single scheduler? Linux is this big community it runs on a bunch of different types of devices Wouldn't it be more appropriate for us to have an architecture? Where the scheduler was something that the user could configure at runtime? And there are other parts of linux that are actually done this way So for example the the code that chooses the cpu frequency to run There are algorithms that you can change on your machine Dynamically at runtime and different communities use different algorithms So for example android has their own algorithm that runs on their phones that they wrote and they think does a better job than whatever linux already had So clevis proposed this sort of pluggable scheduler architecture Maybe this has happened. I don't know. I've checked recently, but it seems like kind of an obvious idea All right, so what's the I mean to the degree there's a moral to the story I think it's you know work on something that you find enjoyable Get really good at it And if you have good ideas, so you might say well, you know This sounds like a failure these ideas never made it to linux But you could also argue that this conversation and the focus on interactivity really helped shape the design of the linux schedulers that are there And this was an important So I think con clevis has made some clearly made some important contributions Despite the fact that he may not have any code in a linux mainline So after a couple of years, uh So this is the part of the talk where you know, that's that's uh, it's not g rated anymore So a couple years later clevis returned to linux community with a schedule that he referred to as the brain fuck scheduler This is A scheduler that as he put it was designed to be forward-looking only this was something that was kind of designed for More like devices like phones and maybe embedded devices It's desktop oriented has really low latencies Rigid fairness nice priority stripping and extreme scalability within normal load levels explaining the name He said this is a scheduler that tries to sort of reinvent things throws out a lot of what we think we know It's ridiculously simple. It performs really well despite being so simple It's now here. He's sort of You know, he's throwing up the white flag He said there's no way this will ever get into mainline linux and I don't care So I just decided to write something that I'm almost positive will never get into mainline linux and that way I don't have to worry about it I don't lie awake at night thinking his line is going to accept my patches I know he's not because I wrote it in a way that he wouldn't like Um, it's a design to sort of draw attention to the problems within the current scheduling algorithms It design shows that we need a pluggable scheduler architecture. You can do a lot better They schedule their design for a specific purpose It means that better CPUs mean lower latencies. This is an aspect of bfs And this is my favorite reason which is I must be crazy for these are quotes by the way. I didn't write these I just want to point that out. Um, this is this is con Calibus in his own words Um, this was his description of how he felt about rejoining linux All right, so I hope you guys have a great weekend on monday. We will talk about virtual memory