 This is important for the video. Yes, CubScout Pack 55 has canceled their meeting. But we're here, so I don't know if what we're doing is more important than the CubScout, so not. But we can talk about that on Piazza. So the auto grader is working now for assignment zero. I see that people have been submitting things. And if you're confused about stuff, please ask in Piazza or email us. But I think that the assignment zero stuff should work fine. And it can take a few minutes after you submit. If you submit something that works well, it might not take very much time at all. But if you submit something that doesn't work, then it might take a little bit longer. So if you sit there waiting for things to come back for 30 seconds, especially for assignment zero, because we're not doing much to your kernel at all, then maybe you broke something. So I don't know if it was Kevin's here today or not. But Kevin saw me in the hallway yesterday and asked me, what's a good timeline to think about? Because there's no deadlines. And so how should we allocate our time? So what I said is my suggestion is that you work backwards from the end of the class. And I think our last meeting is like April 26th or something. It's Monday. And I'll decide and announce there will be a certain point of time at which all of the grades on the website for things like the scripts and the implementations will all suddenly become final. And you won't be able to change them. I'll figure out when that is. But at some point, I'm going to hit that big red button. But it'll probably be around the time the class ends, maybe a little bit later. So here's my suggestion. My suggestion is you allocate about a month for assignment three, maybe a little more. But so that takes you through April. So I would say I would want to be starting to assignment three around the end of March, beginning of April. I would also give yourself a month for assignment two. Assignment two is easier than assignment three. But when you start assignment three, you will have done assignment two. So you'll be stronger and maybe able to move a little bit faster, a little more used to what we do. So I would leave March for assignment two. And that basically gives you a sense of where you want to be. So it's early February. That's how I'm justifying it to myself. But so I would try to have assignment zero and assignment one. I mean, assignment zero is trivial. But assignment one, I try to get that wrapped up by the end of this month. And look, the sooner, the better, clearly. If you guys want to go, I'll get assignment three, hopefully, up today. And some of the automatic gradient will start to work for the other assignments pretty soon. So at some point, maybe next week, you guys, if you had stuff to submit, you could submit all these assignments just be done. And then you could just come to class and enjoy life and work on your other assignments for other courses, things like that. But again, this is really up to you. But this is my suggestion. I think this is a reasonable amount of time. And no matter where you are, the TAs will be helping you during office hours. And I hope people will have more people coming to office hours and helping each other. But this is kind of the timeline I would be on. I mean, the other thing to keep in mind, of course, is that if you get ahead, you get ahead. Great. You can spend the rest of the semester relaxing or scoring karma points by helping each other, helping other students that haven't got to where you got. If you get too far behind, then suddenly, you're going to be coming into office hours in like April with questions about assignments zero. And the TAs will probably be like, yeah. So yeah, there's a disincentive to getting too far behind. Because all the people have moved on and will be thinking about different things. And you'll still be like, how do I make a patch? But anyway, there's good instructions about this. So any questions about the timeline, about the assignments? Again, as we get more things to work, I mean, again, now all the code reading gradient should work. The script gradient works. I just need to sort of reproduce some of the gradient for the other assignments. But I have scripts to do that as well. Does it work backwards? No, no, no. Sorry. When planning your semester, I would work backwards, right? I mean, if you do assignment three, in theory, you should be able to submit that for credit for assignment two, assignment one, and assignment zero, right? So you're welcome to try that. But I would probably work forwards through the assignments, but I would work backwards when developing a plan, right? Because you want to be at the end of assignment three by the time that I push the big button that seals and finalizes all the grades. Yeah, about that. Yep, yeah, yeah, yeah. So if you look at assignment two and assignment three, both those assignments include a fairly significant component that's a design document. I haven't really talked about that because I don't think anybody's there yet. And those will be graded by the TAs. There will probably be similar to the code reading questions, two-shot grading. It's a big chunk of the assignment. It's a two-page PDF outlining how you're going to complete the assignment, right? I wish they could be longer, but there's a lot of you and there's a small number of us. But the idea is we want you to do some design before you get started, right? I'm not going to impose any sort of continuity on those design documents. I mean, if you want to write them after you wrote the code, I guess that's OK, but you will need to submit them and they will be graded at least once if not twice. But they're big chunks. And those are described on the assignment two handout. The assignment three handout also has a description of our expectations for the design documents. Any other questions before we get going? Yeah. I think it is. Try it and see. So what was happening before is that Blankin, I think what was happening was that Blankin answers were overwriting completed answers, and that, I think, is fixed. So if you fill out a couple and your partner fills out a couple and you both save them, the last person won't overwrite Blankin answers for the things they didn't answer won't overwrite the first person's completed answers, right? If you guys are having problems with this, please let me know, right? Like, I couldn't reproduce pieces of that, but it was very difficult to figure out exactly like the order of operations that you guys have. Nick, did you have a? OK, yeah, no, no, if you guys can, yeah. I mean, yeah, I tried to look at that, and I think I understand what was happening. But if I'm wrong, let me know, right? And I'm sorry about answers that vanished in the cyberspace. Any other question? Yeah. Yeah, oh no, I mean, it works for me too, right? But the problem is that there are people who didn't work for, and clearly that's a problem, but I just, like, I can't reproduce it locally if I can't figure out exactly what happened, right? Any other questions? All right, so we're going to skip the review today because I'm running a little bit behind, but do people have any questions about material that we covered on Wednesday? We started to talk about threads, a little bit about illusion of concurrency. This was from there, so this was kind of, this is our mental model of threads, right? So a single process is like a kitchen in a nice restaurant, threads of the cooks, and I actually like this a lot. Now, the more I thought about this, the more I thought, this is a nice metaphor, right? There aren't too many things that are wrong about this, right? There are some, but as in many metaphors, you take it too far, things start to fall down, but this is a pretty good mental model to think about the differences between threads and processes, and think about cases when you would want multiple threads, as opposed to multiple processes. All right, so this is a little bit of review because we talked a little bit about this at the beginning of this semester, right? So we talked about cases where certain types of applications can naturally make use of multi-threading, right? And I wanna be careful here. So throughout, in this deck of slides, when I'm talking about multiple threads, I'm always talking about multiple threads in the same process, right? Clearly your system and most modern operating systems have multiple threads across the entire system, right? But in this case, when we talk about multi-threaded applications, we're talking about a single application with multiple threads, right? And we'll talk about ways that the kernel supports multi-threading, right? And I distinguish that from multi-processing. So all modern kernels, including the guys, the one you guys are writing this semester, sport multi-processing, right? Meaning multiple processes. Your kernel this semester doesn't have any support for real multi-threading at the kernel level, so multiple threads per process. You could add that if you want to, but it's certainly not required, right? So different types of applications, and again we've talked about this, have sort of natural ways that they can use multiple threads, right? So web servers can usually separate requests among threads, sometimes something else we do. So that would be an example of what I would think of as kind of a vertical partition in my software stack, right? I have a thread, a request comes in at the top, and the thread, that one thread does all the work required to essentially collect all of the files necessary to serve the request, to do any sort of dynamic generation of content that's a mainstay of modern web design, and then at the end writes the resulting HTML response out over whatever socket is appropriate, right? So that again is kind of this example of this vertical partitioning, right? You also have applications that do more of a horizontal partitioning, where they'll have a single stage of the application that's served by multiple threads, right? And sometimes we call that a thread pool, right? So if I have a stage in my application, like in a reading or writing from disk, I might allocate 10 threads to do that, and they take requests, they go do it and complete the request, and so there's a little stage of the application now that I exploit some parallelism in by allocating the thread pool. And a lot of Java, if you guys have programmed Java or Python or even C probably has libraries to do this, thread pools are a pretty common abstraction, right? I have some work that needs to be done that work has latency associated with it, and I use threads to mask that latency and allow me to exploit some inherent parallelism in the machine, right? So web browsers might have separate threads for each tab. Frequently when you load a single web page, your browser will use several different threads to fetch different parts of the web page, right? When you load a single web page, not everything's in there, there's images that need to be loaded, maybe there's some JavaScript that needs to be run or whatever, so there's actually some multithreading going on even within a single request, right? To improve performance. And then frequently, you know, scientific applications, we think of, have these divide and conquer approaches that rely on what are thought of as, you know, embarrassingly parallelizable data sets, right? So frequently when I'm doing large scale data processing, I can actually, this is the foundation of things like map reduce, right? I can actually break these things down into much smaller pieces, process them separately, right? Without sharing a lot of state and then merge the result together, right? So this is an example of different applications that might want to use threads. Did I hear a murmur over here? Okay, just hearing things today. And again, this is all, I guess it's all a little bit of review, right? So we talked a little bit on Wednesday. So why not just write, you know, why not write my web server in such a way that when I start it, it forks off a bunch of copies of itself and uses separate processes to process every request that comes in? Why wouldn't I do this? Or why doesn't Firefox, when you start it up, launch all sorts of different processes like for every tab or something like that, right? Yeah, Brian. Yeah, so I mean, the big problem here is that communication is hard, right? And the communication is hard because the kernel is defending processes from interfering with each other, right? That's one of the jobs of the kernel, I remember, is to prevent processes from molestation, right? So, and to some degree, communication and molestation look quite similar depending on whose perspective you look at it from, right? So the defending processes from interfering with each other ends up making it more difficult for them to communicate. It forces more structure into the communication, right? So I have these IPC mechanisms, but they're all set up so that they're, you know, very carefully to produce some semantics that allow them to be used safely, right? Whereas in my own process, you know, inside my address space, I've got some memory that's allocated to me. I got six threads going on. They could do whatever they want, right? Any type of messaging paradigm or communication I want to, and in particular, communication using shared memory, right, is completely fine and okay, right? And the operating system's not gonna help me do it safely, but the operating system also isn't gonna stop me from doing it, right? And then the other thing with processes is that as I start to fork multiple processes, the state associated with those processes doesn't always scale very well, right? And this is a little, I got a little bit of a mini review. So as opposed to a thread, which has pieces of state that are private to it, what is the state associated with the process that I would be able to not have to duplicate if I used multiple threads to handle my parallelizable job instead of multiple processes? What's one piece of per process state that I would be worried about having to duplicate a lot? Tim, remember, threads have registers and they have a stack, right? And then the process has what? Yeah. Yeah, so particularly the memory, right, is the big worry here, right? As I start to fork multiple processes, those fork processes are just not supposed to share memory, right, and the operating system plays some tricks to allow them to share memory safely as long as they're only reading from it, but to some degree, copying a process involves reproducing a fair amount of state, right? It turns out, yeah, distinguished between the stack and the heap. Let's table that question for a month and we will talk about it, right? So the stack and the heap have different semantics in terms of how the operating system allocates memory to those areas, right? To some degree, though, the stack and the heap are abstractions that are set up by programs for their own use, right? There's no reason, for example, that threads, well, could they do that? They probably could, like, threads could probably have stacks in the heap if they wanted to, right? It's just typically not how it's done. All right, so I've got, and again, I have this per-process state that I don't want to scale, right? So I used this before, this assertion to try to convince you that abstraction didn't require privilege, right? But we're gonna come back to it now. So when I start to think about how to implement threads, and this is an interesting design exercise, right? I want you guys to think about this as software engineers and designers. Where should I implement threads, right? So as I claimed before, threads can be implemented in user space using unprivileged libraries, right? And for a period of time, Linux did not support multiple threads in user processes. So Linux only saw one thread per process. Applications that wanted to use multiple threads within the same process had to use libraries like pthreads that allowed them to do this, right? But from the kernel's perspective, there was only one thread, right? So these user space libraries work by implementing switching between threads in a user space library, but to the kernel again, all the kernel sees as one thread. So here's a process that has multiple threads. Those threads are implemented in some user space library, and all the kernel sees here is one thread, right? So the kernel sees as one thread, and that thread seems to be very busy doing all sorts of things all over the place, but the kernel has no idea necessarily that there's a pthreads library that's actually doing this, right? So here's the, and if you look in the literature, this is usually referred to as the m21 threading mop, right? So m user space threads, some number larger than one, but to the operating system kernel, I only see one thread, right? So threads can also be implemented by the kernel, right? So now Linux has the clone system call, which allows me to create a thread, right? And this is, and if I have a direct mapping between my user space threads, so the multiple threads that are in a single process, where's my little bit, I don't know where that's pointing, and the threads that are visible to the kernel, and this is called the one-to-one threading mop, right? So there are, so again, let's go back to this, right? So implementing threads in user space, right? So how is this possible? Let's think about what I need to do, right? So essentially implementing multiple threads means finding a way to switch between one thread and another. We talked before about the process of doing this when I enter the kernel, right? So when I trap into the kernel, one of the first things that happens is I save all this context, and we refer to this as a context switch. In that particular case, I'm switching from some user context, or maybe, I don't know, maybe another kernel context if I process in a hardware interrupt, and I'm switching into the kernel, so I'm saving all the state. So how do I do this in user space? It's not a trick question, probably. Yeah, I mean, basically the answer is the same way, right? I need to have some code that saves all the registers, you know, when I'm going to stop a thread from running, I need to make sure that when I start it again, things look identical, right? Or at least the registers and its stack look identical, and so I have a similar block of code to the one I showed you that's executed by your kernel, but that block of code is executed in user space, right? So it saves all the registers, repoints the stack pointer, then runs some code to figure out what thread to run and is saving all these thread states various places, so it knows, okay, I'm going to run this though that hasn't run for a while, I need to know where all the information associated with that thread is, right? And because I don't have to switch between processes, there's no kernel privilege required, right? Remember, what I'm doing here is I'm reallocating resources that have been already allocated to my process, right? So the nice thing here is, you know, because kernel privilege is used to isolate processes from each other, there's no need to get the kernel involved, and again, in the M to one threading model, the kernel doesn't even know that there are multiple threads, right? All the kernel sees is one thread, right? So the things we have to think about are, how do I save and restore context, right? So this is pretty fundamental, right? How can I stop a thread and start it again, right? And the C library actually has, how many people have ever used set jump or long jump before? Oh, okay, cool. This next example, blow your minds a little bit. This is fun stuff, right? So it turns out the C library actually has an implementation of this, right? Which is called set jump long jump, right? Set jump saves the state. I think this is how it works. And long jump returns to the point where the state was saved, right? So let's come back to the preempt other threads. Let me show you this piece of code, right? So this is pretty fun. So I've got a loop here, right? So this is a jump buffer, right? What do you think is in the jump buffer? Yeah, this holds the saved context, right? This is gonna be passed back and forth between set jump and long jump, right? So here's my loop, right? This is pretty basic C code, right? I'm going to loop over this 10 times. I'm gonna print the value of I, right? Now, when I get to I equals five, what I'm gonna do is save my state. And then if I'm coming back in here, so if this equals zero, if I'm saving save for the first time that I save this, otherwise I'm going to print out this restored CPU state. And let me show you what happens here so I can remind myself how this works. Okay, yeah, here we go. So what happens, okay, here we go. So what happens here is the following. I start at the top of the loop, right? When I equals five, I break out of the loop, right? So at this point, I'm down here, okay? Then set jump has saved the state of my thread at that moment, right? So I think set jump, if I remember correctly, set jump is like four. So if I call it for the first time, it returns zero to indicate that the state was saved. So the first time here, I saved the state and I break out of the loop, right? Now I'm done here, right? I have this variable restored that I'm using just to decide whether or not I'm gonna jump back into the loop. And now when I run long jump, I'm right back here, right? So this is pretty cool, right? So I come through here, I go through the loop, I print out the first five values of I, now I save state, I should've had a printf down here, right? I save the CPU state, I break out a loop, I'm here, and now I call long jump and all of a sudden I'm back in the middle of the loop with I equal to five, right? So it's like I was whisked away from my loop, right? It turns out here that I changed the control flow when that happened. But when I call long jump, I'm just right back to where I was exactly at the moment that call was made, right? Does this make sense to people? This is a little. Yeah, it is similar to a continuation. Oh man, wonderful PL things that I don't wanna talk about. Yeah, so essentially, I mean, you can think about this as a continuation, but what's happening here is this is, again, saving all the state necessary to allow me to return to that exact moment in time, right? So I'm just showing you this is proof that this is possible, right? I don't, I've never seen like real legitimate, I shouldn't say legitimate. I mean, I don't read a lot of C code for fun, right? I would be surprised if there were like super legitimate uses for such up and long jump other than like obfuscated code competitions, right? But this is still pretty cool. It shows you what you can do. Yeah, so you guys are C programmers and I need to understand this stuff. Where is the value of I store? Yeah, I mean, it is an int, right? But where is the value of I? If I was looking at my program and I said, where is like I is an int, right? But you know, let's say it's a four byte machine. So there's 32 bits somewhere that are I. Where are they? They're on this thread stack. So, oh, and you know, it turns out that the compiler also might have have shoved them into a register somewhere, right? But in general, these local variables are allocated and deallocated from my stack, right? And if you guys, maybe this is something we'll do next week in recitations, but if you guys start looking at some of the disassembly of your OS 161 binaries, you can see where space is being allocated for stacks, right? And this is a good thing. If you guys haven't walked through this before to see, you know, when I enter this loop, the C compiler will output instructions necessary to preserve enough space on my stack to hold all of the variables that are there, right? All right, and it turns out there's one, there's one very interesting trick here, right? When I call long jump, right? I think this value is like what should return from set jump? What actually happens here is I jump back here, but some jump now returns one. So that's why I end up, when I call long jump, I end up down the other part of this branch, right? So there's a small change that's been made to my state. The call to set jump returned a different value. Other than that, all of the rest of my state is preserved, right? And again, this is just like something, I don't know. If you have friends that are impressed by this, then you have some good friends. All right, let's see here. So let's, yeah, the other issue, yeah, one more. I think you could, yeah. I mean, if you wanted to jump back into a case statement or something like that. Again, like, do not use this function as a normal C programmer, right? Like, C control flow is weird enough, right? This is just terrible, right? So if you start using this on a regular basis, then I would argue that maybe you need to rethink how you're writing computer code. Vibers? Vibers. Yeah, okay. Yeah, yeah, yeah. But here's the other, so here's the other question with user space threading libraries, right? So now I think maybe I've convinced you that I can save and restore threads, thread state in user space. But how do I preempt other threads? So how does the kernel preempt other threads? Let's go back to kernel level threading, right? How does the kernel keep threads from running forever? How does the kernel make sure that it has a chance to stop a thread and do something else? Yeah, it uses this timer interrupt that's generated periodically by hardware. And every time the timer fires, the kernel is going to run, right? And the kernel protects the operation of the timer and the code that the timer runs, the interrupt handler, from access by user programs, right? So if my kernel is correctly written, every time the timer fires, if the kernel is correctly written, two things are true. The timer will always fire when the kernel wants it to fire and the kernel always can control when the timer does fire, right? So how do I do this in user space? Well, first of all, let me ask you this question. Can a user space program ensure that some piece of code always runs? Does it have any mechanism for doing this? Remember, let's say I'm gonna preempt another thread. So something's gonna happen and then I'm going to jump, it's just gonna be very much like what the kernel does. I'm gonna jump to a memory address and I'm gonna start executing some code that's gonna return control to the user space thread scheduler. So threads in an address space, threads in a process share what? Memory, where is this code going to live? Can you say the same thing you just said? Memory, this code's gonna be in my process memory somewhere. Any thread can overwrite it, right? So no, basically the answer is that there is no way to do preemptive multitasking in a user space library. I shouldn't say no, there's no one that I don't think this is normally done, right? What has to happen is that the threads in the user space library have to cooperate and agree, for example, not to mess with each other, right? Because if a thread wanted to say, hey, I am now the most important thread in Apache, no other thread should ever run, it could just disable the signal or overwrite the signal handle or whatever, right? But the actual way this is done is that we use signals delivered by the operating system. We talked about signals as a form of IPC and many of you guys probably maybe have used signals or didn't know you were using signals but the process can ask that the kernel will deliver a periodic signal to it every so many texts. It's very much like a timer, right? It's a similar sort of thing. The signal is a software construct but when the signal is delivered, the user space program will jump into a signal handler and handle that signal. And normally this is how these user space libraries do scheduling, right, Sam? No, yeah, so the question is, could I, in the middle of, well, I actually didn't want to go forward, right? So in the middle of the satcham call, could I be preempted by the kernel? Yeah, and what would happen when I started running again? I would just finish the call, right? Whatever series of instructions, so imagine you saw that whole big block of code that saves all the registers in the kernel. If I was running that user space and I was interrupted, I would just start again, right? And hopefully if no one else has run then the state of my memory is the same, right? So yeah, I would just be restarted, right? So let's think as designers and we'll think about, like, comparing different ways of implementing threads, right? So I've got user space libraries that allow me to implement threads in user space. I've got kernel support for multithreading in more modern systems. So what do you guys think would be some of the advantages of user space threading? So implementing threads in user space, not telling the kernel about it, the kernel thinks there's one thread in my process that's just a very busy bee and never seems to block. What's the other, actually that's a good question, right? What's the other thing I need to be careful about if I'm running a user space threading library? So I've got, again, I've got one kernel thread. The kernel doesn't know that I've got multiple threads and I actually have a couple of things that are going on in user space, but what do my user space threads have to be really careful not to do? What could they do that would cause this whole, this whole delicate arrangement to really fall down pretty badly? Yeah. Well, okay, well, yeah, certainly synchronization can cause problems with correctness, right? We'll talk a lot more about that next week, but what could they do that would cause this whole thing to fail? Hmm, that's not what I'm thinking of. That would be weird, Tim, yeah. Yes, if I do blocking IO. So if I do a read call, right? Let's say I'm running my web server. I've got 30 threads in user space, but the kernel doesn't know that, right? The kernel just sees me. If I do a blocking read while I'm running, the whole process and every other thread will block until my read call finishes, right? There may be 29 other threads that could run and do useful work, right? But I have blocked the entire process, right? So user space, using user space threading libraries typically means one of two things. A, I either have to do explicit asynchronous IO, right? So I have to, what asynchronous IO means is that when I ask the kernel to do a read, I don't wait for it to complete. I go back to executing instructions and then I have to have some mechanism for finding out later that the read is done, right? So I can either do that explicitly or my threading library may provide calls that do it for me, right? So they may provide a version of read, right? That blocks my thread in the threading library, causes the threading library to do the request on my behalf and then I get restarted. So this ends up being very, very similar to what happens in the kernel when I do reads. But right, if I block one thread in my multi-threaded user process, the whole process will block. And it doesn't matter whether or not there are other threads that can be run because the kernel doesn't know, right? The kernel has no way of knowing that, hey, if I just let this, if I just, you know, let it keep running, there's other things it could do, right? So I have to be very careful about synchronous IO, all right? All right, so pros to doing threading in user space. There are some pros, right? What are they? So imagine, I mean, what does the kernel, so I keep saying the kernel doesn't know that there are threads, right? But because the kernel doesn't know that there are threads, what's not happening? Yeah, Jeff. Okay, I don't have to request the kernel create a new thread, that's true. What else don't I have to request the kernel do? Simon. Well, the kernel's probably, usually the kernel's not gonna help me protect threads against each other anyway. That's my problem as a program, right? What else do I not need to do? What is your, is your name Tau? Yeah, you guys are on the right track here, kernel requests, what do I not have to have the kernel do for me? Well, all the memory's gonna be shared by these threads anyway. Can we go to Tau over here? He doesn't know, yeah, Brian. Okay, so okay, that's a great point, right? So I have more control over scheduling, right? Potentially, because I know more about the threads that are running and I might have like a specific scheduling policy I wanted for my, that's a good one, I don't think that's even up on the slide, but that's a good point. What else though? What can I avoid when I switch between threads, what do I not have to do? Well, I have to save the state, right? But what else do I not have to do? I don't have to talk to the kernel, right? Kernel multithreading means that every time, in the kernel multithreading, every time I switch between threads, the kernel runs, right? In user multithreading, I'm switching between threads in my library and the kernel never gets involved, right? And it turns out that when I enter and leave the kernel, there's actually a fair amount more going on than just saving context, that's part of it, right? But it turns out that user space threading libraries can usually switch between threads much, much more efficiently than the kernel can because I don't have to get in and out of the kernel. I can just let my threading library do it, right? Okay, what about cons? What's a problem here, yeah? Yeah, that's a great point. So how many threads does the kernel think the process has? Let's say I have a four core machine. Will this process ever run on multiple cores at the same time? No, because the kernel doesn't know they're more than one thread, right? The kernel would have to schedule it across multiple cores. The kernel has no clue, right? So, you know, until I had multi-core machines, maybe this wasn't a big deal, but it is kind of a big deal now, right? Because I've got 16 cores in the machine and I wanna allocate three or four of them to my process. The kernel has no clue what's going on. Then the kernel just thinks it's a single-threaded application and it's just gonna have it on one core, right? What else? What's another, that's a good, well, maybe that's enough. So I think we've got most of these, right? So the threading operations are a lot faster. Thread state is smaller, right? Because there's a state that the kernel keeps that the applications don't have to. Can't use multiple cores and there may be scheduling problems with the kernel looking at this because I'm hiding information from the kernel, right? It's possible that the kernel could do a better job of scheduling me if it knew about the fact that I contain more than one thread and maybe if it knew a little bit more about what those threads are doing, which it does once I start to expose information about those threads to the kernel, right? And again, a single thread can block the entire process if it's... So this is really, you know, the multiple threads in user space is really a highly collaborative, right? Or cooperative approach. There are old systems that actually had versions of what's called cooperative multitasking or cooperative multi-processing, which meant that, you know, instead of remember the kernel, this dictatorial thing, right? When your time quantum is up, it's going to run. Not all kernels were built that way. Some kernels would wait for the process to do a system call or for a hardware interrupt to happen or for the process to say, I'm done now. You can run somebody else, right? So there were systems that actually required that the processes yield control. And I remember a friend of mine had an early version of Mac. This was maybe like Mac 8 or something. It was way before and more modern versions of Mac OS. But when he would burn DVDs, the whole machine would lock up for like 20 minutes. Nothing would happen, right? It would just sit there burning the DVD and then it would stop, right? And everything else would start to paint. But it was literally like because that app, for whatever reason, didn't ever yield control, it just sat there, right? So this is cooperative multi-tasking going around and this is why we don't do it. But user space threading libraries have to rely on the threads to cooperate and not to mess with each other because there's no way to do it any other way. Yeah. Yeah, yeah, so if I like, if I use a P thread library and then I make like a call to read and I don't do this, probably like a P thread read or something, right? But if I just call bare read, then it'll enter the kernel and the kernel will put me to sleep until that read completes. But we're gonna talk about this more, you know, today, Monday, Mukta, the other question, okay? All right, okay, so now what about the one-to-one kernel threading model, right? I tell the kernel about every thread. I don't hide any information from it. So what about pros and cons here? So really it's like to some degree, the inverse of what we just did, right? Give me a pro. What would be better about this role? Somebody from the back of the room. Someone who hasn't raised their hand. Pro of this approach, all the way in the back. Yeah. Yeah, so okay, I can use multiple cores, that's one. What else? What's another pro of this approach? You. No, behind you. What's your name? Do you know? Okay. Any other, what about a con? A con, yeah, Brian. Yeah, so there's this context which overhead, and you're right, actually, that's true. Because the state is bigger in the kernel, it's more difficult to support many, many, many threats. So in general, the pros here involve the fact that more visibility means the kernel can do better scheduling. I can schedule across multiple cores in a multi-core machine. I can put you to sleep properly when you block and wake you up, and you know, you don't have to do, you know, the asynchronous, I own user space. But in general, the context which overhead is one of the big cons here, right? There are actually now what are called M2N threading library implementations, right? They use like some fearsome mixture of these two approaches, right? So they use multiple kernel threads, but they also do some multi-threading in user space, and that to me seems very interesting, and probably something that would be difficult to get right. But you can do that if you want to, right? So there are threading libraries that will allow you to create lots of threads, and will tell the kernel about some number of them, right? Maybe enough to exploit the natural parallelism in my machine, right? So if I'm on a 16-core machine, I might tell the kernel I have 16 threads, but I might allow you to create like 64 threads, right? And I might do some switching in user space, some switching in the kernel, right? So I can merge these two approaches, right? So at this point, yeah, yeah, so in general, in order to do any of this user space threading, right? What's really important is that I have a support for asynchronous IO at the kernel level, right? So the kernel has to provide me a, and this is particularly important with file IO, right? Why would it be so important with file IO? What's the problem with files and block devices and other things that, you know, so again, like I would argue, you know, things like fork and exec are okay. I mean, first of all, exec, if I call exec then, who cares what happens to the other threads in the process, right? Because the process is going away, right? But, you know, and fork, I guess fork is also a little bit weird, right? Why is it important with file IO? So Guru was saying if I just had read and read always blocked, then it would be a problem for an application that was trying to support multiple, or a library that was trying to support multiple threads in user space, right? The problem is, but the problem is that the thread is going to, the whole process will sleep until the read completes. How long is that going to be? I'm reading from a file, right? And I could, again, I could just block the entire multi-threaded user process until that read complete. Why wouldn't I want to do that? What's true about the CPU with respect to other components on the system? Way faster. That IO is going to take forever to complete, right? And the probability is that if I have a multi-threaded application, there are likely other threads that could be running and doing useful work while that IO is completing. So what I need to do is the kernel has to provide an asynchronous disk IO interface, right? Or asynchronous, some asynchronous system calls, right? And we talked about that before. So what this means is that when I call read, for example, I tell the kernel, hey, I want to do a read, but I don't block, right? The call, I enter the kernel, I pass in the arguments, the kernel copies the arguments out, but as soon as it's done with that, I get to run again, right? Now of course, the problem is that with the synchronous read, what's nice is that I give the kernel, I tell the kernel, I want data from a file to go here, and the way I know that the data is there is that the call completes, right? The next instruction I run, if the read completed successfully, I have to check the return value, but assuming it completed successfully, the next instruction I run, I can be sure that the data I've requested is in that buffer that I gave the kernel, right? With an asynchronous call, I don't know when the read's gonna complete, right? I get to run again, but now I have this communication mechanism where the kernel needs a way of telling me, or I need a way of asking, is the read done, right? And I think what these studying libraries would do is they would make an asynchronous call, right? And then periodically they would be checking, so my thread would make a read through the library. The library would make an asynchronous read call, it would put my thread to sleep in user space, right? It would periodically be checking, or the kernel would be telling me, is that read done? When that read is done, it would allow me to run again. And in fact, it's a nice segue because the next thing we're gonna talk about before we finish today, I hope, oh, maybe not. Let me just introduce these states because Guru got us to a good segue, right? So we talk about threads, right? And this actually can be true both in user space and in the kernel, but we're gonna be talking mainly about the kernel's view of threads, right? And sometimes we talk about, so if I have one thread per process, sometimes I'll slip up or you'll slip up and you'll talk about the process as being in one of these states, right? But in reality, threads are in one of these states, not a process, right? If a process has multiple threads, its threads can each be in different states, right? So running, what do you think the running state is? Let's do this fairly quickly. Tom, if a thread is running, what does it mean? Yeah, these are not true questions. It's executing instructions, right? It's scheduled on the CPU core, it's running, it's executing instructions, right? What about ready? What would a thread being ready mean? Matthew? It's, okay, so what is it not? It's not running. Can it run? Yes, there is somebody else. Yes, so this thread is not executing instructions, but whenever I want to, I could start it up on a core, right? This is a thread that can run. It's ready to run, right? It just happens not to be running. What about this? Waiting, blocked or sleeping? Synonyms for something, pretty much the same thing, nothing. Yeah, yeah, so this thread is not running. It is, these are mutually exclusive states. It is not ready. It is waiting for something to happen. It may be sleeping, we call it sleeping, we say it's sleeping until that thing happens, right? And blocked is a similar way of saying it can't go any farther forward until something else happens, right? So this means it's not executing instructions and it's not able to be restarted, right? So it's not running, it's not ready, right? And we'll finish the thread straight turn distance on Monday. Have a great weekend. Enjoy the snow, and I'll see you guys on Monday.