 Unfortunately, there's no music for class today. Two reasons for that. One is that there's this terrible bus station in here at 8 a.m. You guys should all be happy that you're not required to attend. But also, it looks like someone broke off the, we ought to be okay with the problem. As they say in Clark's, I'm just sad that it's in this town. Anyway, we'll see if we get that fixed by Friday otherwise I'll be sad. Okay, so today we're gonna finish talking about locks. We just almost got it fixed over to spin locks last time. We'll talk about what to do if you don't spin. So we're gonna do these two styles of locks, spin locks, and sleep locks, or what happens just called locks. And then we're also going to talk about some synchronization primitives. So, condition variables, which are a signaling mechanism, and semaphores. And if you started assignment one, these probably sound familiar because these are the synchronization primitives that are used by your kernel and that a subset of which you will have to implement. So these are good to understand what they do and why they do it. So, this is very similar to the slide on Monday. I won't make too many comments. I put up a post on Piazza, so it turns out there were maybe a day or two behind where we were last year at this time. And I'm wondering if that's my fault. So, if you think lectures are too slow, go online and vote. If you think they're too fast, go online and vote. And if you think they're just right, go online and vote. It's like Goldilocks. So anyway, I just want to, you know, again, I don't care if we're going at a good speed and people are happy with it. I just want to make sure that we are, right? I think we'll still get to the, through the material I want to cover. There might be a few lectures from last year that get dropped, but that's fine. But I just want to make sure I'm not, I'm not holding us back, all right? Okay, so questions on Monday's material. We talked about concurrency. We talked about animicity. We talked about critical sections. We started introducing some hardware support for synchronization that we're going to finish today. Any questions about this before we do our little review session? Yeah, Brian? Sure. You could do that. So, anyone have a clue? So, let's say you're writing bank management software. You have some code that modifies account balances. And you have decided to lock around that piece of code, which we'll talk about today, to ensure that that's a critical section. That seems like it would work, right? I have a lock that, or, and again, we haven't really talked about locks, so Brian's getting ahead of us. But I have some way of preventing multiple threads from running in that section of code. And that works well, right? And then maybe you do some performance profiling and you find out that your bank management application is running quite slowly and threads are spending a long time getting in and out of that code block. So what would be a solution to this problem? No, you're writing this, right? Yeah, no, no, but I mean, what's, so what's the problem here, right? I have one big lock that I grab whenever I'm modifying any account. What is an alternative solution to this approach? Jeremy? Yeah, why don't I just create a single lock for each account that protects that account balance, and you guys will find this. So when you start programming your kernel later in the semester, there'll be cases where you'll say, well, should I use a single synchronization primitive and how much area should that cover? So to some degree, the one big lock solution will work, but it introduces more protection than you need, right? Because any thread trying to modify any balance has to wait for every thread trying to modify any balance to finish, right? And that's actually a stronger thing than we need, right? What can happen, though, if you try to make your concurrency to fine-grained, right? Well, there's two things that can happen. One is that you can get it wrong. The second thing is that you start to have a lot of overhead for all of these extra structures necessary, right? So I could say I could have a lock for every byte of memory, right? And then you wouldn't have any memory left because for every byte of memory you'd have like 16 bytes for your lock, right? And that would be bad. Jeremy, yeah. I know that there's a... I mean, locks in semaphores that we're gonna talk about, you guys are gonna implement are data structures that are implemented and live in memory, right? So when you start using them, you have to think about the fact that when you add a lock to something, you are introducing some amount of overhead, depending on how you implement your locks for this class. Your locks might consume, I don't know, 8, 16 bytes, probably more because you have a string associated with them which is really just for debugging, right, but anyway, yeah. So there is some practical limit as you get more and more fine-grained with your synchronization, you end up having a lot of memory overhead for the structures associated with it, all right? But that's a good question. And yeah, there are, so you have to think about what's the right level at which to synchronize? We'll talk, and we're gonna go through some synchronization problems, hopefully, on Friday. So at that point, we'll give you guys some practice in choosing the right synchronization primitive. And since you asked this question, I'll try to rework one of the examples so we also can look at, you know, at what granularity to synchronize them. Fair? Cool. Any other questions? Good question. All right, my computer has to wake up. And so why do we use concurrency? The illusion of concurrency is what? From Monday's class. It's both blank and blank. Bart, one. Powerful, yes, Bethany, two. Do we have two Bethanese? I don't think so. You got off the hook, yeah. Powerful and useful, right? It helps us think about how to structure applications and it helps us hide latencies, right? Concurrency creates these problems. We identify two problems with concurrency that we're going to look at solving for Rune. What's one of them? Synchronization is a solution to these problems, not the problem itself. Anarok. Correct this, right? How do I ensure that my code is still correct when multiple things are happening at the same time? And coordination. That's the one that came up first. Coordination and correct this, right? How do we allow threads that are modifying the same data structure to communicate with each other? And how do we make sure that the answer that I get is correct when multiple things can be happening at the same time? Concurrency and animicity, right? So remember, we talked a fair amount about the illusion of concurrency, right? And on a uniprocessor system, what is that illusion? Frank, right? On a single processor system, only one thing is happening at the same time, but concurrency gives us the illusion that multiple things are happening, right? Animicity is also frequently an illusion, but what is animicity the illusion of? Chaudhary. Yeah. You're the only one within like a few degrees of my finger, so. One thing is, OK, that's close, right? The one thing happened, Tam, you want to help? Animicity melds a set of instructions or a set of operations into a single atomic, right? Atomic indivisible operation, right? That operation is indivisible in the sense that it either happens or it does not happen. And it's frequently indivisible in the sense that we try to protect other threads from seeing any of the stuff inside while it's happening, right? OK? Assumptions you have to make about the threads now that you're working on a multi-threaded system. Simon, what's one of the things we have to assume? That can be stopped for an arbitrary amount of time. OK, that's good. What else? Sam, compete for the same resource, but why would they be competing for that resource? OK, let's say I'm on a uniprocessor system, so they can't execute at the same time. So my thread can be stopped for an arbitrary amount of time. There's a hint of another answer in there for what? It can be executed in any order, and they can be stopped and started at any time, right? So your thread at any point. And it's important to point this out, and hopefully we'll go through this in recitation a little bit. Even in C, a single, so C's a pretty low-level language, right? But even in C, certain instructions may not actually be atomic, right? So you may think, well, I just set this variable to a value. Certainly, my program can't be interrupted in the middle of doing that. And the case is frequently false, right? Sometimes what looks like a single instruction in C, even a very simple one, actually ends up being multiple instructions to the CPU. And your thread can be stopped in the middle of doing that. So don't use any C constructs to make assumptions about the atomicity of your program, because they'll be wrong, right? So in order to, we start talking about this idea of a critical section. It's an area of code that only one thread is running at any time. We have three requirements for that critical section. Sarah, one of them. Or, well, they're up on the slide, I guess. What do they mean? She's like, I have all three. Yeah, that's the most basic property, right? What about progress? What are we made by progress? How? Or someone who I'm calling how? Who's not actually called how? What is your name? Yeah? Shin. What is progress? Yeah, every thread gets a chance to run into the critical section, right? If I try to enter the critical section, at some point, I will leave, right? Performance, gent. Yeah, I want to get the critical section. So all of us want to implement critical sections. I don't want to have to do too much weighting, right? I don't want to slow the system down too much, right? And I want to keep the critical section small, right? Without sacrificing corrections, OK? So remember, a critical section we defined is a series of instructions, and only one thread can be executed in a given period of time. And the set of instructions will look atomic with respect other threads, right? So another thread will not. We'll see all of the instructions inside the critical section as if they happened at once, right, in their entirety without being able to observe state in the middle, right? All right, any other questions about this before we plunge forward? OK, so right at the end of class, we looked at these two atomic primitives that are provided by hardware, right? We remember we had this idea, particularly on multi-core systems, which is reality today. Hardware has to help us achieve atomicity, right? We talked about this design pattern of stomping other things from happening so that I can implement a critical section. We talked about the fact that that's completely broken on a multi-processor system, because multiple things can be happening at the same time, right? So we looked at several different flavors of atomic instructions that are supported by modern architectures that give us a starting point for building higher level primitives, right? Your OS 161 kernel does this. If you look at the spin lock implementation, it uses an instruction like this that David has implemented in the simulator, right, that gives you these properties. So there's various types of these. We looked at two, right? One is a test and set that atomically writes a memory location and returns the old value, right? So the idea is that the hardware ensures that this blob of pseudocode happens all at once, right? Multiple cores are running this simultaneously. One of these series of instructions will run first, the second, one second. They will not interleave. And then compare and swap, right? Which is a similar type of instruction that compares the contents of memory with some value that you pass in, and if they're the same, sets it to a new value. And if they're not, return something else, right? So I should say many processors, right? So most processors provide some, you know, they're really the only way we can implement concurrency on multi-core systems is with some type of hardware support. So the hardware provides us with this little bit of goodness, right, this nice instruction that allows us to build these primitives on top. And again, if you look at your OS 161 code base, you'll see this, right? The spin lock implementation is directly reliant on an atomic, I think it's, I can remember is it a test and set or is it a test, test and set or something? I don't remember anybody, it's in there, you guys can find. All right, so let's fix, remember we had this bank account example, let's fix it by using a test and set, right? Let's say we had this atomic hardware instruction, I'm gonna pretend it's callable through C code somehow, which it could be. So what about this? Before, so remember I had this critical section I identified in these three lines of code where I'm modifying my local copy of the balance and then writing it back to the global copy. And let's just set the test and set and clear it in and out of the critical section, right? And this works great, simple. And I'm done, right? Am I done? Does this work? Why not? Well, no, I think I've defined one critical section here that starts with this line and ends with this line and I've, I'm setting and clearing my test and set at the top level, I don't know if that's the answer I'm looking for, yeah, Bethany. So the question is this line, right? So I'm saying this is a test and set operation and it atomically sets some global variable and again, this would be like a memory address on the machine to one. But so what? Broom, I think this looks good. Yeah, like does, what happens, so what? Like I set this to one and then I just keep going, right? Does this accomplish anything if multiple threads? So the first thread's gonna come in here, it's gonna set the value and then it's gonna execute this instruction and then it's gonna get context switched and the next thread's gonna come in here, it's set the value and just keep going, right? So when I use a test and set, I actually have to check the value of the test and set and do something, right? So, okay, so this is better, right? So now the idea here is that my test and set is one, if there's a thread inside the critical section and it's zero if there isn't, right? So I set it to one at the top, I clear it at the bottom and what I'm doing here is I'm saying if the test and set is equal to one, that means what? Okay, if the test and set returns one, what does that mean? Something wrong, well, could be, right? Frank, yeah, that the critical section is locked, there's a thread inside the critical section, right? So now what do I have to do? If there's values one, my thread can't go in. So what do, Swetha, what do I need to do? Reset the value, do I wanna reset the value up here? What would that mean? That would essentially mean that there is a thread inside, but I've said that there isn't, yeah, Satish. I have to wait, right? I can't go in. There's someone inside the critical section, my thread can't proceed, right? So here is a solution. This will actually work, Jeremy, yeah. You're always like three slides ahead of us, right? Yes, so, okay, so what are the problems with this approach? So what does this do, right? This essentially says, I'm running this atomic test and set instruction, if it returns one, I know someone is still inside the critical section, right? And what do I do if it returns one? What do I do while I'm waiting? No, no, what about this code? What does this code do? I check it, and then what do I do next? I check it again, I check it again, check it again, check it again, check it again. And maybe at some point, it will actually be cleared, and then I'll go, right? So essentially, what this design pattern or this type of approach is called busy wait, right? So whenever a thread is waiting for something to happen, in this case, I'm waiting for the other thread to execute the critical section, but I am not giving up control of the CPU, I'm not allowing other threads to run, I'm busy. I'm busy waiting, right? I am, you know, repeatedly just sitting there pounding on the door, you know, checking the value, checking the value. You know, it's like if you've ever been on a whole, right? Like, what you really want is for them to call you back, right, when there's an operator who's there to help you, right? But that's not what you have to do, you have to sit there listening to the music over and over, right? So that's what this thread is doing. Sitting there, it's busy waiting, it's constantly checking, and so this is interesting. So on a multi-core system, this can sometimes be bad, right? So why would this be bad on a multi-core system? Got two cores, I've got a core where a thread is just sitting there spinning. Yeah, Jeremy. Yeah, no one else gets to run, right? And potentially, if I'm doing this inside the kernel and maybe I've disabled interrupts and I'm not allowing other threads to even be scheduled, I might sit there spinning for a while, right? What's wrong with this on a single-core system? Does this even work on a single-core system? What would happen if I ran this on a single-core system? What's the problem with this design pattern on a single-core system? Manish. Well, yeah, so that's the worst-case scenario, right? If I'm preventing anything from being scheduled, I could run forever, right? But why? Why would I run so long? Why won't, eventually, that's gonna be zero, right? Masakazu, what's gonna happen? I'm on a single-core system. There's one processor. I'm on the processor. I'm spinning, waiting for the value to change. Wembley. Is it 30 of them? The thread that's inside the critical section can't run, right? I've got the CPU. So I'm waiting for something to change that will never change as long as I'm waiting. As long as I keep spinning and banging on the testing set, the other thread will never get a chance to clear it, right? So on a single-core system, if I was never descheduled, this could be a deadlock, right? I might sit there spinning forever. On a multi-core system, eventually, the other thread would run probably on another core and clear the testing set, right? But I'm still waiting, right? Right, if I'm preemptible, right? If I'm not in the kernel, why I've turned up interrupts and I'm not allowing the schedule to run right? But even if I'm preemptible, right? What's still gonna happen? Let's say that the kernel will eventually schedule another thread. What have I done with my entire quantum of time? I've wasted it, right? Because this will never change while I'm waiting, right? On a single-core system, as soon as I check the test and set and see that it's one, I should stop trying and let somebody else run, right? Because I can sit there waiting for another 10 milliseconds of what other scheduling quantum is, but I know it's never gonna change, right? The only way this makes sense is on a multi-core system because it's possible that the other thread in the critical section is running on another core and so that value can change, right? So again, this is kind of what happens on a busy waiting system with preemption, which is what you're talking about, right? The first thread runs, it comes in, it sets the test and set, and then my blue thread just sits there in this loop, right? Until it's descheduled, exhausts its quantum, and then finally the red thread will finish the critical section and the blue thread will be able to run, right? So essentially what we're implementing here with this very simple test and set operation is a very common synchronization primitive that's called a LOCK, right? And LOCKs can be used to establish critical sections, they can also be used to protect data structures and in other ways, and we'll talk about some of those other things on Friday, right? But for now, we're gonna talk about LOCKs as a synchronization primitive that we use to implement critical sections. So when I enter the critical, at the top of the critical section, I acquire a LOCK. At the bottom of the critical section, I release LOCK. Any other thread that's trying to acquire that LOCK will have to wait for me to exit before it can proceed, right? And essentially what we've implemented is today in this little blip of code is called a SPIN LOCK. Why is it called a SPIN LOCK, AJ? It's called a SPIN LOCK, Tim. Yeah, so it's a LOCK because it guards a critical section. Oops, there's another date reference that I should have scrubbed today's Wednesday. And SPIN describes how this LOCK is acquired, right? I acquired this LOCK by busy waiting. I acquired this LOCK by repeatedly performing some operation until it returns the value I was looking for, right? So, Brian, yeah. Yeah, you, so. You can guarantee progress, right? The only way that this would, the only way that this would break would be if you had a single core system and you had disabled interrupts so you couldn't be descheduled, right? In that type of system, you could deadlock on this, right? And in fact, if you guys look at the code for the SPIN LOCKs that is in your kernel, right? SPIN LOCKs are implemented for you. You guys are gonna use them to implement some other primitives for assignment one. That code checks to see if another threat on the same CPU already holds the SPIN LOCK. And if it does, it dies. Because it essentially says, I'm not gonna make any progress, right? Someone on my core has the SPIN LOCK. I'm trying to get the SPIN LOCK and I'm gonna prevent him from running, right? In reality, maybe that threat would be moved to another core but a lot of times on modern systems, there's a fair amount of affinity between threads and cores, right? So, my thread might not be, the operating system might not move my thread to another core, right, until something happens. So, it might sit there. So, two threads on the same core trying to acquire the same SPIN LOCK can produce what's called a deadlock. We'll look at that later, right? But in most cases with SPIN LOCKs, as we've implemented, you can make progress, right? You may not make it efficiently and we'll talk about the difference between SPIN LOCKs and more traditional sleeping-based LOCKs and a couple of slots, right? That's a good question, yeah? Uh-huh. Yeah, so what'll happen? I have one thread that has acquired the LOCK, right? And I have two threads that are spinning that are waiting for, right? So, what's the first thing that has to happen? Well, okay, so I have two threads spinning. I have one thread holding the LOCK. What's the first thing that has to happen? In order for that to happen? Yeah, Frank, it has, yes, in order to do that, it has to do what? In order to do that, it has to do what? Run, right? It has to be able to run, right? So that's the first thing. That thread has to be able to finish. And then at that point, another thread will grab it, right? One of the two spinning threads will grab it. It'll depend on the schedule, in order, who gets it first, right? It'll run and the other thread will also get it. So, okay, this is a good question. So, in general, if you had, let's say there's always contention for the LOCK, meaning there's always somebody holding it and there's always some number of threads that are then spinning, right? The question is, can you guarantee that all those threads will eventually get a chance to have the LOCK? The answer is in the limit, yes, right? I mean, if for some reason there's some unlucky thread that keeps never being scheduled, right, and never gets a chance to run, then it could happen. And actually, we'll look at an example where when we get into scheduling, where there can be bad interaction between scheduling and LOCK acquisition, right? So there are certain cases, for example, where if I prioritize threads, then a low-priority thread that's trying to get the LOCK might never get a chance to acquire, right? But in general, I don't know, it's difficult to, it's difficult to guarantee that type of product, right? But that's a good question, yeah, that's... Okay, cool, yeah? You mentioned in the previous slide that a single core system will not be able to process because the spinning thread will keep bombarding it. Yeah. Right. Yeah, the spinning is running, right? Spinning is a type of running on the CPU, right? A thread that is spinning for the LOCK is executing a series of instructions on the CPU repeatedly, right? Those instructions are frequently a very tight loop that involves testing the testing set over and over. Right? But it's just, it's a subclass of running, right? It just happens to be executing a very specific set of instructions, yeah. So it's, the number of threads that can spin is limited by the number of cores you have, right? Yeah, okay. Oh, yeah. Yeah, that's bad. Yeah, so it could exit in the critical section, it could die in the critical section. Again, I mean, these things are, and that could crash your whole system, right? So that's actually a variant of deadlock that we'll talk about later, right? In general, you know, again, when we talk about these kernel primitives, yeah, you gotta be careful because if you grab a LOCK and then die, I think modern kernels have ways of unwinding LOCKs when the thread exits to try to make sure other threads make progress. But in general, if I grab a LOCK and I'm in the middle of doing something and I exit for some reason, then who knows what the state of the world is, right? Like I was trying to perform some atomic series of instructions, I might have only got halfway through, right? So if I die while I'm holding a LOCK, then the system could take the LOCK away from me and allow other threads to run. It's not clear that that's safe, right? It might be, in certain cases, the only thing I can do, right? If the opportunity is, okay, this thread was running in the kernel and it died while I was holding a LOCK, like I could just let every other thread that's trying to get that deadlock or I could try to go on and hope that things are okay, right? It's not a good series of options, but some kernels may choose to just try to keep going and hope that things are okay. Some might just say, you know what? Tough, I'm just gonna die now, right? Good question. These are, I don't know, good question. All right, so let's modify our bank example now to use a LOCK, right? So what do we need to do? We need to identify the critical section. We did that already. We found these three lines of code that we need to protect with a LOCK, right? And then we need to LOCK around. And this is the way to think about how to protect data, protect critical sections in your own code for this class. Identify the critical section, LOCK around the critical section, right? In general, this gets you a long way, right? When you start acquiring multiple LOCKs and there's complicated data structures, you have to be a little more clever and we'll talk a little bit about that when we talk about deadlock. But in general, this is how we do this, right? So we've identified the critical section. I need to create some sort of LOCK structure and I would have to also initialize it, but let's say it's initialized and now I just acquire and release the LOCK around the critical section, right? So again, this is kind of basic stuff, right? But Jeremy, yeah. Well, okay, so I've declared this as a global variable and I would have to initialize it, right? Yeah, I just left that out, right? But on your system, don't try this at home, right? Well, you can, it will just panic, right? Because this will be garbage or null or something bad, right? So in general, yeah. But in general, this doesn't work. I would have to have LOCK in it somewhere and some boot structure code, right? So actually, if you look at your system, one of the things you guys will have to do in future assignments is make sure you write this initialization code so that at boot time, your kernel allocates any sort of global synchronization data structures that you need, right? So you'd have to like allocate a LOCK and initialize it properly, right? Normally that just happens once at boot and then I have the LOCK to protect that and then I just use it throughout the rest of the time the kernel's running. Good question. So if I call a LOCK acquire while another thread is in the critical section, the thread acquiring the LOCK has to wait, right? And essentially, we looked at one pattern for waiting which was spinning, right? So we had this busy waiting approach, okay? There is a more frequently used probably and in many cases, better pattern for waiting. It's different. We'll talk about why to use each one. Which is passive waiting or sleeping, right? And sleeping for something to happen essentially means that I need a way of telling the kernel this is what I'm waiting for. And then I tell the kernel I would like to be descheduled, right? I mean, before I ask the kernel to deschedule me, I tell the kernel what I'm waiting for and then the kernel puts me somewhere while I wait and then when that thing changes, the kernel will awaken me a lobby to run it, right? This is the design pattern for sleeping on your system, these are implemented using something called a wait chain, right? So, and again, now we have these two approaches, right? We can activate, we can set their spinning waiting for something to happen and we can also sleep, right? There are actually, and this is an interesting sort of design decision, there are cases when either one of these things can be the right approach, right? So there are cases in real operating systems where I use spin locks and there are cases where I use sleep locks, yeah. What do you mean? I mean, it could be like two threads which are with spinning. Yeah, I mean, either one of these locks allow arbitrary number of threads to enter and leave the critical section. The question is, while I'm waiting, what do I do, right? Yeah, I mean, it can be pre-empted and send back to the ready queue, right? Yeah, I mean, in the general case with spin locks, unless you've done something to disable this, right? Your thread can be descheduled and then run again later, right? And as soon as it starts up again, it'll start checking the testing set again, right? I mean, yeah, say we have disabled it, so will that be considered sleeping or still spinning? Are you using the CPU? No. Then you're sleeping. If while you are waiting, right, you are actively checking the testing set over and over again, right? The thread that goes to sleep will not run again until the condition changes, right? So I see what you mean. If I was running and I was checking the testing set and I was descheduled, I'm spinning, right? And I start up again, because I haven't told the kernel, like I haven't gone to sleep, right? The kernel has descheduled me, but I haven't told the kernel, this is what I'm waiting to happen and only wake me up when it happens, right? Yeah, that's a good question, right? So even if I'm not running, but when I start running again, I'm checking the testing set over and over again, that's spinning, right? I'm using my time actively to check the variable over and over again, right? I'm not using the feature of the kernel to tell me when something is happening, right? So these are good questions and I think it's prepared you guys to answer this question. So when do you guys think that I should spit, right? So sleeping sounds like a really nice idea, right? I can leverage this feature of the kernel to essentially make sure that I don't waste any CPU time while I'm waiting for whatever it is to change, right? But why would there be cases where I actually need to spin? Yeah, Sean, okay, you're getting towards an answer, right? Why, so what are you getting at? If I'm spinning, what am I kind of certain of? Right, so remember, getting from running, so what do I need to do in order to get from running onto the sleep queue? What has to happen? What has to take place? Yeah, and what do I have to do to deschedule a thread? I have to perform a what? A context switch, right? And there's overhead to performing a context switch, right? So we're getting towards the right answer or a answer. When is a case when I would not want to go to sleep? I have all the stuff I have to do to put the thread to sleep and I have to change the state of the kernel, I'm gonna pick another thread to run. I might spin if what, Jeremy? Okay, that's, yeah, that's another subcase, but I think it's something a little bit more, a little bit different here, right? Frank, right, so we'll come back to that one, right? If the critical section is short, right? Because if the critical section is short, if there's somebody inside of it, right? And also I would say if the critical section is short and there isn't a huge amount of contention for this critical section, right? Because if the critical section is short and there aren't that many threads also banging on it, the likelihood is I won't spin for very long, right? Because I'm only gonna do this, I'm like, we're a system, somebody else will be running, that thread that's inside the critical section is probably running on another CPU, so I'm just gonna sit there poking it a few times and quite quickly the critical section will become available and I'll enter, right? The reason I don't want to sleep is it might actually take me longer to context switch, be put on the sleeping queue and then be recontext switch. It might be more efficient from the perspective of the machine to just wait by checking the test and set over. So when I have very, very small critical sections then I can use them, I can use the spin lock to protect them, I could do that efficient, right? Is there a question back there or is it a? What is your name? John, okay. Okay, okay, great question. Who makes this decision about whether or not to use a spin lock or a sleeping lock? The programmer, right? The kernel doesn't do this automatically, right? There might be profiling tools that you could run to try to help you make the decision, but in general, this is a decision that kernel designers make, right? When they're writing various pieces of code, right? They look at a piece of code, they say okay, you know, I'm a, there's only a few instructions, right? So I can use the spin lock, right, yeah. Yeah, so well, okay, so you could implement some of these in user space, actually. Some of these permits, that's a good question. I don't think test and set is a privileged instruction, but it might be. Can you ask that on Piazza, because I'll look at it. I don't see any reason that a test and set would have to be a privileged instruction, right? It's not modified in any privileged state, right? It's just, it's a nice feature that allows multiple cores to coordinate, right? So it's, I don't, good question, yeah. I think that you could use a test and set in user space. Yeah, yeah, okay, so this is another good question, right? Let's say, you know, I mean, is there a case there that a thread could be stopped in the middle of a critical section, right? Does anyone want to venture a guess? I mean, there's two answers to this question, and one is pretty categorical, right? So if I was guessing, if I was a guessing man, what would I say? Done. Is there a case when a thread could be stopped in the middle of a critical section? Oh, hmm. Jeremy. I don't believe. I'm still looking for a strong answer here. I mean, in general, the answer is, of course, right? Because you guys are defining critical sections, right? So what could happen in a critical section that would cause the thread to stop running? Yeah, it might do disk IO, right? So actually, this is, and this is another case where you would never want to use a spin lock, right? So let's say I grab a spin lock, and the first thing I do is initiate some disk IO, right? Which takes, remember, how long does disk IO take on your system? In rough terms. Forever, right? Like that's a technical definition of how long disk IO takes from the perspective of the CPU. So I've got a spin lock now, and now I'm gonna write a bunch of stuff to disk, right? So someone else is gonna try to grab that spin lock, and what are they gonna do? Gonna sit there, exhausting every one of their quantum for forever, taking the test and set, right? So yes, if you do anything that would cause you to sleep, don't grab a spin lock, right? Never grab a spin lock, right? Because someone else will waste a huge amount of time, right? What else could happen inside a critical section that would cause you to stop running? Yeah, sure. So okay, what did we say about test and set? Who provides the atomicity of the test and set instruction? Right? The hardware, right? So remember, I wish I could go all the way back, but I don't want to. That code I showed you, let's see, that's, maybe I shouldn't have used C code to do that, but that's executed atomically by the hardware itself. So there's no way to be interrupted between the test and set. Those happen at like the same time, right? You could think of them as happening at the same time. Yeah, yeah, Sumit. Who implements the test and set instruction? It's implemented by what? Are you saying this is the hardware? Hardware, right? So, I just want to answer this question first. So it's a hardware instruction. So what is hardware implemented by? Like a bunch of like ICs, and it's an instruction offered by the CPU, right? There's no code in it, what's that? But it's not even hard-coded, it's hardware, right? Hard code, what does that mean, hard-coded? Yeah, it was coded by somebody at Intel, right? Maybe in their simulator, right? But it's baked into the chip, right? When you run that instruction, some series of transistors, you know, some series of things will fire and those things will happen at the same time, right? So it's like an ad or a subtract or a jump. It's a hardware instruction, right? So I don't like the word hard-coded. Sounds like a programmer did something, right? And they didn't. Yeah, Spencer. Are you saying the CPU did something? It is one instruction, yeah. It wouldn't even, no, no, it wouldn't compile that. The compiler would generate a bunch of instructions. I'm saying there is a single hardware instruction that implements that SQL. So on MIPS, there's like, I don't know what it's called, but there's a single instruction that performs those series of operations, right? The compiler wouldn't generate it. The compiler's not smart enough, right? How would you generate this instruction if you were programming a kernel? Does anyone know? You can look in your OS 161 code to see how to do this. There's a way when you write C to include little bits of assembly code. And that's probably the only way to do this properly, right? Because you have to tell the compiler, I want to use the test and set, and you do that by including a little piece of the assembly code. So you tell it exactly which instruction to use. All right, these are great questions, all right. Were there more? I thought there were more hands up. This is a good, okay. So when we think about spinning or sleeping in general, what we're trying to do is we're trying to balance the length of the critical section against the context which overhand, right? So in this case, if the critical section is very short, right? Thread one comes in, tries to get the lock, fails, decides to sleep, right? So all this context which overhead occurs, and now it's running down here, and it could have entered right here, right? So this is a case where because of the length of the critical section, a spin lock was the right choice. I used a sleep lock, and I paid the price of doing a context switch when I didn't really need to, right? If the critical section is long and I use a spin lock, then this is the problem I have, right? So now what's happened is this thread is inside the critical section, and maybe it's, there's just a lot of instructions to execute, or it's done to my O, or whatever. It's just taking forever. And CPU two is now occupied by this thread spank, right? So these are, these are the, these is, this is the what not to do, right? Critical section short, I used a sleep lock, I paid the context switch overhead, didn't need to, critical section was long, used a spin lock, banged down the door for a long time, didn't want to, yeah, sure. I have two thread ones, yeah, so it should be thread one and thread two, right? Yeah, yeah, yeah, bugs in the slide. Everyone, it's like a bug hunt, you guys sit out there. Problems, yeah, a lot of problems. Frank, good question. So what is, what is this, what is this relative to? Right, right, say short and long, what am I balancing it off against? What, what specifically? So, so that would be longer or shorter, right? This was just up on the slide. Someone should know that, nothing. It's the context switch overhead, right? So, you know, what I'm saying is, if I wait, right, if I busy wait, right, what's the probability that I'm gonna get a lock pretty quickly, right? If I context switch, I know I'm gonna, it's gonna take X amount of time, right? Cause I'm gonna go all the way into the kernel and I'm gonna sleep and the kernel's gonna put me somewhere and I have to save all my state, right? So that's really, it's the context switch time, right? The context switch overhead that really creates the balance point for a spin or a sleep block, right? This also depends on lock contention, which I didn't get into because I don't wanna like blow people's minds a little bit, but it really depends on, you have to think about how long will I have to spin, right? If the spinning life gets really long, I should wait, right? And I should let the kernel tell me when to wake up, right? If the spinning is pretty short, then it might be worthwhile, right? It's just a few extra instructions and I can get the lock and keep going, right? Yeah. Yeah, yeah, there is. And I mean, on modern systems, if I was like a Linux hacker, if I was like a big Linux developer, they probably know, right? Probably know almost exactly how many instructions it takes to perform a context switch, right? And they've probably been trying to reduce that number for a long time. So, in general, and again, your kernel provides a version of this. Sleeping requires, you know, remember we talked about how to sleep, we said I have to tell the kernel what I'm waiting for, right? And normally, kernels provide some type of primitive that allows me to say, I'm waiting for this particular thing, right? A key value or, you know, again, an OS 161, this is idea of a weight channel. It's like a line that you get in or Q that you get on that's associated with some event, right? It could be associated with a lock. It could be associated with a piece of hardware that is periodically delivering interrupts or whatever. But the idea is that there's a semantics associated with it. So, you know, when the thing associated with this channel or this key happens, the kernel will wake up some number or all of the threads that are weight, right? So the idea is I say to the kernel, I'm waiting for this thing. And the kernel knows what that thing is. And then sometime later, someone else will say, this thing happened. And the kernel will say, okay, I've got a bunch of threads over here waiting for that thing. I'm gonna let them all run again. And I'm gonna put them into the ready queue and mark them as running, right? And this goes back to the question that maybe who asked it about users doing this in user space. So a certain amount of this can be done in user space. And you guys know this, right? Because, you know, you use languages that provide synchronization permits, right? So we're focusing in this class because it's a class on operate systems, on multi-threading within the operating system. But, you know, almost any user space application is multi-threaded and they have the same problems and they use the same primitives, right? They may be implemented slightly differently, but in general, they're really are the same, right? So if you use Java, you have locks, you have monitors, you have various types of synchronization primitives, all modern languages provide these. Because they're necessary to do a sort of multi-threaded programming, safe. All right, we're right at the gate to CVs. So I'll stop today, we'll do CVs on Friday and I will see you on Friday.