 Good morning. Friday. End of synchronization week. OK, now that's a good idea. Let's have a round of applause for Friday. All right. Friday, we all get a couple of days off. So today, we're going to finish up synchronization week. And we're going to finish up synchronization week by talking about synchronization problems. So I'm using problems in two different ways in that sense at the same time, because I can do that. Problems in terms of problems with synchronization, what are the things that can happen? The bad things that can happen when you try to synchronize your code. So we're going to talk about deadlock, which I am guessing that everybody in this class will create a deadlock by the time they finish these assignments. If you don't, then you're not trying hard. And then priority inversion, which is a little bit more, what do I say, more of an exotic synchronization problem. But we'll talk about it today. And it's kind of a neat case where there's an interesting interplay between scheduling, which we'll talk about next week, and synchronization, which we're finishing today. Do I hear music? Robert, we're having class now. Can you turn off the tunes? And then we're going to talk, we're going to work through together a sample synchronization problem. So we're going to walk through one example of a case where I have a couple threads that need to be synchronized. And we're going to talk about, thank you. We're going to talk about, I wouldn't mind having a background track for this class, right? Should we just start leaving the music playing throughout the entire class? We need some very sort of drum and bass electronica or something like that, something that I can talk about. Maybe we'll experiment with that next week, maybe not. And then we're going to walk through this problem together. And decided to do one problem today. You will see more synchronization problems during recitations next week. Sonali's going to be covering several example synchronization problems in her recitations. And if you've seen Assignment 1, there are several synchronization problems we ask you to solve on Assignment 1 for real in real code. So synchronization is like anything else. Practice makes perfect. So the more examples you walk through, the more chances you have to see these kind of acted out, the more likely it is that when you actually have to do this on a real assignment when it matters, you guys will be ready. OK? All right. So a couple of announcements. Assignment 1 is out. Hopefully everybody saw the email yesterday. And my typos go on the slide. Please get started on Assignment 1. The technique that many of you guys employ down Assignment 0 of waiting until the very last minute to start I think will not work as well on Assignment 1 if it worked on Assignment 0. If it didn't work on Assignment 0, then you really have a reason to get started on Assignment 1. But even if it worked on Assignment 1 on Assignment 0, I would suggest that you start this earlier. Because this is the first time this semester where we're actually going to ask you guys to write some real kernel code. And we're going to test that code for you. We're still kind of working out exactly how we're going to do the testing. And I'll tell you more about that as kind of the assignment progresses. But for now, all the instructions that you need, all the requirements for the assignment are up on the website, and so you can get started with it. And at some point, it will update you about exactly how to test your system and what we're going to require for that. So again, as I announced, on Wednesday, the Tuesday recitation has been canceled. The Wednesday recitation has been moved to Talbert 106, which apparently, according to the email I received, is a room that has technology. I mean, at some level, all rooms have technology like, this is technology. This is older technology than we're used to talking about. But anyway, this room maybe has electronic technology, like something like a projector, for example. I moved my office hours partly, I shouldn't say partly, entirely because they conflicted with another meeting I had scheduled a long time ago. And also, that's it, I think. So we're done fiddling with this. I think this is how we're going to rock for the rest of the semester. And the website is your best source for all this information. And I'm working on getting the slides in the video up there. We're continuing to learn how to do that better and better as the semester goes on. Any questions about this? Has anybody started the assignment? Raise your hand. Were you able to get the sources that we updated? OK, awesome. Because somebody was having a problem with that, I wasn't sure if I fat-fingered it or botched it in some way. All right. So last time, we kind of did this whirlwind tour of synchronization primitives. We started out talking about the trade-offs between spinning to wait for a resource and sleeping to wait for a resource. And then we went through locks. We talked about condition variables. And we talked about semaphores. And several of those introductions were a bit brief. So if you guys have questions, this would be a good time to ask. Of course, we're going to do our review. Anybody have questions about stuff that we talked about on Wednesday? Yes, up front. Yep. Isn't there like a hybrid lock? Sure, sure, sure. Yeah, I mean, there's a number of different ways to implement locks. Some of the things that I've read when they talk about hybrid locks or hybrid mutexes, what they're actually referring to are mutexes that will spin on a multi-core system, but will sleep on a single-core system. Because remember, we talked about on a single-core system, spinning is just almost never, ever a good idea. Probably categorically never, but I'm sure there's some corner case that I'm forgetting about. But yeah, I mean, and if you wanted to implement that for your kernel, go ahead, actually. One of the things I want to make sure I impress on people is you guys start programming in OS 161 is this is your system, and you are free to build whatever extra functionality you want into this system. So there are actually some things that students in the past have found helpful to do. One of them that just came to my mind was that if you've seen that we've given you implementations of some kind of useful container data structures. So there's a queue implementation that's used a few places. There's an array implementation that you may want to use. I don't think so. Maybe David has changed this. But for a long time, those implementations were not thread safe, meaning that they were not safe under current accesses. And if you want to use those data structures with multiple threads, you might want to actually do the work to make them thread safe, to make them safe under synchronous, under interleaved access. And that might be something that helps you do later assignments, because you can imagine, I mean, one way of synchronizing access to a data structure is just to synchronize the methods that you use to access the data structure. And then people that use the data structure just get the synchronization for free. So that's just one example of something. But anyway, don't think that the assignments are where you have to stop. If you are solving a problem for assignment two and you think, hey, it would be really cool to have this different kind of lock, build a different kind of lock. I mean, that's kind of what's fun about this class. This is your code base. And the tests that we run, especially for assignment two and assignment three, are just tests that things are working. And how you guys get that to work is totally up to you. So there can be times when the right thing to do is to actually build a new primitive, or build a new data structure, implement a new data structure, or add some functionality to something that's already there. And that might not be in the assignment, but it might be the best way for you to complete the assignment. Any other questions about 74 loxivies? Yeah. Semaphores, yeah, yeah. So that's tricky, actually. So I have one example today of a use case for a semaphore that actually came up. And it's actually in your code tree now, because I used it in the driver code for your synchronization problem. So we'll talk a little bit about it. In general, as I said before, semaphores with their semantics just don't always tend to lend themselves in an obvious way to certain problems. Lox and condition variables are frequently a better choice. But remember, semaphores are counters. So in cases where you have a counted resource, semaphores might be a good fit. Semaphores can be implemented in a variety of ways. You can certainly, yeah, part of the internals of a semaphore is a count. That count is shared by multiple people who are accessed in the semaphore. One way of establishing a critical section that allows me to modify that count is to use a lock. Yeah, that's a certain ledger. All right, any other questions about primitive slacks, condition variables, semaphores? No other questions? Robert? Yeah, so why don't we come, can I answer that question later? So we're going to work through today an example that uses condition variables. But in that example, there is a critical section that we will talk about. Yet the other thing I would encourage you to do is, and you will need to do this for assignment one, go look at the implementation of semaphores that we've provided. So semaphores are properly implemented in the kernel tree that we gave you. And if you look at the semaphore implementation, you should be able to identify the critical section. And the way that you can identify it is how do you think you would identify the critical section in the implementation of semaphores that we've given you? Like I said, it's correctly implemented. So the critical section, it has a lock, acquire and a lock release that bracket. And you should look at that, and you should look at what's inside the critical section. And you should tell yourself there's actually a big fat comment by David in there, too, explaining a little bit about why locks are acquired and released in a certain order. So you should look at that, and you should convince yourself, why is this stuff in the critical section, as opposed to some of the other things that might be outside. OK, other questions? So please today, I know I've been doing, I haven't given you this bad habit of jamming a little bit too much stuff into the lecture so that we have to go into turbo mode at the end. So I've tried to avoid that today. I probably failed again. But I don't think that I did a fantastic job of teaching the primitives on Wednesday. So if you're confused, it's my fault. But please stop me today, or slow down, and we'll come back to this stuff. You're also going to learn this by doing it. So I feel a little bit like, OK, I'm off the hook. Any other questions about this stuff? Before we do our usual review? So I'm going to work over here today. So spinning is almost never a good idea on a what? Single core system. I'm picking on people. You guys can't help. Single processor system. But let's say I'm on a multi-core system. So I need to establish a critical section. I'm trying to decide whether I should spin or I should sleep. How do I make that decision? You spin if the critical section is short, and you sleep if it's long. That's right. So if I know that the critical section is short, I might consider spinning. I still might sleep, but I might consider spinning. And if I know the critical section is long, then it's almost always a good idea to sleep. All right, let's keep going here. Why? Wow, there's it. It's only three letters, but such it. So what happens if I sleep? But how do I do that? What's the mechanism? Sleeping means that the thread that's running is going to what? Stop, and that means what? Well, what happens after that? The processor just sit there doing nothing? It goes a different thread. Right, so what do we call that? Context switch. And what did we say about context switches? It's expensive. Exactly. So if, and the other question that's not up here is, when I have a critical section like this and I have multiple threads that are trying to bang their way into it, what am I trying to do? What's the goal? I've got a bunch of threads that need to go through this little critical section. And to make the system more efficient, what would I like to do? So that's important for correctness. Only one thread can be in the critical section. But once that thread leaves, what do I want? As soon as possible. So I don't want the critical section. If I have a lot of threads that are trying to get into a critical section, you could think of the critical section as a shared resource. And I don't want it to sit there idle because a bunch of other threads are sleeping. And so if I have a small critical section, it might make sense to spin. Because spinning might actually improve the throughput through the critical section. If I don't, again, if the critical section is short, the context which overhead will waste cycles. If it's long, spinning is going to waste those cycles. How far is east from west? I mean, that's a design decision. It depends on how long the context which overhead is. It might depend. And critical sections don't always have a fixed length. I mean, critical sections might have loops in them. They might have arbitrary control flow. So sometimes it can. Now, I think one of the things about designing a good critical section is to try not to do that. Do as little as possible in the critical section and have the critical section be as predictable as possible. Because then you can choose a primitive that works. But in general, I mean, this is a trade-off that really depends on the sizes of other things. So I think I've successfully whizzled out of that question. But I think the answer is really that it just depends. Oh, there are questions about this. Yeah. Again, I mean, it really depends on the system. With your system, you can go and you can just, I mean, you can profile the system and you can essentially count the number of instructions. So it's kind of from, essentially, what are the boundaries? So what starts a context which? It's going to do two. What's that? No, but what initiates a context which? What are the two ways I can initiate a context which? Either a hardware interrupt, which might initiate a preemptive context which, or a thread might go to sleep and essentially ask to be stopped. So essentially, I can benchmark from when that happens, how many instructions are executed until the other thread begins to run. And that's going to include the overhead potentially for the scheduler, which we'll talk about next week. But it definitely includes the overhead to save all that state that I need to restart the thread the next time it begins running. I'm not going to call it constant. It might depend, particularly, on how much work the scheduler has to do or other things that are happening on the system. Something's about to collapse. OK, what is the interface to the lock? To a lock. What are the two functions? Let's get one function back there. Yeah. What can I do to a lock? What's that? A simple lock. Yeah, a lock. I mean, what can I do to it? What's the interface? Do you know what the interface is? So this is important to understand what an interface is. Let's go forward, Robert. What's one of the things I can do to a lock? I can acquire it. I can call a lock a choir. If the lock is available, I'll be granted it. If it's not, I'll be put to sleep, or maybe I'll spend depending on what kind of lock it is. But when lock a choir returns, I have the lock. That's the semantics of lock a choir. What's the other thing I can do with a lock? Release it, right? Pretty obvious, right? And that's not true, actually. Whoops, bug on this lie. Release the lock. Lock release will not sleep. Don't look. OK. What are locks for? What are locks for? What's the primary purpose of a lock? Yeah. Protecting critical sections. That's the most common use of a lock. To protect a critical section or to protect a piece of state. That's accessed in critical sections. OK. What's the interface to a condition variable? Anybody remember? Let's see. Wait signal broadcast. Wait signal broadcast. Oh, man, we got this. I was going to try to trip these out slowly, but it took all of them all at once. OK. So I can wait on a condition variable, which indicates that I'm waiting for the condition to change. I can signal, which means that the condition has changed, and I want to wake up threads that are waiting on it. And signal will wake up one thread, right? And then there's also a way to broadcast, to say to every thread that's sleeping on the condition variable, the condition variable is changed, right? OK. Now, what are condition variables for? I mean, we just talked about the interface. I mean, what does this sound like a good fit for? Yeah. So it allows me to convey more information than a lock, right? But again, I mean, condition variable, right? I mean, think about the name. What is the condition variable useful for? Right. If I'm waiting for a condition to change, right? If there's some condition that I want to essentially be able to wait for to change and to notify other people when it's changed, right? So when we come back to an example today, use condition variables, you'll see that there's a very clear condition that's embedded in the statement, right? And part of using condition variables is identifying what the condition is and identifying when the condition changes, right? Because when the condition changes is when you need to use one of these signal or broadcast mechanisms, OK? Yes, the signal changes to shared state. That's what I just said, kind of. OK. Now, I actually didn't tell you this last time. But if you look at the interface in OS 161 and in general, condition variables are always associated with a lock, why? Can anybody figure this out? So it's condition variables to signal changes to shared state. That's true, but why would I always have a lock associated with a condition variable? So normally, the semantics of condition variables are, in order to call CV signal, broadcast, or weight, I actually have to pass the lock and I have to hold the lock when I call those functions. Why? Yeah? Point of shift. OK, so it means I have to be the owner. But what is the lock protecting? Signal. So condition variables signal changes to shared state. How do I protect shared state? Lock. With the lock. So the lock is there to protect the shared state. It means that, because again, we'll see how to use CV weight, right? CV weight is normally used, check a condition. If the condition isn't the way I want it to be, weight. In order to protect the condition from changing, in between the time I check it and the time I call weight, I need to have a lock. That lock protects that condition from changing from some other thread changing it and me going to sleep. If that happened, I might never wake up. Because I might be waiting for a condition to change. The guy just changed it. I didn't get the signal. I didn't see the change to the state. I'm sleeping and now he's gone. So that would be a case where that would not be good. Lock protects the shared state that the condition variable allows me to signal changes too. Semaphore interface. Anybody remember? P and V. So Carl? Carl came to my office and we were having this discussion. He said, I always struggle using semaphores. Is that OK to say? Yeah. And I would say, and I have the same problem. And I think the problem is this stupid terminology. So it turns out, does anyone know why P and V are used? It was Dijkstra, right? And what was the problem with Dijkstra? It was Dutch, right? So he spoke Dutch. And it's funny because we looked this up. It turns out that so I was always taught that P is for proberan, which in Dutch means to test. And V was for something else that I forgot, right? Which means they know to release or whatever. Apparently, according to Wikipedia, it's even worse than that. That the word that Dijkstra said that P stood for, it's not even a real word. It's what's called a portmanteau. It's a combination of two words that he made up. So not only was he speaking Dutch, but he was making up new things in Dutch. But anyway, P and V, I always have to stop and think, what do those mean? And it's annoying. And we could just say, hey, Dijkstra, thanks for your legacy of incredible contributions to computer science. But we don't care about your stupid interface. We're going to call them up and down. And then we can all use them correctly. But I guess we did that. But you can do that if you want to. You can change your semaphores. You can change the semaphore interface. Just make sure you fix all the places where it's used. OK. So yes, P, decrement the semaphore. So semaphore is a shared counter that I guarantee atomic access to. And the semantics of the semaphore are, it cannot go below 0. So if I try to decrement it below 0, that decrement operation will have to wait until somebody else increments it. So another way of thinking about semaphore semantics is the semaphore is not allowed to go below 0. If I try to make it below 0, the system is going to stop me and say, nope, you cannot complete that operation until somebody calls V an increments. OK. So how are semaphores different from locks? So a binary semaphore. Let's say I have a semaphore that only is 0 and 1. And I don't only use the P and V. Yep. No sense of ownership. No sense of ownership. So a lock has an owner. And when you call lock release on a lock, one of the things that your lock release implementation should do is ensure that the thread that is trying to release the lock is the one that holds the lock. And that's just a semantic requirement. And it's designed to make locks easier to use. And the fact that semaphores don't have that semantic requirement can make them more difficult. All right. Any other questions about this stuff? Locks, CDs, semaphores. You guys are also going to implement reader-writer-locked, another synchronization primitive. And I will leave that to your reading of assignment one to figure out what those are. OK. So let's talk about some of the problems with locks. So up until now, we've really kind of thought about, or maybe I've been thinking about, I've been in this fantasy world where threads get one resource at a time. And in the real world, threads may need access to multiple resources. And locks may be used to protect those multiple resources. And so acquiring multiple resources may mean that I need to get multiple locks. And this is where we start to run into problems. So let me walk through an example. I've got two threads, thread one and thread two. Nope, actually, there was threads A and thread B. They both need simultaneous access to resources one and two. So let's imagine that the following happens. Thread A ones grabs the lock for resource one. Bam, context switch. Thread B runs, grabs the lock for resource two. Bam, context switch. Really, I have fun with that, but I'll stop now because I think it's starting to give me weird looks. Thread A runs, tries to acquire the lock for resource two. Now, what will happen? It's going to wait. Let's say it goes to sleep. Now, what happens? Thread B runs, tries to get the lock for resource one. What will happen to thread B? It will also go to sleep. What is thread A waiting for at this point? It's waiting for thread B to wake it up. If I didn't want to sleep, it said, hey, it's like if both of you got it. Imagine you're in a class that's at 9 AM and you have a midterm. And you and a buddy decide to partner up. And you say, hey, I'm going to call you at 8 o'clock to wake you up. And he says, I'm going to call you at 8 o'clock to wake you up because both go to sleep and nobody wakes up. All right. OK. So this is a condition that we call deadlock. And deadlock occurs when a thread or set of threads are waiting for each other to finish and nobody ever does. And deadlock is always associated with what we just established here with a circular resource dependency. So thread A has resource one is waiting for thread B. Which has resource two, which is waiting for thread A, which has resource one. If you can't establish a cycle in your resource dependency graph, you cannot have deadlock. And that's one of the ways that we use to fix this problem. Now, people might think, well, deadlock sounds really like it's an interesting problem. But thank God, if I only have a single thread, I can't deadlock. A single thread can deadlock. And I actually think from my experience with this class that a lot of times the deadlocks that you guys experience in this class are single thread or self-deadlocks. How could this happen? How can a single thread deadlock? Ben? Yep? Oh, yeah. You got ahead of me. So in general, yes. Thread A requires resource one. And then thread A tries to re-acquire resource one. OK? So this is even worse. Now I have this little, tiny little loop here where thread A is waiting for who to wake it up. Thread A. So this is not good. Now, again, if I just state it this way, this seems totally an A. Like why would anybody do this? Just it seems like this incredibly dumb programming bug. It's like if you set a variable to null and then try to dereference it in the next line of your code. Like this just seems so stupid. Why would this happen? Ben actually already told you. Did anybody here already said? Right. So imagine that I have function foo, which needs access to resource one, which it's going to do by locking it. I have function bar, which also needs access to resource one. And while holding the lock to resource one, foo calls bar. And this happens on systems. And it happens especially when you have multiple people developing the system and you don't want to know what bar does. As the foo developer, you just want to call bar and expect bar to do what you want it to do. What you don't realize is that bar needs exclusive access to one of the resources you're holding and you calling it with a lock held means that you will solve deadlock. So this is not an uncommon thing to happen when you're writing complex systems. And it might not be this simple. It might be foo calls, you know, baz calls, cab calls, cat calls, bar. So the function chains can get arbitrarily long. Can we solve this problem? So can we get out of the situation? Remember, I mean, when you guys start to do, I alluded to this before. When you guys start to write code for OS 161, there are two ways to solve problems like this. One way to solve a problem like this is to fix the code that's causing the problem. Another way to, what's another way to solve a problem like this? Talk to the second function and the second function will verify that they're on it. So that's an interesting idea. If you look in our code, in general, the locks that we've given you, the lock do I hold or call. So there is a function called lock do I hold. Using that in your code generally means that you're trying to do something bad, like dumb. There's only a couple of cases where that function is actually used correctly, and you guys are going to write one of it. But what about this? So what's part of the problem here? Locks, the lock implementation that we're talking about here, will self-denlock. So if a thread tries to re-acquire the same lock, it'll wait. Can I fix this? No? Robert. Well, let's not talk about programming solutions. Let's talk about semantics solutions. You have a guess? So I'm going to accept that answer. And I'm going to go with it. So I am going to make the lock function return. I'm going to allow the lock function to proceed if I already have the lock. But what am I going to do? So that would be what would happen that would be bad. But what I can do is I can write a recursive lock implementation. And what a recursive lock implementation does is it says, I'm going to relax the semantics of locks. I can write a separate implementation. This is usually the right way to do this. Don't change the semantics of something that's already in use. Write a new lock that you can use in cases where the lock needs to be locked recursively at multiple levels of the function stack. So I can write a recursive lock implementation that says, if you already hold the lock, I'm just going to return. But what do I need to do to make the semantics of this sane? Essentially, what I want to be able to do is lock a lock that I already hold, and I don't know that I already hold it. And then later I want to be able to release it. So I'm going to end up with multiple calls to lock acquire. What do I want to make sure about lock release? So I think from this murmur over here, we have the right answer, which is I need to make sure calls to lock acquire and release are paired. And the way I do that is I can use a counter inside the lock itself. So when I lock the lock multiple times, I essentially keep a depth on the lock, and I keep incrementing the depth. When I call lock release, if the depth is non-zero, I just reduce the depth and return. So essentially, I can lock the lock at five different levels in the function stack. And then when the guy at the bottom calls release, the lock is actually not released. Because the guy, as I pop up the stack, everybody needs to call release again before the lock is actually released. So you can build this lock in mutation. It's not that complicated. And there are cases in your kernel where it actually might work well. But this is a way to solve, one way to solve the self-deadlock problem. All right, I already said that. OK, so let's talk about the conditions for deadlock. There are several conditions. One is that I need protected access to shared resources, which means that when I try to get a resource, I might wait. If I don't have thread sleeping, I can have this deadlock where multiple threads are sleeping waiting for each other. Second thing, in this system, resources are not preemptible. So we talked about the CPU being preemptible. What does that mean? Does anyone remember? When I preempt a thread who's running on the CPU, what do I do? Essentially, what I'm doing is I'm taking the CPU away from the thread, and I'm giving it to somebody else. In our world that we've established here, resources are not preemptible. Once a thread grabs a lock, I can't tear that lock away from the thread and give it to somebody else. So that's what creates this problem, because if I could do that safely somehow, I could get out of this by essentially recognizing that there's a deadlock and basically stopping one thread and saying, OK, these other threads could run, and then maybe I'll give it back to you later. So multiple independent requests is the fourth condition. That means that I'm allowed to hold one resource while I request another resource. If I can't do that, I can't deadlock. And finally, I need a cycle in my dependency graph. I need to have some loop where I can establish that thread A is rated for B, C, D, and it has to loop back around. The reason that I bring these up is that if I relax any one of these conditions, then I cannot deadlock. So I need all four of these to deadlock. And so let's go through an example. So this is like this classic synchronization problem. It's probably in the textbook, and it's discussed all of this. So this is the dining philosopher's problem. Imagine that I have these five people who were labeled on the web as modern philosophers. I have no idea who these guys are. And they don't look super modern, right? But anyway, I should know who some of those people are. Well, my liberal arts education has totally failed. And they're sitting down to eat, and each one of them needs two chopsticks to eat. The other thing I found when I was creating this graph, it is impossible to find a picture of just one chopstick. You look for chopsticks online, every picture has two chopsticks in it. So I made chopsticks using thick black lines. And so let's imagine what happens here. Let's say that my algorithm is I grab the chopstick on my left, and then I grab the chopstick on my right. And what can happen here? Well, this guy gets this chopstick, this guy gets that chopstick, that old dude gets that chopstick, this guy gets that chopstick, that guy gets that chopstick. Everyone has one chopstick, right? And no one at this point is going to make any forward progress, right? Because they're all holding one chopstick, and no one is going to surrender another chopstick so that anybody can eat, right? So this is this classic sort of sad problem of the hungry philosophers, right? So how do I solve this problem, right? So breaking deadlock conditions normally, I shouldn't say normally, always requires finding a way to invalidate one of the things that are necessary for a deadlock to occur, right? So in this case, let's think about it, right? So the first thing is, remember I had these resources that I was going to wait to acquire, okay? So the first condition is if I can't grab the second chopstick, drop the first chopstick. Now the locks that we've given you don't support this, right? Because there's no way to tell if the call is going to succeed, right? I just have to call a choir, and if the resource is held I'm going to sleep, right? So this requires some sort of different approach to figuring out how to allocate these resources, right? Another idea is don't make multiple independent requests, right? So in that case I was requesting one chopstick and then the other chopstick. So a solution to this is somehow figure out a way to get both of them at once, right? Maybe I lock the whole table and I grab two chopsticks and then I drop the lock on the whole table or whatever, right? So this would require some different approach to this problem than the one that we established. All of these do, right? Another way, so this is probably the one that would allow me to use the idea that I've already tried here, right? If I can establish an ordering that allows, that means that cycles cannot occur, then I can avoid deadlock. So let's say I number the chopsticks at the table zero through four, and the semantics are that I always grab the lower number chopstick first. That cannot produce a cycle. I'll let you think about why, but what does it also mean? So if you remember the philosophers all grabbed the chopsticks with their left hand first. Will this lead to the same approach? No, there will be one person at the table who because of the ordering, so if I have chopstick two and chopstick three, or let's say I have chopstick three and chopstick two, I grab the left one, right? Everyone at the table will grab with one of their hands except for one guy, right? Who will grab with the other. And it's that guy who allows this cycle not to deform, right? Because his first request will fail, and the other chopstick next to him, the other guy will be able to grab, right? So that will break this condition. And then the final solution is to just detect the deadlock and preempt one of the resources, right? So imagine that I see this condition and I just go in and I whack one of the philosophers over the head, right? And he drops his chopstick and then we can all, we can all go full, right? So that's another way to do this. And the mechanism for doing this is really not clear, right? So, and it's certainly not clear, give an end to the tools that we've given you, but this is one thing you could potentially do, okay? Let me talk about two other things in relation to deadlock. So there's another problematic condition that you can run into, and you guys may experience this when you work on your reader writer locks for assignment one, right? So there's another problem that also prevents forward progress, which is called starvation. And starvation is not deadlock, right? In a deadlock system, no one will ever make forward progress. In a starvation condition, the activity of some threads is preventing other threads from ever acquiring the resource, right? Meaning that, you know, if a thread continues to show up, this other thread will never enter the critical section, right? Will never get access to the shared resource. And I'll leave you guys to thinking about how this works with reader writer locks because you need to be careful because badly built invalidations of reader writer locks can starve writers, okay? So one last observation about deadlock. What is better, deadlock or race condition? Anybody wanna take a position? Deadlock, why? Because the program stops, right? I'll choose the deadlock, right? Because I can tell when my system is deadlocked. And in some cases, deadlock, deadlocks occur because I'm actually being too conservative about my synchronization. I'm doing too much work to protect resources, right? I need to relax that a little bit, right? Whereas race conditions usually happen because I'm not doing enough, right? So in this case, go with the deadlock, right? You will be able to tell if your system is deadlocked and frequently deadlocks, you know, I mean, one of the best things that happens with deadlock is the system just stops, right? So you might have a chance to break in with the debugger and try to figure out what the threads are waiting for. All right, so let me do this. I'm behind as usual, so I'm not gonna, let me not do priority inversion. What I am gonna do is talk about, let's go through an example together in the last 10 minutes. We'll do producer-consumer using condition variables, okay? And again, if I expect you guys to know priority inheritance, I'll record it and you guys can watch it online. Fair? Okay. So let's go through solving an example synchronization problem using one of the primitives that we've discussed, okay? So producer-consumer, right? This is kind of a classic thing that might happen in an operating system. I have some set of threads that are producers, right? And they're gonna put data into a buffer. And then I have another set of threads that are consumers. Those threads are going to withdraw data from the buffer and do something with it, okay? The producers and consumers share a buffer. This is part of the shared state, right? And the semantics are that if the buffer is full, I can't produce into it. So whatever producer is trying to access the buffer will have to wait. If the buffer is empty, I can't consume from it. So the consumer will have to wait, okay? Other things, so whenever you solve these problems when the best things to do is start by figuring out what are my requirements and then what are other things that I need to avoid, right? So for example, if there is room in the buffer, a producer should not be sleeping, right? A producer should not be waiting if there is room in the buffer. If I'm not careful, this can happen, right? And also the equivalent condition, consumers should not be sleeping if there are items in the buffer, right? They may sleep for a second and they should always be woken up if there are items in the buffer, right? They should not be sleeping for long periods of time while items sit there in the buffer. That's a problem. Now that might make your solution incorrect but it makes it not work the way I want, right? As soon as there's items in the buffer, I want them to consume and as soon as there's room in the buffer, I want there to be producers, yeah, yeah. So producers are putting items into the buffer. Consumers are withdrawing items from the buffer, right? Let me put up the code, maybe it'll make more sense, okay? So here's my skeleton code. This is our starting point, okay? So I've got two calls and assume that put and get our work properly, okay? So my producer is passed an item and puts that item into the buffer. My consumer pulls, tries to pull an item out of the buffer and returns it, okay? So what is the first thing I need to do here, right? What is the first piece of information about this system that I need to keep track of in order to establish the semantics that I want? What's that? Take my data, the buffer is full. So check whether the buffer is full. How do I tell if the buffer is full? Number of items in the buffer, right? So I need a count of how many items are in the buffer. All right, this count is gonna allow me to figure out whether the buffer is full, empty or in between. And depending on the state of the buffer, I'm gonna do different things, okay? Now that I've got the shared state, what do I need to do? Lock it. I need to lock it, but I also, right now I don't need to lock it, right? Because I'm not doing anything to it. So once I have the shared state, what's the first thing I need to do? When I produce, when I put something into the buffer, right? What do I need to do to the count? Increment, right? Yeah, I mean, until I start changing variables, I don't need, so I don't need to lock count right now. Because count is never accessed, right? So the first thing I need to do is I need to, oh, actually, sorry, okay. So let me add a variety of things to this, right? So I do need to update count, okay? This is a limitation of my slides right now. Things come in order on the slide, okay? And then the point is that, if I'm trying to produce into a full buffer, I need to, I can't do that, right? Something else has to happen, right? So if I'm trying to produce into a full buffer, I need to do something, okay? And then the equivalent for the consumer. If there's nothing in the buffer, I can't consume, I have to do something else, okay? And I need to update the count on the buffer, right? So this example, kind of, in certain conditions, this example might actually work, even if I just spin, right? But you had a great idea about something to do here, right? So what do I need to do to count now? No, no, I've got that, right? But what was your original suggestion? Right, I need to protect the count, okay? So I need to lock around the count, and then, but I already gave this away, right? But what synchronization primitive is a good fit for this? A semaphore, okay? That's, a semaphore would be one vote. Anyone else want to vote for something else? I need a lock for the count, for sure. Count is shared state. A condition variable. Why? Right, I have three conditions here, right? I have a buffer empty condition, in which case something has to happen or not happen. I have a buffer full condition in which case something has to happen or not happen. And I have the buffer in between condition, in which case something has to, multiple things can happen, okay? So I'm gonna solve this using condition variables, right? And that, and also remember, I have a variable too. I have a variable and it's conditions on that variable, right? So this is a classic example of this. I have a count, right? That's my variable, and I have conditions. This is one condition, this is the other condition, okay? All right, so let's talk about how to do this with the CV. So first, I need a CV, right? But what else do I need here? A CV is always associated with a lock, right? And as someone pointed out, I need the lock anyway, because I have shared state. I need to lock around count. So even if I wasn't using the CV for whatever reason, I would still need to protect count somehow, right? From this multiple axis. Okay, so now, so I definitely need to lock around my access to count, right? So I'm gonna add, well, actually let's see here, okay? So again, this is kinda gonna go faster than I want to, right? So let's look at what the producer is gonna do, okay? So I have, so as Robert, you asked before, right? Can I give an example of a critical section, right? Here it is an example of a critical section, right? I have a shared variable count that's accessed by multiple threads, either multiple threads running one produce or multiple threads running a mixture of producers and consumes, right? I'm testing the count here, and I'm changing the count here, okay? And in order to do that safely, I need to lock around this entire section, right? Okay, similarly down here, right? I'm testing the count here, and I'm changing the count here. So I'm gonna lock around that entire section as well, okay? Now I'm using the condition variable, and one of the conditions is, I'm testing right here. So for my producer, if the buffer is full, I'm gonna wait, okay? And on my consumer, if the buffer is empty, I'm going to wait, right? And the semantics of CV weight are that CV weight will drop, essentially CV weight, if the condition, you know, CV weight will drop the lock, right? Because here I hold the lock when I call CV weight. When CV weight returns, it returns with the lock held, okay? So when I check the condition again right here, I still have the lock, right? If CV weight didn't return with the lock held, I'd have to acquire it again. But because CV weight returns with the lock held, I can write this, right? So CV weight drops the lock, goes to sleep, when it's awakened, returns back to the thread with the lock held. So I can check the count again safely. All right, so we're done, right? This looks good. Is this correct? It looks good to me. I mean, you know, I'm updating the count. You know, I used the CV to go to sleep and yeah, this should just work, right? How will we wake up? Oh my gosh, this is not good, right? So yes, this is the problem here. If you call CV weight, you had better broadcast on that CV or signal on it somewhere, right? Because otherwise those poor threads that are waiting on this condition will never wake up, right? So CV weight and signal broadcasts are similar to lock and acquire and lock release in that they have to be paired, right? So if you start using condition variables, you need to think, where do I wait? Where do I signal, right? Or broadcast. Okay, so where? Now, where am I gonna put this? So I need to tell the threads that are sleeping to wake up. Where do I do this? More specifically, before I release the lock, but why right there? Because I change the count, right? Remember the count encapsulates the condition that these threads are waiting on, right? The condition does not change unless the count changes. And if the count changes, I need to tell people about it, okay? Where does the condition change? The condition change can potentially, potentially change here or here, right? That's when I'm adjusting the count. So let's put in some broadcast there. Okay, now I've got a CV broadcast. So I have to count and I broadcast and when I deck the count, I broadcast, okay? Does this work? Yes, this actually does work. And it says so on the slide. So good, reading comprehension. Does it work well? Okay, so that's a good question, right? But let me ask another question. When does the condition, what are the conditions that I'm checking for, right? This condition is checking something very specific, right? If the count is equal to a particular value. And this guy is also checking something very specific. If the count is equal to a very specific value, right? Now this will work. But what will happen is when I broadcast, let's say the count is going from, you know, well, in that case, I guess I wouldn't have people waiting. So this actually probably would work, right? But one of the things I might wanna do here is actually check if the count has changed in a way that changes the condition, right? So here what I've done is I've said, if the count is equal to one. So the buffer used to be empty, and now it's not empty. So the buffer has gone from empty to in between. And down here what I'm checking is did the buffer, was the buffer full, right? So in this case the buffer went from full to not full, right? If I don't do that, I can still do this and this will work fine, right? But what should happen is CB broadcast will get called and if the count up here went from one to two, there shouldn't be any threads waiting on the buffer being empty. So the broadcast is just not needed, right? And same thing down here, if the count goes from, you know, half to half minus one, there shouldn't be any threads sleeping, waiting to produce, right? So the CB broadcast is just a necessary, sorry, give her that. So we had a question here, why do I CB broadcast? Why can't I call CB signal? And, oh man, in the two minutes I have left, it's a good question, right? And the reason is that, so let's say I have a situation where I have multiple consumers, right? Now, so especially now that I've added this condition, right, I'm only gonna, so let's say I have three, sorry, let's say I have three producers that are waiting because the buffer is full. I have one thread that comes through here and sees that the buffer went from full to not full. If he calls signal, right, what might happen is only one of these guys will wake up, right? But there's three of them waiting and they all need to wake up, right? Cause it's possible that there's several other calls to consume that are gonna change the count but are not going to broadcast, right? Because I've added this condition here to avoid having to CB broadcast, right? So I don't think that would ever happen, right? I don't know, I think that would violate one of our assumptions. Now I think, I haven't thought about it carefully but I think in this example, you could replace CB broadcast with CB signal and that would work, I think. But I have to think about it harder. But in this case, CB broadcast is the right thing to do simply because again, I could have, you know, three producers coming in and a bunch of consumers waiting. Only one producer is gonna call broadcast. I need all my consumers to wake up. Okay, yeah, yep. But the point is that there might be another thread right behind me who's putting more stuff in, right? So it- So multiple producer, multiple consumer? Well yeah, that's the example we have here, right? We always have multiple threads that could be calling these in any order. So if I have a bunch of producers stacked up and only one of them is gonna call broadcast, I need to make sure all the consumers wake up. It's possible by the time the first consumer runs, there could be three items in the buffer. No, no, no, but my point is that it's possible that by the time they wake up, several other producers would have put more stuff into the buffer, right? I don't know how the threads are scheduled. It's possible that I call broadcast and these guys wake up and they may not run right away, right? They may just be added to the ready queue and before they get a chance to run, several other producers come through and put more things. Okay, I will leave this in the examples, Robert. So right, this example doesn't treat the buffer at all. You're just assuming that put and get work, okay? Let me, I'm not gonna do the sample for example, but you can look at this. There's an example in your code base in the drivers of a case where you can use the sample before to do something useful, okay? But let me quickly just summarize what we did today because this is important, this will help you on assignment one and when you think about synchronization problems, right? So when you approach these problems, right, what's the approach that we took? First of all, you have to figure out what do you expect to happen, right? What are the constraints? Remember, a race condition is always defined in terms of unexpected dependence on timing. So you have to figure out what do you expect, okay? That was the first thing that we did. Second thing, identify shared state in the problem, right? It can be a good idea to build an implementation that's not thread safe, either in pseudo code or real code and then look at it and say, okay, here are pieces of shared state that I need to synchronize and I need to protect. Third thing, choose a primitive, right? Choose a primitive, this is really important and if we had done this with semaphores, we could have got it to work. It would have not been as nice and I actually can, I can send out an example using semaphores so you can see that. Fourth thing, remember to pair waking and sleeping, right? If you call a choir, you have to call release. If you call weight, you have to call, you have a call to signal or broadcast, right? Cause weight will potentially sleep and a choir will potentially sleep, right? And if you don't drop, I mean one of the best ways to deadlock a system is to just not drop a lock. I'm like, whoops, I forgot to release it, right? That no one else will ever wake up, right? Look out for multiple resource allocations, right? This could be tricky because the system could potentially deadlock. A good way to do it is to establish semantics that say this is the order in which I lock these resources and every thread that locks them has to lock in that order. You don't have to lock all of them. You could lock a subset, but as long as you follow an order and you cannot deadlock, right? And finally, it's a really nice thing to just convince yourself, because it's signalization can be tough, right? You can have race conditions that only occur in one out of a million different thread interleavings, right? And you may not see those. And like I said before, Murphy's Law dictates that we will see them when we test your assignment, right? So, especially the corner cases. So in our example, what are the corner cases? When the buffer goes from empty to not empty. When the buffer goes from full to not full, right? If I have a bunch of people trying to produce and a bunch of people trying to consume and the buffer is changing states, right? So go through those examples and convince yourself that your system is doing the right thing, right? All right, so next week we're gonna keep going with our CPU unit. Last bit of it, we're gonna talk about scheduling, right? So scheduling, we've been talking about all these mechanisms, scheduling is policy. And then if we are really lucky, we'll get to my favorite topic, which is virtual memory. And I will see you on Monday. Have a great.