 All right, so I assume it's midterm week, because we have like one, two, three, four, five, six, seven people, joy. So that's good for you because this is going to help with the next lab. You only have a week four, and you can probably finish it. Not within this time, but you could probably get pretty close. So lucky for you. So, yeah, or the midterm grades are back, so maybe that's another reason everyone was like, yeah, screw that. Or, well, the average was like 80, so that's pretty high. So maybe you thought that, yeah, the next lab is only worth four percent. So yeah, screw it. Just don't do it, I guess. Whatever. All right. Well, anyways, we will go into locking today, and this will help for lab four, which apparently you will see, again, that will be quite similar in another course. So it'll make that course a lot easier too. So we are just generally getting into locking, other difficulties, and we want to prevent data races. So as a quick aside, so of the like seven people are here, who's used Java before? We got some Java users. Have you ever used the keyword synchronized? Any synchronized users? One. So synchronized caused your program to crash, so you stopped using it. Yeah, neither did I when I first started with it. It was like, yeah, if some problem happens, throw synchronized on it, it might work. All right, so these whole lock things kind of a pain, so with object-oriented programming, especially with Java, developers want something easier to use. So basically, they would like, hey, you can say a monitor or a method is something called monitor and let the compiler handle the locking for you, so I don't have to do a lock, call in and unlock. And I'll make it such that only one active thread can be active in a monitor method at a time. So it's essentially a mutex over the whole method, so that only one thread can actually execute that method in an object at one time. So basically, within the object, the compiler will create a mutex for you, and every method you mark as synchronized, or whatever the keyword is in Java it's synchronized, it will lock that method whenever you begin, and before it returns, it will automatically unlock for you. So it would look something like this, so if in Java I'm making like a bank account, well, I would have an int for the balance probably, and if I have a deposit and a withdrawal, well, it would increment or decrement that balance, and if I'm using a bunch of threads now, I know that I would probably have a data race involving that. If multiple threads are trying to change the balance on the same account, then we have the data race, the same count problem happens where we could get an invalid balance and all that stuff, and we'll see a thread example where we actually try and do this in C, but if you do this in Java, the compiler would essentially transform this into it would lock a monitor whenever it begins deposit, do the balance, do the increment for you, and then unlock before the method returns, and then with withdrawal, it would use the same lock, so it would acquire it at the beginning, and then end it at the end, and that way, if these are all the methods that use balance, you don't have data races anymore, we're all good, and the problem you probably had was that you forgot to mark when the method synchronized, so you still have data races, so as soon as you forget one, you still have data races because one method isn't using that lock. So it's good to know what these things do under the hood, so you know you're using properly, so this was like a good idea that, you know, you just make it easier, but if you just forget one keyword now rather than a lock and unlock, your whole program doesn't work, so they can leave it. So we saw Center Force before with that consumer producer, it kind of got messy, and it was kind of hard to, like, what we were protecting against was pretty straightforward, we wanted to make sure we didn't empty something that was already empty or filled something that was already filled, but if you have more complex conditions that you want to ensure are true, it might be a bit more trickier to use Center Force, and there's another tool you can use in your synchronization toolkit, and they're called condition variables, and they're equivalent to Center Force, but they behave a little bit differently, and they make that block queue a bit more explicit. So they all must be paired with a lock, so you can create a condition variable, you can think of this con, this condition variable, it's not strictly true, but it's easier to think about that way, you can think of that condition as just being a queue. So you're creating a queue, you can create attributes, but we'll take default ones, and then you can destroy the queue, and then there's essentially two fundamental things you can do here, you can either signal on that queue, which will wake up exactly one thread that is blocked on that queue, and then the other one is you can wait. So you have to have a mutex paired with that queue, and what this function will do will essentially just put you in the queue atomically, and then put you to sleep, and then the idea is you go to sleep, and then another function can wake you up whenever the condition is true, and then you can continue going. And then there's also broadcast which wakes up everything in the queue, but for the purposes of this course we just have to talk about signal and wait. So they all must be paired with the mutex, so any calls to wait must already hold the mutex, like it must already be locked, while if you do signal or broadcast you don't actually have to have the lock, it's not required. So why is that? Well you can think of wait, it needs to add itself to the queue safely, so it can't have any data races, so it would necessarily need to be under mutual exclusion, and the idea is you're waiting on value anyways that you don't want any data races on, so you already need to have a locked, acquired, anyway, anyways. And then it needs that mutex argument before going to sleep because atomically it will add itself to the queue, and also unlock the mutex so another thread can go ahead, execute, hopefully make the condition true, and then wake you back up. So and then whenever you get waking back up, whenever you start continuing that wake call before wait actually returns back to you, it will try and reacquire that lock. So the lock is held when you call it, and then you go to sleep for an amount of time, someone signals and wakes you up, and then you have the lock again when you return, and you can keep on going. And then one mutex if you want can protect multiple condition variables, and again we only need to consider wait and signal. So again here's what wait does for condition variables, so which is different than the wait for processes because we are not very creative bunches and we only have a few names that we use. So if you use a condition variable wait, when you call wait, again, atomically this will all happen atomically, so either it all happens or it all doesn't. So it will add itself to the queue for the condition variable, it will unlock the mutex so another thread can go ahead and execute, and it will get blocked. So it will be put itself on that queue, can't execute anymore, and it's no longer scheduled to run, and that fixes our problem where we might not have gone to sleep yet, this all takes care of that that we saw last lecture. So if you have a lock and unlock, only one thread at a time will execute anything between those. Yeah, yeah because after the unlock any number of threads can execute you want. Okay yeah and for this the wait too also needs another thread. Since it's on the queue it's blocked it's not executing anymore, it might be stuck there forever unless another thread called signal or broadcast wakes it up and it tries to acquire the lock again. So as soon as some other thread calls signal or broadcast it will be unblocked meaning that the scheduler can pick it to run, and if it gets picked to run it will acquire the mutex again and then return from wait when it gets it, so then you're sure that you have the lock when you're done. So this is how we could change our producer consumer code instead of using mutexes. We could use two condition variables and this might arguably be a bit clearer. So in here instead of just having a mutex for has counting the number of filled slots, we will just create a queue that says hey we will put things in the queue if all the slots are filled and then we have another queue for everything waiting for an empty slot. So I'll call them has a filled slot has an empty slot. So what we can do is in the producer we already had a lock before because we didn't want any data races. Oh sorry, we create a mutex to go along with these condition variables. So we would acquire it, make sure we're the only ones with the mutex now, and then we can check our condition. So in this case we're going to use end filled where we just keep track of the number of filled slots instead of keep keeping track of the number of empty and filled slots. We'll just keep track of the number of filled slots. So if all the slots are filled, so if the number of filled slots is currently equal to the number of slots, well we would wait. So we put this thread to sleep and say it's going to wait for an empty slot and we give it the mutex. So if all the slots are filled adds itself to the queue and then puts itself to sleep. And then otherwise if it goes ahead and makes it through this, through this while so all the slots aren't filled it means it could fill a slot. It would increase the number of filled slots which is okay. No data races because we're still between this mutex lock and mutex unlock. So we would increase the number of filled slots and then we would signal has filled. So if there's any threads waiting for a filled slot we would wake up one and say hey we filled a slot you can wake up now. And then in the consumer same deal we would lock the mutex. Now our condition is different so we check if we have a, if we have no filled slots, so if we have no filled slots we will put ourselves to sleep, put ourselves on the queue waiting for a filled slot. And then otherwise if we make it through this while loop we would just empty a slot so we would decrease it by one and then we would wake up anything waiting for an empty slot which would wake up a producer that would be waiting that would return from here. And then we unlock our mutex so I could actually switch these two around if I wanted to. So any questions about this code as opposed to our solution for it using center fours that we saw what Thursday in the cursed room yeah sets the signal? So in this case you have to manage the condition because our condition is this end filled slot so we change it so you just have to make sure that when you change it you signal whenever you change it such that the other things could wake up and their condition is now true and they can pass. So yeah so if I don't have the signal and I forgot to put it in well I can potentially I increase the number of filled slots and if something went to sleep because there were zero filled slots well if I don't wake it up it will never be woken up and we'll never check that the number of slots changed. Yeah so we're offloading our post weight problem by just making sure we call signal and weight and thankfully the weight goes in the same place. So any questions about that? So is this clearer than the center four solution or not? Ish some heads yes some no other questions why do we pass mutex to it so we pass mutex to it so we're going to pass mutex to it because again atomically we have to make sure that we add ourselves to the queue and that we don't have any data races in it so because we're doing condition variables we don't want a data race with a condition variable so likely you're holding it anyways so it can just take advantage that you actually have a mutex already and it makes sure when you come back it reacquires the mutex so you also don't have any problems when you return from weight and wake back up so like before we call weight we have the mutex acquired so it's locked and when we return from weight it's also locked so we don't have any problems where we return from weight and now it's unlocked and we have to acquire it again or something really weird and we'll see an example so let's go into fun problems we can have so here's our problems yeah yeah before the thread goes to sleep and adds itself to the queue yep so three things that all happen atomically that you don't have to worry about races for yep yep yep the way the easiest way to think about condition variables is wherever you have this p-thread con you can think of it as just a queue and like with mutexes you have to init them you have to destroy them and then the only things that we care about for now is signal which will wake up some wake up exactly one thread that is on the queue if there's no threads on the queue it does nothing so it's fine and then weight which will put itself to sleep and do those three things all atomically and then when it returns it reacquires the lock so we'll see that this is nice this makes our life a bit easier but still have some issues so let's take a minute read this code so there's going to be two threads here one thread is executing p-thread mutex lock so they're both using the same mutex so one tries to lock a mutex check a condition and then wait and then the other one or and then unlock and then the other thread just sets the condition to true and then signals the other thread if it's in the queue so here so let's give them a minute to think about that and I'll put it here and just remember because we're doing condition variables the idea behind here is that at this line after you wake up and you make it out of that while loop well the condition the condition better be true at that line otherwise you kind of waste you kind of wasted yourself because now the condition you're waiting for it doesn't hold whenever you were actually just waiting for it to hold so the condition better be true at that line yeah so short version of that was I don't have a mutex along or around my condition true whenever I set that so I actually kind of have a brace with my queue what could happen is thread two signals while nothing's on the queue while the other thread could have checked the condition and saw it was false and then it put itself to sleep after the other thread signaled so yeah that's our big problem we had but we can see that uh if everything just went to completion we're actually okay so initially let's assume we have our condition so assume initially our condition is equal to false and then we have two threads executing and generally when you program you assume that one thread happens and then another thread happens so what will happen if I say thread one executes the completion first so what's thread one going to do well thread one is going to call mutex lock and is the mutex currently locked hopefully not nothing else the mutex is just created so it's unlocked no no thread has locked the mutex already so it would now run that acquire the mutex and now thread one has the mutex so thread one has the mutex well it can continue going it would check the condition so that would be a memory read so it would go ahead read that the condition is false and then go into the while loop so it would go here and then it's going to add itself to the queue so con here we can think of it as just having thread one in the queue and that's what's waiting and now thread one puts itself to sleep it can't run anymore and then as part of putting itself to sleep that mutex is now unlocked it's all in the queue so or mutex is also unlocked no thread has the mutex so it's all free so then we execute thread two thread two would check the condition or change the condition from true or from false to true so now our condition which would be in a global somewhere is now true and then we would go ahead signal which would wake up thread one on the condition queue so now thread one can wake up and there's nothing on the condition queue thread one wakes up and if it gets scheduled it would now acquire try to acquire the mutex and when it returns from wait it would have the mutex so after we return thread one has the mutex and we would have returned from p thread wait now we would go back up to the top of the while check the condition again now the condition is true and I just deleted everything so now the condition is true and it would just drop here so now we're executing that line where the condition better be true and it is true so all good and then it would unlock the mutex so now no thread has the mutex all right any questions about that that seemed to work how I intended okay well then let's rewind so rewind initially the condition is false and both threads need to execute so what's going to happen if thread two executes first so thread two executes first we should just change that condition from false to true and then it would call signal what would it wake up nothing nothing's in the queue so it does nothing so which is fine then in thread one it would try and acquire the mutex nothing's currently holding it so it could pass and now thread one whoops thread one has the mutex it would go ahead check the condition which would read true and then it would go here break out of the while loop that line conditions true perfect all right so now knowing that something bad can happen to reset everything whoops who wants to tell me a very bad interleaving of when something goes very very very poorly yeah so uh so the suggestion was that hey I have a problem if thread one starts first and then I get yielded or switched away right at that line so let's make sure make it crystal clear what happens so thread one executes grabs the mutex because it is the only thread or sorry it's the first thread there so it acquires the mutex and then it would go ahead check this condition and right now that condition whoops that condition is false whoops I shouldn't have deleted that line so that condition is false and now we get context switched over and start executing thread two so in thread two we change that condition from a false to true and then we signal anything waiting for the queue so does it do anything no because there's nothing in the queue so does nothing then eventually we context switch back over to thread one thread one keeps on executing it would just add itself to the queue and then put itself to sleep and unlock the mutex so we go here the mutex is now unlocked and thread one is in the queue and now we are completely screwed because no one will ever signal it again so thread one will be infinitely stuck and you won't know any better so any questions about how that works because that is bad how would I fix it yeah I could add a mutex to thread two where do I need my mutex when I set it to true okay when I set it to true where do I need to lock then yeah so I just want to kind of do a lock and unlock around my condition variable so lock that mutex and then where should I unlock it right after or after the signal well let's go ahead and do it after so now with our new modified code well what's going to happen so our bad thing happened uh after after we read this false right so after we read this false so right after we read this false thread one had the mutex and there was nothing in our queue because we just read the line and we were rewinding a bit so we just read that false line and that's it so our problem before was when we context switched over to thread two it wrote to it updated the condition value to be false so sorry it's currently false now so now what happens if the scheduler decides to context switch back over to thread two it would try and lock the mutex and that doesn't work because the mutex is currently being held by thread one so it can't do anything so depending on the mutex implementation it could just spin try over and over and over again or it could put itself to sleep which is again another fun queue but you don't have to worry about managing that queue because the threading library or the operating system will go ahead and do that for you so in this case as much as thread two tries it's not going to execute anything it's waiting for that lock call and then whenever uh thread one goes again it would add itself to the queue let's put that mark it would still be executing this uh wait so it would add itself to the queue all atomically and then unlock the mutex again all atomically and now if we switch back over to thread two well no thread holds lock anymore so thread two could actually get the lock and then update that condition to be true and you can see here why it's important that everything's protected by the mutex so even before here like there could be a signal here and that's fine but you don't really want uh thread one to do anything until you actually update that value and it wouldn't because thread two still has the mutex so even it will woke up before it returns from wait it would try and reacquire the mutex again and right now it can't get it so it can't return so even if I have a signal there before I change it it can't do anything and it it's essentially I would wake it up early and it would not be able to proceed anyway so why even bother waking it up there so in this case thread two acquired the mutex changed the condition from false to true and then it signaled so now it removed thread one from the queue and now it's scheduled to run or not scheduled to run but it could be runnable and now say again say the operating system decides to run thread one at this point well as part of returning from wait again it would try and acquire the mutex and right now thread two still has it so it would never return from wait until thread two actually reaches the unlock call so this isn't bad that I have the signal signal here but I'm waking it up when it can't do anything so typically you want the signal after the unlock but for correctness it doesn't matter I'm just maybe wasting some more time so then I unlock and then now no thread has whoops now no thread has a lock and then thread one which is now able to run would go ahead acquire the lock before it returns from wait so whenever it returns from wait you are sure that you have the lock so thread thread one has a lock and now it would go ahead and check again and that condition is true so it goes out and now the condition better be true on that line and I'm all good all right so any questions about that that works right okay well I'm checking conditions that don't need to be checked what about if I do that so I'm starting to notice every time I wake up the condition is already true so why the hell am I rechecking it again I'll just go ahead and put an if there so if it's false I'll put myself to sleep and then when I wake up I know it's true no because so the reason you don't want to do this is because in this case it's actually fine because no matter what that condition is going to be true because I only change it from false to true once it's no problem but typically you'll have code that looks like not that this so thread one is exactly what we saw before except we change that while to an if and then we want to know what could happen here ideally we always want the condition to be true at that line and then let's say usually you don't just have a value go from false to true and that's it typically it changes all the time so let's come up with thread two that changes it to true like we said before and then since we're changing it to true we want to signal wake up any thread that's waiting for it to be true then another thread might set it to false so if it sets it to false that's fine it doesn't have to do a signal because the conditions now no longer true so with this let's look at it for a bit and think of anything bad that could happen here if that while is now an if something has changed so thread one went to sleep because the condition is false and then something else woke it up because the condition is true here go there yeah yeah but it could but okay for yours would it still happen if I move the signal up inside of here okay and yeah sorry one more okay yeah you're good okay so what could happen is kind of whoops kind of what you were saying where thread one checks the condition it's false it puts itself to sleep and then another thread sets it to true wakes it up so it can potentially run but before it's run something else changes it to false and now if it wakes up the condition would actually be false instead of true now so let's see that real quick so let's illustrate the badness so initially let's assume that our condition is false and then cond what's in my queue is nothing and who has the mutex I was right the same thing no one has a mutex so badness that could happen is thread one executes it acquires a lock so it now has the lock so right now if any thread gets context to it will never make it pass that lock call because thread one has the lock so right now we don't have to worry about any data races on condition because no other thread can write to it so we're all good and also speaking of data races another way to think about data races is uh like the less informal way to think about is a data races anytime you actually now have a still value so the data you read is now not consistent with whatever is the most up to date and bad things could happen as you may have discovered in lab three or you might still have to fix in lab three remember that seg fault thing anyways so in this thread one we check the condition it's false so we would wait and put ourselves in the queue so atomically we do all of this we put ourselves in the queue and we unlock the mutex so now we are waiting for it and in this case again instead of while we put in an if here so yep yeah so you can't take yourself off the queue you can only add yourself to the queue the the one calling signal will take you off the queue and let you run again yeah you can think of this as like an explicit waiting queue so i if a thread calls wait it puts itself on the waiting queue and goes to sleep yeah so in this case the condition still falls we go to sleep because we're waiting for the condition and then so no one has the mutex we're in that explicit waiting queue and now thread two could execute and it would pass this lock call because well no one has the mutex so it could acquire the mutex then it would change the condition now to true which is good uh yeah changes the condition now to true and then the conditions now true so we can go ahead and signal uh well first we would unlock so now the mutex is now not up and then we wake it up so we would wake it up and now it is runnable but it's going to try and acquire the mutex so what we'd like to happen if things go well is right now at this point the condition changed to true so if it wakes up the condition is true that's good it can continue executing it wouldn't actually have to recheck the condition because it knows it's true but what could happen is because before uh weight returns it needs to acquire the mutex well thread three could have already been waiting for it so thread three is here the mutex is unlocked so it could acquire it so now thread three has the mutex whoops so thread three has the mutex changes the condition to false whoops changes the condition to false and then unlocks so now our condition is false no one has the mutex and now thread one can wake up it would acquire the mutex when it returns from weight so it would have the mutex and then because we don't have a while loop we have them we could read the most up-to-date condition if we wanted to but we don't because we had an if statement here so now we're executing this line and the condition is false which is what we don't want so any questions about that and the same thing would actually happen if we move this up here so it would have woken it up but it can't grab the mutex yep so as soon you can think of it in terms of your throwing library whatever you call signal you just throw it on the ready queue again so it could run but not necessarily yeah yeah potentially it's queues all the way down potentially we have three queues so we potentially have a ready queue that the kernel is managing for all the threads and then remember we looked at the mutex implementation and said it had to be fair so the mutex would have a queue on its own to make sure everything get acquires the locks in order so the mutex would have a queue and then the condition variable is another queue so here we potentially might have up to three queues so and this is why even yeah this is why uh if the even with the mutex so when it returned from weight ideally we would like it to run after thread two but it just woke up and is now runnable and if three tried to acquire the mutex first it would pretty much be it would be guaranteed to acquire the mutex before the thread that just woke up and that's where all our problems arise so it's not immediately runnable all right any questions about that so that is fun condition variables and believe it or not it's supposed to make your life easier so like with the first example where we could get away with changing that condition to be an if and it being okay in general if you don't want to think about just make sure it's a while and you recheck it over and over so you don't have to think about it you don't run into this problem uh there's only like very few scenarios where if it's okay generally make it a while if the condition goes back and forth and make sure you read the most up-to-date thing and that's you know data races in that shell so they serve a similar purpose to center fours so center fours are actually like a special case of condition variables so you could think about it the same way so your condition is an integer if the value of the integer is zero you put yourself you put yourself on a weight waiting queue any thread that posts up would also signal and they wake up works the exact same so generally if the solution is suited exactly for a center four just use a center for it simpler if it's more complex if your condition is not just a simple number use a condition variable because it will probably be a lot clear right they're equivalent to each other so you could write one in terms of the other uh but it's degrees of messiness and readability so just use that ground rule so for integer values use a center four for anything that's more complex probably use a condition variable and then next thing lock granularity is the extent of your lots so you need locks to prevent data races uh but if you lock large sections of your program it would probably be really really really slow because you're essentially making only one thread at a time and now it's serial what you could also do is divide the locks and use smaller sections so you lock as little as possible to prevent a data race like what if you want paralyze your hash table hey lab four so other things to think about locks are like the overhead so creating and destroying locks and the lock and unlock calls aren't free so you probably don't want to overuse them but again you don't want to underuse them because you might have contention so if a lot of threads are battling for the same lock essentially that's all going to be serialized and your performance is going to be really really bad because only one thread's doing it at a time and then you also have to consider a fun thing called deadlocks if you have multiple locks but with the locking overhead again you have to allocate some memory for the lock it's going to take time to create them destroy them time to acquire them time to release them and the more locks you have the worse this is going to get so for your hash table thing sure it might be safe to just make you know 20 000 locks or a million locks or something like that but that's going to waste a lot of space it's going to waste a lot of time but the big problem you'll have with multiple locks is deadlocking which is something you don't want so conditions for deadlocking is you have mutual exclusion which is going to happen if we're just talking about deadlocking for mutexes so mutexes guarantee mutual exclusion so only one at a time so of course that's what happens for simple locks the more important ones are hold and wait so if I have a lock and I try and acquire another I might deadlock another one is no preemption so you can't take locks away if I could take locks away and say hey no you don't get to use that then you could prevent a deadlock by just breaking the cycle and then fourth one is yeah the cycle the circular weight so you have a lock and you're the one with the lock and you're waiting for another lock held by a different process so that would look something like this so consider we have two locks two threads and in the implementation of thread one I try and acquire lock one then lock two and then I release them both and thread two I do it in the opposite order so I try and lock lock two and then lock lock one and then I release them well a bad thing that could happen is that thread one executes it acquires lock one and then I get context switched over to thread two it acquires lock two and now I'm screwed so if thread two starts executing it tries to get lock one which is held by thread one so it can't do anything and now in thread one that has lock one it tries to get lock two which is held by thread two so it can't do anything so it's like a you know they're holding each other hostage at this point and no thread can make any progress and if no threads can make any progress that's what we call a dud lock so you have to prevent these and we'll go over it in more detail in the next lecture and yeah so just remember pulling for you we're all in this together