 Alrighty, welcome back to Operating Systems. So where we left off last time, we kind of implemented a lock. It really sucked. It didn't really work. It let multiple threads make it through the lock. So it didn't actually function that well. So today we fixed that. So if you want to implement a proper working lot with minimal hardware requirements, your hardware requirements could be as low as just having atomic loads and stores and ensuring that the instructions actually execute on the CPU with respect to memory operations in order. If those things are true, you can do some computer science-y things like use Peterson's algorithm or Lampart's Bakery algorithm, which basically simulates you going to a bakery picking a number and then that's how you acquire a lock. The problem with that is they're like computer science-y solutions. They don't really scale that well, so as soon as you have multiple CPU cores or something like that, it's just really slow and processors, well, they might execute out of order. So the actual hardware function that is atomic that we use to build our locks is something called compare and swap. So compare and swap takes an address to an integer, so that is the address of the value you want to change atomically, and remember what atomic means, it just means either the operation happens all at once or it doesn't, there's no in between. So by saying this magical atomic function, it means this function either happens or it doesn't, there's no in between. So what it will do is, well, it takes an address to an integer and then a value you expect it to be, and then if it is that value, it will change it to this new value. So it will always return to you the original value, and again this is all be atomic, and it will only swap if the current value equals to old. So if I use it here, so I use compare and swap with my pointer to an int, and I use the arguments zero and one, that means it will atomically change the value from a zero to a one only if the current value is zero. So that will make it so that only one thread and only one thread will be able to change it from a zero to a one and it will be done atomically. Otherwise, if another thread tries to call compare and swap, the value at that address will be one, so it will always just return one as long as another thread has a lock. So here, the first thread that makes it to compare and swap while the initial value will be zero, if the current value is zero atomically, again all at once, it will change the value from zero to one and then return zero, which is the old value. So that will return zero from compare and swap, so that will make the while loop false, so it will just drop through and then continue on with the lock. If another thread tries to acquire this lock and do a lock call, well the current value of the lock is one, so compare and swap will return that value one, it wouldn't have changed it from zero to one, so this while loop will just execute over and over and over again and this is an implementation of a lock, so any questions about this? So because that magical function is atomic, there is no way for two threads to make it pass the lock call at the same time because, well, it happens atomically and essentially that's all that happens in that function, it just calls compare and swap and then checks the value. So any questions? That's a perfectly Cromulant lock, that's a made-up word from the Simpsons, if you don't know what Cromulant is, so this is a perfectly good working lock, we all agree, it just kind of sucks because while we have that while loop, so it'll just keep on checking and checking and checking again, but hey, it works which is better than nothing. So what we implemented there, that's a valid lock implementation and it is in fact called a spin lock. So that compare and swap magical function, I called it a magical function because in actuality it is a CPU instruction that your CPU supports and makes sure that it is done atomically. On x86 it is called CompExchange, why that stupid name? I don't know, they like shortening things because I don't know, we're terrible at naming stuff, it's supposed to be short for compare and exchange, which compare and swap, compare and exchange means pretty much the same thing, you might see it called one thing or another, they essentially, they all do the same thing. So this is a perfectly good lock but it has that busy wait problem where if one thread has a lock, another thread trying to acquire it will just sit in that while loop forever until it eventually calls unlock. So if we have a system with only a single core on it, well the smart thing to do would be if I can't get the lock I should just yield so hopefully implemented that in lab 4, so if I know if I check the value and the value is 1 means another thread has the lock, I shouldn't even bother trying to reacquire it because if there's only one core, well I'm using it, there's no possible way some other thread is going to unlock it so I should just yield and if I yield while if it's kernel thread the kernel schedule another process that hopefully frees the lock, in your case you would yield hopefully another thread calls the unlock function. If you have a machine with multiple cores on it, well you might want to actually use a spin lock if your critical sections are really small, it might be the case where well you want to be super responsive, we'll see when we do a full implementation of a mutex, it gets kind of complicated so you might want to just try again because it's small so likely you will get it the next time you try and you won't have to have a super complicated implementation because this implementation is dead simple, you can't really get much simpler than this, it needs one integer essentially one instruction, one instruction the lock that has a while loop and then a memory write and unlock so pretty simple. So here is our solution if we just implement a yield on it but now with more solutions comes more problems so we now have a problem called a thundering herd so if we have let's say eight threads and one has a lock and seven are trying to acquire the lock while we would have all seven yielding and waiting and as soon as thread one unlocks it you're going to have seven threads attempt to acquire the lock again which seems like kind of a waste because you know by the virtue of it being a lock that's mutual that ensures mutual exclusion only one of those seven threads is actually going to be able to acquire the lock so why would I just have seven threads all battle for it also in this solution seven threads will be battling for it and you cannot guarantee any ordering you're just kind of up to the scheduling and how unlucky you get and realistically we want our locks to be fair so if there are seven threads trying to acquire the lock well those seven threads should probably all get the lock before a new thread tries to acquire the lock so we want to be nice and fair if we want to be able to reason about well I mean we could form an orderly line for the lock and it should be first come first serve generally the first thing we think of when we think of fairness so here's our solution to that so on top of that spin lock well I can go ahead and add a cue so what my implementation could look like is well in this while loop if I do not acquire the lock so it returns a one saying that another thread has a lock I can add myself to a cue for this lock and then put myself to sleep which will essentially block this thread from executing and in the unlock what I can do is well the thread that calls unlock and set the value to zero and then it can check hey is there a thread in the weight cue if there is just wake up that one thread and only that one thread so if there's seven in the cue all I do is wake up one I unblock it and I let it continue running it would resume from the sleep and then come into this and try to acquire the lock again if it's the only lock it should be able to switch it back from a zero to a one indicating it now has a lock so let us think about this for a bit because well there's two issues with this implementation one is called a lost wake up and what that means is there is a situation here and it is your job to tell me what that situation is where due to context switching between different threads I could get in a situation where a thread puts itself to sleep and the thread that calls unlock never actually wakes it up so it essentially is in a coma or something like that so it goes to sleep can't wake up so that's what a lost wake up is and then the other situation we have is the wrong thread would get the lock what that means is essentially is there situation where there is a thread waiting for the lock and another lock come or another thread comes in and essentially butts ahead of it in line and acquires a lock so let's look at that and let's see if we can think of some situations where those two things would occur any ideas yet for one of those situations yeah so that's what we want right so even eight threads try to acquire the lock say one thread has a lock and eight other threads try to acquire it they would all get put on that queue so there's now eight threads waiting and then when the thread that has the lock calls unlock it only wakes up one of those yeah so each of them whenever I mean so lock is always paired with unlock right so so like thread one would wake up thread two and then thread two could get the lock it doesn't have to fight with anyone else and eventually when thread two is done with it it'll give up the lock and then thread three will get it then it'll give up the lock then thread three will wake up thread four that the da da da da so that's actually what we want but the situation we want to see is there a situation where there should be a thread like is there a thread that gets put to sleep that never gets woken up yeah so as long as it so so you have what you've let's put in red so you have a thread that is currently in the queue mm-hmm yeah so there's a thread that has a lock so let's say that situation let's say we have thread one it has a lock it calls unlock so we could put it here and then or sorry thread one has a lock so the current value of the lock is one and thread one was the one that set it so it needs to call unlock to unlock it and then we also have thread two that wants to acquire the lock so what could happen is well thread one unlocks it sets the value to from one to zero and then well it would check the queue there's nothing in the queue and then it would finish right which should be okay because now if thread two executes oops so now if thread two executes well it would do this compare and swap the current value is zero so it would change it from zero to one so the while would be true and it would just acquire the lock and pass that's fine yep so this can be just concurrently so thread one executes so thread one executes it changes the value from a one to a zero and now what close we're on the edge what if okay yeah so what happens if thread two runs first in this case so if thread two runs first well it would do this compare and swap the current value is one so it wouldn't change it from zero to a one it would just return its current value which is one so it would go inside the while loop so now it's at this point so should it continue well ideally we'd want to continue what happens if we start executing thread one now if we context switch right now so if we context switch right now then thread one will change the value of the lock from a one to a zero and then it could continue executing so it could check the queue is there any threads currently in the queue no threads in the queue so it can just go through that then it's done it's done on lock now if we context switch back to thread two while it adds itself to the queue and then it puts itself to sleep who's gonna wake it up no one right so that's a possibility of a lost wake up so does that make sense there is a there is an order of context switching between threads that gives you a result that you do not want so in this case I do not want a thread to yeah I don't I always want to wake up every thread I put to sleep so in that case that covers what that covers our lost wake up so that's done what about is there a situation where the wrong thread gets the lock so let's craft a situation here so let's say do do same initial situation so thread one has the lock it's going to call unlock thread two instead of calling lock let's say that thread two is in the weight queue so if it's in the weight queue and it is the only thread in the weight queue it should get the thread next or the lock next then let's say we also have thread three calls lock so given that situation so let's say we got thread one here thread three here is there an order of executing those threads such that thread one unlocks and then somehow thread three butts ahead in line and gets the lock before thread two so right now so if t3 executes first okay so in this case if we reset it back so you want thread one to execute first so yeah so if thread one executes first well it would change the value of the lock from a one to a zero so it moves down we change the value to a zero and then you want t3 to execute t2 so t2 is in the weight queue it can't execute the only way t2 to execute is if we wake it up yeah so we could immediately now switch over to t3 wrong tool it would check compare and swap oh the current value is zero so it would change it to a one and return zero and then pass through so now we have whoops thread three changes the lock it just kind of whooped in and stole it before thread one could go ahead and wake it up so at this point I mean it's already too late thread three stole the lock from thread two if we went back and context switched back over to thread one well it would check oh there's a thread in the weight queue it would wake up thread three or thread two and then finish now what thread two is going to do it's going to wake up start executing here and then it would go back to the while loop check the condition again and oh no thread three stole the lock so it would just add itself back to the queue and go back to sleep so in this case that is well our that is a situation such that the wrong thread gets the lock and that's very creative name for that problem so any questions about either of those scenarios because we're going to have to fix them so here is that example so you have it the lost wake up and then the wrong thread getting the lock although we swap the names for thread thread one thread two but same idea so here is our implementation we can use to fix it and it looks kind of a mess but it's not too bad if we break it down so here's how we will fix it so our mutex structure so all the data we need to actually implement our mutex well we're going to need a queue so that's our weight queue of threads we're also going to need that integer for that represents a lock and we're also going to use this other integer for a lock called a guard and we're going to use that only internally so this looks kind of a mess but the this guard integer will essentially be a spin lock that is only used in the lock and unlock methods so this right here these three lines this big while loop in order to read this sanely it might be easier to read this as well instead of that whole while loop this is essentially a lock of the guard because it's just a spin lock right that was going back that was our implementation of a spin lock so we can just rewrite them we're essentially using a spin lock internally which is fine because this isn't doing that much so if we look elsewhere we make sure that we unlock it so here is our calls to unlock so this is like unlocking the guard and we do that in both of our if branches so that we're guaranteed whenever we lock it we're guaranteed to unlock it so that's why we need an unlock in both sides and then let's see for the unlock it starts the same way so this is a lock to guard and then here and here well those are calls to whoops not that one at the very end that is a call to unlock of the guard so now how this works is hopefully we fixed our problem so for lost wake up what hers was the first situation so yeah for lost wake up let's see if we fix that problem so what we had before with lost wake up is thread one was starting to execute and then thread two was calling lock and there was some situation where we could do a little bit and thread one unlock it and then not check the queue after thread two read the fact that it was already locked so in this case thread one could execute it would acquire that guard and if it has that guard well it can check the queue well right now the queue is not empty or sorry the queue is empty so it would change that lock to zero to signify that it's unlocked and that was the situation where if we immediately context switch back over to thread two we had some issues so now at this point if we context back or context switch back over to thread two it tries to lock the guard and it can't proceed because thread one has the guard already right so it has that spin lock acquired thread two can't make any progress which is kind of what we want we were in a situation where thread two made a bit of progress and then we essentially were screwed so now that can't happen anymore so it would change the value of the lock from a zero or a one to a zero and doesn't matter no other thread can execute until it has finished and it would just go finish that if statement then unlock the guard and then finish so now if thread two finally gets a chance to execute well now it can acquire the guard oh no it can check the value of it so thread one changes it from a one to a zero you could check the value of it and now well because we have the guard we don't have any race conditions with that lock variable or that lock integer because well guess what we have mutual exclusion we don't even need compare and swap anymore so it could check the value of the lock if it is a zero well then we change it to a one and then that means we have the lock so we can unlock the guard and then it just finishes the function otherwise if we cannot acquire the lock well it would put itself in the cue and then unlock so now another thread could go ahead and execute and well that seems suspect is there still a slight issue here sorry yeah so this one's a bit more subtle so it might be the situation let's say thread one has the lock already so thread two would try and add itself to the cue so right now it is in the cue and because it has finished the unlock to guard well we could context switch back over to thread one and thread one can make it past that call to lock of the guard so thread one can go ahead and continue and it would check hey is there a thread in the cue currently there is thread one is thread two is in the cue so it could say okay I should just transfer that mutex to thread two and then it would call wake up on thread two but right now because we context switch before thread two called sleep it's actually not asleep so there is a situation a very subtle one where it is in the cue but we call wake up on it and it turns out it's not asleep so that is a that's like an example of a data race even in this code but to fix it it's actually not that bad because we know we don't have any data races with the cue itself so in thread one we know that thread two is about to put itself to sleep we should just retry the wake up or have some special thing that cancels it sleep next sleep or something like that it's essentially a data race we can handle because we know it's about to go to sleep the next thing it's going to do it's going to call sleep don't worry about it so any questions with that so this is like 99% working it just needs a little bit of fix up just in case it tries to wake up a thread that's not asleep yet but we know that that thread will go to sleep as soon as it executes again so any questions about that are we good so that's that's a full mutex yep yep yeah well yeah your mutex essentially it's three variables it's an internal spin lock an integer that represents the the current state of the lock and then a cue just to make sure we have some fairness so that's basically what a mutex is so the only thing that isn't the implementation is it's gonna have to do a bit of things there in the situation where it tries to wake up a thread that's not asleep yet but it's a situation you can handle it just makes this even uglier than it already is all right so we're good we implemented a mutex all right so here is that situation again where there's still a bit of a data race again like I said a thread could get interrupted right before it puts itself to sleep after it adds itself to the cue but again because it's added itself to the cue we're going to try and wake it up we can retry calls to wake up until eventually it succeeds because it's eventually going to call thread sleep where you could have another mechanism to handle the situation depending on your operating system like canceling its next sleep or whatever so this is more of a performance thing so remember what causes a data race you will need to know this definition by heart data race to concurrent access is the same variable at least one of them is a right so with the mutex we would have to protect all our reads and writes well you might be in a situation where you are writing some performance sensitive application that only writes rarely so because a data race only happens if one of the operations is a right you can have as many readers as you want so ideally if your application writes very infrequently you want to share that variable as much as you can you don't want to just wrap a mutex around it so we want to have a lock that's a bit more flexible than a mutex which just ensures only one thread passes and that's it so there is a special type of lock called a read write lock so with a read write lock it has two types of lock calls so there is a special call for locking the read write lock for reading only and the reason for this is it's supposed to allow multiple readers at once so I could have a hundred ten thousand eight whatever as many readers as I want but should be the case that only one thread can hold the write lock at any given time and there's no other threads trying to read so we make sure that our writes only happen with a single thread so we don't have any data races while if we're just reading we can just read as many times as we want so here's an implementation for it so our lock so here I just say they're generic locks but they could be spin locks or mutexes that we know now so this could be a spin lock this could be something like a mutex and we could implement a read write lock on top of those so for just writing if we use the lock as a write lock while we want to just ensure mutual exclusion we can just use the mutex directly so we just lock the mutex for a write lock and we unlock it for a write unlock the only difference is for the read lock so for the read lock there's going they are going to lock an internal guard variable or an internal guard spin lock let's say because we don't want any data races with this n reader variable so this n reader variable is going to keep track of the number of readers we have so it will try and increment that so if it increments it from a 0 to a 1 that means this thread is the first reader so it should try to acquire that mutex and if it acquires that mutex it means that the write lock cannot pass so it has a mutex so then it could unlock the guard because now it's done with it so now if another thread calls read unlock while the first thread acquired that mutex well it would increase the number of readers from one to two and then it would check here well it's not the first reader so it can share that mutex so we can have two readers we can have three readers four readers five readers so on and so forth so eventually they'll call unlock and what unlock will do well make sure we don't have a data race with the number of readers it'll subtract it by one and if the new value of the reader is if we have no readers anymore well we should give up that mutex so the last reader gives up that mutex and then it finally unlocks it so it's like uh what's that situation if you're like in a giant lecture hall and you all want to share it well one of you needs the key one of you can unlock the door and then keep on letting people in and if you want to be nice and secure the rule is the last person that leaves locks the door right so same situation here so you can just pile in as much as you want as long as the first person unlocks the door and then the last person locks it when they leave so any questions about this should work nice and cool so this is a type of lock you want to use again if you're trying to make things go as fast as possible if you see that while you're protected you're trying to prevent a data race on some variable that rarely gets written and it gets read a lot while for performance reasons you should use a read write lock because you can do reads in parallel no issue of a data race you're all good so use a read write lock yep so yeah if you use a read write lock all the time then it just kind of looks messy because you're always doing a write lock call and then a write unlock and then typically if you use a read write lock i mean you should use it because you're reading just by itself you could always use a read write lock and just always use it for always use the write lock and write unlock but then at that point you may as well just be clear and just use a mutex because that's what's going to do anyways right so you could there be your program would still work just you might get some ugly grumbling from people reading your code being like why the hell did you use a read write lock when you never do a write or read lock all right any other questions because we yeah we've learned everything today yay all right so recap cap for locks we'll get into more locking in the next lecture but at this point we should know what a data race is and now how to prevent them so two concurrent accesses at least one of them is a write we can use mutex or spin locks they're the most straightforward lock they ensure only one thread can be in the critical section at a given time again critical section it's just the block of code between the lock and the unlock call and it's your job to make sure that well if your thread locks that thread also calls unlock if you don't do that well bad things are going to happen undefined behavior delete your computer all that fun stuff so we need if we want to implement locks ideally nowadays we need some hardware support to implement locks so you need that compare and swap instruction or compare and exchange whatever it's called any modern architect or any modern cpu will have that instruction on it you'll need some kernel support for like wake-up notifications for threads putting threads to sleep and all that but i mean you could if you really want to implement putting threads to sleep in lab four if you want so you just stop it from running and then you have a wake-up so you could do that if you really wanted to in lab four i took it out because you should be able to know how to do it it's kind of busy work after you do the harder stuff but you need some support for that and finally well if we have a situation where we have a lot of readers and have infrequent writes for performance reasons we probably want to use something like a read write lock so i will be here for the remainder but you are free to leave or ask questions about lab four whatever but just remember phone for you we're on this together