 So first off an apology because on quiz day I had office hours and was kind of wondering why no one shows up showed up and that quiz night Someone told me I never posted where my office was So sorry about that. It's BA 5110. I updated the piazza post. I put it on discord So I'm sorry about that. I did not mean to not tell you where my office was Yeah, so hopefully people actually show up now Um, yeah, so quiz one seemed to go fairly well No So the average so far is like 83 So the two short answer ones haven't been graded yet, but they'll be hopefully graded tonight and then you can get your quiz back so Doesn't sorry No, no, I can just see it. So there's 80 percent of it graded I had to go through manually grade stuff, but it's like 66 out of 80 so far All right So today let us talk about son of fours So we saw locks before and they ensure mutual exclusion which again means that only If you have a lock and unlock and you have a critical section Only one thread can be in the critical section at any given time So if we have accesses to the same memory location Since there can only be one thread in there at a time There cannot be any data races because there's only one thread right data races We need two concurrent accesses with at least one being a right But this does not help you ensure ordering between threads at all So how would we ensure ordering between threads? So let's go ahead and take a quick look at our example for today So here is our example today I will create two threads so thread zero and thread one and I will have one Call print first and one Call print second to run So at this point I have a print first and I have a print second and say I want print first to always go Before print second and then the question is what should I do and to illustrate it We can go ahead and execute it a few times this time it worked no bugs no bugs and bugs right So given this implementation and what we already know How would I ensure that I always see you know? I'm going first or sorry. This is first and then I'm going second Yeah, I could sleep one of the threads, but that would be slow Anyway, I could do it that would be faster, but possibly not ideal Yeah, so if I had yield Yeah, if I had yield and control the schedule or I could just make always first go first and then yield and then that's the only way But let's assume p threads. We don't have yield or any of that stuff So let's go down. So I create two threads and then I join them, right? What what would happen if I just did this? So if I just do this and Bill or compile Then that should be fine, right first first first second, so I've just create a thread I join it so I Only know I only create the thread that could run print second after the first thread is dead So that's one way I could do it, but Yeah, so then it's just single threaded at that point and also doesn't really help me if You know, I just want That print statement to go and then the next print statement and then I want everything else after that to be able to be done in parallel So that this does not help that because it was essentially just made our program single threaded All right, so let's see what we can do. Oh also There's another solution we could do involving locks. So what about if I just did this crazy thing? So you're not allowed to unlock and An already unlocked lock, but you can lock it twice So what about if I just created a mutex and In print first I have unlocked at the end of it with no paired lock Which looks kind of weird and then in second I've lock lock so then the thinking in here is that If print first gets scheduled second it would call mutex lock, which is currently unlocked So this call would pass and then it would call lock again So the lock is locked so I can't get it So I'm going to be stuck here at this call and then later Hopefully I have thread one gets scheduled then it would go ahead and print off This is first and then unlock the lock and now this call can go and print this a second Sorry. Yes. So Yeah, so there is an unlucky combination here where this works actually most of the time But remember I said if you unlock and already unlocked lock it is undefined behavior So what could happen is print first get scheduled first it prints off This is first and it doesn't unlock immediately and then that's undefined behavior Or the other thread will just you know Say even unlock on an already unlocked thread did nothing Then the other thread would have a lock and then another lock and then the lock would and it wouldn't progress through this And it would just sit here forever because nothing else is ever going to call unlock So if I go ahead and execute that it like kind of works So it worked it worked it worked and then this is a point where the second thread actually got stuck because of the two lock calls All right Any questions about that? Because that like almost kind of works and it's kind of the idea will be going off today so Again, here's our problem and just just as a note for functions So print F is actually thread safe and that is not always a given so say for instance that that our Print function instead of just a print F that happened either all at once It kind of looks like an atomic that we saw last time it kind of separated out in three different calls So if I have three right calls, this is first and the second thread I've I'm going second If I go ahead and execute that Sorry safe print Then that one kind of looks normal except thread to got scheduled first and then thread one got scheduled first But if I execute a few times I'm gonna get like kind of garbled input because it's going to just get preempted between the calls Yeah, so thread safe means no data races So two threads can access it and it'll just one will happen before the other you won't kind get garbled input like this So this is garbled input. So as an exercise for later You could take this version of it and make it thread safe by just creating a mutex for both of them and putting lock And then lock around them. Yeah Yeah, so print F and three separate calls would give similar results except that if you know the implementation Print F it buffers calls. So wait until a new line to actually do a right call Yeah, but that's like a super implementation detail thing. Yeah, yeah Yeah, because internally it'll have like a mutex or something so yeah, so it won't be cut off in between So you won't see weird output like this So a right call could it generally if it's under a certain size it just always happens But yeah, if you tried to write out like, you know Omega byte or something like that you might only get a bit of it written at a time because remember the right system call returns How many bytes actually got written so you might need to check that but again another huge implementation detail So let's go ahead and see what we can do So we can use something called a center for used for signaling unfortunately This is an overloaded term with unix signals But it's like the actual English word for signals like we actually want to hey like wave you down and Make sure like an air traffic controller or something like that So center for is there's only three things you need to know about them They have a value and it's shared between threads and optionally you can use this as To synchronize between processes as well, but for this course, we'll just do it for in between threads So you can think of the center for as a value and it's just an integer that will always be zero or greater So it will never be negative And then it has two fundamental operations on it It has weight which decrements the value and post which increments the value and they're both atomic operations So the only way that this is different than Actually incrementing and decrementing a value is that weight does not return until the value is greater than zero So if the current value is zero and you call weight on it It will just sit there and essentially block until something posts that That center for and increases it to a one and then the weight would go ahead and decrement one from that And then go from one to a zero and then go by weight So the fun thing with center for is are they just has those three operations It's just the value you set and then increment and decrement And you can initially set the value to whatever you want So you can also think of that initial value is if you don't post or increment it at all It's the number of weight calls that may happen and the number of threads that could be Possibly doing something in parallel So the api is kind of similar to p thread locks. There's an init and a destroy and then Weight and then there's a try weight which won't block So you can just see hey, did I decrement the center for or not? And then maybe you want to do something different depending on you know if you actually got it or not And then post just increments it so all functions return zero on success generally no one ever checks the return value except for try weight And then in the init it takes you know an address to the center for and then this p shared variable Which is just zero if you don't care if it's shared between processes and one if it is for the purposes of this course It'll just always be zero So it's just the center for and then your initial value goes here All right, so knowing that Let's try and make the threads always print Always print this is first and then this is second So let's go to order print So first thing I would want to do is create my center for So I could do something like this So oops, I didn't include Okay, so If I have these two functions um, and I want Second to go first and I have a weight call that will you know weight on something. Where should I probably put center for weight? So in the print first Okay, so right here some of weight Okay So when I create it Some in it So what should my initial value be? I hear one It can't be negative it always it has to be zero or greater So so far we just have one weight call right here Okay So then one All right, should we have a post anywhere? Maybe After this one Okay So if we do that, let's just try it a few times and it already didn't do what we want Okay, so let's let's argue about what just happened to so the initial value is one, right? so say Print first gets scheduled first It makes it to the weight the current value is One so it can decrement it without having to actually wait for anything So just decrease it from a one to a zero and then call this as first And then you know post it back from a zero to a one and then the other thread could go At at any time right and then We're not going to do anything to prevent the first threads from going So if it gets scheduled first it just hits print f and prints it first and then we don't have our order anymore, right? So there's one comment that said to change it to this Yeah, okay, so So let's go ahead and change the initial value to zero and see what it's doing now So now if the initial value of zero So if thread, you know print first gets scheduled first it would print this as first which is good And then post it from a zero to a one And then print second would get scheduled And then it would hit the weight and go from a one to a zero which is fine It doesn't have to wait for anything and then it would print i'm going second So that's good. That's one now. Let's see what happens if we happen to schedule print second first So the initial value is a zero. So if we hit some weight The yeah the current value is zero. I can't decrement it. So i'm just going to wait Then at some point it's going to schedule print first Which is going to finally print this as first And then it would post that value from a zero to a one And then it would return that thread's dead and then print second It would come back to the second thread We could now decrement that value from a one to a zero and then it would get print i'm going second So if we compile that We'll get it, you know every time because we just argued about both orderings of the threads. Yep Yeah, so try weight. We'll just try to decrement it and then tell you whether or not it did Oh weight will like weight blocks. Okay, so it blocks until it can decrement it So if you don't want to block there and you want to just try it you can do try weight Yeah, yeah, it'll tell you whether or not you it decremented Yeah, yeah So you would so it would just try to decrement it and the return value would say whether or not it was able to decrement Yeah, yeah, so if In the normal case you'd get block if it was zero And the try weight case if the value was a zero and you call try weight It's just going to be like you didn't decrement it. Sorry Yeah, it's exactly like yeah when we argued about data races. We have to kind of argue about all the interleavings Uh, yeah, yeah Yeah, all the semaphore ones are atomic Yeah No, so semaphores are its own primitive So So there's going to be some hardware support or there's going to be some way to implement it. Yeah No, so it's guaranteed to be atomic So as soon as if they're both waiting on the zero and then something posts it only one's going to go by Wait Yeah, it's like it's the crux of everything we're we don't know Yeah Yeah, so it will increment and decrement that value atomically so it won't like lose any posts or weights Right, but we don't know that the order Okay Cool, so we fixed it So this is just so the notes so you can see that and we can argue about that It will always call print first like always do the print f in print first and then always do the print f in the second thread second So this is us arguing that no matter what thread Gets executed first we get the same order. So what happens if we initialize the value to zero or to one instead of a zero though Yeah, it doesn't work so If we initialize it as a one instead of a zero then if print second happens to get scheduled first It would hit the weight and be able to get past it by changing the value from a one to a zero Then call print f and then print first would go and just change that value from a Zero to a one and then the other way it would go is if print first got scheduled first so it would post that value from a one to two and then Print second would decrement that from a two to a one Right So here's a question. So can we use a center for as a mutex? Because they kind of look similar right now No, kind of so weight kind of looks like lock, right? So weight will prevent other threads as long as it's a zero Yeah, but Okay, can we use this as a mutex if we use it properly? So Yeah, and then what would the initial value be? One Right Yeah, so this is our counter from last time that we used a mutex for but you know, we can use the center for as a mutex It's actually kind of like a special case of it So if we initialize it to one it behaves exactly like a mutex as long as we replace lock with weight And then unlock with post So if it's one It's going the first thread that makes it to the weight would decrement it from a One to a zero and if any other thread tries to Get by the weight its value is zero So it's just going to sit there and block and then whatever thread decremented it gets to Increment the counter and then at the end it would post so go from a zero to a one and then another thread can get by weight Right, but this is actually kind of subtle because even if we you know, we wrote this We replaced lock with weight and unlock with post The initial value is kind of key to see what actually happened. So what if we just made this initial value is zero nothing, right if we Yeah, so if we made this initial value is zero Then the first time a thread calls weight It's just going to sit there and get blocked everything's going to get blocked because nothing's actually going to be Nothing is ever going to be able to call post. So they're all going to be blocked and they can't make any progress Yeah, yeah, yeah until any of the other threads can decrement it If the value is one initially So it's just like an atomic value. So If the initial value is one and then say eight threads make it here One of the threads you don't know which is going to decrement it from a one to a zero And then so there's going to be seven other threads waiting here and then one thread's going to make it by Right and then execute this line and then that one thread whenever it's done is going to increment it Yeah, so as soon as that one thread that made it past is done It's going to increment the counter back up to one and then one of those seven threads is going to get it next Yep Yeah Yeah, if you want you could just You could use it as something atomic just an atomic increment and decrement If you can make it fit Yeah, but it has to be positive too So it can't be negative, but you could use it as a atomic counter if you wanted Yeah So if this value is a two instead of a one Then two threads could make it by the weight, right? So we could have two threads executing this So we'd still have that case where now we limited their parallelization to two But now we have data races too, right? Because now we have two concurrent accesses all you needs to Yeah, yeah Yeah, not supposed to Yeah, so if I added a print third here and want to organize it Right, how would I do that? Anyone want a hazard to guess? Close So what about if I create another center for? Yeah, yeah, so I could have one center for per order So this could be my first center for that post and weight And then I can have another one a different one and say, you know print third looks the same It's going to we'll post here That new center for and then have a weight here So if this always goes second and then it posts a value that's initially Zero, then it's going to wait until this is done before it can execute, right? Yeah, the easiest way to do it is yeah, if I have number of seven fours minus one I just kind of cascade them like that Well, that's just ensuring order which kind of makes it not paralyzable So it's only if you like, you know if you're doing some operation where the dependencies mattered You might want to do something like this Okay All right, so let's come up with the solution to this So here is our producer and consumer So we'll assume we have a circular buffer Where each slot is either empty or filled and initially it's going to be all empty So we have a producer that writes to the buffer If the buffer is not full ideally, you know, we don't want to overwrite information And then the consumer should just read from the buffer and then do something with the information and extract it from it As long as it's not empty So it shouldn't be reading just empty data that hasn't been initialized yet So one of the ways you can implement this is just you know a circular buffer Where all consumers share an index and then all producers share an index And then in both cases, you know, the value is zero initially for both the indexes So if they're both pointing at the first slot And we don't want the consumer to read empty data, then the producer should come first So there's already some type of order that we want So let's go ahead and see what the code looks like So if we go ahead and So I'll skip some code and just show The producer and the consumer so I essentially have a center for that acts as a counter. So this isn't meant This is just meant to like cap the number of items that we have so you can ignore this So in a real program, it'd probably be like an infinite loop or wait until you know I know that I cleared all the data from a device or something So in the producer, we just sleep to simulate that we're you know producing some data We're doing some work and then we will fill a slot All right, which would just pick whatever initially would be slot Zero whenever we have data We'd put some data in that slot and then go to one to two to three to four and then circle back once we hit the end And then in the consumer, it's the same but the other way around It will essentially just loop over and over again to try and empty a slot And then once it has that it should have some data and then it can do all of that work You know, hopefully in parallel. So we're just getting data from a buffer and then you know Simulate doing some work. We should be able to do in parallel while the producer is also doing some work. Yep So try wait will just return whether or not you dick decremented it Zero if you didn't Yeah, so this loop so we don't have to worry about this that much that I just set it to like say I want to produce 15 elements I just set it its initial value to 15 and just loop over and over again to try wait to decrement it until it doesn't decrement anymore Yeah, so while I didn't while I didn't successfully decrement it Okay, so if I go ahead and print this or sorry, I go ahead and run it So I have a few arguments so I can say the number of producer threads the number of consumer threads Then how many elements I want to produce and the size of the buffer and how long it takes So even if I just do one of each thread and do that I I have red text that says the conflict So if I look at it here You know the first thing it does is tries to empty slot zero, which is already empty So that's an error and then you know it takes a while to produce some data So now it fills slot zero, but it's already tried to be an empty before So then this is going to go on to you know empty slot one, but it's already empty And then it comes in and fills it and goes on and on and on until it finally kind of catches up And then at some point The filling goes faster and then it has overwritten a slot that is already filled So the consumer wasn't able to read it yet So let's try and fix that So if I want to try and fix that First thing First thing that's easiest to think about is what do we want to wait on? So should we wait on the producer or the consumer first? Producer sure Yeah, okay, so we can go the other way. So we know that the consumer probably should go first Oh, no, sorry. It should go second Okay, so it should go second So some post Or no, we want to wait And then we have to give it a name So giving it a descriptive name. We might want to wait for a filled slot So let's call our center for filled slots And then we can go ahead and create one static sum filled slots And then what's our initial value? Zero right because we want to prevent it from running immediately So if we want to prevent our consumer from Running first the initial value probably needs to be zero because if consumer gets scheduled first We want the wait call to sit there and block it, right? So we all agree that that should be zero Yeah, so then if I want to keep track of the number of filled slots I have then I should have a post After the number of filled slots Or after I fill a slot. So I increase it by one. I filled a slot All right, everyone good on the same page So hopefully if we go ahead and compile that And we run it with just one thread Hey, no problems Right, so we waited We always filled a slot before we emptied it and we kept track of them So there's subtle bugs still if I run it probably with more threads. I will see some more bugs Or not Let's produce a lot more. So if I produce 25 things Okay Let's be lucky All right, usually it has a data race 15 buffer size 10. So let's make our producer Take longer Okay Yeah, we're we're too good at coding There's a case where it's not going to work and apparently Yeah, okay. All right, we're too good at coding But we should protect the other way around just in case it were to happen So the other thing we want to do is make sure we don't empty a already emptied slot, right? Or sorry that we don't fill an already filled slot So we only want to put data into an empty slot Right So we probably we can do another center for for empty slots So if we do another center for for empty slots and we probably need a weight here So that way if you know for some reason that Say the buffer size is only 10, right? So say The producer produced 15 elements in a row We want to make sure that at most it only produces 10 until that consumer actually catches up with it, right? So here we're going to wait for an empty slot Yeah, yeah, so We need to wait here and we need to make the initial value Of empty slots Equal to the size of the buffer, right? So buffer size Okay, so now we're waiting for an empty slot which initially there's buffer sizes So if it all gets scheduled I can fill up the buffer full from the producer Which is good And then I fill a slot and then I wait And then I post that I filled a slot So in the consumer I wait for a filled slot and then if I empty a slot I should probably I'm missing something Yeah, some post empty slots Right, so I make sure that I Increase that so if it fills up the whole buffer It waits for the consumer to empty a slot and then it would increment it from a zero to a one And then one the producer could come in and fill the buffer backup So let's compile that Of course, you know, no matter how many threads we got We're going to see All the good output because we already saw the good output before Right, so even if I do, you know 15 threads of each It's just going to immediately Run in parallel as much as it can, right? Any questions on that? Yeah Yeah, up to the buffer size So because I just had 15 threads in each and my By default the buffer size is 10 So I essentially am wasting time creating threads that I know I don't really need to Yeah Yeah, I've five extra threads that could aren't aren't allowed to go in parallel because I can only go in parallel up to buffer size Yeah Or so So this would be quicker as long as you can run in parallel as much as you can And then pass that you're just going to have some overhead where you're creating threads that you know, you don't really need to Unless of course they block or something like that in which case there is a use for them All right cool all right so Here is The solution for your slides. So, you know first one was we insured producers never overwrite filled slots And we initialize the value to you know the buffer size And then the other one was to ensure consumers never consume empty slots And we set that value to initially be zero Right initially we have zero filled slots So we used two center fours and we insured some proper order between producers and consumers All right, so what would happen if I initialize both of the center four values to zero now Yeah, so if they're both zero No matter what got scheduled first it would hit a some weight at a zero and both of them are zero So we'd never make any progress. So we kind of deadlockers. So it's called a deadlock. We'll see that in the next lecture All right, so we use center fours to ensure proper order. We saw mutual exclusion before but now we can ensure order So they Only have three things They contain an initial value You can increment them and decrement them And the only caveat is that the decrement will block until the value is above zero And then You still need to prevent data racism in this case So let's go ahead quick and do a Midterm question example that has to do with get contacts that might help for lab two All right So this is from a midterm ashen went over it. So I will go over it too. Hopefully it helps in lab two So the question says, you know that implementing thread switching is a little tricky To simplify matters you start by implementing a thread switching between just two threads a and b as shown below You've already added a and b to the run queue and initialize their thread context So they will start executing thread a and thread b functions When they run for the first time your scheduler happens to run thread a first and this is cooperative so I will use green to say Where each thread is going to execute if it was to be scheduled So thread b Got initialized to just start executing the thread b function And it says that Thread a is already executing. So thread a would start executing here Okay, and then the question says Circle the output you expect to see hint Think carefully about what is saved by the get context function So a get to context function just saves all the registers, right So let's step through it knowing what get context does and explain what happens So thread a is going to start executing It's going to create a variable called d on the stack and initialize it to zero So thread a will have a variable with d equal to zero Okay, it will go into this while loop check the value of i that it's less than three So i is up here So i is a global variable and its current value is zero So It's going to go past the while loop and then increment it So it would increment i from a zero to a one And then it would go ahead execute the next line, which is print f of a and then whatever the value of i is So this should print This will print a One, right Okay, so now this is where it gets tricky. So it sets d equal to zero. It's already zero So it doesn't matter and then it calls get context for essentially thread a So let's put in green where it would resume If it gets, you know, if someone calls set context on ua So it would resume right there right after the get context call So let's just put that in our back pocket to know and then it's going to check this if condition D is equal to zero. So it's going to set d equal to one That is a big eraser So it's going to set d equals to one And it's on the stack, right? so Since it's on the stack it will call set context and then thread b will be the active one So just to put a reminder as to what the value of d is for thread a because again, it's on the stack So it's local. We'll just put it right here and keep track of it when it resumes Yeah, yeah Yeah, yeah, we're assuming cool. Well Yeah, we're assuming that cooperative threading. So it's not going to get preempted. Yeah Yeah Yeah, yes Yeah, it depends how the compiler uses that register. So generally it does that if it's doing some optimizations and It might reuse that register or do odd stuff to it. So volatile just forces it to stay on the stack Yeah, if you really need it to be on the stack you can like in lab two. Yeah So the compiler gets to the side So if you call address of that variable then the compiler can't remove it So it has to be on the stack if you take the address of it So you'll see that in like some compiler course if you really like compiler courses So if you don't take the address of something that compilers free to like just ignore it Yeah Yeah, in this case it would be the space of UA So but thread a stack Still has that variable d and right before We called get context and then we changed it here No, get context just saves the context and then if you call set context of it, it'll resume So get con that's why I have the green there. So if you call set context of UA This is where it would start executing from again Yeah No d does not get reset to zero because it's on the stack, right So it doesn't re-initialize the stack because we're not calling it again So it would have whatever value you change it to last Right, so it's going to have a value one All right, so after so then it called set context to thread b which is initialized. So I'll go ahead and Make that uh blue So thread b is here So it's going to allocate a d on its stack, which is private to itself and that value is a uh values of one It's going to go into this while loop Check the value for i i is equal to one then it's going to go increment it So it go from a one to a two And then we'd hit the print f So this time we'll see b print off two And then we're going to set d from one to one Then do a get context for thread d Or sorry thread b So we'll just put a little note to ourselves here that if you call set context on ua or ub That's where it would resume. Yeah, but it lives on the stack Yeah So No, it only saves the registers right so context just saves the register. So to save the stack pointer But we changed the value Right So this is where the question gets tricky is that get context and save context just restore registers Right, so if it's on the stack, it's just going to live there unless you you know exit the stack or do something All right, so thread b is going to come here if d is equal to one which it is It's going to change d from zero to one and it's going to be the same thing we saw before So d is now going to be a zero and then it's going to set context ua which would resume back there So let's go ahead and just write down thread b So when it resumes d is equal to zero because that was the last value it set it was on its stack All right, so thread a is now the active thread again So thread a we have d equal to one and point So if we have that we resume at this point Yeah Yeah, you just assume that the compiler doesn't do anything weird So Now d is one so this is statement when go through it would come back up Check while i is less than three which it's two. So it's all good. Then it's going to increment i So we're going to go from Two to a three and then hit the print f again. So we'll see a three Then it's going to set d equals to zero And then call get context for thread a So it would go ahead and resume there If it were to get scheduled again, or if you'd called get set context on on it So d zero go in here set d equals to one on its stack set context to you be so D again on its stack the last thing we did is make it a The last thing we did is make it a one d equals one So now Thread b's going to execute again. It's not going to go into this if statement So i'll just keep it green for now, but this is the current active thread that i'm kind of moving around So it would go here and then exit this go up to the while loop while i is less than three I is currently three. So it would fall back Out of the while loop and exit and then our program's done So we get a one b two a three, which is Corresponds to an answer f. Yay Yeah In this case, it just kind of gets stuck But yeah, they should all exit All right, so that's time. So just remember i'm pulling for you. We're all in this together