 All right, good morning. So first is, I guess, what's on quiz two. So quiz two will not be, won't have anything that's on quiz one because you've already done that and we've all done fairly well on it. So this will all be processes, locks, threads, all that fun stuff, probably some stuff involving exec, wait, and then threads maybe a little bit about signals, but that's pretty much it. But based off that, that's a lot of questions you can ask because that gets confusing very quick, especially if it's your first time. So that's why we introduced it way before, let's set and now we can play around with real examples and kind of argue about it. In the real world, it's not always super obvious and there's always gonna be some trade-offs. So let's go, so I like doing actual examples, so that's what we'll be doing today. So we know enough about coding, we can successfully run a bank. So let's do some financial code. So let's assume we are running a bank and we're doing 100 million transfers, which isn't that many. So I wrote a little simulation that can create a varying number of accounts just to see how it affects things and each account has a unique ID associated with it and then a balance and every account starts with 1000 bucks. So when we're running our simulation, our 10 million transfers, they're just going to randomly pick two accounts to transfer money between and whoever gets picked to send money will send 10% of their total balance to the other person and that's it. So you're just gonna send 10% of your current balance which should subtract that from your balance and add it to someone else's balance. And then just to make things, I guess admit more realistic so that times actually make sense, there's a securely connected bank function we have to call before starting a transfer but and essentially it's just gonna waste some time. So that's it. We're now going to just go over the question, we'll have data races and deadlocks because we wanna parallelize the bank transfers and we'll see why we end up in this situation but this is essentially going to be like half the quiz covered in this one example. So our goal is to be faster than 11 seconds if you wanna find the template, it's in lecture 17 and the example's repository and this is how you compile it and run it. There's a typo because that should be lecture 17 not 16 and then to run the program you just run the bank sim. So let's go over it real quick. So again, we're simulating was it 100 million transfers. So let's see how long it takes. So say we just have a thousand bank accounts. So if there's a thousand bank accounts they have $1,000 each. That's a total of a million dollars we're managing, small bank and at the end of all the transfers what should the balance of the bank be? A million. So we shouldn't lose money by just transferring. The transfers are all a zero sum game. So if I do that I'm just putting out how much memory is used for the accounts because it will become an issue if I have too many accounts. But here we have a million dollars and at the end of all the transfers we still have a million dollars and it took about 11 seconds. So let's read the code for those coming in. We are running a bank. We have a bunch of accounts that have money and we're simulating 100 million transfers between the two accounts. So let's look at the code. So there's a starting balance and number of transfers and we're going to make this thread so that's why I have it defined for that and I'll use it later. So here's the struct that represents the account. It has an identifier and it has a balance. Makes sense. And then here's a function we can ignore that simulates connecting the bank. Here's our transfer function. So it's already picked account to send from and then to. So we got to securely connect to the bank and then we figure out the amount we have to transfer which is just 10%. So we just divide the current balance by 10 and it's integer division so it can round down and that way we don't send negative money or anything weird like that. And then we subtract the amount from the from account and then add it to the to account. So zero sum game here. And then that is our run function that we don't have to care about. And then in main it will make sure we have an argument so it's the number of accounts we should have. This just parses the number, turns it into something we can actually use like an unsigned int and then allocates it. We say how much memory we're used. And then right here is where we initialize everything. So each account gets its own unique ID that starts at one just because most things just start at one and don't have an ID of zero and then sets the starting balance. So everyone starts with a thousand bucks. So then if I know everyone starts with a thousand bucks I can just calculate what the bank's initial funds are. It's the starting balance times the number of accounts that we have. So this is code we'll use later and then this is the loop we want to paralyze. So this simulates all our transfers. So for the number of transfers, so a hundred million we generate a random from index and then a random to index and then we just call that transfer function we saw before that actually does the transfer. Then afterwards this is our sanity check to check that the bank still has the same amount of money it started with. So total balance is zero and just sums up every single accounts balance and hopefully we didn't lose money. Otherwise that would be really, really bad. Our bank would probably be doomed to failure at that point and we just print out what the final funds are. Make sense to everyone? Okay, so if I want to paralyze this loop and make my bank run faster, what should I do? Threads, yes. Okay, so I want to paralyze this loop, I'll comment it out. So we want to make threads. Would it make sense to make processes instead of threads? Why? Yeah, there's gonna be a lot of communication because the amount depends on the current bank balance and they're not completely independent. So if you split off an eighth of transfers in each process, they might have outdated data because one might transfer money out of an account and then if another process transfers money out of that account, they're all independent. They would have wrong values. And to make them right, you would have to like communicate at every step so every process updates it and all that communication has to go through the kernel which is gonna be slow, you have lots of communication so it's just not gonna be worth it. You can do it but it's gonna be slow as dog. So we're going to use threads because they're all in the same address space we get sharing by default, it's all nice and quick. So this is the loop we're gonna paralyze. We want to make threads so I saved myself some typing and wrote this. So in this, we're creating num threads and then for each num thread, we're just creating it. So I want to pass in some thread ID just to show passing data to a thread and also it's gonna be somewhat helpful. So I just create memory for an int, an unsigned int and then malloc it because I don't want to pass anything like a stack address or anything because it's gonna be invalid, we share stacks. Doing something like that would be really, really bad and we saw an example of that before. So I'd have to malloc it and then I set the value of the memory equal to i and then I pass it as part of pthread create. So first argument is the pthread so it's whatever thread I'm creating. I'm giving it default argument so it will be a joinable thread, not a detached thread or anything like that. Then I'm going to say it should execute this run function and this is the pointer argument I'm giving it. It's the pointer to my thread ID. And then after I create all the threads, my main thread just joins all of them. So it waits for them all to finish. So at this point, all my transfers should be done. So any questions about this? Okay, so then the question is what should be in my run function? So right now my run function just gets the, so the argument we passed in, we know it's a pointer to an int so we just cast that argument and then dereference it to get its value and then we free it so we're all good. So we're good people, we're always freeing all of our mallocs. Okay, so what should be in this run function? 100 million transfers. Yeah, so I'm trying to paralyze everything. So instead of having one thing just do 100 million, I can just divide up the work between each thread. That seems like it makes sense. So we can just, you know, we can copy pasta a little bit. That's everyone's favorite programming technique. I know it's mine. So instead of doing num transfers, this is should be wrong because this is every thread doing all of the work. So this wouldn't be any faster. So how many times should it go through? Divide by eight or if I want to give it a nice number, oops. I can say num transfers, I even call it per thread equals num transfers divide by num threads, it's a little more readable. And then in here, I would do num transfers per thread. Yeah, yeah, so in this case, I'm assuming that it divides evenly. If it doesn't divide evenly, you could take the mod of that and then just set off, you know, you do an extra, you do an extra, you do an extra, and it's mostly balanced. But in this case, it divides evenly. But if you want to be like super pedantic about it, yeah, you could take the mod, I had that in there originally, but it just made it harder to read. But yeah, if it didn't divide evenly, you'd have to do something about it. Okay, so cool. I have nine eight threads, they're each doing an eighth of the work, should be eight times faster, right? No, mess with each other. So do the threads mess with each other? Yeah, so what happens if two are trying to transfer to the same account? It's a data race in computer ties. Yeah, if your computer died, none of our computers would ever work. So well, let's just run it and just see for fun what happens. So each thread's doing an eighth of the work, it should be about eight times faster. Let's simulate it. Takes like 11 seconds before. Yeah, it's a bit faster. But we lost some money. Yeah, it's taxes. Yeah, but you're the bank, you're supposed to be gaining money, not losing it. Well, let's see, so I lost money because of data races. Probably because what you said before, we could be transferring to the same account. So if I have less accounts, will that be better or worse for data races? Worse, why? Yeah, if I have less of accounts and my data races are between accounts, if I have less of them, it's just more of a chance for more. More data races and more collisions. So let's try it with 100 accounts, should come back. Ah, I gained money this time. Well, let's try with only like, let's try with like 20 accounts. So 20 accounts, my bank, I'm a little small bank. I have 20 grand. Why is this a problem? Because the bank would probably actually only have 20 grand in cash. And now it says it has 122,000. And if you loan that out, you don't actually have that money. And then we're like 2008 financial crisis all over again. But, but yeah, let's try it a few other times. Because some results are really, really fun. Okay, that one wasn't too bad. Two accounts? Okay, two accounts. Oh, we gained a lot. Let's do two accounts. So we have a, this is a really weird bank that only has two customers, but let's go for it. Yeah, oh yeah. So is that normal? So what likely happened in that case where I created more money than the world has to offer? Yeah, yeah. So it's like pretty much just transferring between accounts and there's being lots of context, which is all doing it. Eventually it's probably going to subtract more money than it actually isn't that read from the account originally. And because it's an unsigned int, it would underflow and then go back to the maximum and then kind of look like that. So that would be very, very bad if your bank shows up with that, which if you've like, what is it, what red subreddit? Is it like programming gore or something like that? I've seen bank accounts that, or like bank statements that look like this or no, it was like a medical bill that looks like this. So after this course, you guys shouldn't be doing this. Okay, so it's also not, we have data races and it's also not quite eight times as fast. So that's kind of weird because the transfer, that this just wastes the same amount of time every time. And then these are just a few operations that should like perfectly paralyze. We should get an eighth of the time. Then the only other thing that might be going on is we call RAND. So are there data races in RAND? So now when you've multiple threads, you have to argue about everything. So you wouldn't know the implementation of RAND, so you'd have to look it up. That is not a helpful documentation. That's more helpful. Okay, so this is RAND. Let's see what it tells us. RAND returns and the range zero to RAND max inclusive, blah, blah, blah. So it tells us it is thread safe and multi-threaded safe. Like it's safe to call with multiple threads. So it doesn't have any data races, probably because it locks, but you would know that the random number generators has some state in it or something like that. So it's slowing down because they're all using the same state and all the threads are fighting over that state and because it's thread safe, it's essentially going to make that part sequential. So we're not going to get our kind of ideal speed up of being an eighth. It kind of looks like atomic because it's going to be thread safe, so it doesn't matter in what order you call it in. So you can kind of think of it as atomic, but it's a bunch of atomic things that happen together like a critical section. But for argument's sake, you could think of it as kind of being atomic. Is that true? Yeah, yeah, so that's, so this is beyond this course, but we have free time, so we can essentially give each thread its own RAND and make things a lot faster. So in this, if we read the description, blah, blah, blah. RAND-R is a generator where it takes a seed argument that stores the state and it's essentially all independent and it's all of its state. So we could use RAND-R instead and that would be nice and independent. So we can do that. It takes a seed. So we can just set the seed equal to thread ID plus one or something like that. Who cares? And then instead of RAND, we'll use RAND-R and all it wants is a pointer to some value that it wants to modify. So now we change it so that RAND's all nice and independent. So if we do that, and this is like beyond this course, we won't have to argue about it, but when you actually do this stuff, if you have threads and you call functions, you have to argue about them. So you have to like check, can I call this? Is this thread safe? Et cetera, et cetera. 10 was fine. So if I run this now, it's way faster. So now it's kind of close to that ideal speedup because now each thread's actually independent. So they're not waiting on each other whatsoever. Previously they were waiting on each other for RAND because it had some shared state. Now we're all good. Everything's nice and independent except we still have that problem where, oh God, we have a lot of money now. So how do I want to fix this problem? Locks. Yeah, so I can fix the problem by eliminating data races. Locks are a good way to do that. And mutex is a great way to do that. So I should create a mutex? Let's all, yeah, sure, let's all create a mutex. All right, I create a mutex. Oh, so, and then in here, there's no data races in here because this just gets a random number. It's independent, this gets a number, it's independent, calling transfer, that's where the data races actually happen, but I don't have to argue about data races in RUN anymore. So as long as transfer's okay, I don't have to argue about RUN. So actually I want to move the lock up because transfer is where my problems lie. So now I have a mutex. Where should I put it? Each thread can do securely connect independently. You lock. Okay, let me just finish typing it so you wanna move it up there. Okay, where should, yep. Unlock it, where? After this line, we agree. After what? After 55, yeah, so there. Okay, cool. So do we have data races now? We shouldn't? Come on, we can be more confident in that. We don't have data races because all of our races were on some accounts balance and now we only access account balance while we have a mutex which means there's only one thread there at the time which means we can't have data races because there's not two concurrent accesses. There's only one at a time. So should we have any data races now? No, so let's do something like that. Wait, run it again? So we don't have data races, are we satisfied? No, it's pretty slow now too. More threads. Okay, well first, we also didn't demonstrate something else too where sometimes this might not be a problem or you might not recognize it. So this is our nice data-racing version because we removed the mutex. Well, if I'm Wells Fargo or some crappy bank and I'm too big to fill, I have $10 billion. All good. There's so many accounts, I don't ever see the data races. So you can be too big to fail. Like the real world, right? So RAND-R just has state, just modifies that seed variable and that's the only variable it uses to generate the random number so there's no dependency between the threads. So it's just completely independent. So with normal RAND, it has some state that each thread is accessing. So like within the call to RAND, it would have a lock or something like that. So if there's eight threads calling RAND, there'd be a lock so they kind of serialize a little bit and it would be slower because of that. But if I use RAND-R, it only generates a random number based off that seed and that's it, so it's completely independent now. So I don't have any data races because I'm the only thread doing it. So each thread has its own, oops, where is it? Each thread has its own seed. So that's the state it uses to generate a random number. But yeah, so too big to fail, yep. Oh, so your question is, if I do this, why is it faster? Like when I have my lock and I'm actually using it? Yeah. So it's faster because there's a little bit of it that can be done in parallel, that securely connect to bank can be done in parallel. So that's a good question though. So if I actually just lock the entire transfer function, well, it's probably gonna be really slow. Yeah, so the comment is, wouldn't it be running the same time just kind of crappy? It's been more than 11 seconds. Yeah, so now instead of just doing single threaded where just one thread just blazes through it and goes as fast as possible, that takes 25 seconds. So now there's an overhead with all of the threads. We're essentially just doing it all single threaded anyways, there's a lot of context switching. We're also decimating our caches, which is another consideration when you actually do like performance, programming for performance. So yeah, it's just not bad, which is why I have this securely connect to bank because doing those like few math operations is so fast. Paralyzing this just isn't even worth it. So that's why we have this securely connect to bank thing just to waste some time. So here too, when I have this, can I put it up? So I can also see that's kind of single thread by my CPU usage. So if I look at H top, I can see all my threads kind of fighting but there's a big red column, which means they're kind of all waiting for a large amount of time and then it finishes executing. So I can see just based off my CPU usage, kind of what's going on. So there's a lot of fighting there, they're trying to do a bunch of work, but it's not quite working, but I am hitting every single core. So I am using all my cores, which is good. So the kernel is doing its job, we're just kind of failing. All right, so can I do anything better than this? So this doesn't have data races, it works, it's correct, but slow, no one likes being slow. What's a better idea? More threads? Sorry? I'm assuming it's all random, we can't change that. Sorry, back. So both like the two accounts we're using? Okay, is that what you're gonna say? Okay, yeah. You wanna move it here? Yeah, so is this a data race? Yeah, you only need one write to happen at the same time of a read. So here, lots of reads could happen in parallel, but they don't stop a write from happening. A write could still happen, it means it's still a data race. Even if this is only a read, yeah. But they might update the balance before then, right? It's all over. So we can't do this, this won't work. I mean, we can even run this. So not a too big to fail bank, let's just do 100. So if we run this, we'll probably see that it is, hey, it worked. So this doesn't, so this makes it a lot rarer. Oh geez, it is not actually the solution. Oh, come on. Okay, well, there still is a data race, which is why this stuff is hard. It's just, this is not gonna do. So let's do your suggestion, which is one lock per account. And yeah, so with this right now, so yeah, that's a good question, someone else answered. Can I get a deadlock here? No, there's only one lock, so I can't get a deadlock, a lesson like the dumb situation where I relock my own lock and deadlock myself, but aside from doing something silly, you don't have a deadlock here because there's only one lock so you don't have holding weight or no circular weight, unless you consider the self-loop. So yeah, another question was, can you do a mutex only if two threads are accessing the same account? And that's essentially just having a mutex per thread. So we can change this account, we just add our mutex here. So p thread, mutex, t, mutex. All right, cool, do I have a mutex now? I have to initialize it. Yeah, we have to initialize it. We have to be good citizens. So we should probably initialize it wherever we initialize starting balance. I'd use p thread, mutex, and knit. And then it needs a pointer, so it would be address of accounts, i.mutex, and then its additional arguments are null because we don't care. So that works. We should also probably be good citizens. Where do I free? So let's set a good example here. And if we initialize the mutex, we should also destroy it. So we'll be good. Whoops. Oh, I think you just destroy straight up. Whoops. What the hell? Oh, it doesn't take a null. I bet. So it doesn't take another argument. I think that makes it happy. There, now we're good. All right, so now we have a mutex per account and we initialize them. So in our code, I now have a mutex per account so I can get rid of this mutex. So what should I be locking? For both of them? For both of them? Okay, do you care what order I do them in? No, don't care. So that's okay, to and from? Okay, so I'm still missing something. I should unlock them. So I should probably just unlock them. I'll just do them in reverse order. All right, yeah, like that. Seems like it should work, right? Why would that not work? Any reason why this wouldn't work? All right, well, let's just send it, see what happens. It worked, it's hella fast too. Is there a possibility of a deadlock here? No question mark? So I mean, there's a possibility that we, whoops, that we have like, transfer B to A and then transfer A to B. So, but is there a deadlock in this case? So here, we're gonna have a lock A and then unlock A and then a lock B and unlock B. And in the other case, we're essentially just gonna have lock B, unlock B, lock A, unlock A. So is there a deadlock between these two? No, there's no circular weight or anything. I'm only holding, like one of the conditions is holding weight. I'm never holding two locks simultaneously. And I also never have any data races. So this is actually the best solution that you could come up with that the other section did not get. But let's make it challenging on ourselves because another thing we could do, we can overlock things and be fine. What if instead we do what the original solution was, which was this? Yeah, so is this deadlock? I get them in the same order every time, right? I always get from them too. So I shouldn't get, yeah, but it, the locks are actually by pointers. So from and to don't always refer to the same locks. So in this case where I've transferred B to A from is, from A, so it would get that lock first. So it'd be lock A, lock B, followed by unlock B, unlock A. And then in this thread, we would be doing lock B, lock A, unlock A, unlock B. So that's a possible deadlock because this thread running transfer A to B could acquire the lock and then we get context switched over to this thread. It could acquire lock B and now we're holding each other hostage. So this thread here is once lock A that the other thread has and the original thread has once lock B that the other thread has. So now we're at a deadlock. And to illustrate that, I'll just compile this and if I run this, we should just sit here essentially forever. So let's let, let time out. Okay, that's long enough. We can be reasonably confident deadlocked. We can look at our CPU usage. They're all just sitting there kind of waiting around. Our CPU is not doing anything. Everything's just kind of falls all screwed up. So if we kill it, we killed that 29 seconds. There's quite obviously a deadlock. So how would I fix them? If I don't want to do our good solution. So if I want to keep acquiring two locks and hold two locks at the same time. Yeah, so ensure every time the order which I think is what you said to. So one order I could do is just by, okay, I'm gonna keep this for a second. So one order I could do is just all the account IDs are unique. So I could just check the account ID. So if from is less than two ID, well then I can create some variables. pthreadmutex pointer called m1 have very clever names. So m1 would be equal to from mutex. So I'll always get the lowest ID first. And then if this is not the case, then I will just swap the order. Whoops, holy crap, that's a lot of from. So now I'm always getting the lowest lock first. So now if I, so now I could just lock m1 and then m2. And then when I unlock it, I would do m2 and m1. So now is there a deadlock? No, because I get everything in the same order. If I have this situation again, well I would have to assume, if I assume A as a lower account ID than B, then this transfer would not look like this anymore. It would have actually done lock A, lock B and do it in the same order. So now here we don't have a data race anymore because we don't have any circular dependencies. We're all in the same order. So if I do that, let's just Sandy check with just 10 accounts. Did I screw up? Uh-oh, it's all over. Rupro, what the hell did I do? Oh God, that's a bad example. So I'm going to m2, well this should be in the same order. So if it's A and B, they would be in some specific order. Did I screw up my IDs? From two, from two. Oh Christ. Oh, so yeah, that's what's happening, that's what we forgot. So because of this, because we have two locks now, thank you, that would have taken me forever. Now I could be in that double lock situation where I blocked myself, but that's kind of done. If I'm transferring to myself, why do I even have to calculate anything or do anything? It's net zero. Why would I bother transferring to myself? So I can also just speed it up and just say if from is to, so the two pointers are the same, or I could check if the IDs are the same, it's going to mean the same thing. Then I just give up. Thank you for mentioning that. So now I'm good. So before I was deadlocking myself, now if I run this, it should run and it's fine. And it actually doesn't run that much slower. Okay, yeah. So this works because the situation where we deadlocked is because it looked like this. So there was a possibility where it acquires lock A and then wants to get lock B and then another thread gets lock B and then waits on lock A. So we eliminate that circular dependency if we always acquire locks in the same order. So in this case, because we checked the ID and we always get the lowest ID first and they're unique, this would never happen. We never try and get lock A before lock B. So it would look like this. And if it looks like that, then whoever gets lock A first will get lock B. Yeah, okay, what's the second way I can fix this? If I don't want to do this because this looks kind of ugly. Yeah, so this eliminates the circular weight, the circular defences. I could also eliminate holding weight. So I could give up the lock. So instead of doing this, I could just go ahead and pthread mutex lock the from side. So if I acquire this lock, I don't want to hold on to the other lock, I would just want to try it. Just give it a little try. So pthread mutex try lock. So I could try to acquire the twos mutex. And so the way try lock works is if try lock returns zero, it means you got the lock. So it would fall out of this loop if it actually acquired the twos mutex. And then at this point, it would have both mutexes. If it tried and it got it immediately, it wouldn't go into the while loop. So it would just fall out and I would have had two mutexes. But if I go into the while loop, then I didn't get the lock. So what should I do? Yeah, first thing I need to do is release the lock because I have to eliminate that holding weight. I couldn't grab the other one. So I would just go ahead and unlock the from lock. So now I unlock it and then here, I should probably sleep or yield. You'd think there'd be something called this, but there's not because the kernel doesn't really care. Sketch, I think. The kernel doesn't really care what you are. If you're like the original thread that started the process or like a kernel thread you created. So there's just a default term called schedule yield that just tells the scheduler, hey, I'll yield myself. And then before you try again, you want to reacquire the from mutex again. So before I try again, I want to make sure that I actually lock the from mutex again. So that way I try again and the only way I fall out of this loop is at this point I have both locks. And if I make it there, I want to make sure I have both locks. So that's why you have to relock again to get the from and then you try again to get to and then go back and forth as many times as it takes. So here I need to unlock my locks. All right, so do I have a dead lock now? No, because I eliminate hold and wait. So if I execute this, it should work. Yay. Yeah, yeah, you'll either when you wake up from the yield and from's not taken any more you can grab it again or if someone else did take it as part of like another transaction, you'd get stuck at the lock and wait for it to pass. And so you're just waiting for that lock. Okay, cool. I will, yeah. So the question is like, don't I have the two lock when I go to sleep maybe? So, oh cool, my camera died. So in that case here, if I go into this while loop it means I did not acquire the two lock because it failed. And then right before I sleep I unlock from. So at this point I have new locks. So I only sleep when I hold nothing. And then I only make it out if I have both. Yeah, so another cool thing real quick is one, so debugging this thing things are kind of hard and almost borderline impossible but what you can do is there is this fun little thing called F sanitize thread that you can compile your program with. So in this case, I know I have a deadlock and if I run this it will actually give me some useful output and kind of yell at me. So this is like a tool you could use where it says, hey, mutex M zero is acquired while holding M four. So that's telling you you have a hold and wait condition which means you have a possibility of a deadlock and that's really the only thing aside from reading the code that will help you debug it. So that's it, that's pretty cool. Yeah, you could it's just gonna be a lot less readable. All right, well just remember I'm pulling for you.