 welcome back to 353. So today's lecture, an important one even though it's the early morning. This effectively serves as a primer for your next lab. This will probably be harder than your next lab because we're running a bank today. Ooh fun. So also feedback forms posted for the course and syllabibe please fill them out. Also it seems like while next week I'm traveling seems like the popular vote is for me to not pre-record lectures. Instead just do a two hour review session closer to the exam date since we're on the last day in the morning. Joy. So next week very likely March 20th and 22nd will be cancelled. So that is the housekeeping for today. So other than that let's run a bank. All right this is a fun lecture. So little setup it's a simulation so we are a bank and our job is just to handle doing 10 million transfers. So how the simulation works is we can create a varying number of accounts and each account has a unique identifier just going to be an integer for us and then a starting balance. So each account is going to start with a thousand dollars so if we choose to create I don't know five accounts we'll be managing five thousand dollars with our bank and the way the transfers are going to work they're a bit silly but when we generate transfers between random accounts we're just going to transfer 10% of someone's current balance to another account. So the only other rule is that they have this function that says let like kind of simulate securing securely connecting to a bank that we assume is like independent if we have threads before starting a transfer. So the rest is a coding example so we are going to try and paralyze this to make it go vroom vroom fast so of course we will probably encounter data races, deadlocks, our goals to make it go fast and I will need your help. So this is another one where the whole thing is the exercise so the template is in the paralyzation example directory in the materials one that you have access to so you can go ahead and try this again but afterwards and try and recreate everything we did but we will be able to solve this in less than 50 minutes and this will probably this is probably easier than your lab ish or no this yeah this is harder than your lab so you should be able to do the next lab in like 50 minutes boom all right great so let's get to it and we will even include time reading the code. So here's our code there's a few defined so we define our starting balance is a thousand dollars number of transfers as 10 million then we're going to try and create eight threads to try and paralyze this but let's see what we have first so we have a struct account to represent the account each account has its own identifier and then a balance they're both unsigned integers and then we have a few global variables so number of accounts and then this pointer to the account struct because well we're just going to create a giant global array and we're going to mallock it and that is how we access the accounts within threads if we choose to do threads so there's a securely connect to bank function we assume it's independent and we do not read it so the other functions are the transfer function so it takes this is the main one so it takes an account from and then account two so how it works is it does securely connect to bank then we go ahead and what we're taking 10 percent of the balance so it's same the same as taking the current balance from the from account and just dividing by 10 so that gives us 10 percent and then that is the amounts we are transferring so we subtract it from the from accounts balance and then add it to the two accounts balance so from the bank's point of view we're just moving money around we shouldn't be losing any money here right yeah it's in the materials directory did I not push again it's there now okay so we have I thought I pushed it okay never mind sorry about that all right it should be there now so here we have a run function that we set up for thread right now it's empty because I'm not going to create any threads and we have a check error well this will be able to check errors for the p thread functions if it returns zero that means no problem so we return otherwise we'll just print an error message we will be perfect programmers today so this should never return an error then how main works is checks for two arguments so the name of the program and then the number of accounts to use it does a set locale thing so we just have commas for our number so they're a bit easier to read and then well we try and you know convert the input to a unsigned integer we do some error checking to make sure that I did not type anything stupid then we go ahead figure out the account size so it's the number of accounts times the size of the structure we go ahead malloc it print off the memory used in megabytes because this can get bad and then for each account we're just doing this without threads to make sure we don't have any data races so we just assign a unique id so they're just one indexed and then start each account at our starting bounce which is just a thousand dollars and then our initial bank funds should just be the number of accounts we have times a thousand right so if I have five accounts I'm managing five thousand dollars so skip this for now and then here is our loop that is the main thing we want to paralyze so it just loops for as many transfers as we want to do picks a random account for from as an index random account to two and then it'll initiate the transfer so it'll just access the struck at that an index and then just take the address of it because this function just wants pointers and then that is the main thing we want to paralyze after this we're going to just do this serially so we're going to make sure that our total balance across all the accounts hopefully matches what we started with so we take the total balance we go through all the accounts and sum up the current balance at the end and if we are running a good bank we should have the same amount of money that we started with so let's get an initial time here so we'll do banks say we're doing a thousand accounts so we're managing a million dollars so one mississippi two mississippi three mississippi four mississippi five mississippi six mississippi seven da da da okay took a while it took like 10 seconds so our goal is to make that faster so as my first step what should i do to make this faster get rid of the for loop not quite like very very high level how do i want to make this faster yeah threads right so this just has one thread going so this machine has more than one thread so i could be doing eight things at a time right so why don't i just use threads so like that commented out code is just to set that up so i want to paralyze this for loop so here i'll just uncomment this so this will make eight threads set it up we'll have a thread id if we need it turns out we probably don't then afterwards they'll wait for them all to join so wait for all the threads to finish before we serially check what the account balance is to make sure we don't screw anything up so this is what i'm doing in each thread so should this go faster now did i just make things go vroom vroom eight times faster so let's see let's compile it do that's a jeopardy thing right i don't know why that's in my head wow this sucks what was my big issue here why is this so much slower yeah context switching not quite so think about with one thread how many transfers was i doing i was doing 10 million right now i have eight threads how many transfers am i doing yeah 80 million do i want to do that no so how should i make this wow it's still going eh all right good thing i plugged in my laptop so it doesn't die so that was over a minute so what should i do to make sure that i still have 10 million transfers in total so i should have the same amount of work so if i have eight of you to help and i have 10 million transfers do i just tell all eight of you to do 10 million transfers how many transfers do i tell you each to do yeah just an eighth of the work you all do an eighth of the work right so if i'm doing this per thread i want to make sure i do the same number of transfers in total so i just take the number of transfers and divide it by the number of threads now i should correctly have eight threads each are doing an eighth of the work right all right am i done is this a perfect program that'll just go eight times faster any guesses as to anyone want to tell me why this is bad the threads could be doing the same transfers but in this case the two random numbers it gets they're going to be different per thread so should be okay yeah so there might be a slight thing where the number of transfers is not divisible by the number of threads so we might get you know turns out this one is perfectly divisible if it wasn't then one thread may have to do one extra transfer but for 10 million not that big of a deal yeah ooh potential data races right if threads are all running at the same time there's nothing to protect these accounts they are just all in memory and we don't really know so let's see so we might have data races so let's run it with our 10 or a thousand accounts we start with a million dollars and then we run it shit oh yeah we made money all right great so that that's how yeah now we're the government right? Hell yeah. So, yeah, this is how finance works, right? So, okay, well we lost money now, okay. Well, finance sucks, I want to quit now. So, nice thing about this, well let's see if we do, I don't know, let's pretend we're a small bank, and we just have 10 to do stonks. We made hell of money. All right, this is great, I like this. So, do you think, you know, whatever the Canadian equivalent of the Federal Trade Commission would enjoy this? If you just suddenly claimed you had $10,000 and now you have, I don't even know what number that is, a big one. So, we have some problem here and it is data races and, you know, unlike most problems in computing, whenever we had the problem was smaller, our problem was bigger because let's say I'm a gigantic bank and I'm managing a million accounts, so I'm managing what, a billion dollars? So, we know that there's data races, but if I let this run a little bit, do-do-do, boom. So, this is an instance of we are now too big to fail. So, we can, if we ever screw up, we can just get a bail out, right? All right. So, how would I go about fixing this? So, I have a data race, how do I prevent said data race? Yeah, so for data races, primarily we want to solve them using mutexes because we just want to make sure only one thread can write at a time. Here, well our problem's mostly in this transfer function, so we can kind of see what we have a data race with, right? So, we are subtracting from one account's balance and then adding to another and this plus equal means we're going to do a memory read, we're going to increment something or decrement something, and then we're going to do a memory write and we have multiple threads all accessing the same memory. In this case, that would be the balance of some accounts. When we had a lot of accounts, the likelihood that two threads were accessing the same memory at the same time was fairly low, so that's why we got a good answer. But, you know, when we just had 10 threads, the likelihood was very, very, very, very, very, very high, and in which case when we ran this pretty much every time, we start with 10 grand and then we get some absurd number. So, we are more likely to have a data race. In this case, it's an unsigned integer, so we just essentially under float it and it wrapped around to the highest number and we got some ridiculous number. So, we want to solve this using mutexes. So, okay, well, simplest thing to do that is safe is maybe I just static pthread mutex. Whoops. So, I just create a mutex. I should initialize it pthread initializer. Let's hope I spelled that correctly. Okay. So, how do I prevent data races with just this big-ass mutex or this one mutex? Okay. Well, so easiest thing to do with the monitor is like if I just do a lock. So, let's say I lock it. Okay. So, lock before, lock after. Do I have a data race now? Are we awake this morning? So, yeah. Yeah, no data races. Is this a great idea? Yeah. Yeah. So, this is basically what all my threads are doing. This is like the bulk of their work and if I just have one single mutex and I lock at the beginning and then I unlock at the end, means only one thread can do this at a time. So, why the hell am I even using threads at this point? All right. So, if I compile and I run that, I'll stick a time on here, but I'm guessing. One, two, three, four, five, six, seven, eight, nine, 10. Okay. As soon as it gets over 10, it's a bad idea because 10 was what I got when I just had a single thread. Oh, it actually finished. All right. We got 21 seconds of real time, but the system time like was 55. So, my computer worked really, really hard to get that number at the end. So, it was way worse than just having a single thread. So, I could make it a little bit better because I told you that the securely connect to bank function is independent per thread. So, I could move the lock down. So, now all eight of my threads could be executing this at the same time and it could be doing that in parallel. So, I would expect a bit better performance with that, but here I have to lock the whole thing. Otherwise, I'm pretty much screwed and I have a data race. So, how should I make this perform better? Yeah, maybe I make one mutex per account. That's a good idea. So, I just make sure that two threads cannot go ahead and change the balance of the same account at the same time. So, what I could do here is, well, in the account, maybe I just create a mutex there. So, let's just call it mutex. I will get rid of this. So, now each account has a mutex for the account. So, now in here, well, now I should probably lock both of them. So, I should do a mutex lock on the address from mutex and then lock the two and then unlock to unlock from. All right. Should that be safe? Okay. So, this is it. All right. So, we got, if this is safe, we are done in 20 minutes and we can go home. If it's not, we got some work to do. So, let's go for a thousand. We'll time it. One, two, three, four, five, six, starting to get worried. Seven, eight, nine, 10. Okay. We're worried. Hmm. Hmm. Hmm. Okay. What's happening? This is taking a long time and this is like helping other people debug it. So, help. My program does not work. It freezes. Did a deadlock happen? Hmm. Is it possible for this to deadlock? It looks like I got my mutexes in the same order every time. How would this deadlock? So, I always get the from one and then the to one. All right. Anyone think of a situation where this could deadlock with two threads? And let's make sure it was a deadlock. So, we can look at our process monitor and see how hard our CPU is working. Right now, all my cores are pretty much idle. So, that pretty much tells me it's not doing any work. So, likely I do have a deadlock. Yep. Two simultaneous transfers are trying to go between each other. So, like thread one is doing a transfer from, I don't know, let's say account one to two and then thread two is trying to do like a two to a one. Yeah. Is that what you were going to say too? Yeah. Okay. So, in this case, we could have a deadlock, right? So, in this case, thread one would go ahead, execute first. It would acquire, in this case, from. So, this would be from. This would be to. So, remember, these are pointers. That doesn't mean we're always accessing things in the same manner. So, we have from could be thread or account one. So, we could lock account one's mutex, then context over to thread two. And then, we could go ahead and it would try and lock account two's mutex. And it would do that. And now, we're screwed. Both threads are waiting on each other's mutexes. So, if we go back to thread one, it would try to acquire account two's mutex, which thread two already has. And if we try and execute thread two, it tries to acquire account one's mutex, which thread one already has. So, we are screwed. And there's another comment about a read write lock. So, in this case, a read write lock won't help us either because it would still deadlock because we would have to lock it as writing because we have a write here. We have to ensure mutual exclusion. Nothing's really just reading. All right. So, I have a deadlock. How fix? And I'll also show you a cool little tool. Oh, look, see those? I killed it after three minutes. So, there is a, you know how you use Valgrind for memory errors and it's useful? So, turns out there is a tool to help you with thread errors and it is called thread sanitizer. So, in the next docs, you will have, I'll have instructions for how to use it and it should run for you on the website. But I can just rebuild my project using thread sanitizer and it will add some additional checks to my code for me to make sure I do not screw up my threads and it will check for data races and deadlocks and all that fun stuff. So, now if I run my bank sim, I have to compile. So, now if I run my bank sim, it really, really yells at me. It says lock order inversion potential deadlock and it tells me what happened. So, if I scroll up, it turns out my deadlock is pretty big. So, this is the, it will show you the loop of locks. So, someone acquired mutex 400 or 750 which acquired that, which acquired that, which acquired that, which acquired that, which acquired that, which acquired that, which acquired that, which acquired that, and this big list and then eventually some thread that had this lock was trying to get lock 750 that was already acquired here. So, turns out that my cycle was pretty big that caused my deadlock. So, it just wasn't two threads involved in it. This was like all of my threads got found and it got very angry at me and it told me essentially where all my things are. And fun note, if I just got rid of these, so let's say I don't have them anymore. So, when I didn't have them anymore, well, I didn't have deadlocking issues, but I had data races. If I run that, it just yells at me that there's potential data races. So, this tool will only detect things that happen when you actually run it. So, just because it's clean doesn't mean that your program is 100% correct, but if it detects an error, there is a 100%, there's a 100% chance that there's an issue with your program. So, it only actually detects things when it runs. So, it is not 100% of the time, but it is a fairly good and useful tool. So, I'll just get rid of it for now because it will, like Valgrind, like really, really, really slow down things. Okay, so back on track now. Woo, my laptop's getting warm. All right, so how do I prevent a deadlock here? Yeah, so if I do something like this, oops, like this, right? So, this, so if I do this, let's see. So, this one shouldn't have a deadlock because one thread only gets one lock at a time. So, it doesn't, so it seems to work, but there is a slight issue here where we might not have like consistent numbers between the transfers. It's like a really subtle difference if we have this scenario because this just ensures there's no data races between adding and subtracting amounts between accounts, but the actual remaining balance in the accounts might not be what we expect if we just did it one by one by one because this can essentially do half the transfer instead of the whole thing all at once. So, our customers might not get what they expect, but we as the bank are happy because, well, if the bank's out like a dollar, you'll get a lot of emails, but if the bank screws you out of like a hundred thousand dollars, they have no idea what happened and can't help you. So, in this case, from the bank's point of view, we are good, but there is a slight issue where the amount the customers have isn't quite right. So, let's rewind again. Oh, then there's another comment. We can add a lock for the two mutexs to ensure the two lines run together. Maybe, but that's weird. How would I solve this? So, each account has an ID, right? So, how do I make sure that my accounts are, I always get the lock in the same order? Yeah. Yeah, one option is that I could use try lock. So, maybe I do that first. So, I would lock my mutex and then what I could do is I could try to lock the other one. So, I could do a try lock. So, I could try to get the two's mutex and then in this case, if I successfully get it, oops, try lock, if I successfully get it, then this will return zero and we'll just go out here and everything will be good. Otherwise, I don't have the lock and I should just try and get it again, right? So, should it just look like this or what else should I write? Yeah, put a sleep in there or yield. So, if this was your lab four, we'd do like a white yield. So, turns out there's actually yield in Linux but it is older than, wow, I can't type, it was a sketch yield. So, it's called sketch yield and not anything else, although I'm pretty sure I typed that wrong. Sketch yield, okay. So, the yield for the Linux kernel is just called sketch yield, that will run anything. What, how did I type that wrong? Oh, I can't spell yield. All right. Great. So, I just yield if I don't get the lock. Is that good? All right. Well, we can run it. Bank sum, let's do, I have screwed up my bank. All right. Great. So, it looks like things are still running but not all of them. So, I should have, what, eight cores all working and currently I have four all trying really, really hard, which means they're probably just stuck in this while loop. So, this while loop is just going over and over and over again and they're trying really, really hard. So, that did not quite work. Yeah. Ah, so I did not release the first lock. So, I should unlock it before I go to sleep, right? So, the from, all right. So, that's it. Should work now? Yeah. Ah, yes. I have to lock it again after the yield, like this, right? Same thing? Yeah. Yeah. So, I need to lock it again after the yield because I want to yield with no locks held and I want to break out of this loop with both locks. So, in this case, if I make it past this line, I have the from mutex. So, I try to get the two mutex. If I do get it in this return zero, I just go this line and boom, I have two locks. If I don't get it, well, I only have the from mutex. So, I unlock it. Then, when I yield, I have no locks held and then after I come back, well, I wait until I acquire the from mutex again and then I try and get the two mutex again. If I don't get it, same thing happens. If I do get it, I run this line with both locks. All right. So, now my code looks great. Right. No problems. We've solved everything. One, two, three, four, five, six, seven concerning eight, nine. Not good. Some clues here. So, I seem to be retrying an awful lot. And if I count them, one, two, oh, that one switched. Oh, we're switching. One, two, three, four, five, six, seven, eight. So, I have eight threads. They're all at 100%. And my laptop is a good thing it's plugged into the outlet. So, what is my issue here? Why essentially are all eight threads trying over and over and over and over again? And this one's actually pretty subtle. So, I will give you a hint. Did I write any code to make sure that the from account is not the same as the to account? No. So, what's happening if it's one thread is trying to transfer to the same, to and from the same account? So, it means the mutex is the same, right? So, if the mutex is the same, well, could lock it, no problem. Will it ever be able to acquire the other lock? No, right? So, turns out if I'm trying to transfer from and to the same account, well, that's stupid. I'm in net zero, nothing's actually going to happen. So, I could actually just check if the pointer is the same. So, if it's the same pointer, in this case, it means the same account. Otherwise, if I want to be smarter, maybe I just checked their IDs are the same. If I want to be, I don't know, more generic or something like that. So, if I'm transferring to and from the same account, maybe I just return, I just don't do it because nothing's going to happen anyways. All right, am I good now? Do we have the greatest bank in the world? Well, let's see. One, two, three, four, five. Hey, we seem to work. It doesn't really, it doesn't really go any faster, but it doesn't go any slower either, and it actually works. Yay, yep. Yeah, so I could make sure that here, that to and from index aren't the same. Turns out I don't care. Oh, also, turns out I miss something that makes this actually slow. So, all of, oh right, that's why I have thread ID. Okay, so, turns out that, well, RAND is a thread safe function, and each thread is using it, so they can only run one at a time, so make sure that that doesn't have any data races. Turns out there is a version of RAND that is actually independent per thread, so if we look it up, man, RAND. Okay, that is not what I wanted. So, if we look up RAND, tells us that there's RAND and there's SRAND, and then there's also this RAND-R, which is apparently deprecated, but we're still going to use it anyways. So, if we, if we bother to look at all of them, it says that, okay, well, these are all thread safe, which is great, but this RAND function uses like an internal seed that it makes sure that doesn't have any data races between all of them, and it turns out if we use RAND-R, then RAND-R, we can just set our own seed and then go ahead and make it whatever we want. So, we can go ahead and just create a seed that's just, I don't know, we can do it based off the thread ID, we can initialize it as the thread ID, and then we can just change RAND to RAND-R and do that. In this case, I'm not going to care if the accounts to and from are the same, just I'll have it behave as it did before. Whoops, wrong symbol. So, now, if I compile and run that, oops, unsigned seed, unsigned compile, all right, we're clean. So, now, if I run my bank, boom, look at that speed. Oh, this does not make sure that the from and to accounts are not the same. So, I still just have the check right up here that they're the same. Yeah, so this was purely for performance because that RAND function, well, like when we were using RAND took like 10 seconds because it turns out a lot of the time was just spent getting IDs. And, well, since it was thread safe, essentially RAND, the implementation of it would just have one mutex and lock at the beginning, unlock at the end, so only one thread can do it at a time. So, it turns out if we just used RAND instead of this RAND R, things were way, way, way slower because, well, only one thread can call RAND at a time. Turns out that generating random numbers was actually a lot of this. So, now, when we run it, we have a very, very, very fast bank. So, that's almost eight times as fast. We had to solve some problems and it was a little painful, but at the end of the day, if you can make your program go eight times faster, probably a very, very good thing that people pay you money for, then this is why they hire you. So, if you didn't take this course before and you just threw threads at it, well, it's not going to work. Maybe if you're working for, I don't know, name, insert big bank here, it might work for a while. And then, you know, maybe one or two times, the balance is off and then, well, as long as you don't lose money, you don't care. All right. Also, let's see. Comment chat. GPT said we could also use thread pools. We'll make it even faster. So, in this case, we only create eight threads and then have all eight of them just do all the work. So, we don't need thread pools. Thread pools are generally when you have really, really small tasks. So, instead of creating like a million threads that each do one very, very small task, the idea of a thread pool is I only create eight threads and then I have like a queue of tasks and I have all the threads just kind of work together to chew through all those million small tasks instead of making a new thread for each one. But, in this case, we know all the work each thread needs to do ahead of time, so we don't need to use thread pools. And if you did it for this, that would be silly. So, great. Any other problems with our banker have we now created the best bank in the world? Yep. Oh, 10. Yeah. So, our problem was like 10 was really bad. 10 works. But, curious. Seem to get a bit slower. What about if I just have only five accounts? Okay. Well, a little bit slower too. Let's see, three. What's a good number? Seems to get a bit slower. Why is it getting slower the fewer accounts I have? Yeah. So, if I have fewer accounts, more likelihood that two threads are going to be using the same account, in which case there's that magic word contention. So, they're all trying to compete for the same lock. Only one can get a time. So, it will be slower the less threads we have. Yeah. In this case, we only have three accounts we don't need nested while loops or anything silly like that. So, this works. If I wanted to, I could also make sure I acquire mutexes in the same order. So, if for some reason I didn't like this try lock one, well, we have, oops, we have IDs. So, what I will do is we can go ahead and ensure an absolute order on the locks. So, if from ID is less than two ID, then maybe I acquire the from mutex first. So, let's go ahead and create mutex. I'll call it a pointer. I'll call it m1. And then, we will also create m2. And then, if the from ID is less than the two ID, I'm always going to acquire the lowest one first. So, I will assign m1 equal to the from, m2 equal to the two mutex. That looks ugly. Let's get rid of this. And then, in the other case where, well, the IDs can't be the same because we checked at the very front. So, in the other case, we'll just do the old copy paste a rune and flip these. So, we'll do something like that. So, in the case that the two ID is actually smaller than the from ID, well then, I'll assign m1 equal to the two and m2 equal to the from. So, turns out, if I do this, m1 is always going to be the lowest ID. So, I can go ahead and then I could now do pthread. I could just lock m1 and then lock m2. And then, here, I can make sure it doesn't really matter. Oh, wait. Then, here, I can just unlock them both. So, now, I have a definite order. And if I go ahead and I run this, let's go back to 10. So, if I run this, it works too. For some reason, it is much slower. So, probably due to the lock contention. So, let's go up to a thousand. Yeah. And then, it's pretty much as fast. Joy. So, we could implement it both ways. We can go ahead and sanity check that we are all good. So, we could use thread sanitizer, compile it, and then run it. So, it'll be much slower. Whoops. Oh, I didn't compile it. So, it'll be much, much, much slower because it's checking that we don't have any data races, it's checking that we don't have any deadlocks or anything like that. So, we can let it run. But, we should be fairly confident at this point that it's run for a while now and hasn't yelled at us. So, that's probably pretty good. And this new one does not deadlock because we always acquire locks in the same order. We always acquire the mutex from the account with the lowest ID first. So, we always do M1 followed by M2. And here, well, if the from ID is lower than 2, then M1 is from. So, we would get that one first. Otherwise, if the to ID is less than the from ID, then we get to first. All right. So, we can be fairly confident it's working because it hasn't blown up on me yet. But like I said, thread sanitizer does a lot of checking and is slow. So, how slow? Very slow, but it seems to work. All right. Yep. Yeah. So, some of this, like performance wise, maybe I had more locking than I was like, oh, well, doesn't make sense to use threads if, I don't know, I have a crappy CPU or something with only a few cores. So, you could have a check for that. But, checking that is going to be very, very hard because you don't know what anyone's going to run their computer on and like how fast the cores are, things like that. So, there's always going to be trade-offs. And the answer is you just have to benchmark it yourself and see if it's worth it because until I fix this, well, it was slow until I actually changed this ran function and then it got like eight times faster. So, just something you have to keep track of and monitor because there's no right solution. It's just like, how much can I actually run in parallel? If I can't run much of it in parallel, threads probably aren't worth it. So, any other quick questions before you wrap up? I think that's it. All right. So, yay, that's lab five. So, you can do lab five. Joy. So, just remember, bone for you. We're on this together.