 All right, welcome back to, wow I forget how this works, welcome back to operating system. So our last lecture we're pretty much done. So we have more review today, which is again an ask away. So is there any pressing questions from anyone sitting here already? If not the discord was like maybe some condition variables, maybe some more threads, maybe some socket stuff, go do that CS111 exam. So any preferences? Yep, condition variables. What about condition variables? Explain them from the beginning. All right, so I made this up with the other section, so I will do that for you. So let me erase something real quick, do do do. All right, remember this question? That we took up last time that had, you know, we were trying to make it such that only thread zero ran this initialized everything function before any other thread executed, then every other thread can execute their own individual initialized thread function. So do we all remember that? All right, well if we want to do condition variables, how would, oh yeah, yeah, how would I do this with condition variables? Yeah, so remember about condition variables, let's see. Condition variables are basically just one big queue. That's what the condition variable is and you need a mutex with it and then you need some condition that you are actually waiting on. So for this, the condition I will probably have is something like I want a boolean that just says is initialized and initialized is currently false and I'll have thread zero change it to true whenever it's done. What is that high pitch, is it that? Am I the only one that can hear the high pitched noise? Okay, yeah, maybe if I'm like, yeah, maybe if I, as soon as I hit like 50 or 60, I can't hear it anymore and I'll be a lot happier. All right, so our condition variable could be like a boole is initialized and then I'm just gonna not write pthread whatever in front of all these. So I'll have my condition variable, I'll just call it conned because I'm creative and then I'll have a mutex called mutex because again, I'm super creative. So those are everything I need for condition variables and for my condition variables, there are two functions that we learned and then one that you already said that would probably make more sense in this case. So I have conned, wait, and then conned, signal. And conned wait with the condition will essentially put this thread to sleep, block it, and also unlock the mutex associated with it, until some other thread wakes it up by calling signal on that condition variable. So anything else you want me to explain about condition variables, those are the two basic things we have. Yeah, we'll get to that. Yeah, so you need a mutex because it's assumed that when you call wait, you have the mutex acquired because, well, you're waiting on a condition variable that other threads can change, so you don't want data races, so you'll need a mutex anyways. And as part of the wait, it needs to unlock the mutex whenever it puts itself to sleep, so another thread can get the mutex, make some progress, hopefully change the variable to true or whatever I'm waiting on, and then wake me up. And then I would have to re-acquire the mutex so that when I return, I know I have the mutex and don't have any data races. Yeah, yes, it is technically undefined behavior if you call it without locking the mutex beforehand. It will, yeah, because they assume it should be locked whenever you have it. So in order to implement this, I could essentially leave my if statement. And then here, right after initialize everything, I could have a call to the shorthand it, call the lock. Oh my God, it's so annoying. Okay, have it called the lock with that mutex. Then after I acquire the mutex, then I can go ahead and change that is initialized variable. So I'll just shorten it to isn't it equal to true. And then in this case, well, it doesn't really matter. I could unlock the mutex. And then in this case, I told you to only worry about signal. But in this case, I might have multiple threads going to sleep. As soon as I, because essentially if threads one through seven have run first, they should be waiting. So I might have up to seven threads of sleep. In this case, I want to wake all of them up because they could all continue as soon as I have initialized everything. So even though we said we did not cover it, if I ask this particular question, which you would not get, broadcast would probably be more appropriate because I could have multiple threads waiting. But instead of broadcast, since I know the number of threads, I could have a for loop that calls signal like seven times if I want. That would also be a valid solution, but broadcast would probably be the better thing to do. All right, so that is all of the code that would go essentially right here. So now instead of thread yield, I probably do something like while, cond, actually, well, first I need to lock the mutex. Then after I lock the mutex, I can check the condition, which the condition is actually the boolean. So I would do while not is init. Then if currently is initialized as false, that means the thread should be put to sleep because, well, the thread zero hasn't executed yet. So here is when I would call a cond, wait on that condition variable with this mutex, and then that would just be in a while loop. Now after I break out of the while loop on this line right here, I am guaranteed that is initialized is equal to true because I still have the mutex and, well, I still have the mutex and I was woken up and I'm guaranteed that it is true. So now I can go ahead and just unlock and then just fall through and then run my initialized thread. So unlock, something like that. So any questions about that? Yeah, go ahead. Yeah, so condition wait will atomically put this thread in the queue, put this thread to sleep and also unlock the mutex. And then whenever it's woken back up, it will try and reacquire the mutex before returning. Yep, yeah, so the condition way I like to think about it is a queue. So broadcast on it will wake everything up that is currently in the queue. So like if I just change it, so common thing is like hey, what happens if I change this to an if instead? So yeah, so in this case, maybe it's technically okay because they just go to sleep and then get woken up once, I only change it in one direction. But if you actually read the spec of condition wait, you can actually get woken up at any time even if someone doesn't signal you technically I believe. So you could get woken up and then you would have to be responsible for rechecking the variable anyways. If I just have it in an if, I won't recheck it again. I'll just assume that's true and then fall through. So common thing is always put your checking of the variable in a while loop because then you're guaranteed no matter what, you are always rechecking the variable and you are always sure it is up to date. And you are not in for a surprise. In this case it would probably work, but in general probably is not a good solution because well, if you change this later and have another thread, wake up the other threads on the queue. It's just not gonna work anymore. You're gonna be surprised. Yeah, yeah, no reason not to use a while loop. Yep, so why do I need to lock it? So if I don't unlock it, technically I have. Well, yeah, technically I have a data race because other threads could be reading it while I am writing it. So in this case, yeah, so in this case if I just don't have it, even in this case, sometimes it's okay not to have it. But even in this case, this could lead to very bad things. This is actually a good exam question, I wish I put this on. So what bad thing happens here if I don't have this lock? If it changes it while we're reading? So if it reads right before it gets changed, so what's another color? So thread two reads it, it's currently false? Yeah, yeah, exactly. So yeah, it's actually like a lost wake up scenario. So what might happen is if I don't have the mutex protecting isn't it equals to true? Thread two could check the current value, it's false. So the next line it's going to execute is the wait. But before it executes wait, we context switch over to say thread zero. Thread zero changes the value from false to true, then does the broadcast to wake everything up, and then thread zeroes on its way. Now whenever we return back to thread two, it will call con wait. It will put itself to sleep, it will get blocked. And now nothing will ever wake it up again, it will just sit there and get blocked forever, so now we have lost thread two. Data races are bad. All right, questions about that? So did we do this question? I think we did this one. Let's start with the UCLA exam unless, yeah. Let's start with this, file systems this section. We did the file system question, right? Yeah, we did it. So page replacement, you don't really need me for this, right? You can practice it yourself, it's probably going to be on it. All right, wanna do this question? Threads? Looks like fun. All right, nope, yeah. Yep, whenever you answer even if it's the clock, you just draw these. Because your clock's gonna update every time I want it. No one's gonna mark all of this in 20 clocks or 10 clocks or however many. Yeah, just like what I did in the lecture, I went through it, I just drew my clock and I just went through it each time. Draw lightly, light hand, light touch and use a pencil. All right, so John writes an interesting web server that allows multiple clients to run the LS command on it. Wow, this is a cool server. I don't know why John did it, but he did. Maybe he was tired. All right, yeah, I was actually late to class because I had to run and get my laptop, so that was fun. For the code example, assume no errors occur, system calls are successful. As a reminder, so this again was a closed book exam. So I told them some things that I will not tell you because you can bring your fun little cheat sheet. So, but this isn't much text. So as a reminder, the file descriptor return from accept can be used in subsequent read and write system calls. So I didn't even tell them that much. So consider the following code. So we have a main, has a socket file descriptor. We set it up, does all that server stuff, setting up doing the socket, buying the listen, and then we have while true, except. So this remember is a blocking system call that will wait for a connection and whenever it returns, you get a new file descriptor that represents that connection. If you write to it, you're sending bytes to whatever connected to you. If you're reading, you're receiving information from it. So here, I create a thread and then I detach it and I give it the file descriptor and I do a little bit of some bad C practice where I cast an int to avoid star and I just try and read it there. So now, right after a thread starts running, it calls sleep for five seconds. So this is what starts us off on our first question. So assume four clients connect and each of them makes it to a call to sleep. And none have returned from sleep yet. How many threads are there in total and where are they executing in the code? So except returns four times, I could ask like this about sockets. There's nothing, I wouldn't ask you like do all the setup stuff or do all that, but I could have, they're just IPC, IPC is part of the course. So with this, how many, what's the question? How many sock, how many threads are there in total and where are they currently executing? Yeah, five, where are they all? And what's the main thread doing? So the answer to the first question is it's going to be five threads in total. So the current special main thread that starts executing main, well, it will be currently right here on the except system call because it is blocking its waiting for connection number five to come in. So the main thread is currently there. And we have thread one, thread two, thread three and thread four all on this sleep line because, well, each time we have a connection that comes in, we get a new file descriptor. So say, I don't know, in this case it's file descriptor three. Then we create a new thread and we pass that file descriptor two that thread. So thread one would probably have file descriptor three. It would read that into a local variable and then call sleep. So three, four, five, six. All right, so okay for that five threads in total, main thread is in except. The other four threads we created are in sleep. Yep, if you make it explicit, you know what you are talking about. So usually when it says created, that's like not including the main thread. Usually the question will say, not including the main thread or something like that, so I can get a consistent answer. But if the question doesn't say it, just give two answers like, here's the number, if you include the main thread, it's this. So either should be okay. Yep, so except will block and it will just wait for like someone to connect to the server. So as soon as someone connects to the server, then it will return a new file descriptor that I can like read and write bytes to. So it's just waiting for a connection like a web server or something like that. Yep, yep, so main executes and then the first time through, the main thread will call except and then client one connects to it. So it returns a new file descriptor. And then that's how we talk to client one. Creates a new thread for that, sends it off on its way. Calls except again, and it would return a new file descriptor for client two, and then we create a new thread for client two, send it on its way. Then it'll do the same thing for client three and then client four. Yeah, so it says, assume four clients connect. So these are all, p threads are all kernel threads. So if all the other threads are currently asleep, the main thread can still run. So detach all it does is eliminate the possibility of zombie threads. So as soon as this thread terminates, all of its resources are cleared out, I can't join it or anything like that. Yeah, so detach as soon as that thread terminates, it's all cleaned up. For processes, there's no equivalent of detach because someone has to wait on you all the time. Yeah, so there's like wait with no hang that's a non-blocking check. For thread join, it's always blocking. There's no non-blocking version of it. Another one somewhere. Okay, it was covered. All right, so what's the next question? So now all threads return from sleep and complete what gets sent over the socket and why. All right, so now we have to continue reading. So now each thread calls fork. That's fun. So each thread calls fork. Remember what happens when each thread calls fork? Yeah, so new process gets created and it's a clone of whatever the thread is that called it. So the new process that creates, gets created, is not going to have four threads in it, it's only going to have one. So for example, if this we started off in, I don't know, process 100, then thread one would make process 101, thread two would make process 102, thread three would make process 103, thread four would make process 104. So they all create a new process and then I check if PID is equal to zero. So what does that mean if fork returns a PID of zero? Yeah, it means this is the child process. So yeah, so the new process, processes by default have one thread in them. And usually we call it the main thread if it just started normally. But in this case, yeah, the thread would just, the one thread in the process would be a clone. In process 101's case, it would be a clone of thread one. Alrighty, so now we check in the child, we do a dupe two. So whatever that file descriptor is we got that represents a connection. We make file descriptor one point to the same thing. And then we call exec LP to execute LS. What does exec VE or exec LP or any of the exec system calls do? Yeah, we essentially reuse this current process. So process 101 becomes LS, starts executing LS things. So in that case, all the output of LS would be sent to file descriptor one by default because it would use standard out. Where would that information go? Yeah, well file descriptor one, what's file descriptor one pointing at? FD no, specifically in this case, what's file descriptor one pointing to? The socket, the socket. So we would be sending the output of LS across the socket. And it's essentially like doing a, this is essentially a very scuffed version of SSH. Cuz you're running commands on a different machine than what you connected to. Yeah, no LS with no arguments shows you the contents of the directory. The directory this process is currently in, that is running the server. So process 101 would be running on the server. And it would have been running in some directory. So LS by default would just show you the contents of the current directory. Yeah, this is the current directory of the server cuz this is running on the server. This is the server code. This is only the server code, there's no client code here. Yeah, file descriptor one. Yeah, so let us take a little detour I guess. So in process 101, well right before it, there would have file descriptor, here I'll write it to the side. It would have file descriptor one, or sorry, zero, one, two, and in this case three. And then there would be like this local open file table. So it would have its position, what else is in there? Position, someone help? Position, yeah, permissions, and then a V node. So it would have this for all of the current open file descriptors. And then we would have like the global open file table, or sorry, these, sorry, the file descriptors are like the local open file table. And those are the global entries. So I wrote it backwards. All right, so it would look something like this. And then that V node would probably for file descriptor three, point to the actual socket connection. Or maybe I'll just call it connection instead. So if I do a dupe two, what that means is it's going to take whatever this file descriptor is pointing to, and also make file descriptor one point to it. So that dupe two system call will essentially change this in the local open file table such that file descriptor one also points to that same entry. And since it points to that same entry, now if that process does a right to file descriptor one, it's going to actually go to over the socket, over that connection. Yeah, yeah, yeah, in this case, because all the file descriptors are shared in all the processes, if you got that file descriptor number and you all seeked, because it would only have entries in the global table, it would change the position for all of them. Be kind of rude, yeah. So group dupe two will, yeah, so. So dupe two will close this one if it's not closed already. But that's part of changing it to the other one. So technically, if I wanted to slow it down, it would have deleted entry one first, like just got rid of it. And then, well, now it's available to point to something else. So it could make file descriptor one now. So if you want to think of it that way, but probably easier just to think of it just swapping, or changing what it points to. All right, so with that, so, yeah. So now, yeah, what gets sent back over the socket and why? So the server will send over the socket the contents of running LS on this machine. So it's essentially SSH, because you are running commands on a server somewhere else that is not your local machine. So your machine would be the one that calls, connect, or whatever, and then the server accepts it, and then it runs commands on the server. All right, so any questions about 2B or not 2B? God, I'm terrible. All right, how many processes did we create, excluding the original running processes? So in this case, we wrote it before. We have four newly created processes because each of the threads will create a new process, so four. So any problems with that? All right, so, wow, I have a typo here, that's embarrassing. So what did we forget to do with the newly created processes? Why is this especially an issue? In this case, where a server runs for a very long time. So whatever you create processes, you should probably think, am I being a good parent? So in this, is this a good parent? No, so process, in this case, process 100, if that's the original process. All of these would be children, right? So all of these would be children. All of the children just run LS and then finish. And whenever they finish, what is, let's say, process 101 terminates, what is the current state of process 101? Yeah, it is a zombie, right? And we don't like zombies because they're wasting resources. So in this case, I could create up to four zombies because I have four connections, if I have 10,000 connections that happen, how many zombies do I have? 10,000, which probably starts being bad if I have a million. How many zombies do I have? A million, okay, now I probably have actually run into an issue. I will eventually run out of memory and then crash. And this is like the common solution to some poorly written softwares. Yeah, just restart the service and it'll work again. No problem, probably because of an issue like that. So any questions about that? Yep, yeah, so in this case, do I create an orphan? Yeah, yeah, in this case, assuming that I don't kill that process for whatever reason, if it just runs normally and it's an infant while loop, it's just waiting for connections. It'll run forever, so I won't have any orphans until, well, maybe this process has created too many zombies and we have to kill it or something like that, and then it would create a bunch of zombie orphans, and then they'd all get cleaned up. Yeah, yeah, wait, so I could just do like a wait. Pid here, wait on, pid, right? Now we don't create any zombies, we're probably good now. All right, so what was that? Yeah, so we didn't call wait on them, create a bunch of zombies. So they're not gonna become orphans, they're not gonna get reparented, they're just gonna stay zombies for as long as the server is running. And eventually, we're going to run out of resources. Yeah, if you have no hang on the waitpid, well, you might call it and then that process is still running and then it's like it's still running, and then your thread terminates and it's never gonna call waitpid on it. Yeah, yeah, in this case, it would make sense to be blocking. Yeah, so this return null, this is the end of like thread one running, thread two running. So this would eventually call pthread exit, and that would end the thread that got created here. So just ends the thread, so thread one will be gone and because it was detached, it won't be a zombie thread, so it'll just clean itself up. But all the other threads, including the main thread, are still running. If instead of just implicitly having a pthread exit, if I just straight up called exit, then I exit the process. All the threads are now gone, and well, I don't have any issues anymore because this process doesn't exist. All right, so now what would happen if we did not fork, instead always executed dupeto followed by exec directly after sleep in every thread? So in this case, what it is saying is, let's see. So instead of the situation we had, I don't have this line, I don't have this line, and I don't have anything like that. Yep, yeah, so in this case, what's gonna happen? Let's say I have again four threads that connect to it, and they all start running this run, they're all set up to run it. Well, one of them is eventually going to execute, let's say thread two. Wow, that's gigantic. So thread two would get its file descriptor sleep for five seconds, then call dupeto, and again remember it's running as process 100. So it would just change process as 100, file descriptor one to point to its socket. Then it would exec LP, and then process 100 would be replaced with LS. And now the server is not running anymore, so we would just, whatever first made it there, we'd send that information back, and then for every other thread, they'd just be wiped off the face of the planet. Yep, so if instead of this whole terrible thread thing, I just kind of delete this, and then essentially take this and move it here. All right, is that bad, anyone? Yeah, so if I move it, well, then I essentially only have one thread executing. As soon as we get a connection, well, it would make a child. And in this case, assume I have my weight, because I don't create any zombies, or sorry, yeah, don't create any zombies. Well, it will create a new process, wait for it to finish running, and essentially I can only handle one connection at a time. So I'm going to have to accept the connection, wait for it to finish completely before I start another one. And I can only do one thing in parallel all the time, it's going to be really slow. Yeah, yeah, so the problem is this is weight PID. Yeah, if I want to make it and I get rid of weight PID, well, then technically I could go through the while loop and have like a while loop that pulls and does like a no hang. Yeah, or you could have a signal handler or something like that, so that would work fine. In reality, the thread probably does a bunch of other stuff that's local that might run for longer, so it would probably be better to use threads. But in this particular example, that's kind of silly, just processes would be fine if you had a signal handler or something like that. Yeah, no, it doesn't matter because the operating system is going to schedule processes instead of threads, it's going to essentially be the same thing. Yeah, yeah, so whatever we start, we give a function here. Whatever this returns, it will implicitly call pthreadexit just like you did in lab three or lab four. Yep, yeah, it doesn't matter what type of thread it is, it's just always going to call exit whenever it's done its function. Yeah, the only difference is the other thread, so it's either going to be detached or joinable thread. If it's a joinable thread then the rules are like with processes, I have to call join on it, which is essentially equivalent like weight. Yeah, oh no, all right. All right, if I was a, let's see, yeah, there's one last question. So if I was a betting, yeah, what would be the issue if we removed pthread detached? So now, yeah, so if I remove that, now I have zombie threads. So they finish executing. They call pthread exit, and then, well, I don't call join on them. So essentially, they're wasting some resources. So I could have zombie threads and zombie processes at the same time and have a whole zombie party. No, pthread detached does not implicitly call join. If you want it, you can think of it kind of like that, except it just trashes the return value because part of join, you get the return value out of it. So if you- It does the same thing essentially. Yeah, it essentially does the same thing where it cleans up all the resources. So if you want to think of it that way, where it just automatically joins and trash, and you just never see the return value, yeah, yeah, exactly. If we were to use pthread join here, where would we put it? Yeah, it'd be pretty bad. So this would be an example where detached is actually a good thing, because I don't care about the return value. And placing join would be a pain. So this is a case where detached is actually a good thing. All right, so now we got, all right, this looks, do we do locks or do we move on? All right, let's try and do it in like seven minutes. So transfer function, so this looks kind of similar to the bank thing we did in the lecture for the parallelization example. So we have a struct called account that has three fields. It's got mutex, a name, and an amount. Then we have our fun transfer function that, oh, that's smart. It checks if we're transferring to and from the same account. If not, it just returns. Otherwise, we lock the two mutex, and then we have a deduct function. In deduct, we lock the other account. Yeah, I will make the threads a different color. So in both of them, all the deduct does is check if the amount is able to be deducted, and then while it subtracts that amount, and then returns the amount it actually deducted. And if there's not enough money, it just sets the amount to zero. And then in the transfer, we just add that to the two account. So in part five, does this code have any data races? Come up with an example of a data race or justify why it does not? Yeah, so circular weight is not a data race, right? It's a deadlock. So this question, does this code have any data races? So we answered a different question. So this case does not have a data race because, well, should remember data race to concurrent accesses to the same location, at least one of them, is a right. So the two mutex is protecting the reads and writes to this location, so twos amount. And then in deduct, well, we're protecting concurrent accesses, which are here to the from accounts amount here. So we're not trying to access anything from the two account in the deduct method and vice versa. So we always are only changing the amount of a account with its associated mutex. So only one thread doing concurrent accesses at a time. So while there's only one thread, there's no concurrent accesses. Doesn't matter if I'm reading or writing. If there's only one thread able to do it at one particular time, we have no data races. Yeah, so here with this particular call right here, I'm only changing the two amount. And then within the green, I only change from because, well, from is always equal to that. Yeah, so no data races in this case. All right, next question is, yeah, does this have a deadlock? Sure does. So has a deadlock, so assume we have a transfer from A to B and then a transfer from B to A? Well, in that case, we could have thread one executing this, thread two executing this. So thread one could go ahead, acquire lock A. And now it has lock A, we context switch over to thread two. Now it executes, it acquires lock B. And now we are screwed because neither of those two threads can make any progress anymore. Thread one wants to get lock B held by thread two. Thread two wants to get lock A held by thread one, so we have a deadlock. All right, the next one is, given the issues you found in 3A and 3B, explain how you would fix this code to prevent these issues. You may not significantly refactor the code. That means that transfer must add to the correct, sorry, transfer must add to the account and call deduct. Also deduct must safely decrement the account, only if it won't have a negative amount, otherwise it must be zero. Deduct always returns the amount deducted. You may write a replacement function or explain yourself clearly. So how would I change this to get rid of the deadlock since that is our issue? Yeah, single mutex and call it a day. You'd prevent a deadlock, but yeah, then you'd be going serial and that probably would count as significantly changing this program. No, try lock's not significantly changing. So yeah, is that gonna be your, yeah, so just use a try lock. So in here, that's what I do, keep deduct the same. Well, actually I could just move the code around in this case. So I don't even have to do a try lock. In fact, this was actually trying to prevent you from doing a try lock because adding an argument so I could give up the first lock would be refactoring it. So in this case, I could actually just call deduct outside of this and then go ahead and actually adjust the to account in a different thread or yeah, with its locks. So I just move it so it looks like this. So I called deduct without any locks and then I do the lock to two. Only after I'm done with the ducks, so now I'm only holding one lock at a time. So I don't have any deadlock conditions. All right. So we are done. Oh crap, I'm over time. Oh, I can't even play the song anymore. Oh crap. Sorry. You can't do a try lock because you don't have access to the other lock. All right. Just remember pulling for you. We're on this together and then yeah, there we go. All right, you can leave.