 All right, welcome back to Operating Systems. All right, last lecture. Yay, although there are a few people in this course. Whoa, section one outdid you. So, any questions? What should we go over today? Anything we feel like? Yeah, if you do too, you're pointing at the same global entry, so if you else seek or something that, I'll change it for both of the file descriptor numbers. So weird things could happen. If you don't open? Yeah, if you open it after the fork, then it's not shared, right? Each of them have their own independent position and their own global entry and all that fun stuff. So, yeah, so you want to do a question involving that and threads and all that fun stuff? Sure, any other requests? Nothing? Without? Boom. All right, that bear? All right, yeah, this room's weird. All right, so you want to start doing this? These will? So, page replacement. Do I need to go over this? Hopefully not, hopefully you can practice this on your own and it was fairly boring when we did it. So we can skip that. All right, threading. So this will bend together a bunch of different subjects. We can also use it to talk about file descriptors and all that fun stuff. So in this question, I write an interesting server that allows remote clients to run LS command on it. Don't know why I did it, but I did. So maybe I was tired from running home. So for the code example, assume no errors occur, system calls are successful. As a reminder, the file descriptor return from accept can be used in subsequent read and write system calls and again, remember they had a closed book exam so I was a bit more wordy. So let's see what this program does. Whoops. So this program has a main, let's assume we have process 100 that starts executing main. Well, it's going to create a local variable called socket fd, set up the socket. So the nuts and bolts of sockets aren't going to be on the exam, but the fact that their IPC is totally valid. So here I just say it's set up so it calls all the system calls for you and then it does while true and then accept is a blocking system call that will return a new file descriptor whenever there's a new connection and then you can do read and write system calls to it. So in the first part of the question, it says assume four clients connect to it and each one makes a call to sleep and none of them have returned from sleep yet. How many threads are there in total and where are they executing in the code? So I should instead call this one. Well, I'll leave that as p100. So if more things connect to it, that means well, accept is going to return for the first connection, I'll get a new file descriptor, probably file descriptor three, then I create a new thread and then I tell it to execute this run function, give it the file descriptor and then let's say we have thread one and it gets file descriptor equal to three. So I create it, set it up and then I detach it and then the main thread in process 100, call accept again, we would get another connection, then we would create a new thread, we'd probably get file descriptor four, then we detach it, go back, do an accept call again, get a new connection, create a new thread, we would get thread three, then thread four, which would get probably file descriptors five and file descriptor six and then all of the threads are currently right here doing this sleep call. So how many threads are there in total and where are they? Yeah, five threads in total, where are each of them? Yes, so the main thread somewhere in the main function, where specifically is it? So the main thread would be blocked in this accept system call waiting for connection five to come in. So the main thread is stuck in accept and then thread one, two, three and four are all in that sleep because the question says they're currently in the sleep. So five threads in total, main thread is blocked in accept and the other threads that we created are all in the sleep function. Questions about that part? Okay, so next question is, assume all the threads now return from sleep and complete what gets sent back over the socket and why? So I will leave you with this, let's read it for a little bit and then after you read it, look at it a bit, tell me what gets sent back over the socket or what does this do and why? Any thoughts or who wants to tell me specifically? I don't know, what happens in thread one? All right, for a brief little recap, doing the CS111 call final, doing the threading question. So our program looks like this. So we start off with a main thread, then our main thread essentially makes a socket, has, goes into this while true loop and then calls accept and remember accept will return you, it's a blocking system call that will return you a new file descriptor that represents the socket connection and you can read and write bytes to it. So in this question, just to get everyone up to speed we have four connections to it and we're supposed to explain how many threads are there in total and where are they. So if we have an accept and we have four connections while the first connection would return a new file descriptor from accept probably file descriptor three, then we create a new thread, we want it to execute this run function and we pass it the file descriptor we got from the accept call, then we detach it. So we have up here, we have thread one that probably has file descriptor three executing this run call and the question says, well, every thread gets stuck in this sleep call and we need to explain where all the threads are. So if there are four connections that means accept will return four times with a new file descriptor each time. So we create four threads, they each have their own file descriptor and in total we have five threads. So we have the main thread blocked on the accept system call waiting for the fifth connection and each of our four threads, thread one, two, three and four are all, whoops, are all stuck in that sleep system call. So in total we have five threads and then the next question say, all of the threads return from sleep and complete what gets sent back over the socket and why? So what gets sent back over the socket and why? Or specifically what happens with thread one? Yeah, thread one would fork. Yeah, so in thread one, let's just say thread one goes ahead, I'll do it in a different color. Thread one calls fork. So that would create a new process. How many threads are going to be running in that process? One, what is it going to be that running thread? What's it going to be a clone of? Thread one because that called fork. So in this case, if we call fork from thread one, we create a new process, probably get a process ID, I don't know, 101 is probably likely. So that will be an exact copy of thread one. And then we check if PID equals to zero, what does PID equal to zero mean? Yeah, so it's the child process, so it means you are process 101. That does this dupe two system call, which we can kind of explain. So if we wanted to tie this into the file system lecture, well in process 101, it's local open file table, which is essentially all the file descriptors it has open, might look like one, or sorry, zero, one, two, and three. And then each of them would correspond to essentially point to an entry in the global open file table. And each entry in that would have its own position, its own permissions, and then like a V-node pointer. So it may look something like this, where the last entry is V-node pointer, would essentially point to the socket connection. So now if we do a dupe two, that essentially will make whatever this file descriptor is pointing to, it will make file descriptor one also point to the same thing. So the result of the dupe two is going to essentially change this to point from whatever it was pointing to before, to now point to the same thing as three. So now file descriptor one and three are both pointing to the same global open entry that actually represents our socket. So now what does exec LP do, or exec VE, or exec whatever? Yeah, so we replace the current running process. So process, in this case 101 would cease to exist. Well, no, sorry, process 101 would not cease to exist. It would just stop executing this code and start executing whatever LS is gonna do. So LS, well, that just prints something to standard out. So it'll print like the contents of the directory. In this case, since we did a dupe two, then well, we changed the standard output file descriptor to point to the socket instead. So now if we do this, any of the output from LS is actually going to go over the socket. And guess what? This is pretty much a very scuffed version of SSH. So this ran the LS command on the server from some client. So this is essentially SSH, just with no security. So because we did that dupe two, we are actually sending for each client that connects to us, we're sending them the output of LS. And then thread one, in this case, if we go back and process 100 and start executing finished thread one, it will hit return null, which is the same as pthread exit. And also because this is a detached thread, we won't have the concept of a zombie thread or anything like that. So it'll just get cleaned up immediately. So any questions about that? So that's good. Every connection we get, we just output the result of running LS on this machine. So because we do the dupe two and LS, well it's just gonna write to file descriptor one. So we just change whatever file descriptor one points to. All right, next question is going to be, how many processes did we create excluding the original process that is running main? Four, all right? So each of these threads will create a new process, probably process 102, process 103, process 104. All right, I would bet that the next question is going to be, what is the problem with this? Yeah, so usually if there's a fork question, it'll probably involve me asking you about zombies or orphans and whether or not you were a good parent. So in this case, were you a good parent? Nope, I was a terrible parent because I did not wait on them at all and this is especially bad. So if I have four connections, how many zombie processes do I have? Four, if I have a million connections I handle, how many zombie processes do I have? Which is probably bad and I'm probably actually going to run out of memory at some point, which guess what? That is why a common solution to people that have not taken this course is like, oh, my web service doesn't work anymore, restart it and it works. Well, they probably have some issue where they're leaking memory or they're creating zombie processes or zombie threads or something like that. So let's see if we are correct. So yeah, it asks, what did we forget to do with the newly created processes? Why is this an issue in the case where the server runs for a very long time because it's in a while loop? So process 100 will survive so those processes will never be orphans, they'll just be zombies forever. So we'd eventually probably run out of resources given enough time, we'd run out of process ID numbers, run out of memory or something like that. All right, next question is even more fun. So what would happen if we did not fork and instead always just execute a dupe two followed by exec LP directly after sleep in every thread? So that means what if, oops, okay, not you, so what if my code actually looked like this? I did not have fork, I did not have this if, I did not have that, replace the process, process. So each thread replaced the process entirely? Yeah, the first one, right? So because there's four threads running in this, whoever first starts exec LP, the process starts executing a different program, all the other threads do not exist. So say thread two started executing, so thread two would call dupe two, point it's file descriptor, replace standard out, and then it would exec LP and then this process 100 just gets completely replaced, everything just disappears. So only the first thread would, only the first thread that called exec LP will give any output, the rest of them, all the connections will be lost or they just won't exist anymore, I wasn't too specific with that. All right, next one is what would be the issue if we remove pthread detach? So if I went back to this original, but all I did is I have zombie threads, right? So I create a thread, if I wanted to clean up my threads in this case, what would I have to do with them? Well, if I could, how would I clean them up? Sig kill kills a process, sorry? Wait on them, what's the word for wait for threads joins? Yeah, so either threads will be detached threads where they terminate, clean up all the resources immediately or if I don't have to attach the default meaning of pthread create as joinable threads where I have to join them to essentially get all their, to clean all their resources. So in this case, if I removed it, I have zombie threads, waste resources, same issue as not cleaning up the processes, although maybe I can limp on for a little bit longer. All right, so any questions about any aspect of question two? All right, good. All right, do we want to do this a lot question? All right, so at least one nod, good enough for me. So now this looks very familiar to the bank simulator we wrote before, but it's a little bit different. So we have a struct account, it has a mutex, it has a name and it has an amount of money. And then in the transfer function, while we're smart, we check that the accounts are not the same. And then in here, I'll highlight it in blue, we lock the two mutex and then we unlock the two mutex. And in here, we, you know, we change the amount, two's amount while we have two's mutex. We also call this deduct function with the argument for from. So from will be this account. And in this function, well, we lock the account, which should be the from account, and then we unlock the from account. So you can probably guess the questions you'll be asked. Number one, is there a data race? We got one nose, no, was that a no? No, any other nose? Why do we not have a data race? Yeah, so each time I am reading or writing in this case, to two's amount, I'm doing it while I have the mutex to two. So I do not have two concurrent accesses or at least one with the right. I do not have any because I disabled that because I have a mutex. Similarly, in between the green lines, that is where I am looking at the from accounts amount and I am only reading and writing it while I have from's mutex. So I do not have a data race with from. So no data race is in this case because I don't have concurrent accesses to the same location because I'm using a mutex. All right, next one. Well, if there's no data race, is there a deadlock? There are two. Yeah. So if I have two threads, thread one and thread two, and thread one is trying to transfer, I don't know, A to B and thread two is trying to transfer B to A. Something like that. Then yeah, I can see how that would deadlock. So how that would deadlock is thread one goes ahead and acquires lock A. So it gets lock A and then we context switch over to thread two. Thread two would acquire its two mutex which is account B. So it locks B and now we are in a deadlock situation. So thread one can't continue because next thing it will do is try to lock thread B. Which is held by thread two. And then thread two also can't make progress because it is going to try to acquire lock A which is held by lock one or thread one so it can't make any progress. So we have a good old fashioned deadlock. All right, so now the tricky part because we can probably get rid of some of our solutions here. So the next question is without significantly refactoring the code so without moving the code around without changing the return value of deduct without changing any of the arguments for deduct keeping the logic core logic where it is how do I fix this deadlock? So I can't move the locks around like from functions. So I can't lock like from here because my deduct function has to stay thread safe. Yeah, sorry. All right, can I do try lock? How would I do try lock? So in the deduct, I changed this line to try. So try and get account, that's good English. So I probably have that in a while or something like that. So what happens if I don't get the account lock? Then I have to unlock two. Two is not defined in this function. I don't have access to two. You can't add function arguments. So can't do that because I don't have access to it. So you could if you had access to it but in this case I don't have access to it. So I wrote it in a silly way because you're prevented from doing this. All right, so this one we have to think a bit outside the box. So the other one is ensuring the same order between mutexes all the time. Can I do that? So another way is to prevent circular weight and always acquire mutexes in the same order. So can I do that in this example? Not really. So usually your problem from deadlocks is hold and wait. So I have a mutex and I try and acquire another one. Is it possible to fix this that I only have one mutex at a time? So is there a way to fix this that I only acquire one mutex at a time? So I don't try and acquire another. Yeah, just make them, yeah. So I could make them a little bit separate. So I could have like int amount equal to deduct, like essentially this line and then hold it in that variable. So I can call that, that's fine. It won't have a data race within deduct. And then I'm saving the output of it to a local variable. And then I just have instead of this, I just add the amount to the two account. So essentially if I wrote it out, I'm doing like lock from, do a bunch of calculations, do all this stuff, then unlock from, then I lock to, then I add to it, and then I unlock to. Yeah, so like, yeah, it would be up here, like int amount equals deduct, blah, blah, blah. So in other words, to write it nicely, looks more like that. So don't have a deadlock in this case, because I only have one mutex at a time. All right, next question. Or do we have any questions about that? Yeah, so I try and make it kind of like real life. You just have to look at it, argue about it and see what it is. So for threading, yeah, it'll likely either be a data race or a deadlock. If it's neither, that'd be a weird exam question. I don't remember, I think I tried to make something weird. I don't think I got too weird. What? You guys are smart, you can figure it out. Yeah, but if a question just doesn't have any problems, you can say it doesn't have any problems, it's fine. But you have to argue about it either way, right? Doesn't really. Yeah, like for data races, so for data races, quick thing to do, see if there are any variables or anything that are shared between threads. If there are, see if there's a right to it. If there's a right to it, see if that right can happen at the same time as a read or another right, in which case you have a data race on your hands. And if you don't, well then, if there are more than one mutex, that's like red alert for deadlocks. Yeah, fixing them so I either break circular weight or I break hold and wait or I just only acquire one mutex at a time. Those are essentially the only three solutions we can really do, at least in this course. In reality, you can just like take mutexes away if you want, but it gets more complicated. Yeah, lost wake up. Did we do, I don't think we did this in this section. So did we do like this question, but condition variables? Did we talk about condition variables? You were in the other section. Did this section talk about condition variables last lecture, like did we do this? What? Okay, whoever was here, did we do this with condition variables? You just want me to do this with condition variables? Sure, all right, cool. So we definitely did look at this question. So this was like, we want to enforce some type of order between threads. So we wanted thread zero to execute this initialize everything function. And then only after thread zero has executed that function, then we want this initialize thread to be able to run hopefully in parallel with all the other threads. So one question I could ask is like, oh, okay. Well, fix this using condition variables instead of center fours. So if I want to use condition variables, I have to think about, well, when is it valid for all the other threads to start executing? And well, my condition would probably be has thread zero called this initialize everything function. So the external condition variable I probably care about is like a Boolean is initialized. Have I called it yes or no? And I could initially set it equal to something like false. And then for a condition variable, I need the condition variable itself, which I like to just think of as a queue. And then you need a mutex to be able to protect all of that. Make sure I don't have a data race on this is initialized variable. So if I wanted to make this safe using condition variables, so I can just remove this line. What would I do? Okay, so with condition variables, there's two functions that we have to worry about. There is conned weight, which will take the condition and then a mutex. And that assumes that you already have the mutex acquired and it will atomically put this thread to sleep, add it to the queue, and also unlock the mutex and wait for something to at least in the purposes, their purpose of this course when we talked about it, wait for another thread to wake it up with con signal. And that will wake up exactly one thread on the queue. And we do not need the mutex acquired to actually run this. So there is another one that we kind of glossed over that is actually more relevant to this. And that's con broadcast. So we didn't explicitly talk about it too much, but that would wake up every thread in the queue and let them execute, which is actually more appropriate for this question, but we can work around it too if we want. Yeah, no, so they'll wake up all of them. They'll try and grab the mutex, but only one will. And then the others will be blocked on that mutex until it unlocks it and then they can go. So in this case, what should I do to make all of the comments valid using condition variables? So I could do a wait here. So if I do a wait here, while I run a room, so I just do an unconditional wait here. So if I wait, well, I have to make sure that I have the mutex. So I would also need to lock the mutex before calling wait. So it needs to at least look like this. So I should just wait. It puts this thread to sleep. It blocks it until another thread wakes it up through signal or in this case, maybe broadcast. Sorry? Yeah, this is in the else. Yeah, I should probably check the condition. Cause if I just unconditionally wait, I just go to sleep immediately. What about that condition is actually true. So what I should have? Oh, I lost my underscore. So I have what? If what? Yeah, if not is before locking. So only thread zero should call initialize everything and then they should all run after that. Multiple threads can run initialize thread at the same time before what? So you want to move this a line up, move it down. Yeah, move the lockdown under the F, right? Yeah, so if I move the lockdown under the F, then I'm using that mutex to not have a data race on is a knit. I'm using that lock. So I make sure I don't have a data race on is a knit because what do I need to do in thread zero? Yeah, I should probably add right here after this initialize everything. Well, I should probably at least set is a knit equal to true. In this case, I can probably do like the con broad cast. So instead of doing broadcast, I could also just have like a for loop that goes for you. The number of threads just signal that many times if I want, but in this case broadcast is going to be slightly more efficient because it's just going to wake up every single thread that's currently waiting. All right, so is this my whole solution? Am I done? So there are one major issue here and one minor issue here. And it has to do with, I heard someone say lost wake up. Who's lost wake up? Is there a lost wake up here in con weight? Yeah, you can, sure. Yeah, it's a problem with thread. It's a problem in the if statement. Thread zero if, no, there's two, there's the thread zero if statement. Yeah, yeah, so what could happen and say I just have thread two and thread zero because technically I have a data race here. So I have a data race on is initialized. So I have two concurrent accesses. All the other threads are reading it and that's protected, so that's fine. But there's one thread that is writing it and that can happen concurrently with the reads. So I have a data race, so I'm probably in trouble. So the trouble I'm in, in this case, it looks like a lost wake up because what can happen is thread two executes. It locks the mutex, so it's all good. And then it reads, isn't it, is currently false. And then we context over to thread zero and then thread zero changes, isn't it equal to true? And then there's a broadcast, so it wakes up everything on the queue. But guess what? Thread two is about to go on the queue and it has not added itself yet. So thread zero will just continue on going, initialize itself and then be done. So that means thread two in this case will never get woken up by anything because only thread zero calls broadcast. So we'll have a lost wake up on thread two in this case. In this case we could get extremely unlucky that we have a lost wake up for every other thread. So that could happen to thread, actually no, just one thread, just one thread because we acquired that mutex. Yeah, so I should just prevent this data race completely. So I should probably just have thread zero acquire the mutex before it writes to it, right? So the solution should probably look like lock and then unlock like that. So now if it looks like that, well I don't have this lost wake up for example because this context switch thread zero wouldn't have been able to make any progress because it would have tried to acquire the lock, it can't get it, so it can't update the value until this thread actually puts itself to sleep and unlocks the mutex and then thread zero can go ahead and change it and then wake it up. All right, other minor problem that I will just spoil in the interest of, oh, we only have one minute. All right, cool. So I should not have an if while I'm checking that I should just to be safe because why not? It should always be a while because it might be the case where I won't get woken up if you actually read the spec for thread weight it says it may just sporadically wake up your thread even though no thread has called signal on it or broadcast. So it could be the case that you get woken up, the condition is still false and then if you have an if you just fall through and you're executing this line while the condition is false, which is not what you want. Yeah. So basically, if the con is not satisfied. Yeah, con weight, basically this is, I like thinking of this as a queue and what con weight will do is atomically add this thread to the queue, unlock the mutex and also block this thread. And then whatever gets unblocked, the first thing it's gonna do is try it to acquire the mutex before returning from weight. All right, all right, that's it. We're done. The course is done, yay. All right, all right. Just remember, phone for you. We're on this together and good luck on Friday.