 OK, welcome back to class. I hope everybody had a pleasant weekend. So today, we're going to kind of try to get through as much as we can of the rest of our very top down view into the operating system for the perspective of the system calls and process support. So hopefully today, between today and Wednesday, we're going to finish kind of the process related system calls. And at that point, we're going to take a big shift in perspective. We're going to go all the way down from the bottom and start thinking about hardware and messy details of hardware and start talking about the abstractions that operating systems build up on top of hardware. So it's just kind of the end of this very top down, very externalized view of the operating system. And once we get through this introduction of sort of processes in the process lifecycle, then we'll start talking about the stuff that I think is probably a little bit more fun. So if you're bored, just hang on for a few more lectures and things will start to come more interesting. All right. So at this point, assignment zero is up. All the questions that are up there are the ones you guys will have to do. Hopefully other things work. And today, we should have the question submission forms updated for the code reading questions and the scripting things and other stuff, right? The code reading questions are going to be graded by the TA, so there will be some turnaround associated with that, but the scripts. And for assignment zero, I'm going to ask you guys to just upload a patch with the very small change you made where you added your username to the printout when the kernel boots. That's all I want to see. The reason for this is just to get you guys used to the process of submitting patches, there'll be additional instructions on how to do that. But last year, there were some people that really struggled with that. It's not that hard. You guys will get used to it, but this will give you a chance just to practice. But the only thing that the automatic grader will really check is that your patch applies. And when your kernel boots, it should see your username somewhere before it exits. So that should be pretty simple. And again, all this stuff is still in the queue. Sorry. Processing the to-do list as fast as I can. All right, so last time we talked about fork, we finished talking about the process of creating new processes and the different overheads involved and the challenges associated with that. So how many people remember Friday? How many people had a good weekend? Should be the other half of the class. All right. All right, so let's talk about Friday. But is anyone who remembers Friday have any questions about Friday? Any questions about fork? Talked a little bit about fork overheads, some of the little tricks we play. Yeah. Is it uncial? Multi-threaded fork. Will I speak a little bit about multi-threaded fork? No. Oh, so can I ask you to table that for maybe a month? Because we will come back to talking about threading in a little in a few weeks when we talk about threads, because we'll talk about user and kernel threading library implementations and the pros and cons of where to do it. And I think that'll answer some of your questions. There are the idea of having threads associated with processes and abstraction that can be implemented within the kernel, but it can also be implemented in user space libraries. So for a long time, for example, p-threads is a popular user space threading implementation. And it turns out you can actually implement multiple threads without the kernel knowing anything about it. And there are pros and cons to doing that. But for a period of time, there were kernels that didn't support multiple threads and user processes. And so the only way to write a multi-threaded application was to use one of these user space threading libraries. But that's a good question, but there is some richness there, but we'll get back to it. All right. So today we're going to talk about exact weight and exit. But before we get there, let's talk about fork, what people remember about fork. So after fork, the child thread resumes executing where? Now, the child thread returns executing and starts executing where? Wembley. Yeah, at the same point that a parent called fork. And you're right. It's actually not exactly the same point, because what would happen if I returned to the exactly same point where I called fork? What's that? Well, no, I would return to what did the parent just do? It forked. And so I would fork again. And fork, and fork, and fork. So it has to go. And you guys will remember this when you're doing assignment 2, because there is one small change you have to make to the child's stack frame when it returns, which is it cannot go back to the same instruction. It has to go to the next instruction, which would be the return from fork. Otherwise, we're stuck and we can never make any forward progress. And we end up with that nice little fork bomb that we talked about last time. All contents of memory in the parent or child are what? Somebody from this side of the room. What's your name? Tim. They are shared? Are they shared? Brian, they're copied. They're identical. They are not shared. This is, again, I have a child. And if the child wants, it can completely dissociate itself from me. So I can't share memory with it. Otherwise, I'd be able to meddle in its life in ways that it would be able to meddle in my as the parent. I don't want either one of those things. OK, both child and parent, what about files? When the child starts executing, what's true about both the child and the parent? They have the same file handles open. They have the same files open at the same position. What is the one, when we talk about, and again, there are many variations of fork. So you can find forks that do lots of different versions of this. But we're talking about the one that makes the most identical copy. There is always, however, one exception. Even if I'm making the most identical copy possible of the child, what will be different about the child when it returns? The return code. I have to return twice. I have to turn the PID to the parent and 0 to the child so that we can go off on our way knowing who is who. So in this example, who is Bob and who is Alice? I've got one of the ladies, Sarah. Bob is a child and Alice is the parent. Good. OK, what about pipes? So remember, I have this sort of standard UNIX abstraction of being able to link together little shell utilities to achieve these big, fantastic things, or at least to do some simple stuff. How do pipes work? What is a pipe? First of all, what system call do I use to create a pipe? This should be pretty easy. Alyssa? Pipe. Yeah, there we go. The pipe system call, appropriately named. What does pipe do? Somebody in the back, right there. What is your name? Yep. Jude. What does pipe do? Yeah. What's your name? Luke. Right, and it does this by creating this anonymous pipe object that is like a file. It can be read and written to, but it is not a file in that the contents do not live anywhere on disk. There is no disk block that holds the contents of the pipe. The pipe contents are buffered in kernel memory. And I forgot to check whether or not you could actually use a pipe in both directions. But if somebody reminds me on Piazza, I will look that up, because I'm curious myself. All right, so let's talk. This is a good review of sort of what fork does. So after fork, well, this is actually how pipes work. So what has happened here? This is a process that's about to use a pipe to communicate with another process. So it's created this pipe object, and how does it initialize both ends? This is a describe this slide sort of question. So I've got this pipe object, and where do both ends point? And in the same process, right? So I have a file handle associated with the right end, and a file and an associated with the read end, all right? And that and what happens next. Somebody hasn't spoken up yet. Jeremy, good to call on you for a change. Yeah, so what is the slide? Yeah, how does this look after the process forks? You're getting ahead of me. OK, so after the process forks, right? My child has a copy of my file table, and so it has its own file descriptors pointed to the same file handle, right? So it also has a file descriptor pointed to the right end and to the read end. And then what do you do next? Right, and in this point, by closing the appropriate ends, I can set up the pipe the way I wanted, right? Where the parent has or the child has the right end and the other process has the read end. And this allows us to pass data back and forth. All right, so what was the problem with fork, right? The problem with fork is that there's all this state associated with the process that I have to copy, and copying that state is expensive. What is some of the more expensive state that I have to copy? Swappa. Yeah. What's the more expensive part of the fork operation? I have a lot of state to copy, but in particular, what part of the state is really expensive to copy? Memory, yeah. I've got potentially all this memory open and now you're telling me that I have to find memory on the system and make copies of all of that memory that the parent has open, right? And then why is this frequently a sort of a silly thing to do, right? What frequently happens next that causes me to regret all the work I did? Yeah, Jen. Yeah, the child calls exact because I didn't create a fork, a process for it to be me. I forked it to go off and do something else, right? And so right after I get done making a complete clone of myself, it calls exact and it completely blows itself away. Changes, right? We'll talk about exact today, right? So what are some solutions to this problem? There were two different solutions we discussed. One has to do with trying to observe this pattern and optimize for it, and the other one had to do with just changing the semantics of the fork system call to support it better. What's your name? Anurag. Anurag. Pick one, you have two choices. Right, so the first solution is, okay, so we just talked about this. The first solution is this clever memory management technique, which we'll get back to, which essentially allows us to avoid copying the memory without allowing processes to share pages, right? And again, we'll talk about how this works. And then what's the other solution? What else could I do that would also give me kind of the same, support the same common pattern, don't you? Right, so I'll create a new system call that essentially will fail if the child does anything other than immediately call exact, all right? And this is a, you know, throughout this semester I'm gonna try to point out cases where there are design patterns that I want you guys to understand, right? Because one of the reasons, you know, we talked about studying operating systems in the first places, these are these, you know, beautiful, very mature systems that people have designed over the years. And a lot of smart people have thought pretty hard about how to design these systems. And so the designs that they've produced are fairly mature and interesting, right? So I want you guys to pick up on some of these things. And we will come back to this. All right, so remember we used this PS3 utility and I did look this up before class and discovered some of the syntax here, right? So the square braces indicate that there are two copies of this Apache process, all right? Here, for example, this means that there are eight copies, identical copies of a Python process. So this is, you know, either Mailman created eight different copies by calling fork and then having it exact the same program or Python was called once and forked itself seven times, right? This notation indicates that there are 17 threads inside each one of these two Apache processes. So for some reason, I don't understand this, Apache has two processes that are identical that have 17 threads each for a total of 34 threads and then two other processes that have 26 threads each for a total of 52 threads. And I have no idea why these numbers aren't powers of two or 10, but that's what Apache did, right? So this is the semantics of PS3. I know there were some questions about this. OK, so any more questions about fork before we go on? All right. So today, uh-oh, that's not good. Hold on, hopefully this will fix that. Oh, wait, there we go. That's what I wanted. OK, good. That's better. OK, so it has every slide in the whole deck all at once, which is kind of interesting, but not what I wanted. All right, so we talked about this a little bit on Friday, but all I had was fork, right? What would the system consist of? This is in it. So all I have is fork. Fork is the only system call that allows me to manipulate processes. So what does the system look like after a while? There will only be in it. What's your name? Mukta. Right, so I'd have like this sort of thing, right? And I could keep going with this, but I got tired after a while. But I just have lots of different copies of it. And this wouldn't be a very interesting system, right? So we need to allow processes to change, right? There has to be a mechanism for this. And a mechanism for this on Unix-like systems is a family of system calls, because if you look, there's exec v, there's exec c-e-v, there's exec v-e-c, and they all, but it's essentially a family of calls that do very similar things, right? And what they do is they allow a process to request that it be transformed into something else, right? So this family of system calls, it replaces the calling process with a blueprint of a new process that is loaded from a file. And we talked a little bit about this last week. We used this analogy of the diva dressing room, right? So exec v is supposed to read a file. In this case, the file is this l-format executable, which we mentioned last week, and we'll show you a little bit more about it today. And the l-format is designed to provide a complete description of what the new process wants to be like, right? What am I going to be like in my new life, right? And what are, you know, I mean, there's some hits on the slide, right? So what are some things that, you know, what does it mean for a process to change, right? What would the l-format have to contain? This is a read the slide and regurgitate the slide type of question. Is it Bethany? Yeah. Okay, okay, so, okay, memory, right? Memory is a big part of my environment, right? So the l-format definitely has to describe what should the contents of memory be when I start executing, right? And the contents of the process are going to be completely replaced according to the blueprint that's in the l-file, okay? And then also, right, I mean, where is the entry point? Where should the first thread in my process start executing? Because this is what's going to happen, right? I'm going to read this file. I'm going to set up the contents of memory according to the instructions in the file, but then the thing I have to do to get the process going is I have to actually initialize the first thread and say, here's the first instruction that you're going to start running, right? And, you know, as you guys have glommed on to by now, Linux and other unicyc systems have a standard format that describes how this binary file is supposed to be set up, okay? So here's another example using one of our fun UNIX utilities. So there's a program called Read Elf, and Read Elf does pretty much what you would think it would do. It reads an elf format executable and it displays information about that elf format executable by obeying the standard and showing you what's in here, right? So this is for this program, been true, right? What does been true do? It returns zero, right? True, good, right? What has been false to it? It returns negative one, I think, or something that's not zero, okay? So it's unclear whether or not you could have a simpler program than been true, right? I mean, how many lines of code has been true? It might be like one, you know? I think it might actually have a main because it looks like it's loading the C standard library for some reason, which it probably doesn't need to do, but... Okay, so been true just returns zero. Here is the contents of the been true elf file, right? So first of all, the program prints off that this is a elf file type is exec, right? This is an executable. What other kind of elf file might there be? Why is there even a file type? Aren't, isn't every elf file an executable? Jeremy? It could be a library. It could be a library, right? It could be a shared library file, and we don't have enough time in this class to really get it to all the gooey details of loading and linking. This is probably as close as we're going to get today, but you guys are probably aware of the fact that there are these things called shared libraries on your system, which are routines and other pieces of code that are used by many different programs on the system. One of the most canonical being the C standard library, which may or may not be loaded by this, I don't know, but anyway, so the point is that not every program contains every instruction, every function, every subroutine necessary to execute it. Many of them rely on the presence of standardized shared libraries on the machines that they execute. And this executable has a hit, which we're gonna look at as to how those libraries actually get loaded at runtime, right? All right, so this is an executable, so this can be executed, it's not a library file. What's the next thing that ELF points out to us? Yeah, what's the next thing on the slide? ELF type is exec, it's an executable. AJ, this is a, can you guys read the slide back there? Cause this is the read the slide sort of question, right? The next thing highlighted in green is what? Entry point, what did, how do you pronounce your name? Bart, okay, it's a nice Polish name. It has some j's and other constants, nice name. All right, it says the entry point, right? So what's the entry point? Yeah. No? You were going in a good direction and then you muddled your answer, yeah. The first address that the thread is going to start at, that is not actually main, right? Main, if you look, the C standard library sets up main for you, right? The main system call, and in fact, you can find, I showed, this is so embarrassing. I showed it to you how to do this, brain freeze. And so you guys may see this in section, right? But you guys have the code in your tree and you can see where the main function is called, right? There's assembly code that is run by the C standard library before main is called, right? Main is not what is called by the kernel. Kernel is going to use this entry point, which is probably going to be somewhere in the C library, right? And in fact, it is somewhere in the C library, right? Cause this is the program code, right? Well, maybe I'm wrong. Let's see here, 48, we'd have to figure out where it is. 48, A90, my hex is pretty bad. Anyway, I'm not going to do this on the board, but you could use this to figure out exactly where that entry point is, okay? All right, and then what are these, right? So it's told us a little bit about the file and then there's this list of information. This looks kind of similar to something we looked up for, Nick. So this is the blueprint, right? This is my description of exactly how I want my memory to look, right? Each one of these contains a couple of things. So first of all, this is an offset, okay? That, so these, if we look at these offsets, right? What's true about them compared with the other numbers on here? They're pretty small, right? Where is this an offset into? Remember, I've read information about a file, right? That file describes how memory is supposed to be laid out, right? But so where are the contents of memory coming from? Yeah. Well, okay, no, the entry point, remember, is where the first thread is going to start executing. Now I'm here, which is essentially the blueprint for memory, right? Remember, there were two imporses and pieces of information that had to be in here. One is where the first thread is going to start and I know that. The second is how memory is going to look, yeah. So these are offsets within the file, right? So this says start at address 34 in the file, okay? Load 100 hex bytes into virtual address here, right? And the memory stuff isn't necessarily going to make a lot of sense to you right now, but the idea is this says these two pieces of data tell me where to get the data in the file and the virtual address and the memory size tell me where to put it in the address space. So this says take 100 hex bytes from the file and put them at this point in the address space, right? And then this next thing says, you know, this is barely anything, so 13 hex bytes from here. So this is the next segment, right? So I start at 34, I load 100, whatever 100 is in hex, 256. 256 bytes into memory and then I start and I load the next 13 hex bytes, which is 19 bytes into somewhere else in memory. Yeah, cheer me. You know, I don't know what that physical address means. I will admit it, so I was ignoring it. Maybe next time for next year, what's that? It is identical and me, I don't want to speculate. It's, you know, virtual addressing hasn't been around forever, so maybe this was. But anyway, I actually don't know what this means. So ignore the physical address column for now, but that's good point. You know, these are identical in this case, right? All right, so, and then what are these flag values over here, right? What do those indicate? These look like what? Permissions and there are standard permissions, right? So this says right here, it says I have 256 bytes that I want to load from here. I want to put them into this memory address and then I'm going to allow the process to do what inside that area? Read and execute, okay? And what's probably in this memory region? Yeah, it's like return zero, right? You know, like whatever been true is, I think been true is actually probably compiled using the C libraries, using main and all it does is say return zero, but there's a little bit of code there, right? So that's where that goes. All right, so then I have this little line here, which is quite interesting, right? And we'll get to that on the next slide because I didn't know what this was when I started looking at this. And then I have these, then I have these blocks here, right? And I'll admit that I don't know what all of these tags mean, but these are, some of these are definitely shared libraries, right? And what this indicates is that the program is asked for a particular program on the system, which is called libloadlinux.so.to to interpret this section for it and to load those shared libraries. It turns out that this is itself an executable program and if you execute it, this is what happens, right? This is probably one of the best error messages I've ever seen on a Linux system, right? So essentially you can run this, right? And it will tell you, it will give you a long description of what it does and then it'll say, chances are you did not intend to run this program, right? And because this is the program that's used to load shared libraries for programs at runtime, right? So anyway, linking and loading is a, is a sort of GUI technical topic that deserves several lectures, but I'm not, I find it kind of boring. So we're gonna stop here. Okay, so one of the things that happens during exec is that the caller gets to pass arguments to the child, right? When you run a program in Unix, you pass arguments on the command line and the shell passes those arguments to the child that's fork and exact for you, right? This is how, I mean, if you couldn't do this, you know, it would be very, very, you'd have a very boring system, right? The arguments that are passed to exec have to pass through the kernel, right? Because the kernel is going to replace that process and one of the last things it will do is take the arguments you passed in and put them in the process's address space in a place where it can find, right? So there's an agreed on location where the process starts running. It will say, here are the arguments I was called with, right? And getting those arguments in the right places is a big challenge you guys will face in assignment two, right? And if you've ever wondered, this is where main gets argc and argv, right? So when you've written C programs, you've written a main function, you may have wondered where those arguments come from, those arguments come through main, they come to main via the C library but they come at the beginning from the kernel, right? So the kernel puts them into your address space that it agreed about in place and then when you start to run, you can find it, right? So exec is an interesting, has also as an interesting return, right? It's similar to fork but different in its own way. So if exec succeeds, exec never returns because where would it return to? The process they called exec is gone, right? So if exec is successful, exec does never returns a value, right? Because the new process that is running has replaced the old process and there's no way for that return code to go. So here's, let's walk through what happens if I call exec, okay? So I've got a program, right? And maybe this is my shell and maybe it's a copy of my shell that just forked off, right? So my shell forked a copy of itself and that's gonna immediately call exec, it's gonna run been true because I ran been true for my shell command, right? So the thread makes the exec system call. The kernel copies the arguments that should have had copy instead of move, right? There'll still be inside the address space but the kernel has to copy the arguments out of the program that called exec, right? Then so again, here's my address space right now, here's the file table, which we'll talk about in a sec. I don't know how we didn't get to that yet. So I have to load this file, right? I have to interpret the lformat executable. I'm gonna use that interpretation. So that interpretation tells me here's what the address space is supposed to look like when the process starts to execute, okay? And then I'm gonna replace the address space of the caller with this new address space, right? So whatever the caller had in its address space is gone, right? All those memory contents are completely replaced, all right? And then the last thing I have to do remember is I have to copy those whatever the arguments were in case this has been true. So hopefully I wouldn't pass too many arguments have been true because I don't think it does anything with them. I pass those arguments, I have to put those arguments in the address space have been true so we can find them when it starts running and then I start been true running. Where do I start been true running? At the entry point that was in the Elf file, right? At the memory of where essentially I started running where it told me to start it running, right? I'm a happy, compliant and helpful curl, right? All right, so we hadn't talked about files that and I'm glad the slide is here because we need to. So my convention exact is not alter the file table of the calling process, right? Why not? I mean, I blew everything else away, right? It took all the memory, it's gone, right? You know, the threads are gone. You know, any state about the parent process that was there is gone except for the file table which by again, by convention, there are at least versions of exec that do not modify the file table, right? So why wouldn't I modify the file table? Why not just have exec by default, reinitialize the file table or clear the file table or something, right? Why would I leave the file table alone? What's your name? Akshay. Well, I could read, well, okay, that's interesting. It might want to work with the same files. What in particular might it want to do with the same files? We just reviewed this today. Yeah, that's it. I wanted to communicate with the other. Remember all this work I did in fork with pipes, right? To get my pipes to work. Now, if I call exec and I blow away the file table, I can't communicate with my parenting, right? So the semantics of fork with respect to the file handles, the work that fork does to allow file handles to be shared between the parent and process and the child process is not that helpful if every child process that is not a copy of its parent loses all that state, right? So by default, I want to allow because most of the time when I set up a shell pipeline my parent and child will not be the same program, right? That would be, I mean, there are probably times in which I could do that for some reason but most of the time I'm catting the output of something into something else. I'm gripping and then I'm doing this. I'm sorting or whatever. So usually I'm using several different programs along my pipeline, right? So each forked copy is going to call exec in order to run that new program and leaving the file table alone allows me to set up these pipelines, yeah, Sharon? Yeah, that's the system called that's used to set up those pipelines. Absolutely, yeah. So the shell, I wish I knew exactly how this system wanted it. The shell interprets that pipe character and creates pipes between the processes that it sets up for you, right? So when you get the shell, if you give the shell like a three-stage pipeline it will set up something like this with new forked processes and if there's a third stage in the pipeline I'll have a pipe going from the child to its child, right? Or from the child to some other process, right? But yeah, the idea is that the shell uses those pipe characters to determine how you want the pipeline initialized. All right, any questions about exec file handle stuff, right? So I told you last week we were gonna build basically like the simplest of simple shells, right? And this is almost it, right? Like this is a big part of it, okay? So, and again, I mean, I made up pieces of this, right? Just to spare us some of the gooey stuff, right? So I read a line of input from your shell. I call fork. Who is this? It's the child. If I'm the child, I call exec with whatever the input you gave me was. I essentially, if you gave me bin true, I try to exec bin true and then I continue doing it. So what's missing here, right? What would be wrong with this? What would, I mean, what actually, it's interesting, what would happen if I ran this code? Assuming that I had some functions that would help me to get this to work. Yeah, so essentially what would happen here is that the parent would go back to the top and try to read another line of input. It would probably redraw the, if I had a real shell, I probably have some kind of prompt, right? So probably redraw the prompt and then it would sit there waiting for input. But in the meantime, the child is running, right? Then what if the child is trying to also read input from the console, right? Now I have a problem, right? I have the parent and child competing for input, right? And so the parent's gonna get some character as the child is gonna get other characters and it's gonna be a mess, right? So I need something else here, right? I need a little bit more. And that's kind of, let's see what happens. That's what we're gonna get to today or next time. We get there today, all right, cool. All right, so yeah, I'm almost, there was a question, yeah. So exec never returns a value, right? Because remember when you call exec, you're essentially saying I want my memory erased, right? So there's no one, there's no one to listen to you at that point, right? Because the person who would normally get the return code would be the calling process, yeah. Ah, yes, I can get a return code from the process that calls exec. I just can't do it in the usual way, right? But we're gonna talk about how I actually do that for the next 15 minutes, right? So when this, there is a mechanism for to allow the process that runs to return a return code. But it's a little more involved than just returning a value through up the call stack like you normally would. Was there a question in the back? Yeah, Nick? Ah, okay. Yeah, okay, yeah, good. It's a good segue to my next slide. So one of the harder parts of exec is making sure that exec can fail, right? And that exec can fail safely. So if exec fails, I need to, what needs to happen? I do need to return a return code, right? So if exec fails here, right, then, so what I would really have, actually, and so let me clear from my answer to Jeremy, exec does not return on success. If exec succeeds, it does not return. If exec fails, it does return. So normally what I would do is I'd probably have some code in here, in the child, to handle the failure of exec, right? So I might try executing, and I might have given it some bogus file that isn't even an L format executable or whatever, and it might fail, and then I might need to handle the failure code and exit myself, right, or do something else. So on success, exec does not fail, but we need to allow exec to fail, right? And the problem is that exec is making these destructive changes to the address space of the caller, right? It's wiping out all the memory and putting in new stuff, and so what I need to do is make sure that any of the destructive changes that I make are done after I'm positive that the process will, the exec will actually succeed, right? So again, I do need to fail, and if I fail, I shouldn't have changed anything, right? If I fail and I come back after the exec call, there shouldn't be big parts of my memory that are missing or have been replaced with something new or whatever, right? So I have to return to the call, right? And this is not really that difficult, right? Because what am I doing, right? I'm just preparing a new address space for this process, but that process is really just an abstraction, and so I can just set up a completely separate address space somewhere else. I can essentially take a new chunk of memory and set up all the contents and get that already, and at the end, the last thing I do is I kind of switch. I take the pointer to the old process, I point it to the new process, but I only do that when I'm completely sure that everything's worked, right? And you guys will get a chance to do this, so I won't describe this in too much more detail, right? All right, so okay, so what's wrong here, right? We have this process model, we've allowed processes to be born, we've allowed them to change, but we haven't talked about how they die, right? And processes, yeah, Jeremy? Yeah, I mean, there's all sorts of reasons that I can fail. I don't have permissions to execute the file, I mean, there's shared libraries that are missing, whatever, I mean, there's all sorts of things, yeah. So processes choose to end their own life by calling exit, right? Exit returns an exit code to the kernel, right? So as we talked about before, when I call exit, I give the kernel some clue as to what happened to me in my life, right? Exit zero is usually considered to be, everything was fine, exit non-zero is usually taken to mean that something interesting or wrong happened to me, right? What happens to this exit code? Who collects the exit code for a process that calls exit? So the process that exits tells the kernel, hey, negative two, or whatever, right? And then the kernel tells who this information, lovely. The parent process, right? So typically the exiting child's parent is guaranteed to be able to recover the exit code of the process, right? Now this is a little bit interesting, right? Because it's possible that other processes might be able to also retrieve the exit code, but the parent process is the only process who the kernel will guarantee it'll say, no matter what happens, I will not lose this note from your child that says what happened to it. You may not care, you may be going about your life and you may be totally ignoring your child and you may never be interested in its exit code, but I will keep this little teeny-weeny bit of state until you either retrieve it or I know that you really don't care, right? But normally those exit codes are saved until somebody retrieves them, right? Yeah, well, I guess this year I did fit it into my life cycle metaphor with this whole seance type thing, right? So yeah, it's kind of like your dying ancestor who was killed as the result of foul play, right? I mean, that message will be waiting for you at Lillydale until you either go there or die, right? And maybe if you die it'll be waiting for one of your ancestors, right? Like someone will collect that exit code. All right, so yeah, so wait and exit are frequently sort of lumped together into a single, you know, a pair process, right? Because this is such a two things that happens here, right? Wait creates some state, exit, sorry, exit creates some state and wait removes the state, right? Wait is the system call that I use to collect the exit code of a process that is exit, right? So, and wait has two, let's see here, wait has two different semantics, right? Which we'll talk about, but right. But again, until the process calls exit, there are traces of the process that remain on the system, right? So sorry, until it calls exit and its exit code is collected, right? There is some trace that that process was there, right? And if you look on your system sometimes and you do PS and you look around, you'll see that there are these processes that are listed as zombies, right, on the system, right? You guys may have wondered what is a zombie process, right? A zombie process is a process that is exited but has not had its exit code collected, right? So the colonel is holding on to this note about this process that it knew about, but no one has collected that. No one has paid, no one cared, right? No one cares what happened to that stupid process. So what's, so I point out here that the pit of the process is retained. The pit of the process that called exit is retained until that exit code is collected. Why is that? Yeah, that's true, but I mean, there's something more fundamental here, right? I mean, how, when I call wait, I have to tell the colonel which exit code do you want, right? I have that, like, if I get rid of the pit, then the process really has no name anymore, right? There's no way for anyone to call wait on it because there's no way to call it anything, right? If I get rid of the pit, it's like, oh, I've got this pile of exit notes lying over here and I didn't put a name on any of them, so which one do you want, right? So we don't do that. Yes, yes, yes, yes, yeah. Okay, so that's, I think that's, oh, where did it go? Okay, I'm getting to that, maybe, but we will talk about that, right? There's a difference between exit or return that you guys are used to and the exit system call that gets made, what actually happens is that system call gets made by the C Library, right? So when you return, there's a little bit of assembly code that gets run after you return that takes that return value, it puts it in the right place and makes the exit system call. Yeah, yeah. So what you're saying is like, if nobody collects, then maybe in it's collected? Yeah, so what, okay, you guys are good with segues today. So if a parent's process exit before it does, right, then the parent doesn't have a, sorry, the child doesn't have a parent anymore, right? So the question is, who serves as the parent for this child, right? And you could assign it to its grandparent or something in some social services type way, but what we do is we just assign it to in it, right? And on well-functioning systems, in it just sits around collecting exit codes, right? Because the idea is if something gets assigned in it, no one cares, right? The parent's process is dead and gone and no one is there to care about this process and so in it just sits there and anytime in it realizes that it's been assigned to a child to just collect the exit code and discard it, right, so that's why you don't see zombies normally on your system, right? And there are probably some interesting cases where you do see zombies, but normally zombies are reaped by in it fairly quickly, right, because in it, that's all it does, it just sits there running and collecting exit codes, right? And yeah, so when the parent receives this, it gets, so the other thing is, you know, how does a parent know that its child has died, right? And what happens is that there's a signal that gets sent to the parent called SIG Child that says your child's exit code is ready to collect, right? Parents don't have to do anything about this, they can choose to ignore that signal, but this is what gives them an opportunity, right? And on some systems, the process can choose to have its children, wait, oh man, sorry, ah, okay. So how does this work? Yeah, okay, sorry. So there are some systems where when the process exits, it can cause its children to be automatically reaped or to have them called, what is it, what is it? Yeah, there's another signal I'm missing here, right? So some of you guys have had the experience where you've launched a GUI application from a shell, and then you've quit the shell and the GUI application stops running. How many people have had that happen before, right? So you may have wondered how that happens, right? The way that happens is that there's a signal that's sent to the child when its parent exits, and don't think it's SIG Child, it's something else, and some processes choose to handle that signal by exiting, right? Some of them choose to ignore it, right? So you can launch certain GUI applications that will ignore the shell when it exits. But the shell is the parent of that GUI application, and so when the shell exits, it sends a signal to that process, and certain processes choose to handle that signal by exiting themselves, right? And you probably found that it's to be frustrating, right? And shells and other tools usually give you a way to kind of remove this irritating relationship between parents and children, right? So on shell, on bash, the relevant command is called disown. So if you're running something in a bash shell and you don't want it to exit when bash does, you can call disown and you can tell bash which process and bash will not send a signal to that process when it exits, yeah. What's this? You mentioned that when we exclude an exit statement, the parent exits. When you call exec, no, there's no parent and child in exec. When you call exec, you replace your own address space, right? Frequently what happens when you run a program from the shell is that shell creates a child and then that child immediately calls exec. But exec replaces, this is a good question, it's important, exec replaces the current calling process. It doesn't do anything with the processes children. It says, I want to be different, right? And here's a description of how I want to be, right? All right. So like I said before, there are two types, I think I'm low on time, right? So I'm gonna get through this and we'll be done. There are two types of weight, right? So there is a blocking weight call and there's a non-blocking weight call, right? The blocking weight call says, I want the kernel to stop me from running until my child exits. And then I want you to return the return code of my child. So the blocking weight will sit there and the process will not run until the child exits, right? There's also a non-blocking weight which says, don't block me but just tell me whether or not my child has exited. So I'm kind of checking on the status of my child, right? How many people have used a Unix shell? Then you've experienced this before, right? So what does the blocking weight correspond to? Let's say you're running, I don't know, something that, you need something that spins, right? But what's that? Yes, that just, that just spins? Okay, okay, that's fair. What's that? Yeah, well, I mean, how many people have ever backgrounded something, right? So you can run something from the shell, you can type a command and if you type a command, the shell will do a blocking weight on your child. So until that child exits, the shell will not show up again, right? It'll just, it won't redraw the prompt, it's done a blocking weight and will not return until the child exits. If you give it that ampersand symbol, it'll run in the background, right? So when you start processes in the background in the shell, when do you think it checks on their status? If you guys ever run anything in the background and it's exited, and you can tell that it's exited, but the shell hasn't told you anything about it, why not? Yeah, so what will happen is the process will exit and if you go back to your shell and hit return, it'll show you that the process is exited and it will show you the return code of the process, right? And the reason is when it redraws the prompt, right? If you just give it a return statement, it'll go through the list of children that it knows about and it'll call this non-blocking weight on all of them, right? And if one of them is exited, it'll show you that, right? I wish I had an example of this. Maybe I'll come up with one for next class, right? But again, this is behavior that if you used a shell, it should be very natural to you, okay? All right, so this is our simple shell. We'll start with this next time and we'll clean up a little bit of weight. If you have questions, come talk to me outside and I will see you on Wednesday.