 So apologies for being a bit discombobulated today. OK, so you're good? OK. So what we're going to do today, welcome back. Thank you guys. Oh, this is the group that's taking the class. Today's the drop deadline. So today what we're going to try to accomplish is we're going to finish some of the examples we started last time using simple Linux utilities to sort of poke around and find out some details about running processes. Do you have a question? Yeah, can I get to that in a sec? Sweet. And then we're going to start talking about the process life cycle. So where do processes come from? How do processes change? And where do processes go? Question, yeah. OK, great question for Piazza. Question about script. Can you post it on Piazza? Sweet, thanks. All right, so yes, today is the drop deadline. Recitations will start this week. I think the first recitation is when, Wednesday? Is that true? Tomorrow, Tuesday, all right. And then we have the glorious Thursday 8 AM recitation. Friday, even better, Friday 8 AM, yeah, fantastic. So if you're still looking for a partner, how many people don't have a partner for the class? All right, so yeah, I know. How many people are not a ninja and don't have a partner for the class? OK, a couple of you guys, please come by. If you want to look for a partner in person, come by Davis Hall today for office hours after class, which start at 3.30. And we will try to help you find some, right? Or talk to you about other options, OK? It's crunch time. If you guys don't have a partner, haven't gotten started, then we need to get choked up as quickly as possible. So come to office hours today, and we will help. All right, so just let me make a, you know, now that we've maybe potentially scared away a few people, let me talk about grading, right? I didn't want to talk about this earlier because I was afraid I would scare away more people. Just kidding. Actually, so my goal in the class is for everyone to do well. I don't have a quota as far as how many A grades I give out. There's not a limit. It's not a limited resource. I can actually give out as many A grades as I want, right? And I want to give you guys all an A. That would be fantastic, right? If that ever happens, I will email the faculty list and brag about it, right? And maybe I will get in trouble, but that's OK. That's really what I want to do. I want you guys to all get to the point where I feel like I can give you an A grade. In this class, that grade means something to me. It means something to people who know me. It means something to people who hire our graduates, graduates of this class. So it's not a meaningless grade. It's not something I can do just by waving my hands. But I don't have any rules about how we grade the course. If you guys do well, if everybody does well, you'll all get a good grade. It's that simple. And we're going to do as much as we can to try to help you with that. If you have any questions, use the forums, or particularly about grading. Or email us if you have technical administrative questions about how grades adds up. Most of that stuff is pretty well covered on the syllabus. But again, this is not a cutthroat course. I want you guys to do well. We're going to give you as many resources as we can to make sure that happens. And the rest is up to you. So I try to, the way I feel good about asking you guys to do this much work is by putting resources at your disposal to help you. So please step forward and do the rest. The what? We've made it through three classes before someone asked about the lecture slides. So yes, the lecture slides will be posted on the official course website. If you go back and look through the videos, I've said this every year. And before they were up on the old website, that old website is down. Finally, I got tired of paying GoDaddy for it. So that is down. And the new slides will be up in the next week or two. We'll certainly have them up in plenty of time for the midterm. Does that sound reasonable? They're so close. I just have a little more hacking to do to get them up there in a nice way that I feel good about. OK. So last time, what do we do? We started talking about process, the process abstraction. And if you guys are feeling a little bit confused, I understand that. There's a certain degree of circularity to some of these concepts, especially when we're getting going. So just bear with us. Because processes in particular depend on these other abstractions that we haven't talked about. But I think processes are the most natural starting point when we start looking at what the operating system looks like from the perspective of user programs that rely on the services and interfaces that it provides. All right. Any questions about what we covered on Friday? I know it was a long time ago. It was a weekend. There was a Super Bowl. I know some of you guys were stunned by Tom Brady's continued excellence and amazingness. How many people were rooting for the Pats? Oh, yeah. OK. That includes me. Yeah, all right. How many people were rooting for the Seahawks? Ha ha ha. Two years ago, when the Pats played the Giants, I was very upset about that. But I did come to class the next day and I wore Giants jersey and some sort of terrible blinking Giants hat. You can find the videos on YouTube if you want to see that. So anyway, I feel like that was my little, it's been a long time coming, but I feel good final. They won. All right. So let's go back to using PMAP. So PMAP was this nice utility that was designed to allow us to inspect the memory mappings of the process. This is sort of where we left off. It's a little bit rewind of where we left off on Friday. So we were using PMAP to look at Bash. Fairly simple process. We found out that Bash had some executable code here. It had a little bit of space that was reserved for read-only variables. And then there's a little bit of space in this section that was reserved for variables that we can write. We can change them at runtime, but they have values that are stored in the executable. So these were variables that were loaded and initialized from the executable when Bash started running, but were allowed to change them. That's why this is marked to rewrite. Now what are these sections here? Anybody want to guess? Yeah, I think there is some heap in here. So I suspect that this part right here might be the process heap. And we will talk, how many people know what the heap is? Yeah, so the heap is an area for dynamically allocated memory. When a process needs memory at runtime, that it didn't know it needed when it started. If it knew it needed when it started, and it was going to use that memory the entire time, it would have put it up here. It would have placed it up. It's going to be one of those days. OK, it would have put it in one of the other sections. So this is probably the heap. And then what's going on down here? This is a little bit potentially easier to pick out. Yeah. Yeah, so these are shared libraries that are used by Bash. And at least at this point in time, what's the shared library that it's using? The standard C library. Libc probably used and linked by, I don't want to guess a number, but I would say upwards of 95% of executables on most modern Unix-like systems. Link against the C libraries. Just incredibly ubiquitous. It might be closer to 99. Finally, what's going on here? Another little section, 132K. Reading comprehension test? Stack, right? What's on the stack? How do you get a variable onto a stack? So this is actually a great question that's very pertinent for people that are new to C programming. There are three ways to allocate memory in C. Three main ways maybe. There are probably some other funky ways that no one cares about. Three main ways. What are they? Malik? Don't do all of them, just one at a time. Let them answer some questions too. Malik, right? If you allocate memory through Malik, it ends up in here. This is dynamically allocated memory that's managed by Malik. Malik is a library. Malik is probably, in this case, loaded as part of the C library. It's a set of routines for managing that memory. Now, when we get back to memory management in a month or so, we'll find out where the program gets the memory in the first place. But where do you think the program gets the memory from? Who's in charge of multiplexing system resources? The operating system. So, Malik, while Malik is a user library, it relies on the ability of the operating system to manage memory and provide memory to user programs as needed, OK? So that's one way. What's another way you can allocate memory in C? What's that? OK, there's two types of variables, right? There are global and local variables, right? What's different about global and local variables? Scope, what else? So scopus is the technically right answer. And it's also where they reside in memory, right? But from a programmer's perspective, if I change the value of a global variable, where is that value visible? Everywhere. And how long is that value visible? Till I exit, right? So these global variables are the type of stuff that ends up in here, right? Now, not every global variable is initialized by the executable. Some of them are just initialized to have a default value of 0 or essentially nothing, right? Those might end up in this section, right? I'm not exactly sure specifically where those values would end up, right? But when the program is loaded, it can request that the US give it some memory that is just initialized to hold 0s. And that memory might be where some of those global variables live if they don't have a starting value. If I initialize the value to something, then the program has to tell the operating system when it loads up what that value is, and those values end up probably in here, if they're modified, if they're not just constants, OK? The third way, local variables, OK? So if I set the value of a local variable, where is it visible? Only within the context of the code that's executing, right? How long does it hold that variable? How long does it hold that value? Sorry. How long is that value safe to access? Let's think about it that way. Until the function returns, right? The function in which it's allocated, and the scope in which it's allocated, right? Until the function returns, I can access that variable. Once the function returns, that variable is gone, right? Technically, it's still there somewhere. But as a C programmer, you shouldn't try to use it, OK? Those variables end up down here on the stack, OK? So it's a little bit more programming discussion that I like to get into a lecture, but fits in with PMAP, OK? So now we're starting to build our mental picture of what Bash looks like, right? So Bash has a single thread. Remember, we decided there was really no reason for Bash to be multi-threaded. Maybe you guys can write like a super awesome shell Bash multi or something that, I don't know, like draws cats while you're trying to figure out what to type next. Draws cats in ASCII, which would be awesome. All right, so there's some code that Bash is loaded, right? We saw the data. It's got heap and a stack, right? This essentially lines up with, this is the code, this is the data area, this is the stack. Oh, sorry, this is the heap. And then we're ignoring shared libraries because I just got tired of drawing boxes on this line. OK, so now let's look at open files, right? So how many people have ever heard of LSOF? LSOF is, yeah, it's kind of, of all these commands, I feel like PMAP is probably the one that most people don't know about, and LSOF is sometimes useful, right? Why do you use LSOF? Like, what's the use case for LSOF? People that have used it, why did you use this command? Are you just poking around, trying to find stuff out, yeah? Yeah, yeah. Why can't I eject that stupid USB thumb drive, right? And your computer's like, well, someone's got a file open. So this is where LSOF could be used, OK? So I've asked LSOF to tell me what files Bash has open, OK? I gave it a process ID. And what are these? Dev, PTS, 0. The same file. There's the same file. Well, someone make a guess about what that file actually is based on its name. Yeah? Some sort of device, right? It starts with dev. There's three of them. That's your next big hint. Yeah. SD in, SD error, and SD out. What are those? Standard input, standard output, standard error, right? These are three file handles that the C library opens when you start up a program. And in the case of Bash, what is this device that they're pointing at? It's a console, right? It's a console device that accepts and prints off characters, right? So that's what's waiting for you to type something, right? I have no idea what PTS stands for. It probably stands for something really cool, though. So someone wants to look it up, do you know? What's that? Why are there three of them? I have one open for each file handle, right? So what this means is the shell is printing its output to standard output. It's receiving input, sorry, it's printing its output to the terminal. It's receiving input from the terminal. And it's also writing error messages to the terminal, right? So all of the input and output is going to the same device, in this case, the terminal, right? If I set up a process in a different way, for example, if I use pipes to move data from one process to another, then these file handles don't point to the same things. And we'll go through an example of how to do that later. What's this last file right here? Dot bash rc, yeah. So this is the bash configuration file, right? This is what bash reads at startup in order to determine how you want to print the shell prompt and other stuff like that, right? OK, now I just want to be told, because I want you guys, you and me, to be frank with each other, right? So this is not technically true, right? I doctored this output, and the main reason I did it is because I was embarrassed that there were no other files open than standard in, standard out, and standard error. However, you can imagine that bash would have had this file open if I had caught it right at the right moment during initialization when it needed to have this file. So anyway, that's my mea culpa for the day. I have made this example slightly more interesting through the powers of making things look like they were actually printed by the term, yeah. No, just the last one. No, no, no, I didn't make it all up. Come on, I wouldn't have known how to format the output if I didn't have an example, right? No, I just added the last line there, right? OK. All right, so again, let's just imagine that that's what we call it, OK? So now here's our mental model of bash, right? And this is pretty much done, OK? It has the three resources that we've talked about in the class. It has an abstraction for using the CPU, the thread. It has a set of memory mappings that allows it to use memory. And it has a file that allows it to store data on the disk, OK? Any questions at this point? I'm not going to do this aside. You guys can look at it online. OK. All right, so briefly, let's go. So we just talked about these abstractions. Thread save processor state. Address faces are used to allow processes to use memory. Files abstract away the disk. Any questions about this before we go on? Yeah, you've got to speak up a little bit. Good, because I didn't tell you what processor state is, but we will get back to you, right? So yes, that will be the next unit we will do. We will talk about what state I need to save in order to abstract the process, right? You guys can start imagining it if you want. And as you start working on assignment one, you're actually going to find out a little bit about what that is, the specific example being inside OS 161, all right? And Unix also has this idea of file-like objects, right? So a file-like object is an object that obeys file system semantics, but it's not actually technically a file, right? There are no blocks on disk corresponding to this, right? So can anyone give me an example of a whole file system that Linux slash Unix systems create that's completely fake? There are file systems that live among you that are not actually file systems, right? You think they're file systems. When you look at them, they look like a file system, but inside, they're not actual file systems. They have no data on disk. What's the example of one? What's that? Say that one more time as loud as possible. Peripheral I.O. Yeah, so OK, so that's a good, devices are an example of this, right? But I think I'm an entire file system that's not. Yeah. Not a file system? PROC. How many people have ever poked around in the PROC file system before? How many people have ever noticed the PROC file system before? It's there. Slash PROC. Go poke around in it on your virtual machine. So if you run a mount command, you'll see slash PROC there. Slash PROC is not an actual file system. Slash PROC is used by Unix-like systems to export information about the files that are running on the machine, but there are no contents on disk corresponding to PROC. The operating system just makes up PROC out of thin air. So all the entries in PROC, totally illusionary. They're there for a reason. They're there to allow tools like TOP. So for example, TOP, if you look at the source code for TOP, TOP reads things from the PROC file system, formats them nicely, and then shows you the output. That's how PROC gets information about processes that are running on the system. And a lot of standard Unix utilities do the same thing. So PROC is sort of the interface between the kernel and user space tools for understanding things that are going on the machine. But PROC looks like a file system. It smells like a file system, but there are no data blocks corresponding to PROC. All right, a little aside. OK, so let's go back to Monday. What's the point of abstractions? What are abstractions designed to do? There were a couple of things. I'm going to start using my little magic thing again. So how's deep? Yeah, so that's a good broad goal. One of the ways they do this is hiding undesirable properties. Things that programmers don't want to know about. And once we start talking about the CPU, memory, and disk, we'll get into a lot of these. What else? Another example. Let's go with Josh. Joshua. Josh here. The problem, he could be sitting here completely silent. Pass? All right. Oh, he's looking in his notes, though. Ananda. Be here. You don't know? Something else. Hiding undesirable properties? Alexander. Yeah. OK. So OK, I would argue that these are good mergers of some of our answers. Adding new capabilities is a way of simplifying what you want to do. Because if it didn't have those abilities, you'd have to implement them yourself. And it can be better for the OS to do that. Last but not least, organizing information. So getting all the information about a particular part of the system and making it available in one place. All right. So processes were this very high level of abstraction. They were the one we were going to talk about that doesn't map down to a part of the system. So what's the point of a process? Who can solve the puzzle? Simon. Yeah, you don't know. Oh man, Vanna White is so disappointed right now. She's like, all right, who wants to try Daniel? Yeah, don't know. OK. Anyone want to take a stab at the first part? Yeah. Organize information. Close enough. Close enough. We're not too strict on this version of Wheel of Fortune. And they represent one single thing that what? Process. How do you guys experience a process? Yeah. Yeah, one thing the computer is doing in your definition of doing. Processes contain threads, address space, files, and other things that we won't talk about. These were the things that were a concern. Network sockets, capabilities, permissions, all sorts of other weird stuff. These are the main things we're going to focus on. All right, got it. Thank you. I'm just going to skip through this. This is what we talked about on Friday. What is the contract that the operating system sets up with processes? Again, we'll talk about this in one of the next couple of lectures. The operating system is this special program. It's just a program, but it's been given these special powers. And to some degree, it rules the rest of the system. It makes decisions that affect other processes. Part of doing that is trying to make the system run as fast as possible. But what else does operating system guarantee to processes as part of allowing it to rule, to be the benevolent dictator of the system? You want to answer this question, I think. It protects them from each other. What I do inside of my own process is my own business, but I should not be able to affect the operation of other processes. All right, any other questions before we plunge on? Stunned people into submission. OK, so the first thing we need to do before we talk about the process-related system calls is just what we're going to do for the next couple of lectures. So fork, exec, wait, and exit. Together, these allow me to create processes, allow me to alter what they're doing, set up some very simple ways that different types of processes can communicate with each other, and allow me to get rid of my process when I'm done with it. But as we go forward, we need to be a little bit more precise about how processes relate to files. So here's our old process model. We had threads and address-based files. And here's the small change I'm going to make. Did you see that? Just blur your eyes a little bit. So it turns out that there's actually a layer of indirection between a process and the underlying files. There's an extra level of indirection, which is what we call it when we talk about building computer systems. So file handles. So it turns out there's actually three parts of this. There's the file itself. There's a file handle. And the only piece of process private data is something called the files table. File table contains references to file handles. How many people have ever looked? So how many people have used open before? Like a raw open? They just call it open in C or C++? What is open return? What's that? I know that's what we call it. Returns of file descriptor. What is that? Have you ever just tried printing it? What do you think? I think it's a tuple, unless you might be using it in some sort of wrapper or something. It's an int. It's an int. It's a number. Remember, computers are good with numbers, bad with names. You gave the computer a name, you called open, and it gave you back a number. So it turns out that number is a number for a reason. It's actually not just a number, it's an index. It's an index into this data structure called the file table. The file table allows the operating system, so when you call read or write, you pass in what you think of as a file descriptor. But all you're passing in to those calls is an int. And the operating system is using that int to figure out what file are you trying to do something to. Has anyone ever got to the point where open fails because they've run out of file descriptors? It's never happened to me either, but it would be kind of cool. Actually, it would probably be a signal that you're doing something really dumb. So the reason that happens is because this array fills up. This array has some number of entries in it. Most processes know we're close to using up all of these entries, but there is a limit. And so at some point, if you call open again, you've got 6,000, 65,000, whatever, two to the 16 or something. The file handle's already open. The kernel will be like, no, you're done. You need to close some of those millions of files you have opened first before I'm going to allow you to open another one. But the reason is that this array fills up, literally. So the file descriptor is an int. And it's an index in the process file table. So that points to a file handle object that's maintained by the kernel. And the file handle object contains a reference to a file object, which is also maintained by the kernel. So the question is, and that's actually used by the file system to figure out what disc blocks or whatever else, if you're talking about something that's not quite a file, I need to work with, OK? Yeah, there we go. Why? Why am I doing this? Yeah, it's like this cat is so angry. This is just to make this difficult, just so I can burn up 20 minutes of lecture today. So why do we have these three levels of indirection? It's a great question. We're going to talk about it today. The reason is because this allows us to share certain file state between processes and their children. And specifically, it allows us to establish certain patterns of communication between certain processes and their children. And so this is a really powerful design principle. If you have a piece of state that you want to have different properties or you want to share differently than other pieces of state, in order to allow that to happen, you need to break that state out of whatever larger object it's part of. So in this case, we have some state that we want to allow certain processes to share. In order to do that, we needed to take these two objects that we had before and take all the information in there and carve it up now into three. And I'll show you exactly how these work. In particular, I wish I had my, where's my thing? So this diagram is correct as I can make it. So the file table and file descriptors, these ints, private to each process. A process cannot modify another process's file table. And processes do not share file descriptors. Again, the descriptors just index into the table. File handles can be shared between processes under certain conditions, which we'll talk about. Specifically, after a process creates a child process, there is potential to share file descriptors. File handles, OK, I need to be more precise. The underlying file objects could potentially be shared among a bunch of different processes, but that's done in a way that the processes aren't aware of. So you're not aware that there might be other file objects that are opened by other processes, but with the file handles, you will be aware, because there's pieces of state that will change. We'll talk about that in a minute. Bug on slide. So I want to add, and you guys will want to keep this list handy. Can a process share a file descriptor with a child process? So in theory, a process can tell the child process file descriptor 4. So I can share it. It's just an int. So if I have some other way of communicating with the child process, like I'm using some other IPC, I could tell it 4. If it tries to use that file descriptor the same way I would, it's not clear that the right thing is going to happen. So for example, if I open foo and my child opens foo, it's possible that the file descriptors we get back are totally different. So if I told the child, hey, write to the file with file descriptor 4 unless we've done something else to make sure that that's the same file, who knows. So yes, in theory you can, but the file descriptor loses meaning once you take it out of a process. Does that make sense? Cool. All right, so what was our first big operating system design principle? You guys can all say it together. I know you've been thinking about it as you fall asleep. Separate from mechanism. Separate policy from mechanism. A separate one is to facilitate control or sharing by adding a level of indirection. And this is something that we're going to see for the first time today, but we'll see again and again as we talk about other operating system abstractions and in other places. So these ideas are referred to as design principles because they come up over and over again even in different subsystems. OK, so I'm going to get back to talking about file handles, but the first thing we have to talk about is where do processes come from? Where do processes come from? It's like the birds and the bees question. The birds and the bees question for operating systems. Where do processes come from? Yeah. What's that? Other processes. I like that. Two processes get together when they decide that they like each other very much. No, well, it's not two processes actually. In this case, Greek men used to fantasize about being able to spontaneously reproduce without involving women. Maybe they designed modern operating systems because processes can create new processes without having to find a compatible process and decide that they want to go into this adventure together. So processes come from other processes, but just one. There's a system call called fork that allows a process to request, and this is now very Greek mythology oriented, that the operating system create not just a new process, but a complete copy of me. I want a complete copy of myself. Right at this exact moment, I just want another me to appear. And actually, I don't know, at least at my age, a lot of you might fantasize about this sort of thing a lot. We usually call it a body double rather than other process, but this is someone who could return things to the store for me, so I don't have to do that, stand in lines. Anyway, so fork. The fork allows a process to say, please make a copy of me, and the process is supposed to be an identical copy of the caller. After fork, this terminology is important, we refer to the process that called fork as the parent, and the process that was created as the child. But keep in mind that they start off, except for one tiny little difference. They start off completely identical, at least with traditional fork semantics. We talked a little bit about how that's changed recently. So now, again, so in recent versions of Unix, there's lots of flavors of fork. But what I'm going to talk to you guys about is sort of the canonical version of fork, the original version of fork. So now you have these tools that allow you a lot more control over what actually gets copied. But the original version of Unix had one version of this that created a complete copy of the child. Now the only place where this becomes a problem is threads. And again, this goes back to early systems that may not have had any support for user level multi-thread. So it's possible that from the perspective of the operating system, there weren't any programs that had more than one threads, and so this wasn't a problem. Now the problem with fork and threads. So here's the problem. If I have a single thread in the process, then there's only one thing that the process is doing at any given point in time. So when the process calls fork, what's the one thing that the process is doing at that point in time? It's calling fork. So if I have a single threaded process, and it calls fork, I know that that one thread is currently calling fork, right? And so I know that nothing else has happened, right? Nothing else is changing the process. There aren't other threads that are running around in the background doing random stuff, right? The whole process has stopped because it's called fork, yeah. Yeah, we'll get to that. Yeah, you're getting to my end of slide, end of lecture slide, right? The problem with multi-threaded fork is that there are all these other threads, right? And those threads may be running, they may be doing stuff, and so the process may be changing as I call fork, right? And that makes the semantics of fork kind of complex because normally when I call fork, what I ask the operating system to do is I say I want a complete copy of myself. But if there's other stuff going on, then the operating system is like, well, when exactly do you want me to do? Like when do you want me to copy you, right? If there's only one thread, I want you to copy me at the exact moment that I called fork, right? That's, I've got everything ready, I've created this perfect clone of myself, and that's when I want that person to start existing, right? If there's other things going on, then this becomes more complicated, right? So the way that modern versions of Unix handle this is they normally only copy state for the thread that called fork, right? Okay, so little, I'm not gonna go through this either. We just talked about this. Okay, so we're only gonna copy one thread, but now let's look at how fork works, right? I copy one thread, the caller, the thread that called fork, okay? I copy the address space, and I need to copy the file table, okay? Now we're gonna see the application this is gonna have, right? So here's my caller process, the process is just called fork, okay? So I'm gonna copy one thread, just assume that thread one is the thread that called fork, thread two and thread three, we're doing other stuff, they don't get copied, right? I copy the address space, including the stack for that thread, right? I don't, those other threads don't exist anymore, so they don't need a stack, and I copy the file table, right? Now, what does it mean in this case to copy the file table? Well, you can think of the file table as containing pointers to these underlying file handle objects, right? Technically, those pointers are stored in the file table array, and the process uses references to access them, right? So when I say I want to do a read to file four, the kernel looks up four in my table, it finds a pointer to a file handle object, and then it keeps following pointers to find out other information about that object. So what does this look like after I copy the file table? Remember, there are pointers in the file table, I'm just gonna copy the pointers. So where are the pointers in the child's file table going to point to right after fork, okay? Say it louder. They're gonna point at the parent's file handles, okay? This is why I took that little piece of state and split it off and put it in its own object because here's when it's shared. It's shared right after fork, okay? So again, fork copies the contents of the file table, those contents are pointers. What it means is that after fork, the parent and the child can share open files. Now it's possible that the child wants nothing to do with the parent's files. The child wants to live its own life, it wants the parent to leave it alone, and in that case the child is free to simply close all the files in its file table and open any new files it wants to, right? If it opens new files, it'll create new file handle objects that are not shared with the parent, okay? This is important to understand. So it's only right after fork that the sharing takes place, yeah. Yeah, yeah, that's not usually how the sharing is done. I'll show you how it's done, right? But yes, right after fork, I would, assuming that the child hasn't modified its file table, neither has the parent, right? At that moment I know that the file descriptors have the same meaning in both processes, right? Okay, so I said there was only one way that the parent and child differed after fork, and that one way is the return code, right? Because otherwise here would be the problem. How would I know whether I was the parent or the child, right? There are times when after fork, I want the parent and child to do different things, right? I want the parent to go off and fork a couple more children, maybe I'm studying, maybe my web server is booting and it's forking a bunch of new processes to handle incoming connections, and I want the child to run off and actually like do stuff, right? It's like real parents, right? You have children, so you can put them to work, right? Like I'm gonna fork off a bunch of children to handle the incoming requests, and then I'm gonna sit around, you know? And like an idol loop and just hang out, drink my beers, watch football, and let the children handle the web requests, right? That's what we do in my family. We run a web server by hand. So the traditional semantics are if the return code is zero, then you are the child. So here's the thing that has to blow your mind a little bit about fork. How many people have ever used fork? All right, cool. Fork returns two times. That's so weird, right? Like who does that, right? Fork, when you call fork, it returns twice, all right? Once in the parent, once in the child, because assuming that things went according to plan, now there is this other copy of me, right? And again, it's a copy. So it's got all the same code and it's executing at the exactly same place, but it better be. You guys will get a chance to actually implement this for assignment too. And I'm sure you will have times when your child is not an exact copy of the parent or is not executing the exact right spot, right? But anyway, so the point is that I hit this fork, now there are two of me, okay? So you have to think two separate copies of this code executing, okay? And again, the only thing that's different between the parent and the child is the value of this return code, okay? If the return code zero, I'm the child. If the return code is non-zero, and I think positive, because I think negative signals error or something, but anyway, if the return code is non-zero, then I'm the parent. What does fork actually return to the parent? What do we think of fork? What would be a useful thing for fork to return to the parent, yeah? What's that? Yeah, so it turns out fork only reports the status if there's an error. And again, I think it signals that by returning like a negative value, right? But let's say there's not an error, the child gets zero, what would be something useful for the parent to know about the child? Yeah, so that's what it returns to the parent, right? It returns, here's the idea of the child process that I just created. Because the parent may not want to make some notes about that, particularly if it needs to communicate with the child later, right? Okay, we just talked about this. And again, everything contents of memory are identical, the open files are identical, the point where I'm executing is identical, so you can literally think of suddenly there's two copies of the code running together. Oh yeah, absolutely, yeah, yeah. If children couldn't fork after they had a parent then you'd have a very flat process tree, right? But absolutely, right? I mean, so hold on, I think I... Yeah, but remember, not only do the parent and child have the same files open, but there's some state that's stored in the file handle that is now shared. And in particular, that state is the offset into the file, right? So if I move the offset pointer, the child will see those, at least until the child decides to ditch me, go off on its own, you know, live its own life, close those files, handles them real, yeah. So if fork fails, then it only returns once, right? And in the case when fork fails, I think it returns a negative value to indicate failure, right? To do this properly, this code should handle failure, it doesn't, you know, don't write this code, yeah. Yes, absolutely, right? Because remember, there are two, there are, right at this moment, there are now two separate identical processes that are executing the same code, right? But because of how I've set up this if else block, one of them is gonna go down one branch and the other is gonna go down the other. And this is what you see all the time, because again, frequently I want the parent to do something different, right? And I want the child to go off and do something else, right? If fork failed, I would only hit one of these, right? In this case, I would hit the bottom block, but I wouldn't be a parent, I wouldn't have a child. It's very sad, something would have gone wrong, yeah. So if they both added the same? Yeah, chaos, right, without more coordination, right? Yeah. Sorry, no, the exception is that the two, the child and the parent are identical, right? They're executing the exact same point, sorry, the exception is that the only piece of state that's not identical between the two of them is the return value of fork, right? That's the only way you can tell the two apart, right? Yeah, sorry, it's a little unclear. These are great questions, yeah. Yeah, the offsets will be the same, and we're gonna come back to this, I guess it's gonna be Wednesday when we continue talking about fork and pipes, right? We'll talk about how that can be useful, right? In certain cases. Yeah, any more questions? Good questions, okay. Yeah, so we'll come back to this, but someone asked, and so I'll try to get to this just for fun, just to finish up class today. So someone asked, what if I wrote a loop where I just executed fork? What would happen? And I can tell you, so what does this code do? Yeah, this creates an exponentially increasing number of processes, right? This is a program that you can get in trouble for running. If, I mean, don't try it, right? You might be able to, right? But it's, it may be one of the shortest programs you can run on a system, like a production system like Timberlake, and some would be mad, like very mad, right? Because, so this, if you want, like the matrix equivalent of this, this is this, right? This is like, more, and I've actually done this by accident. Not on purpose, I did not write this piece of code, but I've effectively done that through a fat fingering, a portion of a script I was writing, and it's not pretty, right? Especially if you run it as sudo. It's just, you know, like you have a minute or two to say goodbye to your machine, and then you gotta pull the plug, right? Because it just, the whole thing melts down, right? But yeah, this creates bad stuff. So please don't, please don't compile this and run this on Timberlake. If you do, I absolve myself of all responsibility, all right? So on Wednesday we'll talk about pipes and how I can use fork and pipes to set up chains of communicating processes, all right? If you need a partner, come talk to us at 3.30. Good luck getting started with assignment zero and assignment one. You have less than two weeks.