 I said, everybody, stand up. Stand up. The actual standing kind of stand, like when you're cold outside. Maybe we should have class. So I actually stand when I work. Maybe we should have class like this. This would be kind of funny. This is weird. OK, sit down. It would be weird if you guys were standing in the whole class. It'd be kind of fun. All right, so it's cold outside. I hope everybody had a nice, long weekend. Hope the weather didn't ruin it for you. We were here working on class stuff. That was fun. Unfortunately, we're still a little bit behind the eight ball with the class set up and everything. So if you think you're behind, I think Sean talked to me before class and said he was missing lecture notes and stuff like that. We're missing lecture notes, too. That stuff will be online. I promise. That's all I've been working on for a couple of days, so it's coming. It's really, you wouldn't think it would be that hard. And you would think I would have done everything last year, but I decided to fix some things. Some of those things took longer than I wanted to. So today, we're going to finish up the examples from last time, and then we're going to focus on getting through process creation, process life cycle, where do processes come from, how are new processes created, a little bit more on IPC along with some of this, and then how do processes die, how do processes decide what they want to be in life. So various stages in the life cycle of a process. And hopefully by the end of Friday, we're going to essentially show you how to write using Unix system calls just a really simple little shell, very, very basic, very basic, like three line Unix command line shell. This is the link, again, to sign up for the class email list. I think everybody is on this list at this point. How many people have not been getting emails about this class? Yes, OK, great. Uh-oh. There's always one thing I forget to do. Well, everybody has. OK, great. And then are the issues with Piazza resolved? There have been people who have been having a hard time getting on Piazza. I added everybody manually, so I think that works. The access code was broken for some reason. I've changed it several times, but Piazza doesn't seem to like it when I change the access code. So we're going to try using Piazza. I'm going to have to remind the TAs and myself to check it. I don't spend my life on the internet reading bulletin boards like Piazza, so I'm going to have to get used to it. But hopefully it's something that will be useful and allow us to aggregate information. So the bar for sending email to the staff list just went up a little bit. So if you send something to the staff list that I think is interesting and suitable for broader distribution, then we're going to ask you to put it on Piazza so other people can see the answer. We don't end up answering the same question multiple times. All right, this is basically what I talked about. So we posted all the recitations in office hours. We have 27 hours of office hours a week scheduled. Those were based on your times. They're scheduled so everybody can make at least one. Some of you guys only indicated two times in the entire week where you could come, so we did our best. And somebody indicated two times, one of which was like Friday night from 8 to 9. Not a time, not a popular time, and not a time that the staff decided they wanted to hold office hours. I don't even know why I put that up as an option. That was probably stupid on my part. But yeah, so we have office hours. We're going to start today I think right after class. We have a shift and then recitation will start today. My goal is still that Aditya can walk you through some of the steps of getting your environment set up in today's recitation, but that requires me finishing a few things after class. So we'll see where we are. We talked a little bit about this before, but I just want to reiterate it as you guys get started on the programming assignment. So I don't have any interest in curbing this class, so it looks like some of you guys learned and some of you didn't. If everybody finishes the assignments, then you guys are all going to do well. When I was at Harvard, everyone talked about getting a grade. So how many people say, I've got to be in the class? Is that how you would describe it? How many people would say that? I've got to be in that class. So when my advisor came to Harvard, he talked about making an A. It sounds very weird, something almost strange about, oh, I made an A in the class. But that's how it is. I sound like I have a limited number of As that I have to quota here. I have to give out a certain number or I can only give out a certain number. You can give out as many As as I want to. I mean, if I gave the whole class an A, then probably somebody would probably come and ask me if I was actually teaching a class or not. But if I had some good justification, like you guys all completed the assignments and did really well, then that would be fine. So there's no, I like this. I came to like this terminology. I still don't like to say it because I think it sounds weird, but that's really what's happening here. I'm giving you guys a certain amount of choice as far as the grade you get. We're going to give you a lot of feedback, a lot of help, and hope that you guys can make it to that level. But there is no, grades are non-competitive. This is not a thing. You guys should keep that in mind when you're using the resources on the website and stuff like that. This is not a curved, weder, chem, major, pre-med style class. Our goal is not to limit the number of people who get through the door. Our goal is to try to make everybody do well. And again, please use the piazza forums for these sorts of questions. And help each other on piazza. This is part of the point. When I took this class, there was a really great community of people in the class who were helping each other and posting questions and answering them. And if you spend some time doing that and spend some time helping somebody with something with a coding question or whatever, then I'm sure that that will sort of come back to you later in the class. And if it doesn't, then you'll just be seen as that real incredible student who is so helpful and stuff like that. And that's not a terrible reputation to have. I think that's something that most of us wouldn't mind. All right, so last time we had talked, we started talking about processes. And now I just want to, at some point, this all starts to seem a little bit abrupt, at least to me. But the problem is that there's some circularity going on here. And so this is our starting point. So we started talking about processes as sort of the basic operating system abstraction. And today, we're going to continue that discussion. And then we're going to talk a little bit about how processes are born to die. But before I start, is there any questions about Friday? I know Friday seems like a long time ago. Any questions about Friday? Anybody remember what we talked about Friday? That's discouraging. All right, so we'll do some review in a minute. But first, I want to finish the example that we are working on when class finished that I think sent you guys all off into a very dull, boring weekend. And then we'll do some review. And then we're going to keep going with some stuff about file handles and the process creation. So if you remember, when we finished on Friday, we were using standard Linux tools to explore our system. So we were looking at information that we can find out from user space about the state that's maintained by the kernel about what other processes are doing. So we used PS to show some information about Bash. And we discovered that Bash is single threaded. So we were building up this abstraction. We were starting to fill in some information about Bash based on this nice model that we had. Who wants to bring us back to Friday and tell us what application we used as an example of something that had multiple threads? Yeah? Web servers. Web servers. What's your name? Unchill. Unchill. The web server, right? Yep, so this is Apache. Apache had a bunch of threads. And we found out some information about them when we talked about why Apache might have multiple threads. Why does Apache have multiple threads? I'm going to pick on somebody over here. Something with out their hand up. Why does Apache have multiple threads? To handle multiple users. In this case, users probably not the right word. But what does the web server handle? Request. Yeah, there it is. All right, handle multiple requests, all right? So let's finish up. Let's use a couple of other tools to look at other information about the processes on the system, right? So we talked about one of the things that the process had was memory, right? Processes have access to memory and the operating system is in charge of providing certain guarantees into how that memory and what will change that memory and what won't change that memory will come back to that as we go throughout the class. But there's this new tool called P-map that I actually, when I started writing the slides for this class, didn't really know about. But P-map will show you for a process what parts of memory it has mapped, right? So this is again for bash, which is small and sort of simple, OK? So I ran P-map. Let's see, I didn't do this cleverly. And here's what I found for bash, right? So who can help start explaining this output? Let's start here, what's your name? What do you think? So what's this? I have my little laser thing. I hate this thing. What are these? That's the PID. Who thinks that's a PID? It's a little big to be a PID. What else might that be? What's your name? Arish. Memory address, right? These are memory addresses. This is a 32-bit machine, right? So this is on the low end, right? These are hex addresses. They're eight nibbles, I think they're actually called, wide. And, yeah, I think nibble is four bits, right? And then what are these, right? Are these also look like memory addresses? What's the largest memory address on a 32-bit machine? Yeah, but what would it look like here? It's easier than that, right? What would it look like here in this format? All Fs, what's your name? Shumma. Shumma. All Fs, right? So this is how bash, these are memory addresses, right? And then what's here, right? What are these? What's your name? Do you go by memory? Wembley, it's a great name. What are those? How much difficult is loading to memory? Yeah, that's actually, I think, the size of the region, right? So these regions have different semantics according to the kernel. The kernel's going to kind of handle them or protect them differently, right? So for example, this is 876k of something, right? That's been loaded into bash's address space. What do, how many, if you've used Unix before, what do these look like? They look like permission bits. What's your name? Amit. So these look like, yeah, standard sort of Linux permission bits, right? And what do you think they indicate about that memory region? So this is a good one, for example, right? This is 876k of something that's marked r and x. What do you think that means? So r, w, and x do stand for those things. What's marked here? Yeah, so what does that mean? Yeah, so this is a portion of memory that bash is allowed to read from, right? Which is kind of required to execute, right? It'd be difficult to execute something if you couldn't read it. And it's allowed to execute it. So what do you think is potentially loaded here? This is the code section, right? And then this last thing over here, right? What is this? What does that look like? What's that? Yeah, it's a path, right? It's a path to what? To an executable file, right? And what do you, so how do we interpret this all together? So I have a path, I have some permissions, I have size, and then I have this memory address. Who can sort of put together all the pieces for me? That's almost perfect, right? So what this means is that there is 876K of code that was loaded from this file, right? You'll see that this file is indicated several times, right? There's a few other regions that were also loaded from this file. There's 876K of code that was loaded from this file at this memory address and marked readable and executable, right? So again, this is probably the bulk of the code that's being used by bash, right? What about these other two areas that bash has loaded, right? What do we think those are? Yeah, yeah, so there's a segment that's marked just R, right? So that's a static segment. Okay, so what do we mean by that? What do you think of it? Static, but what's in there? It was loaded from bin bash, right? We know that, that's where it came from. What do programs have that they would load? What's that? Okay, global variables, that's good, but global variables that what? Yeah, constants, constants, right? This is important, not configuration things, because configuration things I might have to load from another file, those could change, right? Constants related to its operation, right? And then what about this area? I should not even sure what this is. So this is, well actually, no, I am sure what that is. Sorry. I don't know what those other things are though. Okay, so what do you think this is? This RW, so it's an RW section, also loaded from bin bash, right? Yeah? The data variables, data variables, the data section of the code. This is data section in the code, but it's data that's been initialized from the file, right? So when bash starts up, there's a certain part of its data that it's going to change, but it loads some initial values into, right? So if you've programmed in C and everybody will be programming in C, when you initialize a variable inside your code, right, that's not dynamically allocated and it's not on your stack, it's just like a global variable that's going to be used throughout the code and you initialize it, this is how it works, right? Whatever values you initialize that get loaded into the executable and this is how it gets set up at runtime, all right? Any questions about the memory mapping stuff here? Okay, what's this stuff down here? Yeah, these are libraries, right? So bash uses a lot libc, right? This is the standard C library. If you programmed in C, you've used libc, you don't even know it, right? You don't have to do much to use libc and it's difficult to do anything in C without using libc, right? Libc is like string length and all sorts of like really standard utility stuff. It also provides the standard wrappers to the Unix system calls, right? So almost everything in C uses libc, yeah. All of the C are just one file. Yeah? How does the segregation take place? Yeah, yeah, yeah. So, oh man, this goes down a long messy road that I don't wanna go down too far, but the format of, and actually you guys will deal with this in assignment too, right? The format of, so there's a standard, there's something called ELF, right? Does anyone know what ELF stands for? Other than a small person who helps Santa? No, you can't, hits are yours. Yeah, close. I think it's executable in linking format, right? So when you, like, when, so idea is that bin bash is a file that's produced by the compiler, but is going to be interpreted by the operating system. And so they have to agree on a format and there's something called ELF, which is complicated and well documented and everything. But what ELF does is ELF tells the compiler how to produce a file that the operating system can understand. So when the operating system loads bash, it reads the file according to the ELF format and what it finds out is there's 876K of code that bash wants loaded here, right? And then there's 4K of initialized global variables, read-only variables, static variables that it wants put here, et cetera, et cetera, right? We'll get back to this when we talk about exec, right? But essentially bin bash contains, it's almost like how many people know about celebrities? Nobody? Oh, come on, admit it. Even I know about celebrities, okay? This is ridiculous. Everyone knows about celebrities, right? So I don't know, I just find this funny. Some of these rock stars and stuff like that, they have these things in their contract where I can't remember who it was. I shouldn't impugn anybody unfairly, but maybe some famous pop singer Diva who I don't want to get their name wrong, only wants green and red M&Ms in the candy jar and her dressing room, right? There has to be a bowl of M&Ms there, but there can only be green and red M&Ms. So somebody at the venue has to go through the M&Ms and pick out all the blue ones and the yellow ones and all the colors that would make this person very unhappy, right? That's kind of like what ELF is like, right? ELF is a description of exactly how the process wants to find things when it starts running, right? It's very specific, exactly like, I don't want any blue M&Ms, right? So, again, we'll come back to that in a great example. All right, and then what's down here? What does this indicate? And this is pretty obvious. The answer is on the slide. Yeah, it's the stack, right? So that's where a bash is saving variables that are local to each thread, right? And thread state associated with run, okay? All right, so now, again, we can fill in bit more pieces of our address space. Hold on, was there a heap here? Oh, yeah, aha, okay. Is that actually true? Yeah, I think it is. So what about this? So these are interesting, right? This is a fair amount, right? It's like for bash, right? Like two megs, you know? Bash is a little teeny-weeny, a little nice program, of stuff that's marked anon. What is that? What are we missing here, right? We've got our stack. We have libraries, we have the code for executable. You need a heap, right? Need somewhere to put dynamically initialized stuff. So that's what this section is, right? So we can fill this in. So now we have our code, we have our data, we have a section for the heap, right? And we have this area for our stack, okay? Finally, I know, and this is long, this is again, long-winded and dull. Yeah, question. Because it's, I mean, it didn't come from a file, right? And it's just, that's a good question, actually. I don't know why that term is used here, right? But yeah, good question. I actually don't know why PMAP outputs anon for that, right? Instead of heap or something else, right? There might be cases where I can, well, actually there might be cases where I might have areas of my address space that it doesn't know what they are for, right? So they might be like memory map files or other things, right? So it's possible that it wouldn't put heap because that's too specific, right? But some of that is definitely in use for the heap. That I'm sure of. All right. Okay, so finally, there's this nice command called LSOF that will show you open files for your process, right? So this is bash, right? And you'll see that it shows a couple, what are these? Is anybody, any of the Linux hackers out there know what these are? Yeah. Yeah, the terminals, right? So bash is a terminal client, and so it has a couple of terminal sockets open, right? And then what's that last file down there? Yeah, I'm gonna say, I need to get to the coin. Bash RC, what do you think that is? Is that the executable? So it's in my home directory, right? And it's not bin bash. Yeah, what's your name? Don, yeah, thanks. These are profile, right? So I've got my settings in there and bash has that open, right? Yeah, I forgot. Okay, so that's actually not true, right? This file wasn't actually open. When I ran this command for real, all that was open was the terminals, but I wanted to have like a real file so I just store that in there. But, you know, but let's just pretend that I caught bash, right, when it started up and it was actually reading my file. Okay, any questions about this, right? So we've shown that you can use these standard Linux tools to basically build up this really nice and complete picture of what a process is like, and this is essentially all there is as far as work is concerned, right? What are we missing here, right? What's the one big thing that would, if this was all process was today, what would you be very disappointed by? What's missing here, right? I've got files, I've got some memory, I have a thread running. What can I not do that you might do with your computer normally? Anyone take your guess? No guesses? Well, okay, well, that would help, right? But what, like, no, I don't, right? Yeah, like, how do you want to hear do any networking on the computer? Come on, guys. Exercise and hand raising, right? I mean, I don't even open up my computer if I can't get on the internet, right? It's not interesting, right? So if there was no networking, this would be, this, you know, you'd be sad, and this doesn't really include networking, right? But again, we're gonna pretend that it's 1970, right? Maybe I'll ask you guys to dress like it's 1971 day. Maybe I'll start dressing like it's 1970. Maybe I already dress like it's 1970, who knows? I don't know, I wasn't alive in 1970. I barely made it into the 70, 79, so I'm proud of that, right? Dude, no, it's not born in the 80s. That's, just barely. All right, last thing I want to point out is just a little bit of sort of interesting sort of information about operating systems, about how operating systems expose information. So the question is how do all these utilities get this information, right? These utilities are helpful tools, right? For system administrators and other people who are interested in finding out things about their system, right? And so the operating system kernel is, you know, Linux and other operating systems try to expose some of this information for these tools to use in a kind of flexible and nice way, right? And what Linux does is kind of clever. It essentially reuses the file abstraction, right? So if you go on a Linux or, you know, Unix system, usually there's a file system called PROC. It's usually mounted in slash PROC. And so this is the mount for the PROC file system. When I go into PROC and I run LS, so I went into PROC and I ran LS in a particular PROC directory and it shows me all these files, right? These are not actual files, okay? What is happening is the operating system is maintaining this illusion of a file system here, right? For the benefit of these utilities, right? But when I do an LS in this directory, what happens is the operating system says, oh, you know, what's all the information I should know about PROC 7.6.1.5 and it displays it as files. So reading and writing, usually you really can't write to PROC, right? But sometimes you can, but reading from PROC essentially routes you into the kernel and retrieves information about the processes, but there is no actual file system, right? Anyway, that was just a, all right. So let me, let's go back to Friday and today and do a little bit of review, all right? So this is kind of what we've talked about so far, right? So we have, we gave you sort of an introduction into some of the operating system abstractions we'll be covering for the rest of the term, right? Threads, subtracting the CPU, address spaces that abstract memory, files that abstract disk blocks, essentially. And we started to talk about processes which kind of organize and bring together some of these other abstractions, right? All right, so, but why do we, why do we even have these abstractions in the first place? Right, who can, who can remind me why operating systems go to all this effort to create and maintain these abstractions, right? What do abstractions do? Three things, one of them. Yeah, they, well, high complexity, right? Well, what is complexity an example of? What is, the complexity is, is sometimes a, a member of a broader class which I might call, what? Yeah, not the implementation. Yeah, make it simple by what, right? What do I do to make something simple? I hide, okay, we're getting closer, hide, well it's up, it's up on the slides, right? Hiding what, right? It's gonna hide the implementation, but what? Well, do you see any, okay, no, no, no, no. What's that? Undesirable properties, oh crap. Everything came up once, right? All right, but then you didn't see that. My slide surprised me sometimes, I wish I'd done this better. Okay, hiding undesirable properties, right? Complexity is example of an undesirable property. The implementation is an example of an undesirable property because when the implementation changes, I don't wanna have to recompile every one of my programs, right? So, or, you know, when I get a new hard drive and stick it in my machine, I don't have to re-want to recompile every one of my programs, right? So, implementation, how things actually work is considered by the operating system to be an undesirable feature, right? And it's something that processes and applications should not have to know about, right? Adding what, so what do abstractions also do? They add what and they organize what? Adding capabilities, right? Things that, and again, this is kind of, this is kind of hiding undesirable features, but it goes past that, right? It says, hey, as long as I'm doing all this work, why don't I make things better, right? You know, so I create this file abstraction so I don't have to worry about disc blocks, but let's make the file abstraction do some cool things that disc blocks don't do, right? Like maybe be reliable and perform better and, you know, be able to grow and shrink dynamically, whatever, right? I mean, you know, it's adding new features, right? By using these underlying capabilities and then finally organizing what? Information, right? All right, so, and the purpose of all this is to simplify life for applications, right? And also to allow the underlying hardware to change without having to change your applications, right? I mean, you guys don't really appreciate this, but back when computers, before some of these abstractions existed, you didn't just get new computer hardware, right? Like your program might have been heavily, you know, so, you know, and this is prehistoric times, but it used to be, you wrote a program for a particular computer, right? And if something about that computer changed, you had to rewrite your program, right? Like you made assumptions about, you know, the capabilities in the machine and if that capabilities changed, then your programs broke, right? My favorite example of this, how many people have ever seen a computer with a turbo button on it, all right? And if you, I'll have the Linux geeks here and who know why this is, can't tell people. So, these used to exist when I was growing up, right? You'd see these computers and kind of old, like, early x86 machines they had a turbo button, right? It was usually on the front of the case, you know, and a little lead on it, you hit it and it was like turbo mode, right? So, why would your computer have a turbo button, right? Don't I want it to be in turbo mode all the time? Why did these machines have a turbo button? You know, so the turbo button increased the speed of the processor, right? But again, why do, like, why not run in turbo mode all the time? No, it had nothing to do with resource consumption. These machines were slow, slow. So it's like, I'm gonna make it a little bit faster. No, no, no, none of the, let's, well, actually Sean, do you want to go? It was not, it was not heat. As far as I know, these machines would run in turbo mode happily for as long as you wanted them to. Yeah, Gina, not power consumption. Oh, this is back, energy is cheap, 1960, you know, like nuclear power is gonna save everything, you know, like there's so much oil underground, like no one cared about it. Well, maybe they did that. Any other guesses, yeah? What's your name? Honorar, no, well, it had nothing to do with memory. What, okay, I don't know, we should just, we should just give it away, but so one of the things that people used to use their PCs for was to play games, right? Early computer games, you know, different types of things where, you know, it'd be running around in some little virtual world, right? So why, so I just gave you a hint, right? And we'll speculate about this for a few more minutes, for like 30 more seconds, right? What to do with games might mean that I would want to have a turbo button. Well, I could, but again, why wouldn't I want it to run faster all the time? Nick, no, it wasn't an extra input source. No, no, no, yeah, okay, so you guys are, it's funny, you guys are making the assumption that I would use turbo mode when I was playing a game. Try making the other assumption, yeah. It would go too fast. It would go too fast. You had these games that made assumptions about the speed of the processor and they used those assumptions to dictate how the game worked and how fast the game worked, right? So imagine like you've got a new computer, right? You're, I don't know, what's a popular computer game? World of Warcraft? How many people play World of Warcraft? Good, because you're gonna fail, but it's good. You won't have any time left to do the class. No, anyway, so it's, okay, World of Warcraft, but apparently it's not a popular game, only Sean plays it, all right? All right, Angry Birds, right? So imagine you've got a new phone and suddenly Angry Birds was like, it was just like super fast, right? Can't be that way, okay? So, but this is an example of, you know, when application developers made these assumptions about the machine and then those assumptions changed, it's like we'll never have a faster processor. Of course people, like just wait 30 seconds. Yeah, the other question. From like 2000, that made those kind of assumptions? Uh-huh, and I mean, maybe they thought that people's reflexes were going to scale with Moore's law, right? That would be awesome, right? It'd be like the Matrix, you know, it's like download my helicopter, boom. That's pretty cool. All right, anyway, so that's a little fun aside. Okay, so what do processes, this is like Jeopardy or something, or not Jeopardy. Wheel of Fortune, okay? Processes organize information about what and represent a single thing that the computer is what. What does the process do? Processes organize information about what? Not a trick question. Yeah, Jen. All right, this information, oh, okay. So they do collect multiple abstractions together in those, oh, yeah, there it is. Yeah, she got it. All right, and represent a single thing the computer is doing, right? And we talked about where processes contain threads, address spaces, filing, all right. IPC mechanisms. I'm gonna start to speed up a little bit here because I don't feel like I have the room. Remember, IPC mechanisms. What does IPC stand for? Is it up on the, it is, it's up on the slide, okay? That's good in a process communication. Yeah, it's fun. Return codes, that's a, yeah, that's a great one. Nice and easy, just return, return a number. What happened to me? What else? Pipes, yeah. Signals, yeah, errors. Interesting, yeah, errors are probably returned to return codes, primarily. Pipes, signals, return codes, we're pretty close. Files are another poor man's version of IPC. I can certainly share a file between processes. Shared memory, which we haven't really talked about. All right. One major goal of the operating system. At minimum, what would I like my operating system to do? Other than all these other great things, right? For processes, right? What's that? Yeah, yeah, I mean, but what's that a broader class of? Terminating other processes is an example of what? So, yeah, okay, so, well, it's difficult to protect processes against race conditions. It's usually programmer problems, but I'm a process I'm going along and somebody terminates me, right? That would be me, right? But what else would I not like to happen? Molested, right? That's the word I've been using. Maybe it's not a great word to use in class, but. Yeah, from molestation, right? From having someone picking on me, right? I, you know, don't bash to leave me alone, you know? Like, he keeps writing into my address space. It's like kids, right? And put you over there in your corner and in his corner. Okay. All right, we just talked about this. Okay, all right, questions now before we start talking about fork, which of course we will not finish today. All right. So, one thing we need to do before we start talking about process creation, this'll be good, because I think we'll get through this and then that'll be it for today, is talk about our process model, right? So this is the model of processes that we had developed last time. We have threads, we have an address space with different parts of it that are mapped. And this will all, this is like, this is probably for some of you kind of like looking at hieroglyphics right now, right? But I will, I am teaching you hieroglyphics. So by the end of the semester, hopefully all of this will be clear, right? But there's still, I don't know, there's still kind of pretty to look at even if you know what you're looking at. Okay, so here's, but this was our process model. We had files and we had this idea of that, you know, basically processes just having some pointer to a file, right? And what we wanna do before we talk about fork, because fork has some important semantics related to file tables that you guys will have to understand for the second assignment is introduce an additional level of abstraction, right? So we had this idea of file handles. How many people have used open, read, write and see and done file IO that way, right? Okay, what is a file, let me just review here. What is a file handle for a C process? What's that? It's okay, it's a pointer to a file. That's what it is conceptually, but what is it if you're a C program and you're just using it? Like you call open and open hands back what? An int, just an int. That's all it is, right? It's just a number, right? It's like, that's how you talk about files, right? You don't give the full path to the operating system every time you need to perform an operation, you just give it a number, right? It's an optimization, right? So essentially when I open my slides file, the operating system will say, you can call this file three, right? And that's what we'll agree on for the lifetime of the process, right? And the next time when I want to write something, I say, hey OS, I would like to write to file three, right? And the operating system will do the path mapping for me, right? Okay, and again, this is kind of what this is designed to reflect, right? So I have, processes have a file table, right? The files are actually stored in an array. Within the process, there's usually a limit to the number of processes, files that a process can have open. This is one of those cases where limits are very nice, right? So, well, anyway, there's been too many digressions today already, so I won't go there. So I'm introducing a new level of an erection. So I have a file table, right? And the int that I talked about in the file table now points to this file handle, right? Which is an operating system object that has some other information about the file. And then that actually points to the file itself, right? So now I have two levels of an erection, right? The operating system first translates my, I need to, like the actual int that I'm given, right? To this file handle, and then that is translated to a file, right? Right, so again, we just talked about this, right? So the int refers to a file handle object that's maintained by the kernel, right? And that file handle object also references a separate file object that's maintained by the kernel of the file system, right? So now we have three levels of an erection. We have a file descriptor to file handle, file handle, file object, and file object, finally, to some disc blocks, right? So why am I, why did I do this, right? Why am I, why did I introduce this additional level of an erection? Do people understand what happened here, right? What happened here is that I took one thing and I split it into two pieces, right? And I have the first piece pointing to the second piece, right? So I have, you know, this, and I've split part of that information and I have a file table and now the file handle is maintained by the kernel, right? But why would I do this, right? Why, why, why would I, where's, what's, what's the, what's the normal case where I would take one piece of information and divide it into two parts? What's that? What if I manage me? Better, I mean, that might be the overall goal. What about the layer of abstraction to hide what? Yeah, well I already had the level of, I already had the abstraction in place to hide that undesirable feature being that there are actual disc blocks, right? Too much of information to save memory. When I thought about teaching in this room I always thought I could walk down these aisles but it turns out to be more difficult than I anticipated. It's a lot of feet. All right, what's another reason I might divide something into two pieces? It does, it does create a higher degree of mapping and that's true, but why do I want that? I'll just give away the answer, right? So what I wanna do is I want to be able to share different parts of this information differently and protect it differently, right? So one reason of taking a single abstraction and breaking it into two pieces is I now can protect each piece differently and have different semantics associated with each piece, right? So here's how this works. File descriptors are private, right? So we go back to our model here. These file descriptors are private to the process, right? No other process can change my file descriptors, right? And when I open a file, only the file descriptor only exists in my process, right? File handles are private, start out by being private to each process but are shared after I create a new process, right? So when I create a child process, which we're gonna talk about probably on Friday, those file handles are shared, okay? And the file objects that hold other pieces of file state, those are kernel objects and those are shared among all the processes, right? So I have stuff that's private to my process, things that are private to my family of processes, right? And then stuff that's shared by everybody on the system, right? Before I only had stuff that's private to my process and stuff that was shared by everybody, right? These kernel objects and the reason for this is something that we're gonna talk about when we get to file creation, right? So they're essentially, again, three pieces of data now that are shared differently. The file objects are shared across the entire file system of everybody, all the processes that are using that file. Any process that has a certain file loop and that will map down into some unique file object maintained by the kernel, right? The file handles can be shared by multiple processes and there's specific semantics for sharing them, right? They're only shared after fork and the file table is private to each process, right? All right. So before we can talk about exactly why that is, okay, good, I can get through just into this. Okay, we need to talk about where processes come from. It's like the birds and the bees conversation that we're gonna have now, 9.45 on Wednesday morning. So where do new processes come from? I mean, they must come from somewhere, right? Certainly on your computer, you know, like, oh man, you know, I have a brother and sister and yet, anyway, and me and clearly, anyway, I won't finish the joke but you guys know how it goes. Yeah, okay, so it's Brian? Yeah, Brian thinks that processes come from other processes. That's true, right? But let's be a little bit more specific. Well, yeah, I mean, it's gonna be something I control, right? Like, there's just too many bad analogies here, aren't there? Yeah, so I wanna be in control of the process creation process. But what else? Like, you know, does anyone, well, okay, so let's talk about this, this like the creation myth, right? Where did the first process come from? Yeah, so yeah, we can talk about bootstrapping, right? But the first real user process, right? Who knows what this is called on Linux systems? Spencer? Init. Init is the first process, right? First, the kernel, you can think about the kernel as a program, but it's not really a process because, you know, it's like, yeah, anyhow, you know, the kernel can't be a process because processes are an abstraction created by the kernel, right, so then it would be an abstraction created by itself, right, which would be difficult to sustain, right? Where does Init come from, right? So I have this process called Init, it's the first process on the system, it gives rise to the whole big, you know, tree of life of processes that I'm going to have, where does Init come from? What do you think? Does anybody know? Where does Init come from? Yeah, so the kernel creates the first process, right? So the first process is Init and it's created by the kernel, right? Maybe not in its own image, but close enough, right? And then Init gives birth to all these other processes through this Unix system command that I wish I would have made you guys do, Fork, right? How many people have called Fork? All right, good stuff. Yeah, so Fork is the Unix system called, this is sort of the beginning of our discussion of process life cycle, right? So again, there's circularity here, but we're going to start by talking about where processes come from, right? Other than Init, every process on the system is created by a call to Fork. And Fork is Unix called a system called that creates a new process, right? The process that Fork creates is a copy of the process that called it. And after Fork, we refer to the process that called Fork as the parent and the process that was Fork as the child. And that's fairly obvious why we use the terminology, right? And they have a special relationship, right? And some special responsibilities with respect to each other, right, that noble talk, okay? So I start off with my process here, a couple of threads. And there's one thing to keep in mind about Fork, which I've tried to update this class a little bit more to make it sort of multi-thread, multi-core compatible, but Fork tries to make an exact copy of the process that called it, right? And there are lots of different semantics associated with Fork, so if you look in Unix and you do manned Fork or manned V Fork, you can find out lots of information about different arguments you can give to Fork that will cause it to do different things, right? Copy this and not that or whatever. But for the purposes of this class, let's consider Fork to make it an identical copy of the process that called it, right? With one notable exception, which is that threads are, you know, for a variety of reasons, very difficult to make Fork safe, right? Because imagine, so in a single threaded process, when the process calls Fork, right? So process tells the operating system, hey, I would like to create a new copy of myself, right? And the process actually enters the operating system, starts doing that transformation, right? In a single threaded process, why is this easy to do? What's not happening in that process at the time it calls Fork? Well, it's not that, I mean, you're right, there's more state associated with extra threads that I would have to copy over, but that's not what's difficult, right? When I say to the operating system, hey, I want you to create a copy of me, and there's only one of us that's doing anything inside the operating system, what makes it simpler? If there's only one thread, then it will block. Right, so the idea is that there's some thread, right? So imagine, let me get back to my thread thing, okay? So imagine thread one was the only thread here, right? And thread one says to the operating system, hey, I want you to Fork me, right? Thread one is doing that, thread one has gone off to the operating system and said, hey, I want a copy of myself, right? And so nothing else is happening, right? No other threads are running, the process is essentially completely blocked, right? It's frozen in time. It's like, if you went out to, you know, if you went out and decided you wanted to make a copy of your house, so you went out to the place that does that, you said, hey, I like to do that, I said, okay, you know, wait here while we do that, and then you went home and there were two copies of your house and they were identical to the way you left it because there's no one else home, right? But let's say you have a family and there's a bunch of other people running around, like these other threads, so now while you're out, they're at home and when the people come, they're like, well, unless we can stop all of them somehow, which is typically a little bit tricky, right? The stuff's still going on, right? And then the semantics of it are weird, right? Because what Fork should do is make a copy of the process at the time that the thread calls Fork, right? But if there are other threads in the process, they could still be running and other things could be happening and so it's difficult to say what it should look like afterwards, right? So anyway, we're not gonna worry about threads, right? So let's pretend, on Linux, I guess, the semantics are that only the state that called, or for the thread that called Fork is copied, right? So it's not an exact copy, it's a copy that emits the other threads in the process and the thread that called Fork is responsible for creating those threads at once, all right? Yeah, we just went through this, right? Yeah, so yeah, this ends up being gross, right? All right, so I think this is a good place to stop for today and on Friday, it's Wednesday, on Friday we will continue talking about Fork and file handles and other things. Recitations today, office hours today as scheduled. Go to the website, go to the link I sent you which will open up the Google Calendar, all the office hours are on the Google Calendar.