 All right, welcome back to operating systems. So you might notice today's title for the lecture looks like Lab 2. So this will be the last lecture that will help you in Lab 2. So after this, you have no excuse but to start it, you should start it early because you will run into problems. You will run into problems that are very hard to debug. You will run into problems where you need to just go to sleep for a night and then you'll wake up and figure out the solution. And beating your head against it won't help. So we'll go over some of those today and let's see our task for today. So all we want to do today is send and receive data from a process, which means we want some inner process communication. So our job for today is we want to create a new process that launches a command line argument, which looks very familiar to what your shell does. Then we want to send the string testing with a new line to that process. And then we want to receive any data it writes to its standard output. And we want to be able to read that data and maybe we want to process it or something like this. So this is like, as you script stuff and make useful stuff, guess what, the Python sub-process library is essentially this and it's very, very useful. It's a good thing to know. In fact, it's how all your grading is done. So we just do run your tests in the sub-process, go ahead, parse the output, all that fun stuff. So very, very useful. So we talked about execve, execve is kind of a pain in the butt because we have to give it a whole path to a file. We also have to make an array of C strings, which is kind of annoying. So there is this more convenient API, which is just a C wrapper for the execve system call. But it does some nice things for us. So what this does is, well, it behaves like execve. It will return negative one on failure, set error no, all that stuff. We know that the C standard library does. And if it's successful, it does not return because this process starts executing a different program. So the nice things about it is that this first argument here, it doesn't have to be a path, it can just be a name of a file. And it will search the default directories for you. The default directories are indicated by this path environment variable. Basically, it's always set. And we can just have it search the default directories for us. And then instead of making an array of C strings that has to be null terminated, we can just pass the C strings as arguments as many as we would like here and just end it by null. So that way, we don't have to actually create an array. So this saves us a bit of programming. It's more of a convenience. The final APIs we need are dup and dup2. So they return a new file descriptor if they're successful. Again, negative one on failure, set error no, all that good stuff. And what these do is essentially create a new file descriptor that points to the same thing that old fd is pointing to. So if you say, I don't know, dup0, well, you'll be duplicating whatever file descriptor 0 currently points to and creating a new file descriptor. So if the default 0, 1, and 2 are open, you'd get file descriptor 3. So file descriptor 3 and 0 would refer to the same thing. There is a more convenient one called dup2. And what that will do is take whatever old fd is pointing to and make this number, whatever this new fd is, it will make that number also point to it. So you get a bit more, you can target what file descriptor you want to replace. So for this one, if new fd is already open, it will go ahead and close it for you and then duplicate old fd so that it points to the same thing as this number in the process. Remember, file descriptors themselves are independent to a process. So this might look a bit confusing now, but we'll see. And the reason why we might want to separate this into two steps is because dup2, it atomically closes that new fd argument if it's already open and then we'll replace the other one. You could just do a close and a dupe, but you're not guaranteed that after you close something else doesn't open it and then that number is now in use and you're kind of screwed. All right, so that's it, yep. So atomic means that it will close and duplicate it, essentially in one step. It can't get interrupted in the middle. So right now that's not important to us, but later on it will be, yep. Yeah, new file descriptor can be something that's already open, so it could be 0, 1, or 2. It will just replace it, right? So I could open a new file. I could get file descriptor 3 to represent that file and then I could do a dupe 2, 2 to 0 and then after this is done, file descriptor 0 will also point to that file. Yep, so dupe 2 returns 0 on success. Yeah, so dupe 2 returns 0 on success, dupe just returns a new file descriptor. Although dupe 2 might also return new fd as well, I forget, you'd have to look at the documentation, but if it's successful, you already know what it should point to. All right, that's it. Now we do our goals for today. So I wrote a bit of boilerplate code, so here's some check error code that will check if the argument is negative one. If it is negative one, it'll do all that error stuff, so I'll just wrap everything in this so I can save myself some typing. So in here, remember I want to essentially run whatever the command line argument is, so I use the arguments to main, so argv is an array of C strings and argc tells you how many elements are in that array, that's just C's way of doing things. So I will go ahead and check that there are exactly two arguments and if there are not, I will return from main with this error code which just means invalid. Otherwise, you should launch the program with first the name of the executable and then whichever one you want to exec. So here, I will just declare some arrays for some pipe file descriptors because, well, if you wanna communicate with another process, we should probably use pipes. Then after that, I fork and then I wrote some little helper functions. So if PID is greater than zero, it means I am the parent, so I call this parent function, give it these two pipe fd arrays and the process ID of the child. And then in the child, I just give it the two arrays and then the argument passed in which is the executable I should run and then after either of these return, the program exits with just zero. So right now, parent does nothing and then the child does an exec lp. So what should happen if, let's say I wanna run uname, so we did uname in like lab zero, so you did uname dash r to show you the kernel version, but if you just type uname, it tells you what operating system we're running basically. So if I just type uname tells me Linux, so what should happen if I do something like this? Yeah, it should just still print Linux, right? Oops, print Linux, wasn't a trick, but that's coming from the child process, right? The parent process is just returning immediately, doing nothing exiting, so that actually came from the child process, not the one I ran, sorry? Yeah, and it does currently, may leave an orphan or a zombie that becomes an orphan because was I a responsible parent yet? No, so we should probably fix that later, but usually that comes last, people don't notice it, so good on you. All right, so first part is I want to capture some output from that process, so I should probably do it with a pipe. So I will go ahead and create a pipe from pipe, or from out pipe fd, oh pipe fd. So if I do that, I create a pipe and then I fork, so both the parent and the child process will have some file descriptors open, likely file descriptor three would be equal to pipe f, whoops, would be equal to out pipe fd zero, which is whoops, which is equal to the read end of the pipe, and then file descriptor four, which is out pipe fd, whoops, one which is the right end of the pipe, right? Cool, so if I fork both processes should have them, so if I want to capture the output of the child process, what should I do here? Yeah, so I would duplicate standard output to the right end of the pipe, so dupe two, so the right end of the pipe is out pipe fd one, and what is standard output? One, I see some ones, yeah, so standard output is just one. So dupe two, so if I do something like this, dupe two, well, I'm taking the right end of the pipe and with dupe two, I'm making file descriptor one in the child process, remember this won't affect the parent process, I'm making file descriptor one point to the right end of the pipe as well. So now if I go ahead, whoops, and compile this, and run it, what should I see? Same, I should see Linux, yeah, I'll see nothing. So I run it, see nothing. So is that you name, or is that child process that's running you name still printing Linux? Yeah, but it's going to the right end of the pipe. Where's that going? Colonel's managing it and then no one ever reads it, so the Colonel's smart enough to know that both the only references to that pipe is from those two processes, as soon as they are gone, it can go ahead and free that memory because the Colonel's smart. This is like an instance of like the tree falling in the forest, this process outputs something and no one hears it, doesn't really output anything. I guess kind of. All right, so am I done in the child process for this? Is this good, or should I heed my own advice? So am I done with file descriptor three and four? Yeah, what's the golden rule? Close file descriptors once you are done with them. So the process that I'm running is only expecting and only expected to use file descriptor zero, one, and two. Anything else that is open is considered very rude and it is not going to be aware of them. So I should probably close both of them because I'm not going to use them anymore, right? All right, that looked pretty good for the child. Yep, the right end before I read, where do I read? Before I write. Where do I write? Here, like I should move this one up or this one up? Like you want me to put this here? Yeah? Yeah, well I'm going to close this pipe FD if I have this here and then say what we say it is. It's probably file descriptor four. So if I close it, file descriptor four is now invalid and then if I do a dupe two, I'm trying to, it's invalid and I want to replace invalid with one. So I mean, we can run it but luckily this will be the case where it is good to program defensively and no, it just kind of silently fails. So that's great. So that check error didn't give an error but yeah, that would cause us some issues later. So that is probably more what we want. So that's good. Also just as a quick question, what would happen if I just did this? Yeah, so if I had the pipe after the fork, remember they're completely copies of each other at the time of the fork. So in this case, they just have some initialized variables then after the fork, they would both create a pipe and those pipes would be independent of each other. So the only way because this pipe doesn't have a name, the only way to actually share the read and the right ends of the pipe is to create the pipe before you fork because we're getting the copying as part of the fork. Otherwise you will not be able to communicate between the processes. So it has to be here. Okay, so that was step one. So we're capturing its output from that process. Where is that output going? Yeah, some memory managed by the kernel through the right end of the pipe, right? So how would I access that information in the parent process? Read, read what? Yeah, read the read end of the same pipe. That way I should be able to get some information out of it. So in the parent, remember in the parent we didn't close any file descriptors, we didn't do anything. So all the out pipe FTs are all nice and valid. So I can go ahead, create a character buffer of that magic number. We don't know yet. And then I can do a bytes red, do a read system call. What file descriptor do I want to read from? The read end of the pipe is out pipe FD zero. And then I want to read it into that buffer with the size of the buffer. Then of course we want to check error because that's what we do. See, error checking's not that hard. All right, that was just one line, not too bad. So checking for errors, at least in this course isn't too bad. So there's my read system call. So right now if I run uname, Linux should be in there. So if I want to go ahead and print what I got, I can go ahead print this, which because the output is not guaranteed to be a C string, this format specifier takes a length as the first argument and then it pointer to some chars. So the length I want to print is the number of bytes I've read and then otherwise I will use the buffer. So if I go ahead compile and run this now, hopefully it prints out got Linux because that is what uname outputted. So we are capturing its output. If you wanted to, you could further process this input and do something with it, but for now we're just capturing it just because it's fun. So any questions about that? Yep, which one? On line 28, whoops. No, so yeah, so the question is, doesn't this output close standard in, standard out? Because of the dupe or why? Yeah. Yeah, so here just to visualize the dupe a bit better. So before this even runs, here are all our file descriptors. So file descriptor zero is standard in, whatever it was when this process was created, one is standard out and zero is standard error. So after this dupe two, all it essentially did is take whatever, in this case, file descriptor four was pointing to and point and make file descriptor one point to the same thing. So after the dupe two, it looks like this. And then these calls here would actually be like close four and then this would be like close three. So right before this exec LP happens, these are the file descriptors that are open. That makes sense to everyone? Okay, cool. So I should be able to run this with other things. So let's see, cat. So if I run this with cat, nothing happens because it's still waiting for output. I should be able to type something and then it captures that lots of fun. So, am I done with my parent process? Is that good? Or am I doing some bad things that I've been telling you to do constantly? Yeah, yeah, I should probably close my file descriptors as soon as I'm done with them. So in this case, my advice to you is to close them as soon as you know you don't need them. So this parent process will never use the right end of the pipe, so it should close it immediately. And then here it reads from the read end of the pipe and then after it's done, reading from the read end of the pipe, it should go ahead and close that after it's done with it. So that looks a bit nicer. That should work. Great. And I'll also get rid of a compiler warning. Okay, so next part of it is I want to send some input to that process. So what should I do for that? Yeah, another pipe. Another pipe and I should probably just, you know, I already created one for you. It's called inPipeFD. So for inPipeFD, I want to write data to it and I want my process, my child process to read from it. So let's go ahead and do the reading first. So in the child process, I should probably do more or less the same thing, right? So let's do the old programming 101 thing to do and we do some copy paste with some replacement. So is that good? What's the point? Yeah. Yeah, so there's a question. Can we not use just use one pipe for both data transmissions and essentially for that, you can create a loop because you don't know which one should be reading and which one should be writing at any given time. So very bad things can happen. So we might make a bad thing happen if we have time. Yeah. Yeah, I'm duping both of them to file the script or one. What's the purpose of this inPipe? I want this process to read from it, right? So if I want it to read from it, is this right? This should be zero and is this right? Is this the read end or the right end of the pipe? That's the right end of the pipe. So this one's wrong too. So this should also be zero, right? Okay, so everyone agree that when the child executes our file descriptor should probably look like zero is the read end of inPipe, one is the right end of outPipe and then two is whatever standard error was, right? So our child process is good, all done. Wow, no one's really confident at all today. That looks all right, let's keep it as is. So if I want to write some information to the child process, how would I do something like that? So at the beginning we want to write the string testing with a new line. So if I want to send that to the child process, how do I do that? Right, what file descriptor should I be writing to? One, one, okay, message, stir length message. All right, that good? Okay, let's go ahead and, whoops, what'd that do? Yeah, there, is that? So if I do that, I see testing and then it kind of dies. So I screwed something up when I screw up. So well, the first part of this comment was correct. So this write was actually one in the parent is its standard output. I didn't screw with that file descriptor at all. So if I want to send it to the child, well, I use inpipefd1 because that is the right end of the inpipe. So that process would be going and reading from that when it's done. So, whoops, I didn't actually save. So now if I run that, well, the other comment was that if I use your name, that process never reads from standard input because it just outputs something. So I'm not sure if this actually works, but if I go ahead and run something like cat, well, it should read, cat's just really stupid, right? It just reads from file descriptor zero and then writes to file descriptor one. So it should read that testing that I sent it and then send it back out to file descriptor one, which is the right end of the outpipe, which I should be able to read. So if I run that, I see got testing. Good, we're done. Yeah, we need to close some file descriptors. So if I want to close them as early as possible, I should probably move that one up and then I never use the read end of the outpipe so I should close that immediately and then I should also close it after I am done writing to it like that. So that looks a bit better. So that looks a bit better. Cool, what else did I forget to do? Yeah, wait for the child, we were not good parents. So we should be good parents. So where should I wait for the child process? Anyone want to give me a line number? 26, after we read from it here, at the end. All right, so we will go ahead and wait for it, W status. Actually, I'll use wait PID just to get rid of a compiler warning because we have the child process in. So child, W status, no options, check error. So that looks good to me, there ain't no compiler warnings. So if I run that now, looks pretty good, right? All right, so does it still work now? Yes, we got a few yeses. How many yeses we got? Let's like four, how many nos we got? Less nos? Okay, let's go ahead and run this. Still works fine. Why does that work fine? Because you had a 50-50 shot. Yeah, remember this wait just waits for that child process to terminate. So the child process, I send it some information, it reads it and I close my file descriptors. So hopefully that's all good. So then it can just write the output to the pipe. I don't have to read it immediately. The kernel manages that for me. The kernel lives longer than any process because, hey, guess what? If the kernel dies, your system is now toast. So the kernel just has the information it wrote in its own buffer and then after it terminates, we can still read from it. We read that information and then we can print it out. So we did this properly and it even worked and we didn't close file descriptors. Oh wait, before that, let's see, does this work? Yeah, so this doesn't work. Remember what cat does is just reading and read is a blocking system call. So cat reads, it has at least one process has the right end of the pipe open which is the parent process. So it can't do anything. It's still waiting for output and I'm waiting for it while it is waiting for output and we're essentially at a still mate here. Nothing's gonna work. So I can press control C and by default it will die. So another fun error is what will happen if I do something silly like this? And remember, this is one line, one line, one pesky little file descriptor. Nothing bad could happen, right? Well, same thing. This is lab two. This is what will probably happen to you if you do not listen to my advice. So, start lab two after this and again, if you run into things like this, go to sleep because you probably messed it up. So you will be creating child processes. They should only have three file descriptors open, zero, one and two and you should close them in the parent as soon as you know they are not going to be needed anymore. So this is actually a harder case because we're communicating directly with the children. Lab two, you don't really have to care what they do but you do have to mess with the file descriptors. So in that case, if I comment this out, then the parent still has the right end of the pipe open. The child process wouldn't return zero from read. It would just say sit there, waiting for more output. The kernel is not going to know that no output is possible because my parent process still has the right end of that pipe that is listening to open. So that will cause me, they'll just be at a stalemate. So, yep. If I comment this out and then I don't wait for the child, yeah. So this is where it pays to be a neglectful parent. So that works. So two wrongs do make a right. Although technically we probably created an orphan here which isn't, well actually we would have had to create an orphan here, right? Because the child process will only return from read whenever the right end of the pipe is closed. Here I forgot to close it. So it's just going to spit back out one output. Then the parent process is going to come here, return and then since it returns that process ends and then the kernel closes everything associated with that process. So only at that point does the right end of the pipe close and then the child process would get reparented to something then it would return zero from read and not create an orphan that doesn't, that keeps on going. So it would just get cleaned up immediately, right? But in this case you would be wrong because you were a neglectful parent but also because you were a neglectful parent your process ended and you kind of freed up that deadlock. But yeah. So this is also a cautionary tale. Just because your code works does not mean it is correct. Any other questions here? So unexpected pass and unexpected fail are essentially checking just the exit status of your process. So expected fail just means it's supposed to return not zero. So that's what it means for the test cases. They do something kind of like this but as you have probably figured out writing test cases is harder than writing the labs since yeah, you can do some creative stuff. Any other questions? Comments, concerns? No? So there was one question. I'll try and illustrate it quick of like using the same pipe for both things. So if I'm being artistic here, let's say this is a pipe. So this is like the right end of the pipe. You fill it up. This is file descriptor at index one and this is the read end of the pipe that you can get data from. So if I have something like this I'm trying to do this two way communication with a single process. So say this is my child process. Say it's like cat. Well, I'll draw it's file descriptors here. One. So essentially if I connect one to this end of the pipe and then only use the read end of the pipe, well that would have to go here which already starts to look wonky. And then if I want to send it information while the parent process would also just be using this end of the pipe. So now we have two things writing to the right end of the same pipe. And that cat process is just reading from the read end of the pipe. So essentially even, which I hesitate to program this because it might end up being a disaster but if I even write one character to here, whoops. This was one. So yeah, so say if I write the character X into the pipe from the parent. Well, guess what? The cat would go ahead and read that X and then output that X here. And then it would read that X and then write it and then read it and then write it and then read it and then write it and then read it and then write it. You would never get off this wild ride. So that is why you cannot use a single pipe for both ends of communication. Yeah, yeah, you can't let the parent wouldn't be able to interfere like. Yeah, yeah. So this is why we have two pipes because we know for a fact that the information we write is only read by the child and the information the child writes is only read by the parent. There's a clear separation of who is the receiver in that communication and who is the sender of that communication. Which is why pipes are a one-way communication channel which means one receiver, one sender. In this case where we have one pipe we would have two senders and also two receivers because also the parent would want to read and it's just complete chaos. All right, any other questions? Yeah. Yeah, so how we have it set up now is, let's see. See how artistic I can get. So we have two processes and two pipes. Let's see. So here's my in pipe with one and zero and here is my out pipe with zero and one same thing. So in this case, my child process, well we had file descriptor one or zero, one and two. In this case, what it was writing, it was writing into file descriptor one which was going into the out pipe and it was reading from file descriptor zero which is this end of the in pipe, right? In fact, I probably should just make that better child. I'll abbreviate this a little bit. So that's what was going on if you wanna draw pipes with the child, right? It was reading from the in pipe and then outputting to the out pipe and then in the parent process over here, well, I was feeding data into it from this in pipe, right? And then whatever happened to it, the child read it from its file descriptor zero and then output it to file descriptor one which went to the out pipe and then in my parent process, I just read that information out of the read end of the out pipe. Yeah, so the child just read in from the in pipe and then output it to the out pipe and you just do the reverse in the parent. So in this case, when I sent the string testing, whoops, actually testing started over here. When I wrote to it and then it, you can think of it as it traveled through the pipe, and then when the child process read from file descriptor zero, it read this testing and then, well, in this case, it was cat. So it wrote testing out to file descriptor one. So it went into the out pipe. So now this is in the kernels memory and then when the parent read from the read end of the pipe, well, it went into the parent's memory and it got the string back again. So it just had a roundabout journey through some processes. So the gut testing was after I read from the read end of the out pipe and then I just printed whatever I received from that pipe. So because it was cat, I sent it testing. It wrote back out testing and then I read that. Yeah, if I did something else like I did, you name or something like that, well, it doesn't read anything from standard input. I would have still fed it data, but it just doesn't ignore it. It just ignores it, doesn't use it and then carries on and still prints the standard output and terminates and then I read its output. All right, any other questions? All right, cool. Then I will be here for other questions for like the four minutes. So just remember pulling for you. We're all in this together.