 OK, good afternoon everyone. Welcome to station three. We're going to start today with assignment two. So I'm going to start from the very beginning. I'm going to recover what Carl's covered last week. And we did cover part of assignment two last week because we thought some people are done with assignment one. So they want to have an idea of how to go about assignment two. So we're going to talk about the operating system computer organization. And then we're going to start with system calls and OS 161, syscalls, lifecycle, how to add the system call. We're going to go through the process model. And then we're going to talk about file system support, initializing the console, which is assignment 2.1. And then we're going to talk about the first syscall, file syscall, write. Previously, we just started with open, close, read, write. But this year, because you need the console running by this Friday, so you should have write at least done for that to be running for the console. So we're going to go over write if the time allowed and also I'm going to talk about design document. You only need one starting assignment two. So assignment two is about implementing system calls and exception handling. You have two kinds of system calls that you need to implement. First, you have filesyscalls and you have processes calls. The filesyscalls, you have open, read, write, lc, close, dub2, change directory, get current working directory. For the processes calls, you need to implement getpid4, execv, and waitpid, and exit. You have two deadlines. The first deadline going to be this Friday. The next deadline going to be around four weeks from now, March 17. So let's start with understanding the computer organization and where is the operating system fits in. So in general, we do need to separate users from the operating system or the kernel. Why we need to do that? Because of security. One of the main reasons is security. Sometimes the user doesn't know what he's doing. Sometimes the user intentionally tries to access some part that he's not allowed to, just like accessing some part of the memory that he doesn't have the permission to do so. So security is one of the main reasons that we keep users space and kernel space separated. And that means basically that you do need some kind of interface that will allow the user to interact with the kernel. So whenever a user program needs help, it will, through that interface, ask the kernel for some operation, some services. And this is what goes through the system code. Sometimes something goes with the operating system or with the program that needs the attention of the kernel. And at that point, for example, exception, dividing by 0. The user program tries to divide by 0. At that point, the kernel should come into play and handle such an error. So we need to separate the user space from the kernel space. And this is basically how it goes, just like we have users. We have user program that use library functions to interact with the system calls. And through the system calls is where the kernel receives the user request, user program request to do some kind of services. And this separation basically means that you still need to figure out how they should coordinate, just like how we pass arguments, how we return arguments. So as we go up the hierarchy, we're going to see that things going to get more abstract. But as we go low, then things going to be more low level. And that means, for example, the kernel has direct access to the hardware. And this is why one of the reasons that you want to prevent the user to have it, just like to have a direct access. So system calls is basically just like a way for the user program to request services from the kernel. So whether it is hardware-related services or software-related services, just like accessing hardware disk drive or creating execution of the new processes. So before we dive into the system calls, there are some concepts that you do need to understand. And you might also have these questions now in mind, just like, now how should we switch between normal mode to kernel mode, user mode to kernel mode? Or how does the system knows when a user program issues a system call? And that's basically from the interrupt handling process, but still system call is one of the interrupts that could happen. We could have exception, which is just like, as I said, divide by 0. We could have a hardware interrupt. You press any key on the keyboard. You have timer interrupt. So how does the kernel know what kind of interrupt it is? And how does it know that it is SS call? Then how should another question would be, how should we pass arguments to the kernel? And how should we receive the results from the kernel back to the user space? So I'm going to go through a syscall example that is already given to you. It's implemented. I'm going to show you from the point that a user program makes a call to library function and all the way going to the kernel space, satisfying that service, and then going back again to the user space. You don't need to know all these details, but I'm going through this just to give you an idea of how things goes so you would have a better understanding of what is a system call. What you really need is only to write the functions that handles the system calls. So I'm going to go now through a system calls, assembly files. You really don't need to do anything about that or don't need to go through it or understand it. I'm just going to go step by step showing you how things goes. And meanwhile, I'm going to also cover some topics. So let's go step by step now. The first thing is here, now we are in the, as you can see now, from the path that we are in the user space. We are in user land, lib.c time. So this is the time function, the time function call. And it's very basic. It receives a pointer, and here the library function makes the call to assist call, which is underscore time passing the pointer and null. So if you want to really understand what are the arguments, you have man pages given to you. You can find them under the source, then man folder, assist call. You have all these assist call pages given to you, and it is up on the website. So for example, now if we go through, now this is the time assist call. Basically it receives, takes two pointer, one for the seconds and one for the nanoseconds. And that's what's happening here. So if we go next, once we make that assist call underscore time, we will go into the assist call.s, which is an assembly file. And that's basically a macro assist call receives some, which is a label and a number. So all the assist calls that are defined for the kernel, you can see that it has a name and number. So for the time assist call, for example, you have the underscore underscore time that the assist call that we just, the call that we made. And it had the number one, one, or three. So what will happen here is just a label assembly code. I will put the assist call name into v0 register and jump into the assist call label, which is here. And at this point, I'm going to issue that instruction, which is assist call. What will happen here is the interrupt handling process. So at this point, I'm going to issue an interrupt here. So what happens basically whenever an interrupt is a trigger? The first thing, you're going to enter the privilege mode. You're going to change mode from user space to kernel space. Then you're going to record the state. So all the registers are going to be saved. And then we're going to jump the PC or the program counter, going to jump to a predetermined memory location, which is 0x8 million, and start executing from that memory location. What is in the 0x8 million? We're going to see that now. So once we make that instruction call, this is where the interrupt will be issued. Now then we're going to go into the kernel. As you can see, this is called that S was still in the user land. But now if we jump into the exception maps, that S, which is a simply code again, it is in the kernel folder, under the kernel folder. So now we switched from the user mode to the kernel mode. And you really don't need to care about that. What's happening here is basically, as I said, saving the state. So saving all the registers data into the trap frame. So this is what's happening here exactly. What you need to know about is jumping into MIPS trap. This is a C function that you can find in trap.c. So all the paths I give to you, you have it here. So we currently are in general trap exception handling function for MIPS. You have the path here, which is trap.c. And this is the function MIPS trap. What does it receive? A trap frame. What is a trap frame? Trap frame has all the registers. So if we go into trapframe.h, you're going to see that the trap frame, basically what it has are all the registers. So v0, v1, a0, up till a3, you have the stack pointer. Everything is in the trap frame. Why do we pass it? Because it has all the states saved on it, or even the arguments. Now what we will do is we will get the code. So now we're going to extract. So if you go through these files and you go through the comments, you're really going to understand. I mean, nothing is, you don't need really to go through the restation. Once you go through the files, you read the comments. It's quite obvious. So the code basically extracts the exception code, info from the register fields. And what will happen now is, so now we know where we do switch modes from user mode to kernel mode. The next question was, how does the kernel know that it is a system code? And not, for example, exception or hardware interrupt or timer interrupt. It's here. So from the code that we just extracted, we're going to know. So for example, if it is an interrupt, just like hardware interrupt, then this is what should happen. Next, if it's a syscall, so if code is equal equal sys, that means a syscall, then this is what should happen. And if you continue, for example, it tells you maybe it is a memory fault. So if it wasn't any of really easy cases, call the VM fault. So as you continue, these are the other interrupts that could happen other than the syscall. So let's go to the syscall part. The syscall part will call syscall function a c function and passes the trap frame, again, that has all the registers and all the arguments saved in it. Where is the syscall? It is in syscall.c. So it is a system called the spatcher. And this is the path for it, syscall.c. So we talked about interrupt handling. Now we have the argument passing that we need to talk about. So if we go to the syscall function and we have the trap frame, as you can see here. Now first of all, what I'm going to do is I'm going to get the value of v0 register. So as I said, now we know that it's a syscall. Now I need to know which syscall is it. Is it open, read, close? And that's what I'm going to get from the v0, which we did save the value of the number of the syscall in the assembly file in it. That was here. So based on that number, there is a switch statement here. And now here, things get important. Here, what you really need to add stuff to it. There is a switch statement based on the system call, based on the number that you received. So for example, now what I have implemented is only reboot and time syscall. Now as you implement your syscalls, you're going to add cases. You're going to branch more into the switch statement. So as you can see now, since we made a time syscall, then we're going to go into that case. And that will call sys underscore underscore time, which is the function that handles the syscall. And that's basically defined in time syscall.c. This is the function. Now here, the argument passing that we need to really take care of. So as you can see, I'm passing to the, since I received the trap frame here, I'm passing the registers values based on the number of arguments that I have. And that's what we discussed last time with call. So argument passing convention. Please read the system call dispatcher comments. It's really whatever is here is from these comments. So we have three four registers, A0 to A3. So the first four 32-bit arguments goes into these registers, A0, A1, A2, and A3. If we have a 64-bit argument, then we need, again, it goes into these four registers, but it has to be aligned. That means either A0, A1, or A2, A3. So if the first argument was 32-bit and the second was 64-bit, that means A0 is going to be used, A1 is unused, A2 and A3 are going to be used for the 64-bit. So A0 is going to hold the 32-bit argument, A1 is unused, A2 and A3 are going to hold the 64-bit argument. And the remaining of the arguments should go on the user stack. And that's starting from the stack pointer plus 16. This is for passing the argument. Now, when we want to return the values, where should we place the return values? So the return value should go into V0 register, if it's 32-bit, or V0, V1, if it's 64-bit. And A3 should be set to 0 if the Cisco succeed. If it failed, then some value other than 0, maybe minus 1. But it's all in the comments. You should know what should it be if the Cisco doesn't succeed. And we have a few examples here. You can go through it, but so for our Cisco, since we receive two pointers and pointers a 32-bit argument, then we're going to put it in A0, A1. And we're going to return, if it succeed, then A3 going to be 0. And V0 going to hold the time, which is the return value, since it is, again, 32-bit. But for L-seq, as you can see, what you're going to pass is you're going to pass the file descriptor in A0. It's 32-bit. Now the position, or the offset, it's a 64-bit. So that means A1 is unused. And it should go into A2 or A3. And the ones should go into the stack pointer, plex16. This is the most difficult one that you will need to handle. Other than this, all of the Cisco should be just like 32. So the return value, again, A3 indicates success or failure. V0, V1, since, again, we're returning the offset, and it is 64-bit, then it should go into V0 and V1. OK. So now you're going to call the function that handles the time-sys call. And here what you need to figure out, which register to pass. So as we said, it should the seconds go into A0 and the nanoseconds go into A1, since they are 32-bit. And let's go to the time-sys call. So this is the second step what you need to do. So the first step, or let's say the first step, actually, is you need to write the function that handles the sys call. Once you're done with that, then you add a case or a branch into your switch statement. So here the function receives the seconds and nanoseconds. But again, that's coming from the user space. That means you need to check it, check the arguments for, why do we need to check it? So I'm not going to go through copy and copy out. Copy and copy out are the functions that you need to use to check the argument that you receive. And also you need to pass whatever argument that you want to return to the user space through the copy out. So copy and check the argument that are passed. Copy out check would check the argument that you're going to return to the user space. You might ask now, why? OK, now I understand that passing an argument from user space could be dangerous. That's why I need to use copy and to check that argument. But why do I need the copy out? Because let's say there is a pointer pointing to some memory location in the kernel space. You need to use copy out so you could copy that data into the user space, into a memory location that is designated for the user to be used. This is one of the usage for copy out. So why should we use copy in and copy out? For example, I make an open syscall and I pass an old pointer. If you just move, then things are going to crash. But copy and copy out going to handle these cases. Just like to returning error codes. Or for example, you're going to write data in interrupt handler. So for example, for read, you have a buffer that you need to read memory content and then put that data that you read into that buffer for the user to read it. And you might just like pass the OX80 million, which is the interrupt handler location. And you're going to override that, just like read whatever data from the memory and then save it into OX80 million location, which you're not supposed to. So these are the reasons why you really need copy and copy out. And that's what's happening here. As you can see, we get the time. We use copy out for here. We really don't need to use copy in. Why? Because we're not receiving arguments or data from the user. Please keep in mind that pointers are in the kernel space, and they are safe. But what they point to is not safe. So here, what we're going to do is we have a pointer that we're going to assign the seconds to it. We're not going to read anything from that pointer. So what we need really is we get the time from the system, and then we copy out the time into the seconds pointer and non-seconds. Now, other than checking the arguments and satisfying the service, you also need to handle exceptions. So you do have, for example, I'm going to go over this, but for every scol, if you go through the man page, you do have some errors that you do need to handle. So if you remember, as part of this assignment, we're going to implement system calls. And we're going to also handle the exceptions. So you need to keep that in mind. Because one of the stuff that your scol will not work if you do not handle the errors. So that's, and now we start returning the results. As we receive them, we return the results. It goes, so the scol returns a code and the error. And here we check if there is an error, then we're going to put the error code into v0 register, and then we're going to assign a31, which means failing. Else, we're going to assign the return value into the v0 register, and then assign a30. And that means success. And this will return the, if we go back, we'll keep returning until we reach the assembly code. So this is where we issued the interrupt. And now here it tells you branch unequal. That means if you read the comments, it tells you if succeed, then it will continue. If not, it will assign minus 1 to v1 and v0. It will handle the error and returning, and it will return the error code. So that's how a system call goes. This is the time scol that is already implemented to you. Don't freak out. I mean, the only thing you need to do is you need to implement the system call handler, and also you need to add a case statement for the system call dispatcher to handle that case in case that scol is called. So system call lifecycle. We started by putting arguments in register a0 to a3, and then we issued scol command, which trap from or change mode from user mode to kernel mode. And then we entered the kernel mode. And here, which is the privilege mode, we saved the context, and then we identified that it is a system call, the type of the interrupt, which is a system call. Based on the system call number, we dispatch the system call handler, which is we identified which system calls we need to handle. And then once we handle that system call, we store the system call results in a3 and v0 registers. And we go back to the user mode. So this is system call lifecycle. Now how should you add a system call? There is a convention that you need to follow, which we did put the steps for you here. But there is an easier way. The easy way for now is what you can do is just like in syscall.c, you can write your function in syscall.c. For example, open syscall. And then you add another case in the switch statement, and it should work. But once you get your system call working, you do need to follow the convention. Jeff is expecting you to follow the convention. So the easy way is just like to get you started. Once you get your system call running, you could just like go back and follow that convention of how you really need to define a system call. Or you could just like do it this way from the beginning. It's a one-time process and basically put all the syscalls under the kern syscall folder. So this is the first part of our recitation today. Any questions? Yeah, you can put the function that handles the system call in syscall.c for now. And then add another branch for the switch statement. And it should go fine. But once you get it working, you need to follow the convention here. Yes. OK. So let's continue. Now, you still don't have the functionality. You don't have the tools ready for you to implement a syscall. You have a lot of missing stuff that you do need to implement. What are they? So this is a process model. What you need to really care about for now is the parent process. You don't need to care about the child process for now. This is basically a user process. It has multithreads. But please remember that for OS 161, all processes has only one thread. They are single threaded. We don't have multithreaded process. So I'm just putting this to give you the general idea. So a user process would have a thread, would have an address space, and it would have a file table. A file table, basically, we should assign the first three indexes for the file table to the console, which is stdn, stdout, and stdr. And then whatever file you open, you're going to put them starting from index three into your file table. And every file should have a file handle and a file object. File object is actually the physical file on disk. We're going to go through this in more detail now. So process structure are defined for you in proc.h. You're going to add a lot more stuff here when you reach the process schools. But for now, what you have is address space and current working directory and spin lock. And also you have the process.c file, which has some functions that are implemented for you. Just like create, process create, process destroy. You have them all here. Woodstrap, create, run, program, destroy, add thread, remove thread, and get address space. You really don't need to care about these for now. But I'm just like giving you a heads up for the next part. Now let's go through the file system support. So we have three levels of indirection. We have file table, we have file handle, we have file object. A file table reference just like as a data structure that points to a file handle. Every index points to a file handle. File handle in turn map to a file object and file object map to blocks on disk. What is a file descriptor? A file descriptor basically is an integer. That's an index of the file table. Why do we need three levels of indirection? Because they do have different sharing policies. So the file table are private to each process. File handle are private again to each process. But once you fork, it is shared between the process and its child process. File object is shared system-wide. So if there is a file that is opened by more than one process, there is only one file object for that file. So we do have some functionalities that are implemented for you that you can use. Similar to, for example, locks. We did have spin lock that you did use. Now here for read and write, we do have some layer that is implemented for you, which is the V-node. V-node is basically the physical file on disk. And there are some macros that are already given to you that you can use, like, for example, VOP read, VOP write, and VOP is seekable. Let's go through the header file of V-node.h, where you have these macros. So what you really need to care about is the VOP read and VOP write and VOP is seekable. You're going to use that with read and write and L-seek. But just read more about is seekable, because previously we had it just like it was tri-seek. I think the functionality did change. But I'm not sure about that. It basically tells you if some offset in the file, the user process can access it or not. So just like you provide it with an offset, it tells you it's valid or not valid. So if you see here for VOP read, it does receive a V-node and a UIO. UIO, you really need to go through to understand how you should read and write. It's a structure. So you have VOP read, VOP write, and VOP is seekable. You pass these arguments to these macros, and they will do the rest. There is another layer that is built on top of V-node, which is the virtual file system, the VFS. It provides a more friendly interface. And you can use these interfaces, the VFS open, close, change directory, get current working directory to implement these file syscalls. So these are all implemented for you. Now you might have the question now, all of this is given to us. Why do we need to re-implement it? Because there are a lot of stuff that are not done, like argument checking, exception handling. These are not still done. And you still do not have the syscalls defined. So let's go through the VFS. So you have, please go through the comments. It tells you exactly what you should expect from all of these functions. So you have the VFS open, it receives the path, the flags, just like what are the flags for open? I don't remember these, but the mood, just like tell you if it's a reading mood, or should we open the file as a readMood, writeMood, readWrite, and then it will receive the V-node for that file. So now what are your tasks for 2.1 and the file system call part? That's not only 2.1. So for 2.1, you do need to design a file table, you do need to design a file handle, and you do need to initialize the console. Initializing the console means that you need a working write syscall. And whenever you want to implement a syscall, you need to check the arguments and you need to handle exceptions. Yes? The V-node is correct? Yes, that's the file object. Yeah, the V-node is the file object. And so for the rest of the file system calls, you do need to continue implementing the file system calls. So for 2.1, again, you need the file table, you do need file handle. Also, there are some way around that, but you do a little need it for the file handle. And then you need to initialize the console and implement the write syscall, which still needs, you need to do the argument checking and the exception handle, yes? That's what we're going to go through now, yes. I'm going to explain it. So let's go through the file table design. What is a file table? File table is a data structure that maps file descriptors to file handle. File descriptor, as we said, it's an integer, it's an index into the file table. So every file table should have a file handle object. Every file table index should have a file handle object. What are the requirements for your file table? It should be able to identify the next available file descriptor. So if you call open, it should first figure out what are the available file descriptors and assign one of the file descriptors to the open and return it as a return value. Given a file descriptor, the file table should be able to retrieve the file handle of that index. So for read and write, for read and write, you will pass a file descriptor, and through the file descriptor, you're going to retrieve the file handle. It should also be able to recycle the file descriptor. So let's say we close the file, that file descriptor should be reused to open another file. Then you have file handle design. File handle is basically a structure that contains references to file object, the V node. What are the requirements for a file handle? It should be able to tell where to read and to write. From its name, it's a file handle. This is what's going to handle the file. So if a process has a file open, it points to a file handle. And through the file handle, you should be able to tell now if the process needs to read, it should read from where. Or if it needs to write, it should write from where. And that's basically the offset. And it should also be able to prevent invalid read and write. So through the file handle, you should be able to determine if the user process has the permission to read or write. It should also be able to determine if it's safe to be destroyed. What does that mean? Now, if we only have one process, then as we said, that file handle is private to that process. But once that process begins to fork, that means that file handle will be shared between the parent process and the child process. So there should be a way, a mechanism for you, to follow up. Because now that file handle is shared between the parent and the child. So if, for example, the parent, let's say, closes that file, then you shouldn't destroy the file handle because still the child is pointing to it. Or that file is open for the child process. So there should be some kind of mechanism for you to figure out when it's safe to destroy the file handle. And basically, it is safe to destroy the file handle when the process that's trying to close or destroy the file handle is the last process that is pointing to that file handle. And also, it should be able to synchronize. So you should use your synchronization primitives since that file handle is shared between, it could be shared between more than one process, then you need to protect it. You shouldn't allow more than one thread to access it at the same time or one process. So we're going to use process and thread interchangeably here in OS 161 because, as we said, our processes are single threaded. Now let's go to initializing the console. Console basically is a special file that doesn't have the concept of offset. What does that mean? It's just like think of it as a file that you keep appending to it. Append to the end of the file whenever you write. And whenever you read, just get the user input. And it has a fixed file descriptors. So 0, the index 0 in the file table should go to standard input. And index 1 should go to standard output. And index 2 should go to standard error. These three indexes in every file table should be reserved for the console. How should we initialize it? We use VFS open. This is the virtual file system layer that are built on top of the V node. And as we said, for open, close, and some other files, Cisco that I mentioned, you need to use the VFS. For read and write, you use the V node. So what you need to do, how you should open the console, you need to pass that string as the file name. And then remember that you need to set the flags correctly, just like, for example, SDD out should be write only. So when should we initialize the console? Console should be initialized only once. You do it only one time. No need to open them for each process. You only open it with the first user process that is created, which is the init. And once it's created here, then you will keep forking. So once that's one of the functionality of fork that then you need to implement. That if you fork a process, you should copy the file table along with a pointer to the console. So you only initialize it once. And then it will be inherited as you fork. And just like a general information that console file abstraction is only used with user process, kernel doesn't need console to print to screen. It will use just like K print app. So this is initializing the console. How you should initialize the console. Any question up to now? So now as we said, we have a process. You have one thread. You have a thread address space. You should have what you need to care about is the file table. File table is private to every process. File handle could be shared, as you can see here. Now this is the child process after the parent process that fork. Then at this point, the file handle for any opened file will be shared between the parent and the child process. And the file handle will point to the file object. And as we said, if the file is open using more than one process, there is only one file object that is handling that file. So the file object is shared system-wide. While the file handle is private to the process, as long as it didn't fork, once it fork, it's shared between the parent and the child process. And as you can see, there is no file handle for the console. Why? Because as we said, there is no concept of offset that you need. There is nothing we need to handle about the console file. So you have the process structure here. And then you need to use this mechanism, the V-node, that is already implemented for you for read and write and lseq. And then you can use the VFS for open, close, change directory, get current directory, syscalls. What you really need to do is designing a file table, design a file handle, and then initialize the console. Then you need to implement the write syscall. I would advise you that you should also consider starting from open, close, and then you go into write. That would give you a better understanding on how to implement write. That's what we used to do last year. But the problem was that just like students didn't start assignment two until the last week. And they didn't realize how tough is that assignment. So that's why we did add the 2.1 so we get you started. So it's a good idea if you start going through file syscalls in this order, just like open, close, then read and write. But what you need for 2.1 is only the write syscall. So you design file table, file handle, initialize the console, implement the write syscall. This is for 2.1. Then you start implementing, once you're done with 2.1, you start implementing the rest of the file system call. What you need to do, so you have the VLV node and VFS already implemented for you that you can use. What you need to do is argument checking and exception handling along with some other stuff. But these are the main stuff that you really need to consider while you implement the file syscall. As we said, file table designs, data structure maps to file handle, file handle maps to file object, and then you have the initializing the console. Now for the file syscall write, what you need to consider is you check if the, basically check all the arguments that are coming in. You need to check if the file descriptor is valid. Then you need to check if the buffer pointer is valid. How you do this? Through copy and copy out. So you pass the pointers to copy and copy out, and the copy and copy out going to handle that. For example, null pointer was passed to it. Can the user write to the file? This is one of the stuff that you need to figure out in the write syscall. And how you should write to a file? Read and understand this header file. It's very important because the UIO is one of the arguments that you need to pass to the V node. There are a lot of comments there that you really need to understand. Maybe not every detail, but you should have an idea because you cannot implement the syscall without understanding what's going on with the UIO. The last thing I need to say here is please we used to grade the design documents, but we don't do it anymore. It's really important that you do have a design document. We don't grade it because we believe people who really passed the assignment to did have a design document. Without a design document, it's really difficult for you to proceed with the assignment 23. What you should have is basically a two-page that, just like you need to have a visual idea of what the file table design should look like, what the file handle design should look like. Process table design, process structure design, that's for later on, not for now. Synchronization, how you're going to synchronize the shared stuff. And you should describe syscall, how you're going to go about implementing it. And then exception handling. And you can break down the task that you have between you and your partner. So it's really important for you to have a design document. Please start with the design document. Otherwise, very few people did pass assignment 2 without a design document. And that's why we did really stop grading it because it doesn't make sense. I mean, you do really need it. So people who really listened to this advice did pass assignment 2 and a 3. But people who didn't do it, you're going to get confused. Yeah, that's all I have for today. Thanks for coming. Please let us know if you have any questions now or through the office hours. Thank you.