 Welcome to the final recitation. And we're going to do the final exam review. So it's going to be mostly similar to the midterm exam review, but customized for the final exam. So again, we're going to go over the location, day, time, and then the final exam material, format, and tips. We will also review last year's exam, I believe, after we're done with the slides. So final exam will be next Monday, May 15, from 3.30 to 6.30. So three hours exam. The location will be the same as lecture room. But please double check in your MIHAB. And also keep an eye on the discourse forum for exam announcements. The procedure will be similar to the midterm. There will be seating charts. And it will be almost similar to the midterm procedure. So final exam material, I've divided that into midterm topics. We have post-midterm topics. And then we also have papers and some additional topics that you've discussed in the class. So going, again, quickly over the midterm topics that you covered, so starting with the process abstraction. And you learn that process, organize information about other abstractions. And it is not tied to a hardware component. We use process abstraction to separate policy, the what, from mechanism, the how. And why do we use abstraction is to just like hide undesirable properties, provide or add new features and organize information. You also should now have a clear view that the threads abstract CPU, address space abstract, the memory, and the files abstract the disks. We've also presented you with the process model that a process could have one or more threads. It also has address space and a file table that is private to it. We've also said that there is a file handle or file system that's divided into three levels. You have the file table that is private to the process. And it has references to the file handles. File handles are also private to process. But once you start working, it will be shared between the parent and child. And then you have the file objects that are shared system-wide. And file handles have a reference to the file objects and file objects mapped to blocks on disk. We've also presented you with process calls. You should have now a good after completing the programming assignment. You should have a good idea what each of these process and process calls does. Also, we then discussed synchronization. And you should by now know what is a critical section. Why do we need to protect a critical section and how do we protect it? So you've been presented with various and also implemented various synchronization primitives, sleep locks, condition variables, read the writer locks. So you should have a good understanding of these, how to implement them. You should also have an idea of deadlocks, starvation, race conditions, what are the definition of each one of these and the differences. How should you avoid them? Also, you should have a good idea about some synchronization problems in case you're being asked about them. So you won't waste time just figuring out what is that problem. Then we moved into the interrupt and exception handling. So for interrupt exception handling, you should know the types of interrupts. And you should know how to differentiate between an interrupt and an exception. How should we handle an interrupt? And you've also discussed the context switching. You should have a good idea of the process of context switch. For scheduling, you should know why we need scheduling, which is we need it for multiplex and CPU. And you should have a good idea of the thread states and the transition from one state to another. What does it mean? Also, you should have a good idea about scheduling algorithms, round robin, MLFQ, RSDL. You should be able to compare them, improve them, talk about advantages, disadvantages of each one of them. Then we've moved into the memory management, and that's where we discuss the virtual and physical addresses. So you should know the process of translating virtual to physical addresses. How we divide up the virtual address? How we do retrieve the physical address for a given virtual address? What is a page table? What is a core map? What does the core map keep tracks of? What does the page table keeps a track of? Why do we need them? Also, the different implementation that we've discussed for page tables, flat array, link list, multi-level arrays, you should know how are these implemented and advantages, disadvantages of each one of them. Also, then we moved into the TLB. We've said that TLB is used by the MMU at its part of the CPU to speed up the virtual to physical address translation. Also, memory faults, page fault, TLB fault, the difference between each one of them and which one triggers the other. So TLB fault doesn't necessarily trigger a page fault, but a page fault must be preceded by a TLB fault. Then we discussed swapping. Also, we've discussed some algorithms on how we can choose the page that we need to elect or swap out. So you should have a good idea of these algorithms and advantages, disadvantages of each. This is the midterm topics that we covered. Then we moved into post-midterm topics. We discussed disks. So you should know what is an HDD, SSD, the pros and cons for each one of them. The parts of the HDD, you should know these parts, advantages, disadvantages of each one of them. Then we moved into files. What is a file? What do we expect from a file and file structure? You should have an idea of all of this. For example, where should we store? How should we store the metadata about a file? So you have different ways for that. You should know all of these ways, advantages, disadvantages. You should be able to compare them, improve them. File systems, then we moved into the file systems. So you should know the expectation of a file system. Why do we need a file system? What are the design goals for a file system? And what are the data structures that are presented? So we presented several data structures to organize data blocks. You should know these algorithms and should be able to compare them. Caching, consistency, caching and consistency. We've discussed buffer cache. We should know what is a buffer cache. And where should we put buffer cache and why? There are several scenarios for that. So you should be able to compare each one of them. And then we moved into discussing journaling file systems. So you should know what does a journaling mean? And how does a journaling file system works? And different scenarios when there is a crash and how does it recover? We've also discussed the Berkeley FAST file system and also log-structured file system. So for each of these, you should have an idea how does it work and advantages, disadvantages. And several other topics, subtopics that are, I couldn't cover everything here, but you should go over these slides since the final exam is cumulative. So everything discussed from day one up until the last day of classes is included in the final. So then we also presented you with some papers and some additional topics. So for the rate paper, you should have a good understanding what is rate different kinds, different types of rate, and you should be able to compare them. Then we moved into discussing virtualization, full virtualization, per virtualization, container virtualization. So again, all of these you should have a good idea about and you should be able to compare them. So to what extent you should know? To the extent that's being discussed in the class. Then we moved into performance and benchmarking. Also we've discussed under slow and then the paper hence for computer system design. So these are the topics that we covered throughout the semester. There are some stuff that you really shouldn't miss. For example, page table translation, questions about page table, how to translate addresses. You should know if this comes into the exam, you should know how to do that. You shouldn't really lose point on these such questions. Rate types, make sure you don't miss these. Also, hints for computer system design. Make sure you understand and know a couple of them. You should be able to compare them and discuss them. So these are some stuff that you shouldn't, for example. And also the HDD parts, the disk parts, don't miss out on these stuff. A question comes with a figure. We did have a question comes that in previous exams with a figure of the disk. And asking you about parts of the disk. So you shouldn't really miss points on such questions. So any questions on the material covered throughout the semester? OK. Call any points, anything? Well, I guess we covered a lot of the material. What I'm trying to do is kind of also bounce through some of those bullet points we were talking about. So possibly we'll just go back to the other one on this. Yeah, like one or three. Oh, OK. Obviously, this is an incomplete list on this, as Ali said on this. You probably hear this back some of the time. Oh, call. Sorry. A few people are hearing, but the people out in video land are not hearing on this. So as I was just mentioning on this, Ali's list up there is, as he had mentioned, it doesn't cover every last tail on this. But what I would say is in terms of exam preparation, obviously you want to make sure that you are familiar with the specifics. Those are going to show up in some of the short answer questions on this. I would expect that the short answer questions are going to, as with the essays, disproportionately wait the last part of the course. But again, everything from January on is fair game on this. In terms of what is on the short answer questions, I don't want to say it's a no-order-you-don't, but it's the sort of thing like what Ali was saying, like keep track of. You don't want to make sure a virtual address is not the same thing as a physical address, yada, yada, on that. Besides that, a couple comments with respect to the essay questions. I assume that Jeff is going to be addressing this in much more detail in tomorrow's wrap-up class. But he is fond of designing at least one or two of the, especially like the 25-point questions, to try to cover lots of topics all at once on this. Like if you look at last year's exam, which we'll be paging through as an example in a couple minutes, there was one question, the 20-point question, which was kind of on one specific area, obviously virtualization. But then the other two questions really did cut a wide swath across the entire course. And that's something that he's actually going to be expecting. So it's not like, well, what should I study? Should I study? Maybe scheduling, because is there going to be an essay question on scheduling? There conceivably could be. I'm just saying if you look at the format of these essay questions in the past, it's probably going to be an essay question that will involve scheduling, but it will also involve other subsystems, too. So you need to kind of know how the subsystems interplay on this. So if this makes sense, it's not going to be like one particular question on this or that or what have you, it's going to be a combination of them at all. So I was going to kind of dive into, again, last year's exam on this, but before we go on, possible questions about, like in particular, some of the lessons that I had covered on this, but in terms of like, who knows, processes, synchronization, well, who knows. Interrupts, remember the kernel, everything begins with an interrupt on that. We talked about you people designed a process subsystem and kind of had an inter, if you will, tie in with a threading subsystem. Again, you should have a good idea on what's the difference between a thread and a process on that. In terms of scheduling, scheduling, again, Ali had mentioned the various types of schedulers on here. I would be, remember, take a clue from the midterm in terms of how Jeff asked this, that yes, you need to be familiar with a particular type of scheduler, but then, well, maybe go a little bit beyond that. What's a weakness in the scheduler and how can we possibly address that weakness? That's more going to be what the grading staff is going to be looking for. In terms of paging, swapping, the whole, like memory management system, questions about that, anything else on this? Going once, going twice, sold to the person, what have you, and then maybe next one. Discs, in terms, disks, files, again, this is post midterm on this, we know the metadata in terms of where does everything start. Remember, there's going to be the anchor block and then you've got these inodes and then there's obviously different ways of skinning the cat on this. What we're trying to do is remember the genesis of file systems is to provide persistent storage on this, but we're also dealing with legacy mechanical hardware. So that's where a lot of these design decisions came in in terms of remember, we try to scatter the inodes throughout the disk in strategic locations. If at all possible, we're trying to minimize our movement on this in terms of, let's say, where do we store data on this? Well, we've got the data data, we've got the metadata, and how much metadata do we actually store? And there's other things, remember, Jeff, I know he talks about in terms of like, there's typically levels of inodes on this. So in other words, depending upon how big the file is, if we have a really small file that's only 50 bytes, maybe just have one reference to it and done on this, as opposed to a really big file, it's probably gonna have several layers to the tree on this and don't create them unless you actually have to have them. And then, I would say perhaps a good essay type question might be to compare and contrast something like the log file system with a standard UNIX type file system. In other words, why have inodes in the first place as opposed to remember Jeff's lecture in terms of with log file system, we're just gonna kind of write everything sequentially and then we've got to trade off with the cleaning process and there's pros and cons on this. So it's not simply a discussion of one file system, but well, and then you might ask you under what type of use scenarios would something like log file systems be better as opposed to let's say a more traditional file system involving inodes and well, not log in. When I say logging, let me be clear, we're not talking about journaling in the sense of safety on this. Everyone clear on that distinction logging first in terms of the log file system as opposed to journaling, which is a type of logging on this. In terms of when we're talking about like journaling, what we're talking about is essentially safety of the file system itself in the case of a power cut or other types of corruption on this. So, and then I think that should be about it. And then as Ali had mentioned, like the research papers too. And that's another thing. If you look at previous exams, I would say dollars to donuts, at least one of the essay questions is going to involve one of the published papers on this. And even something like the virtualization question last year, which you're thinking, well, wait a minute, we just talked about that in class as well. In order really to understand the third component to virtualization, which is the per virtualization, it was actually buried in the discussion about how can you virtualize on a smartphone? So that's actually where the discussion of per virtualization took place. So I would say, certainly at some level, you wanna make sure that you have a handle on the research papers that were discussed. I would say there's a better than even chance that something's gonna show up from that, just knowing history on this. So, questions, comments? Absolutely nothing. Well, what we were gonna plan to do is kind of wrap things up, kind of take a quick peek through last year's exam in terms of what you would have taken, again, one year ago, essentially at this time May 9th, but 2016 instead of 2017, and how could you actually approach this? Again, like we had mentioned from the lecture that I had covered yesterday, it's the midterm, in other words, 10 multiple choice questions, plus a bunch of short answer questions, plus an essay, a 20 point essay, but Jeff's also gonna throw on there two 25 point essays. Oh, did I jump ahead on that? Yeah, sorry about that, yeah. Exactly on this, so that's, yes, okay. So, and like I had mentioned, I don't think that time is gonna be the issue like it was for the midterm. I would just encourage everyone here and everyone again in video land, please do take advantage of that extra time. I know you wanna get out, just kind of cut and run for the summer, but again, this is where the majority of the points can be garnered and it is typically, certainly for this year's midterm and previous year's classes, what separates the sheep from the goats, so to speak, that the essays, they require a little bit of thought and planning and spilling of ink too. So just don't rush into writing them and don't rush getting through them on that because again, that's where you'll typically see the greatest disparity in terms of scores. I would say in terms of where you can actually save time and effort, it's gonna be the short answer questions. I would say, I don't know if you wanna chime in on this, but the short answer questions from my experience, people actually tend to write too much to the extent that people miss points on the short answer question. It really doesn't matter how much they write if I can almost tell from reading like the first sentence, sentence and a half or so, whether or not this person is going to get pretty much all credit or is halfway off the boat or is completely out to see on this. So the short answer questions really may be a few sentences at most. If you find yourself filling up a half a page, that really is too much on this. That's something that you can save time and get through very quickly on this. And then in terms of the essays though, again, that's something, it's not even so much the length, but the fact that you really do need to think through and plan what's going on. So let's take a look at, okay, I'm pulling up the final exam for last year. Oh, oh, sorry, okay. Yeah, I'm coming, yeah. You want to? Yeah, yeah. All right, you want me to jump in again? That was okay. So let's move into the final exam format. And this is based on last year's exam. So you're gonna have again something similar to the midterm in addition to two long answer questions. So the total points gonna be out of 100 points. And you're gonna have 10 multiple choice questions that worth one point each. And these are drawn directly from the second half lecture slides. And it should be easy. So all these information, you can find it from the first page of the last year final exam. So then you're gonna have six short answer questions. We will give you credit for the best four and each one fourth worth five points for a total of 20 points. And please answer them in four or five sentences. Shouldn't take longer than that. And these are mostly drawn from second half material, but not entirely. So some of them could cover the first, the midterm topics. Then you're gonna have one medium answer question. This is equivalent to the long answer question in the midterm. This is worth of 20 points. And it's drawn from the second half material. And you should answer it in a page or two. And then you're gonna have two long answer questions that are really long. So each one worth 25 points for a total of 50 points. Both of them are required. It's not like the midterm where you need to choose one. So please answer both of them. And they integrate material from the entire semester. And your answer should expand to several pages, two or more. So 10 multiple choices, short answer questions, and then one medium answer question, and then two long answer questions. All are required. You're gonna have three hours. You shouldn't have problems with the time. So it's not like the midterm, 50 minutes. So use your time wisely. As for tips, how should you prepare? Again, study the lecture slides. Then link that to what you learned in your programming assignments and your station material. Then start solving previous exams. Once you do that, you should be fine. How to answer? Please write your answer clearly. If you wanna maximize the number of points that you get, make sure we do understand. We can read your handwriting and we can understand what you're answering. So as much that we better understand your answer, you can maximize the number of points that you get. So draw figures, diagrams if they help, be concise and be organized. Use bullets if needed. How to allocate your time? Again, the points assigned to every question should tell you how much time you need. So long answer question, 25 points. So that means 25 minutes around half an hour. For long and for medium and long answer questions, know what is being asked. Please read the question completely. When we graded the midterm, so that many students didn't really read the question. For example, the question that asked you for the kernel privileges for multiplexing memory, many students didn't read that what we need is the special privileges that the kernel need for memory multiplexing and not the general kernel privileges. If also for the long answer question that was asking about predicting the wait times, many students just like they discussed some other wait times other than the one that were clearly asking the question that is when a thread is put to the waiting queue. So please read the answer two times at least. Read it completely. You do have time. Then start answering and link it to what you've learned throughout the semester. So that should be all for me. So you can, if you have any tips and you can start with me. It's last year's exam. So the 2016. Yep. I'm just gonna jump right into the short answer questions on this. Again, just a few minutes bang through the multiple choice. Just make sure that you haven't made a dumb mistake on those. And the other thing I can suggest too is you're going to have ample time probably to do all of the short answer questions on this. I know probably the majority of people did more than what was required on the midterm, but for the final, there really is no good reason for you're not taking a whack at all of them. And then you will have enough time to go by back and with a clear head decide, well, which of these is actually gonna be the ones that you wanna submit for grading on this. Actually, you know what? We pick the highest. So that is correct. So you know what? You don't even have to select. So just write down something for all of them. It does not hurt you on this because I'm gonna assume that time should not be an issue on this. So we begin with the first short answer question here, which is question number two and we're talking about RAID here. And remember, recall that RAID level one, we've got an array here. And remember what RAID one is? That's the mirroring thing. We've got two disks that essentially are duplicating each other. And well, so obviously you wanna make sure what's the difference between RAID one, RAID zero and these exotic things like RAID five, RAID six on this. So we're gonna start off with something like this. And it's not so much that Jeff wants you to regurgitate what it is, but well, why is it that he dives a little bit deeper? Why is there a asynchronous performance to be expected from RAID one? I'm thinking, hey, wait a minute. RAID one is not the one that we typically pick for performance. That is true, it's RAID zero on here. But what he's looking for is, do you know the fact that RAID one, we've got two disks that contain duplicate information here. So why can we expect to see a jump in RAID performance on this? Think about it like this. Since we have two completely duplicate copies of this, we can kind of mimic RAID zero in terms of reading on this. In other words, read a block from one disk and a block from another disk kind of almost at the same time. We can also take advantage of the fact that we have two different disks with two different disk arms. So in terms of where that file is, we can kind of look ahead and see, well, maybe optimize it. So we've got, when it comes to reading, the best of both worlds. We have two complete copies and we can kind of pick, well, interleave it to speed up the reading process. Obviously, that's not gonna be the case in terms of writing, so that's why we see this performance disparity in terms of writing. So that kind of would be the sort of thing that we would expect for you to kind of put down for this. So questions, RAID, what kind of makes sense? Okay, moving right along. Question number three, and look at that here. The performance hits paper that we talked about yesterday. In this case, it's showing up on a short answer question on here. And Jeff is asking essentially, you know what kind of list off a bunch of the hints? Well, that we talked about either yesterday, or again, if you took a look through the paper, you can kind of pick beyond that. So what are some of the hints that we can actually talk about? Well, Jeff has some listed that are there, but again, you can go beyond that. Well, we're talking about, like remember separating the normal and the worst case on this. We spent a little bit of time discussing that yesterday. And then just beyond that, we also want, watch it. Read the question here. It's one point per hit. And then he's also looking for one point per explanation on this. So in terms of we can get six points kind of maxed out at five points on this. But you can keep going on this. So we might take an example of what is an example of, let's say separating out normal and worst case on this. I got this from yesterday. I gave one specific visa V assignment three. But again, feel free to create an example that you might. I'm talking about swapping. I'm sorry, like the TLB shoot down. And remember the fact that in many cases, most cases you really don't have to do a TLB shoot down on this. So optimize your code pass that in most cases you don't have to deal with that and then separate out the worst case that that's going to just take some time. Plan to throw one away. And again, you could even say, how about an example of that is, well, my original design for assignment three or what have you, but not to be cynical on this. What he's trying to say is, don't worry about starting over completely over because very often you need to kind of let ideas percolate in your brain before you can come up with a workable idea on this brute force, do things well, complete in background, certainly background. That's not something that you had to do yourself in terms of programming, but let's say in a user operating system you could spin off a worker thread. Or actually, you know what? If you were towing around with, let's say, kicking off a paging daemon, that's certainly an example of doing work in background to try to optimize what's going on. So any of these, the grading staff is going to be flexible as long as you stick to the design of the question on that. So if you're thinking like, well, wait a minute, how much flexibility do we give? Read the verbiage of the question on this. So as long as you're fulfilling that, you're good to go on this. Questions about this in terms of just essentially, I don't want to say it's a rote memorized, but this is about as close as it gets to it. Just knock off a bunch of what the hints are and give an example of each. And again, that's the sort of thing really, probably five or six sentences tops because example, I'm sorry, hint example, hint example, done. Okay, question number four, Amdahl's Law. Now, reading here, remember a discussion of Amdahl's Law essentially that what we're talking about is if you work on improving one part of the system, typically the overall performance is going to be limited by the rest of the system. Remember back to 341 on this. My strong suspicion is that the people who, because you can see, most people did fairly well on this question, but the people who did not do well, it's probably because it's not because they didn't know Amdahl's Law, but because take a look at what the question is asking here. Just not asking you to state Amdahl's Law, but a particular corollary to that. So this is one of these, and it's not so much that we're being picky, but just do read and make sure that you answer what the question is asking for here. So in this case here, corollary was talking about that the longer you work on improving a particular subsystem, the essentially you're going to hit diminishing returns, the less result it's going to have in terms of improving systems performance on this. So that's what we'd be looking for. And then in terms of what we want to go beyond that is take a look at how this would guide you in the future. So you can say like, well, you know what? If let's say I'm really trying to tune the memory subsystem of my kernel, well, after a while, I need to kind of step back and see, is that still the biggest problem? And then maybe it is, in which case I need to keep working on it, but maybe not. Maybe I now need to, because I've exposed something else as being my biggest problem. So again, this question here probably could be answered or knocked off in two or three sentences, if that. So you don't need to list like all the equations for this, you don't need to go into lots of deep examples. Those will be nice, but the grading staff is going to simply say, do you know what the corollary is? And essentially, how does that apply your tuning and development of a system? Done. Question number five. And, okay, looking at the grades for this, it looks like this was a tougher question on this. But again, you know what? You might as well take a crack at it because, well, we're gonna pick the highest, actually the computer grading algorithm is going to scan over what the grades that were assigned and it'll automatically pick whatever the highest scores are. But question, okay, what's the difference between placing the buffer cache above the file system or below it? And if you recall, this was the subject of a couple specific lecture slides on this. So it's kind of specific, but you know, even if you, let's say you were completely hokey for that particular lesson. And you just, or you just don't remember that slide. I'm thinking, what were those, what were the pros and cons of that? You can probably reason this question out here. So in other words, where can we actually put the buffer cache here? So into the words above the file system or below the file system and what impact does that have here? Because, well, number one, what interface must the cache support at each level? And what's cached? And then, well, how does it kind of affect overall? So let's kind of take a look at these two things. Well, if we put the buffer cache above the file system, in effect, what are we intercepting? We're intercepting things in terms of OS 161. Well, remember the file system was given to you. That was that whole VFS layer that everyone was haranguing you, don't touch it, it works, it's a black box here. Well, your intercepting calls down to this from typically, let's say, the process file system, like a syscall or something like that. So your caching, in effect, let's say, opens, reads, writes, closes on this. So that's kind of what's going on in terms of the interface that we're dealing with. It's, well, a quasi-syscall interface that is what we're kind of having a cache in, well, the buffer cache on this. As opposed to if we put the file, the buffer cache below the file system, well, what does the VFS layer call down to? Well, I talked about this in one recitation, probably about two months or so ago. Remember, VFS layer calls down to the INO layer and eventually down to device drivers. Well, we're talking about accessing the hardware typically at the block level on this. So that's the, if you will, the interface that the buffer cache would be dealing with if we were putting it below the file system. We're talking about, let's say, dealing with it in terms of a block to be read or a block to be written on this. So there's pros and cons of that here. And what's going on in terms of, well, what do we want from this? We also need to know, okay, what's cached in each cases is it, let's say, data in terms of files or is it data in terms of blocks here? And how does that impact, well, how we do things on this? Well, again, there's pros and cons in terms of if we cache things at the file level, we can kind of get an idea about, well, what are the file access patterns? In other words, if this file X is read, we noticed from machine learning that file Y, it tends to be read pretty soon after. So we might be able to do some, let's say, prefetching or what have you. So we can study these file access patterns. That's something we wouldn't be able to understand with blocks because blocks, really, we don't know what they represent because remember, file X could be stored in one location in a disk, but then maybe somewhere else later on. So that definitely would be an advantage to having the buffer cache above the file system. But remember, Jeff also talked about if you put the buffer cache below the file system, what are we catching? We're essentially catching all IO on this. And that's important because in terms of the data that gets returned at the file system level, it's essentially just that, it's data. We're not really intercepting or dealing with metadata. And that's something, if we put the buffer cache below the file system, we're also gonna be able to cache things like inodes. And other directories are like bitmap or whatever it happens to be. We can cache that and speed that up too, which we would not be able to do if we had it above the file system. There's also another thing, kind of bonus points, if you noticed that putting the cache below the file system, you also don't have to worry about consistency on this too. Because if, let's say, at the file level on this, the user may think that, you know what, I've, let's say, done a call to fsync and I think that all the stuff is written out. Well, you know what, that may or may not be the case here. As opposed if we, let's say, put it below, we're actually gonna have, be able to eliminate some of these potential consistency issues. So again, this is something, my guess is, it requires a little bit more thought and that's probably why the scores on this particular thing were a little bit lower. It's also, it is a somewhat detailed question. It picks from a particular part of a lecture. But on the other hand, even if you didn't remember what those were, you can probably sit down and puzzle out and get a lot of points from this. Because, well, what is it? What, what is it that the file system expects and what is it that the file system kind of needs from the lower levels? And I can probably at least take a pretty good guess at what it is that the graders are gonna be looking for. Moving along, question six. And actually, whoops, you know what, in the interest of time, so that unfortunately, I don't have three hours here. I think I would bore you to tears if we did that today. Let's actually kind of skip ahead to maybe some of the, well, take a look at, let's say, the mid question, question number eight, virtualization. And here, essentially, what you're gonna be asked is, let's take a look at the three types of virtualization that are out there. And well, obviously, you need to know what they are. But what we also wanna know is, okay, what are the three types? How do they work? And what are the mechanics of making them work? What's virtualized? In other words, what's the interface that we're dealing with in terms of virtualization? And what are, essentially, the pros and cons? Again, in terms of this, it's stated like, what are the challenges to this virtualization approach? I can tell you, when you take your exam next week, most of the essay questions are going to involve, what are the pros of this approach, and the cons of this approach, as opposed to the pros and cons of this other approach? So if you haven't discussed the pros and cons of multiple approaches, you're probably missing something in your answer. It's just, again, it's the way these essay questions tend to be structured. So let's kind of take a look at these. First one, remember, full virtualization on this. And what are we, essentially, virtualizing? An entire operating system. And at the hardware level, we're kind of providing, well, we're not kind of, we are providing virtualized hardware. We're faking out a guest operating system on this. So that's what's being virtualized in the mechanics. What we'd be looking for as a grading staff is, obviously, you've got a guest operating system, you have to have a host operating system, and you also need, remember this widget, this VMM virtual machine manager, that's the virtualization tool. So in other words, you've got a host operating system, then an application, the VMM, and then above that, we've got the guest operating system. So those are the parts to it. Now on, and we would expect you to discuss a little bit about how it actually works here. And, well, we're saying, well, we're running the guest operating system as a program. Well, that's great, except what? What happens when we run programs that think that they're operating systems? You've got this problem with member traps because user programs are gonna be doing things like executing SysCalls or referring to locations and memory. And that's great if you're a real operating system, but remember, I'm just a lowly program, the virtual machine manager, and I can't risk the guest operating system trapping because I'm not gonna get it, it's gonna go to the real operating system. So this is the big challenge with full operating. I've got full virtualization, and this is what you need to do to solve it. Remember, this was the whole thing that was a big discussion about 15 years ago. Essentially, you have to kind of keep reading ahead and sometimes pre, let's say, compiling or recompiling on the fly certain instructions so that the guest operating system doesn't trap. That's what makes this all work. So logically, we get the same result, but what's going on is the operating system, we're gonna try to run as much code natively on the CPU, but some of it, we're just not gonna be able to do that. We're gonna have to intercept calls on the fly. And again, this is one of the things we talked about yesterday, kind of just in time compiling. Well, this is the very real use for that in terms of inserting stability into full virtualization. Now, it's gonna make it slow, but on the other hand, the nice thing is it works for all operating systems, and it's a great drag and drop resource for legacy systems. So that in a nutshell is full virtualization, and frankly, we would probably want you to mention all of that, the fact that it's full virtualization, the three components, the big challenge is we have to intercept traps, and we do that by recompiling on the fly. The benefit is we can essentially virtualize everything. And I know I as a grader, as a matter of fact, I did grade this, that is what I would be looking for in order to ladle out the full credit for the full virtualization on this, as opposed to let's look at the next one, per virtualization on this. Well, this also involves a guest operating system, and a VMM, although it's typically called a hypervisor, but kind of the same function here. There's no host operating system, though. So that's again, it's one of these, that's one of those, you just need to remember that. Don't mention it because points get lost that way. And the genius behind per virtualization, and the subject of research on this too, is remember the problem with full virtualization is it's slow. We're constantly having to read ahead and compile and interpret this. Well, remember this is the old software paradigm of dynamic versus static on this. Instead of having to do dynamic translation on the fly, let's do it statically. Let's kind of take a look at the guest operating system, where the problem children are in the guest operating system, and in effect, change things ahead of time. That's why it runs a lot faster. Now, the fly in the ointment with this is, we have to make these changes ahead of time, which means we need essentially cooperation with the guest operating system. In the case of something like Unix, that's great, where I should say Linux, it's open source. In the case of Windows, that means you're gonna need the support of Microsoft because they've got their source code in a lockbox. But again, it can be made to work, and it is made to work. So I would expect you to mention the parts in terms of guest operating system, a virtual machine manager. Again, there is no host operating system. We would expect you to talk a little bit about the pros and cons. It's gonna run a lot faster. As a result, it's great for running server farms on this. Con is it does require cooperation from the operating system designers on this. And kind of brownie points if you talked about the Zen people who got this actually working, they took advantage of the fact that, let's say, the Intel chips have multiple levels of privilege, so you can actually run the hypervisor at the really highest level of privilege, and then you've got the hypervisor and the guest operating systems at a middle level of privilege, and then the user programs at the lowest level of privilege. So that was kind of like a nice genius insight that they took advantage of. Then the last part, the container virtualization, again, the source for that was the research paper that talked about phones and how can you run multiple guest operating systems on a phone. And here, what are the parts? We've got a host operating system, but we don't have a hypervisor or a VMM per se. What we have is provisions within the host operating system to virtualize the namespace, hence containers on this. So we have, let's say, each pseudo operating system, all we're really doing is we're peeling off various levels that, let's say, I can have a copy of Microsoft Windows running in this instance, and, well, actually not, let's say, its process ID is X, but I'm running the same program in another guest container, and its process ID is Y on this. So I have to virtualize things like process IDs, file handles, namespaces on this. The upside to that is that what gets virtualized is at a much higher level on this. I'm not running an entire guest operating system. I'm only running one host operating system and a whole bunch of guest containers on top of that. So the downside is I have to have support for this from the operating system. I also am limited. I can only run one type of operating system. For example, like with full virtualization, I can virtualize Linux next to Windows next to perhaps even a gray market copy of OSX, all on the same machine. As opposed to container virtualization, if I'm running Linux, I can only have a whole bunch of Linux containers of the exact same type. Plus side to that is since it is lightweight, it's going to be great for resource constrained devices like a phone on this. So again, that's what we'd be kind of looking for for these three levels of virtualization on this and what are kind of some of the pros and cons. So questions, comments about this? And again, that was a discussion of the 20 point question for the others. Take a look at it. It's essentially, I would say the only big thing, I'll just mention this very briefly, like something like the heterogeneous cores thing. That's a case of Jeff is talking about if you've got two chips, okay? I'm sorry, two cores, a big core and a little core and how does it impact the various systems? You can see right away, you're going to have to discuss how it impacts virtual memory, how it impacts CPU architecture, how it impacts threading and processes. So it's cutting a wide swath across different topics and that's what he did expect you to discuss. And then the same thing is true in terms of the last question on this, in terms of let's say that we had a whole bunch of memory in the future, rather than a simple hard drive and dynamic memory, we're just going to give you a terabyte of flash memory and how does that change your system design on this? Again, it's going to impact files, it's going to impact processes, it's going to impact virtual memory and we would want to see a discussion of this. Make sense? Okay, we'll stick around as long as there's any questions, but good luck people. Thanks for taking OS 421-521. Yipper, our pleasure.