 Okay, welcome back to CS162 everybody. We are going to pick up where we left off on Tuesday and tell you a little bit about some queuing theory and then pile into some interesting things about file systems. First however, I want to remind you what we talked about last time. We were talking about spinning storage and the fact that inside a disk drive there are a series of platters as I show in this diagram. There is a head which can move in and out to talk to different tracks. These platters have two sides to them and there are heads on each side. All of the track is a circular path on one surface and all of the tracks that are on top of each other are called a cylinder. There's three stage process to reading or writing data. The first is basically the seek time which is the time it takes to move the head into the right cylinder and then there's rotational latency which is you wait until the particular sector that you're interested in rotates under the head and then the transfer time which is the time to actually get the bits. Notice that typically the rotational latency is half of a rotation on average. That's a probabilistic statement. When we then take into account queuing and controller time which we haven't talked about queuing at all yet. Then the disk latency for an access is a combination of the queuing time plus the controller time plus seek time rotation time and transfer time. All right. There was is there a question here we have is is there an average time for the seek time as well. So yes and typically the seek time is stated on the specs of the disk that might be for instance four milliseconds. And as I briefly mentioned last time if you have a good file system that can keep locality then it's going to be attempting to move a lot less. And so that seek time that's specced is actually the time on average to go from any track to any other track. That's an N squared computation. But on average you're going to get something more like 25% or 33% of that spec seek time. So that's a good question. So disks only rotate in one direction. Yes. And it takes a lot of time to start and stop them because they're mechanical things. So typically you'd only stop them when you think you're going to be idle for a long time. And then one more question was why don't we have one head for each platter. I mentioned that last time. The answer is that it's just expensive because the piezoelectric movement of the head and the heads themselves are all very complicated and expensive. And so since disks are a commodity part, you basically only have kind of one conglomerate head that moves together. By the way, we do maybe I misunderstood that question. There is one head on either side of each platter, but they're all tied together and they move as a whole. So maybe that was the question that was actually being asked. All right. Now last time there was some interest in what exactly flash memory was. I thought I would say a tiny bit more about that since that was brought up. And any of you who have actually taken a class that has CMOS transistors in it, like 141 or something like that, will recognize this diagram where we have silicon. There's a dope source and drain. And typically on a transistor, there's one gate. And if we raise the voltage on that gate, there's a single word line then current will flow. And if we lower the voltage, then current won't. And so this becomes a switch. What we do that's different for flash memory is we put two gates with an insulator in between. And so you can't actually cause flow between this under normal circumstances. And so what we do is when we want to program this thing, we raise the word line to a really high voltage, which basically forces electrons to go across that oxide layer and get trapped on the floating gate. And the result of those electrons being trapped there is just like a movement on the word line. And so then that will actually, even though we remove all the voltage on the word line, we can either have current or not depending on whether or not there are electrons that are trapped there. All right? And so that's basically how flash works. And the erase process is basically raising the word line far enough that the transistors are then attracted off and you get rid of all the trapped, excuse me, I meant electrons get pulled off of that. So that's called a floating gate. There's two varieties of flash memory NAND, which is much denser and NOR, which is more synced fast, but not really used very much these days. And there's even 3D stacking of these. And so a lot of modern flash chips have many layers of transistors on top of each other. All right. So the basic idea here is trapping electrons on the floating gate and that will distinguish between a 1 and a 0. The other thing I wanted to point out here is this is the NAND structure. So those of you that know something about transistors might recognize this. But what happens is we have one page, which is 4K bytes, is a whole group of these bits with the floating gates. And the pages are built up into blocks. And these blocks are a little different from what we talk about on a disk drive. These blocks are multiple pages. And the key thing that I said last time is basically that you have to erase a whole block at a time. That might be 256 kilobytes. And then you can selectively write some bits on a page. And then you end up using that page. The problem is that you can't write over a page that you've already written on. So you have to erase a whole block first and then you can start writing on the pages. And so that makes the management of this flash memory more complicated because the controller for the flash has to keep track of which blocks have been erased and then it can hand out empty pages. And then it's got every now and then basically group pages together that still have valid data on them and then erase big blocks so that there can be a free list. And so there is some communication between the file system and the flash as to which pages that still have ones and zeros on them are actually not in use anymore so they can get collected together to be erased. And so you could actually say that in flash memory, copy on write is the normal behavior because if you write a block and then you want to write it again, you end up having to copy it somewhere else and write the bits you wanted to change. Okay. So the summary that we had for SSDs was really the pros against disk drives. It's low latency and high throughput because there isn't any seek or rotational delays. There's no moving parts and so it's more reliable. It's lightweight, it's low power, very shock insensitive, which is also good. And you can read at memory speeds. Write is more complicated as I mentioned because there is this process both of having to raise the voltage high enough to attract or force electrons across that insulator barrier but also you have to erase every now and then and keep free list of erased blocks. And so the writing is a little more time consuming. The cons again are there that SSD storage is typically smaller than a disk and a little more expensive but that keeps improving and where out happens. If you write over transistor too many times, it'll stop working. And I can actually show you why that is. The process of writing is basically forcing electrons into sitting on this floating gate. The problem is that when you do that forcing, some of them get lodged actually in the insulator and over time the fact that there are those electrons that are stuck there and can't be removed kind of removes the effectiveness of the gate and so after a while the flash memory wears out. Okay. And then also is there any consideration toward optimizing SSD access properties? Is the question minimizing writes in any kind of software? Yeah, so people do all sorts of things to optimize for writes. I mean the simplest thing that happens is the SSD itself, the controller basically does what's called wear leveling and it makes sure that all of the blocks and all of the individual transistors are written in a regular fashion to try to spread all the writes out to try to not wear out any particular bits. But then there's also some software that tries to minimize writes and we can talk about that a little offline if you like and I may mention a little bit about that when we talk about flash file systems. So the final thing I wanted to mention here is that the cost profile of SSDs versus HDDs, hard disk drives has been changing pretty drastically and getting better. And so, you know, now in the 500 gigabyte range it's pretty easy to find a low cost equivalence of flash that are very replaceable for hard disk drives and it's only in the really big ones that it's still pretty expensive but this profile keeps changing and so they get cheaper and cheaper over time. And you know, I haven't, as I mentioned last time, I haven't bought a laptop with spinning storage on it in years because the advantages of the SSD far outweigh the slight cost and increase and slight small, smallness of them. Okay, so then now to move on to our new topic. So we were talking about IO performance where the response time here is something to do with the queue which we haven't talked about plus IO device service time which it can be controller and something about the IO device which varies on the type of device and the performance of an IO subsystem there's lots of metrics you might have like you could talk about what's the response time for an access or what's the throughput in number of transactions per second. The effective bandwidth per operation which is sort of a transfer size over response time and so that would be the total transfer size of n bytes divided by while there's some overhead typically and then some bandwidth to write and so the trick is if this overhead is too high then your effective bandwidth is very low unless n is large. Okay, and so this s which is the overhead becomes very expensive with disk drives because as you might remember from a moment ago here we have to seek and then we have to rotate and then eventually we can read and write it full bandwidth and so it's very important for us to optimize file systems to try to not seek to seek as little as possible. The other thing here if you notice is the typical queuing response time has this curve to it whereas we get closer to 100% or 100% utilization our response time goes through the roof and this surprisingly is an artifact of the queue itself and the probabilistic entrance of data into the queue and so this queuing behavior actually can be far more costly in terms of performance than any of the controller and IO device service times. So contributing factors to latency as we mentioned are all the software paths that come through the operating system and queuing and so on which you can loosely model as a queue. There's the hardware controller itself and then the IO device service time and so one of the advantages of switching to an SSD over a disk is of course you don't have that seek or rotational delay and so you have a much faster overall performance regardless of how spread out the data is and that can be very important. But what about this queuing behavior? Well it can lead to really big increases in latency as you get your utilization closest to 100% and I've mentioned this before in the class and I'm going to say it again my utilization horse here that I would climb on to is to say never run anything at 100% so you as engineers out there should keep that in mind you're welcome to quote me on that because anytime you get close to 100% of the weight bearing capacity of a bridge or the utilization of a disk drive or whatever there are always bad effects and queuing is just one of them. You can get cascading failures in a bridge as you get close to 100% and so really what we would like is I haven't explained this behavior yet but if you knew you had this behavior you clearly want to be in this lower part of the utilization curve where you're in the roughly linear region rather than in this exponential growth region here which is when things are getting really bad in terms of latency just a little bit of an increase in utilization and suddenly your response time increases quite drastically. So remember, Kubi says never run anything at 100% you guys can quote me on that. So let's look at trying to figure out where does this come from. So if we were in a deterministic world which of course we're not life would be much simpler. Okay, so we'd have arrivals would come into a queue and they would be arriving on a regular interval. We're going to call that T sub a so every T sub a some item arrives. That's like a request to read something off the disk for instance every T sub a it's deterministic. So these T sub a's are all equal. And then it sits in the queue for a little while and then it comes out of the queue and this is the time in the queue. And then it gets serviced the server we're also going to assume is deterministic and so the amount of time is always TS. And in this deterministic world what we see here is something pretty interesting which is you know the item goes into the queue. It waits there just a little bit it's immediately taken out it goes into the server for TS time and we can exactly predict what's going to happen here because we know that from the moment we put the from the moment the item arrives to when it's done is really just TQ plus TS and it's always the same. Okay, so this particular world is easy to analyze. We can start talking about mu which is a variable you'll see more of in a moment, which is the service rate how fast can we serve things at this server. It's really one over T sub s. So that's the number of operations per second we can serve and that's just one over the time to serve. Okay, and so the maximum that this server could do is, you know, an item every one over TS or that really means is if we fill up T a with TS. At that point we're 100% utilized. Okay. The arrival rate which is how fast things are coming over is one over T a. So notice that's how fast are these things arriving per unit time and it's really one over the period. Okay, and that's often lambda so lambda and mu are very common variables that you should be quite familiar with. Okay. And again, lambda is the arrival rate mu is the service rate and this utilization which is lambda over mu is an important quantity. So if lambda is less than mu so the rate at which things arrive are less than the rate at which we can service them, then we're good to go. All right, if it's if we're in a situation where lamb things are arriving faster than we can service them then we're in trouble. Now there's a question about why why are we not talking about one over T sub s plus T sub q. And the answer is that the queue is an independent item from the server so we can pipeline through the queue into the server. So the the bottleneck in this scenario is T sub s not TQ plus T sub s T Q plus T sub s is really from a person who's watching an arrival going through the system and departing. They're going to see that it takes that amount of time to get serviced to them, but in terms of the throughput of the system we really only care about T sub s hopefully that answered that question. So, now the average rate is the complete story in this and it's an easy thing to analyze now the problem with this of course is that the world is very rarely deterministic. But let's continue with this for a moment. So the offered load is really that the TS over TA is the offered load here. And if you look at this amount that we put in here, as we go from a utilization of zero to a utilization of one, we can certainly handle this because we're loading the server and the time that things are stuck in the queue is constant, and it's small. So if you notice this, as long as this service time is smaller than my arrival time, I'm okay. And basically the, you know, I'm always just going through the queue and it's also always constant because that is purely I'm going to assume here for a moment that the reason that TQ is not zero is there's some pointer manipulation or something there for getting in and out of the queue, but it's really small. Okay. So, once, however, we start getting to where we have more things arriving than we can serve. So in other words, the arrival rate is too fast then we're over in this region where utilization is greater than one. And at that point, what happens is the throughput we get out of the system keeps growing until it's equal to one. And then at that point, we can't get a throughput more than one so we can't get more than sort of that server 100% busy, because if we try. Well, nothing's going to happen right if this server here is 100% busy we can't shove more items through that than 100% busy. And so at that point, our throughput basically is is saturated. And we can add we can have many more things that arrive but what's going to happen there the queue is going to grow right so the time we wait in the queue is just going to keep growing here and in fact we are, you know, we could end up waiting an arbitrarily long time if there's too many things arriving. Okay, in this ideal linear world we've been talking about it's still very easy to analyze what's going on you can see the saturation, you can kind of see that the time it takes for you to get serviced keeps growing and when you're in this region here, but it's growing the queues growing without bounds so you can see what's happening is we're still serving at a given rate that's why our saturating here but the queue just keeps growing. And so this queue is going to grow without bound, but it's pretty easy to evaluate. And what you can see here though of course is utilization equal to one is our key point here if we're above utilization equal to one we're unbounded if we're below it, we're serving properly. Now, let's change this up a little bit to something more realistic. So now the arrivals are going to be bursty. So they don't arrive at exactly the same rate. They're going to arrive at some rate having to do with the software that's running and the number of processes that are running and it's going to be basically bursty they're going to kind of arrive really quickly and then there might be some idle time and another quick burst. What happens here is the following. So the first one arrives, and then we start serving. Excuse me. Okay. And if you look. What happens when the next one arrives. Okay, well the next one here shows up, but notice how it's showing up before the first ones done being served. Okay, and by the way I took this service time and I added it to the queue time and this total here you can see is the time that we're going to be seeing from the standpoint of somebody who's putting a request in and getting it out. But so we're busy serving here but the next one arrives before the service is done and now it's got to sit in the queue and it's got to sit in the queue. Until the server's done right and then if another one arrives also in that time now we've got two items that are queued and one that's being served. So now we're in a situation where the number of items actually in the queue is growing rather than being essentially zero, which is what it was before and what by the time. This guy finishes. Oh wait, excuse me, I let me let me backtrack this white one is actually another one that arrived here. Okay, so at the point we've actually got three items in here white, orange, blue. Thank you. Sorry, sorry for mention screwing that up. And if you notice what happens now is by the time the blue ones done then the white one gets to run. And now the one that's at the head of the queue is orange and the head of the next one is blue. And then when the white one finishes the orange one gets to run and now the only thing in the queue is is the powder blue. And then finally we get to run and we're done and maybe we pick up from that point. So if you notice what happened is things bunched up we had a blue one arrived and a white one that an orange one that a blue powder blue one. And three of them are sitting in the queue so the queue is building up and then it, then it sort of the queue decreases and drains as no more items come. And what you can see here is that burstiness leads to the queue growing, even if the average is such that our average utilization is not over one, but we still have these bursty periods where the queue grows and then it drains out. Okay, so this situation I've shown here is the same average arrival time but almost all the request experience very large queue delays. So look at this look at the white one arrived here, but it's not finished till there so that's a much larger queuing delay than this blue one. And the orange one arrived here and notice when it's done. Okay, so it's queuing delay is huge and then this powder blue one arrived here and look at how long it waited. So even though the average arrival rate is the same and we've got a utilization that's less than one, we're still building up queuing delay because of the burstiness. Okay, I'm going to pause on that. All right, so this is a situation where the average arrival rate is still low enough that our utilization is less than one but because things have burstiness on the input, our queue builds up and our average response time gets long. Okay, are there any questions on that. Now just to back up here for a moment. I want to show you this curve, see this curve. This is a queuing delay curve and what you see is, as the throughput gets closer to utilization and we've got randomness that's what this curve is about. We grow our response time keeps getting bigger as we get closer to full utilization. And you can understand that in this graph here by if we're, if things are arriving at a rate that is very close to 100% utilization, then these, this server is going to be almost totally full by the time we get the next burst of arrivals. And so it's going to take a very long time, for instance, from this arrival of this blue one to its service time and so we're starting to get that part of the curve that is growing rapidly. Okay. All right. Now. So how do we model bursting. So are there any questions on this. I think this is what's fun about this simple animation here is I think it shows you how just a little burstiness can lead to very long service time. And it's a little counterintuitive the first time you see it. But I just wanted to make sure that everybody kind of caught that. All right. Now, how do we model that burstiness. All right, well there is one option which is if we don't have any information about what's the, what's this arrival rate look like, then one thing that we can do is use something that's called a memory list distribution which is exponential. Okay, and it looks kind of like this and it says that the likelihood of the arrival of an event, you know, what's the distance between the arrival of one event and the next looks like this exponential curve. It's a lambda e to the minus lambda x is this curve. Okay, and the mean arrival interval is one over lambda. Okay, and this is called the memory list or Poisson distribution or exponential distribution. And there's a lot of short arrival times that's because it's bursty but then there's this long tail where there's a, there's a occasionally a very long interval for arrivals. Okay, and this mean arrival interval is the average arrival rate and that's the thing that we divide by the service time and hope that we've got a utilization less than one. Okay, now let's think about what memory list distributions really are. So first and foremost, they are what you model when you don't have quite enough information. The good thing about a memory list arrival is essentially that if you have a bunch of independent things that are all generating events and you feed them together into a common queue, then what happens is you get something that somewhat converges toward a memory list distribution, even though the individual ones weren't memory lists. And so this is kind of the default approximation people have when they don't have any other information. They say, well, I know on average, things are arriving at some rate. And so then I'll come up with something so that that average is equal to one over lambda and they call it memory lists. Now, why do we call this memory list? Well, this is exactly like sort of waiting for a bus in Berkeley, right? So the thing about this arrival rate here is that if you've been waiting for an hour, you might think with a regular probability distribution of arrival times that that out the fact that you've been waiting for an hour might tell you something about how long until the next bus arrives because you say, well, it's going to arrive soon because I've been waiting so long. The key thing about memory lists is if you say I've been waiting for two units of time, like I've got here, and I rescale the remaining part of this curve with the knowledge that it's already been waited two, we already waited two units of time, what you'll find is the curve is exactly the same shape. So how long you've waited tells you absolutely nothing about how long you're going to wait. So memory lists just like buses in Berkeley. Now, let's, before we sort of give you a queuing theory model that you can use, sometimes in the old days I used to derive this in class, but I think this is good enough for you to use I want to make sure that we all have some common words about time. So if we have a distribution of service times what we see is we have some graph with an average in the middle that says this is the average time it takes for either something to arrive or for the next thing to arrive or for the next thing to be serviced. And the meantime is the average. Okay, which is the sum of all the probabilities of given time times time. Okay, so that's the average time. There's something called the standard deviation. Okay, which is the square root of the variance. And that is basically something about how much of a bell shaped curve this is or how many standard deviations it's got is something you're also used to from thinking test scores for instance. But the last thing which is interesting and probably not something that you're so familiar with is called this squared coefficient of variance, which is the variance that's standard deviation squared divided by the mean squared. And this is a unit list measure. And what's surprising about this is this unit list measure is often enough to tell you something about this distribution without having to know anything else about it you don't have to know its shape you don't even have to know the mean or the variance. All you may need to know is this C unit list C, and it turns out a memory list distribution has C equal to one. Okay, and no variance or deterministic where basically it's always the mean is a very is a C of zero and you can see why that is if everything always came at exactly the same time then this sigma is going to be zero okay and so Sigma equal to zero means that C is zero exponential is one. Okay, and this is where past tells nothing about the search future is a Poisson process many complex systems as I mentioned kind of converge to that. And then disc response time surprisingly tend to be with see a be a little bit bigger than one. So this is where the majority of the seeks are somewhat less than the average and this is oftentimes has to do with the way that the file systems are put together. So why am I going to all this trouble with you guys, I want to tell you why. First of all, you need to have an idea for the input in the output distributions when you're trying to analyze something like a file system. You need to know for instance what the the average time between arrivals of requests, the very the standard deviation of that arrivals and see and if you have enough of these variables then you can come up with a queuing model, and that queuing model can let help you predict where you are on that curve in terms of the queuing delay. What about this queuing time. Okay, so let's just a very quick question about the pre slide. The, the 1.5 C for the for the actual server. This response times that's from that's from data collected like in the real world. Yep. This is, this is a number that has shown up with a lot of real world data that's correct. And then is there any kind of, is there any kind of mathematical distribution that matches that. Well, it's, yes, anything for which the sigma squared over mean squared is equal to 1.5. So there's actually a whole bunch of them that match that was like a whole family of shakes. Yep. Great. And actually, let's, let's first try to type in some questions, although that one made sense to ask verbally. All right. Good. Now. Yeah, so that's but following up to that question notice that C equal to one, there's a whole family of things for which sequel to one memory list is only one of them there are others like that. And I'm guessing that memory list distributions have sequel to one. So if it's memory list, you know, see, all right. Good. So, what about the queuing time so we've been talking about a queue with a controller and a disk. So in order to really talk about this, we need to look at the system as a whole so we draw a box around this system, and we look at arrivals coming into the queuing system and departures. So the theory of the type that we're going to talk about in this class is really a long term steady state behavior in which the arrival rate and the departure rate are equal to each other on average. Why is that well if the arrival rate and departure rate on average weren't equal, then the queuing system is not in a stable state and the queue is either growing without bound or shrinking without bound down to zero whatever. So they have a stable situation. Now it's true that there are lots of transient problems that are interesting, where you sort of ask what happens before we get into stable state. That's a whole another class that you can take on queuing theory which we're not going to do in this class. Okay, so we're going to be looking at the steady state behavior. So arrivals are characterized by some probability distribution. Departures are characterized by some probability distribution. And if we put them together, we can sort of ask ourselves how much time is spent in the queue between when arrival hits the beginning of the queue, and when it finally exits. And that's kind of our goal. Because our goal, we've already figured out kind of how to deal with a disk and a controller in terms of latency but this queue is kind of a, is a weird black box for us and it'd be nice if we had some way to at least estimate what the queuing delay would be. And to do that, we're going to talk about little's law and I'm going to do a lot more about this after I talk about administration trivia, but for now little's law where there's arrivals and departures going through a system in steady state looks basically says the following so we assume that the arrivals are coming in at a rate of lambda. Okay, that's our arrival rate. We assume that there's some average latency for getting through the system. And what little's law says and little's law is an incredibly general stable law no matter what the probability distributions are coming in or leaving says that the average arrival rate equals the average departure rate implies basically that the number of jobs that are queued in the system here as a whole is equal to lambda times L. Now, let me give you my McDonald's intuition. So this is the McDonald's intuition of little's law. Imagine that you've got a long line going up to the counter of McDonald's and it's stable system. So on average, the line is always the same length on average. And what you see there is that if you enter the door, and you look at the number of jobs in front of you that's number of people in line. And people are arriving at a certain rate. Then what you can say is you look at the number of people in front of you in line and then when you finally get to the counter and you turn around and you look at the back you're going to see there's as many people in line as there were when you entered. What does that mean that means that if you take the amount of time you spent in line that's L times the rate at which people were arriving while you're in line. And you turn around and you can see that that ought to be this number of people in line. And so that's really what this is all about. This says the average number of people in line at McDonald's is equal to rate at which they're arriving at the door times how long it takes for you to get to the counter. And it's an amazingly useful law and you kind of run into this anytime there's a probabilistic system. Regardless of the structure burstiness variations in service, etc, whatever as long as this is a stable state. And arrivals overall match departures then little's law applies. Okay, and this so this is a very useful, useful law to remember. Okay, any questions. I'll give you a little proof that's easy. Oops, in a moment as how this works, but okay. Now, as a very simple example, by the way, is if it always takes exactly five units of time to to basically serve you or that you wait for for five units of time in line, and people are arriving at one every unit. Because here is here is you you walked in the door you got here at time zero and five units later you got to the counter. The next person came in at time one they get out at six time to they get out at seven. So each one of these are people in line. And if you slice over at any point in time and you look how many people are in line what you see here for instance is it's always fine. Okay, and so that's in essence the simple way of thinking of little's law when these numbers are not probabilistic but determined deterministic. So the response can be thought of that's a good question isn't response the time between arrival and completion of the service. The response can be thought of in a lot of different ways and it depends on what you're measuring. So you could talk about the time from entering the queue to departing I think that's what you were talking about that's the full system service time. You can talk about the time from entering the controller to exiting the disk. That's the disk service time okay and so you're when you're talking about service time, you have to be careful. And so, for instance, if we were to compute the time you wait in the queue, and we add to that the average time it takes to service one request you put those together that ought to equal the total system time which is the time from when you enter the queue to when you depart. And so the important part about any sort of queuing system is making sure that you keep track of what you're measuring and which part of the system you're measuring. All right, did I answer that question. Hopefully. Good. All right. So that's lambda times L is number in line. Okay, and so let's see I thought I thought I had administrative before we did their proof sketch hold on a sec here. Okay, let me just do administrative first here. So, oops, I didn't put anything on this. So, a rough cut. So what's going to happen with the midterm next week is the following so we moved it to next Thursday lectures 10 to 17 or what are important here. All right. And so make sure that because this is next serve Thursday and is going to be from five to seven Pacific daylight time make sure you filled out the conflict form. Alex has posted that we're going to try to actually do this live. Okay. And so what we're going to do is a basic mechanism and we're still going to announce more information as we go forward, but we're going to release an answer book. So you can print it out on a printer somewhere. And we'll do that early so that you can get to your printer or whatever. And then we'll start the exam on time and send exams out to you send the actual questions out to you. You're going to write the answers in the answers into the exam book. And then when you're done you're going to scan it with your camera with your phone. Now there is a question what if you don't know printer or have access to one outside campus. I think we will why don't you post on piazza will try to figure this out we're hoping to give you the exam book early enough that you could potentially print it somewhere else as well. But let's see if we can figure that out. So we're, I wanted to say that we anticipate that people are going to do well on this exam so we're clearly in a different mode now with this class and we were when everybody was co located at Berkeley. And so, among other things, we're basically not, we're not going to be curving the results. We are going to be invoking the honor code that you're not going to ask others for help. Now we're of course going to also have scrambled exam so that not every exam is going to be the same. But, and we're also not going to curve and we're going to assume that people are going to do well. So, this is thinking this almost, I don't know more like a quiz where we're going to lower a bit we haven't figured out exactly how much the value of this. And we're hoping that people just do the work themselves and we're hoping to lower the advantages of cheating, but also by not curving it you're less affected by others cheating. So, that's our goal. And hopefully, we'll figure out the final details by next Thursday. And some of the details about people not owning printers make sure to start a question about that on piazza and we'll figure out what to do with that. All right. Okay. Now, I will also say that one of the reasons we're going this way is that the the Berkeley has actually told us we're not allowed to do any proctoring remote proctoring at all until they figured out what we're supposed to do about remote proctoring. And we may do, we may get something by the time for for the final we're not sure or for midterm three we're not sure yet but this is our this is our attempt at experimenting to see how this goes. Right. This this will be basically open book, but it's not open people so you're not allowed to talk to other people during this. Okay. And we'll give more explicit details about this when we go for forward. Okay. Now is it open Internet. We'll assume that we haven't answered that yet I'm going to say no for starters but we may change that we'll see. Okay. So, the other alternative was to not do midterm to at all. I think of this as we're trying this out to see how this works and it's a way to get some additional evaluation for people and keep in mind by the way that you know the default grade in this class now is past no pass and you have to request the grade. And people in Berkeley would really like or the Berkeley administration and a number of people in the department would love people to just keep with past no pass as a way of reducing the overall stress but I know that may or may not happen. Okay. So, I think we'll leave this for now. But we're trying to make it as equitable as we can for everybody and we're hoping that people are going to be honorable about this and go from there and we'll see how it goes. And so we're not curving so just so people know. All right, good. All right. Now, I think I'll briefly show you this little proof sketch it's cute. But this is another way to look at more general little theorem. Basically, if we have an a length of time that varies so notice and not all these blue stripes are the same. And we have a number of jobs in the system that varies and we want to come up with an average relationship between the average number of jobs and the average response time and the arrival rate. What would we how would we prove anything about that when we have arbitrary distributions and the way we would do that is we say well take a time t layout people as they arrive and this is the length of time person one spends in the system. This is the length of time person two spends in the system person three and so on and lay them out like this. And if we take any slice we're going to see the number of people in the system at any given time and the trick is what's the average number. Okay, and I'm not going to spend a long time on this I just want to give you a rough sketch. Okay, and the way we would do that is we would instead of L we're going to talk about area. So this blue area. If I imagine that these are all unit one in height, then the area is really just L times L one times one so we could say that the area of this whole blue thing which is the sum of all the individual areas equals just L one plus L two plus L three, because I'm assuming these rectangles have height one. Okay, so this is don't think too hard on this because if you're thinking hard then you're thinking too hard about it. Okay, very simple. Now, if I take all that area and I'm interested in a cross section, well then I just asked myself sort of what is the average area. At any given time. Okay, and that's pretty simple it's basically the total area divided by time gives me how much area in a given portion of time. Okay, and so S over T is the same as the sum of L's over T, as I said before, which the sum of the L's over T I can sort of put an N over, put an N over T, and this over N so all I've done is multiplied and divided by N total number of jobs. And when I'm done with that what I end up with is that total jobs over T is the average arrival rate, and all the L's over N total is the average time and if you notice then that means that the average number of jobs is equal to lambda times L. Okay, and so that's where it comes from. And it's very general doesn't matter what the distributions are sort of a general very simple system. Okay, now. Here's some results for you and this is where this becomes useful now. So we assume the systems in equilibrium there's no limit to the queue for a moment. Okay, so let's imagine we have an infinite size queue and then we can deal with non infinite for a moment. The time between successful arrivals is random and memory lists. And what does that mean that means for the moment that the arrivals is is a memory list process that can be defined by lambda. So lambda is memory lists, but the server isn't so just because you know disks are not necessarily memory lists. And so the server which is some combination of the controller and the disk service time. We're going to say is one over the average service time but this is not memory lists this could be an arbitrarily complicated probability distribution whatever you want. Okay. There's memory lists coming in but arbitrary going out. Okay, and what do we, how do we describe that well lambda is the average number arriving. The service time is the average time to serve a customer. So in that McDonald's example it would be if you took the time from when you got to the counter to when you had your hamburger, and you average them all up that would be M. M one or the mean that's T sir and C is the squared coefficient of variants which is Sigma squared over M one squared for the server. And so this C if this were a disk we might have said earlier as 1.5. If this were a memory list process it would be one and so on. And so these three parameters lambda for the rival and then T service and C for the server. We really are enough for us to find something out for one thing we could compute mu by just taking one over T sir that's already there. And so now we have lambda and mu, and we can talk about the utilization. Okay, which is lambda over mu, which if you multiply that out that's also lambda times T sir. Okay, and what do we know we know that you is better be between zero and one if you is bigger than one then things are arriving faster than they can be serviced and this queue will grow without bound and that's bad. Okay, parameters we wish to compute time spent in the queue. How long are we spending time in there that might be interesting. The length of the queue how many items are in there on average well it turns out if we know the time in the queue little slot of the less to the rescue. So lambda times TQ gives us LQ so this length is easy to get once I have the time and so really all we care about is the time. And so here's a couple of results for you. They're fairly easy to derive and there's some references I'll give you a moment where you can go see the derivations, but if the server is memory lists. So if we have a time where we call a memory list input memory list output one QMM one, then what's interesting is the time spent in the queue is equal to T sir, which is the average service time times you over one minus you. Okay, where the utilization look at this item here if you is zero or close to zero. The bottom doesn't matter as you get closer and closer to one notice that the denominator gets close to zero and TQ goes toward infinity. So if you get utilization close to one that's bad. Right. Now in the general version where we have a general probability distribution here or an MG one that's memory list input general output. It's a little bit complicated but not too much. So notice T sir is the same and you over one minus you is the same, but one half one plus C is how we account for an arbitrarily complicated distribution in the server. All right now. Come back for a moment from your shock at a bunch of equations on a 162 slide for a moment and look at the difference between MM one and MG one notice the very slight difference here is just this little factor here of one half one plus C. That C is a number and it's all that's needed no matter how complicated the probability of services. That's rather remarkable, I think, okay, that just having see here is enough to describe actually see and the average service time is enough to describe this regardless of how complicated the service time is. Okay, and by the way if you plug one in for C for memory list and one plus one is two divided by two is one. And so when you put C equal to one it degenerates down into that. The other thing to notice is this factor as I mentioned here you over one minus you is exactly this guy. And it's the fact that now these equations aren't talking for talking about the overhead piece which is, which is independent of the queue. It's talking about this piece, and that that growth goes up toward infinity as you get to 100% Okay, so the utilization gets higher and higher and we get close to using the disk at its full speed. We get closer and closer to an infinite response time and that actually will transfer into the reason this is going toward infinity is literally because the queue is getting infinitely long. Alright, so again, that's why you don't want to be close to 100%. Now, of course, can anybody tell me why in real life, obviously there won't be an infinite response time. Yep, finite queue. Good answer. Okay, yeah, there's a queue that's finite, which means that at some point what happens is the queue stops allowing things to come in. And so you end up basically saturating your ability to put into the queue, which means that whoever is putting requests in is backed up with flow control in some sense and you can't put any more items in. Okay, and so there at some point when the queue fills up, that's going to limit your arrival rate but assuming you put a big enough queue in here so that the queue is not your normally your limit. Then it's still the case that if you arrive too rapidly you'll get very high response time here. Okay. Now I'm going to stop for a second and ask if anybody has any questions here. And while you're thinking of your question notice what I've got here in red lambda T sir and see these are the only independent things you need to figure out how to derive in order to handle a queue queuing situation of an MG one queue. Okay, it's just those three items because everything else here is derived. It's all derived, you is derived, TQ is derived, LQ is derived, they're all derived once you know those three items. And so if we were applying this to a real scenario, we would first try to come up with these values. Okay. Right. Should I move on. All right, let's see how we might do this. All right. I want to ask another, give you another very simple little simulation in some sense of why you get an unbounded, abounded response time. So if we arrive, if items are arriving deterministically, it's possible to sustain utilization equal to exactly one in this unrealistic situation where things are arriving. Deterministically, one after another and they take, there's no burstiness and they take exactly the same amount of time to arrive we can in principle at least it to be serviced I mean we can in principle at least line them up and to end and get 100% utilization of the disk. Okay. But in the moment we have any randomness at all or burstiness, it's no longer possible. And that's when we start getting this interesting behavior and why is that well, because here's a burst notice we had arrive arrive arrive and then they get serviced. And then because of the burstiness there are long tails so then there are periods of time that you don't have arrivals. And then we have the next set gets serviced and so on, and that burstiness means that we have these gaps and these gaps never you never make up for the gaps. Okay, and so that's why you can't get utilization equal to one when you have any burstiness on the input because you have these gaps preventing it. Okay. And that waste of the wasted time here is never reclaimable because there's nothing to service here we're actually idle the disk is is not doing anything during this gap when you have burstiness. And by sort of set up when I set up these equations for you, we're assuming this arrival rate is a memoryless arrival rate which is by definition bursty because it's got a Poisson distribution to it. Okay, and real life is probable as they can bursty. The last thing I want to do before we leave the queuing theory here for you is let's give an example. So suppose a user request 10, 8k blocks per second. Suppose that the request and service time are exponentially distributed for a moment so we're going to say sequels one on input and output. The average service is 20 milliseconds so that's will assume is the controller plus the seek plus rotation plus translation, plus pulling the time to take things off of the disk and transaction service time. And what you see here is how utilize is the disk. Well, from that previous slide we see that utilization equal lambda time service time. So the answer is lambda time service time you could compute that. What's the average time spent in the queue time in the queue. What's the average number of requests in the queue L queue. Okay, what's the average response time for a disk request there was a question earlier about what's the total time that a user sees and it's basically TQ plus T sir. And so as a result we can just compute these. So if lambda is the average number of arriving customers per second that's 10 per second. Okay. The average time to service an 8k disk block, I've said by my spec here is 20 milliseconds. Now if you were given this problem in an exam or something which you really have to do is compute this 20 milliseconds. By doing something having to do with the size of the block you're reading off of the disk, the transfer time, the seek plus rotation time averages and so on controller time and you'd compute this 20 milliseconds based on 8 kilobytes and other stats that we've given you. But for now I've told you it's 20. And so since T sir is 20 which is 0.02 seconds, I can say my utilization is 0.2. So this disk is 20% utilized. Okay. And what's the time in the queue. Well this is the average time for customers in the queue is while it's an mm one queue because I've got sequels one for the output. And so it's basically T sir times you over one minus you, I can do that calculation and I find that I'm spending five milliseconds in the queue. Okay. The length of the queue I get from little flaw is lambda times T queue so that's 10 seconds or 10 per second times 0.005 seconds. And so the length of the queue is 0.05 items on average in the queue so this queue is not heavily loaded. All right, if and if you notice kind of my utilization here is is low. Okay, it's it's kind of it's not too bad. It's it's a 20%. If I were to increase the number of user requests per second and I were to get this to go up closer to five, you know, 50% or higher, then this would start growing and I'd start seeing on average multiple items in the queue. And so by the way just the final T system, careful to always remember to calculate is that time in the queue plus the time, one more time to be serviced by the disk because once I get out of the queue, I get serviced and so that's the five milliseconds here of time in the queue, 20 milliseconds to be serviced and so it's on average a 25 milliseconds service, which is a little longer than the average service time of 20 milliseconds. Okay. So the queue is starting to have an effect but not a lot. And again, if I were to up my user request enough that time would get much more than 20. Good. All right, are we good. All right, oops I missed my little time to take a break you guys good with just continuing for a little bit until the end of the class. Let's do that because there's some other things I'd like to go here. So by the way there's some good queuing theory resources on the on the homepage. If you go to the resources there's a Patterson some stuff from one of the Patterson books that sort of explains where some of those numbers and the formulas come from. There's also a website that's got all sorts of resources on queuing. And there's a bunch of previous meet midterms with queuing theory questions that you can look at. And you might assume that queuing theory is fair midterms fair game for midterm three. Not for midterm two because we're beyond the material for that. Okay, and I think they might actually talk about queuing theory in tomorrow's section so. Now, let's talk briefly about optimizing IO performance so how do you improve performance, make everything faster. Alright, can't always do that so maybe you make things more decoupled so by putting multiple disks and cues in, you can make things faster. So I somebody mentioned last time raid zero striping part of the way that what raid zero really does to make things faster is it ups the the bandwidth to the disks, thereby reducing the overall transfer time and therefore making things faster. Okay, you could have optimizing the bottleneck somehow. Okay, you could fix the software paths to make the queuing faster, you do other useful work while waiting when we talked about that when we talked about scheduling and page fault so when you got a page fault, go do something else. Okay, but cues are incredibly useful to absorb burst and prevent for instance, the user threads from being blocked by being unable to queue things okay so cues even though they can cause this behavior are actually really good things in general. It's just if you try to overload the devices that you get into this bad part of the queue. And if you have finite cues, and you can deal with blocking a little bit then by pushing some of the pushing some of the generation back on the Fred so they don't generate quite as many. That is a way to improve performance for other people by slowing down the generators of requests. We can talk about scheduling IO what happens when two processes are accessing storage in different regions of the disk. What can the driver do, how can buffering help. What about non blocking IO, or threads with blocking IO or what limits how much reordering the OS can do. We can start talking about scheduling of the actual disk operations, and a lot of this is going to have to do with file system. Okay, so when is this performance highest well when they're big sequential reads or when there's so much work to do that you can rearrange all the work and arrange to have as little seek as possible. Okay, and I'm going to show you that in a moment of reordering the cues. So either big sequential reads or a lot of independent reads that can be reordered. It's okay to be inefficient when things are mostly idle. So we don't really care if we're inefficient under normal circumstances until there's so many things that are trying to get a resource it's overloaded and then we have to be efficient. So bursts now are both a threat and an opportunity they're a threat because as you see they can put you into that regime of queuing where things get really slow. They're an opportunity because when you have a bunch of items that all arrive at the queue at once, then you have a bunch of items in the queue that can be rearranged to better use the disk. Okay. There are lots of ways to do optimization and my hope is by the end of this class you'll have a lot of good thoughts about how to optimize things because you will be master system programmers and system designers by the end. There are lots of other techniques which we're not going to go into today today by user level drivers and doing other useful work in the user level and so on that's that's for a different day to talk about it. But let's let's specifically look at disk scheduling. So discs can only do one request at a time so what order do you do your requests in. So if user requests come in, this is saying sort of what track and what sector for instance, then, you know might be track to sector to track five sector to track seven sector to so on and so forth, or these could be cylinders as opposed to tracks. And as you can imagine, if they things come in and they're a bunch of random tracks or cylinders, the head is going to be moving a lot if you're forced to do things in FIFO order. So FIFO order is bad right we learned that a while back when we're dealing with paging. It's bad here as well, because it's forcing the disk to move in the same way that the request have come in, rather than taking a group of requests and reordering them so say it only moves a small amount and continuously in or out that would be a much better scheduling, but it's not going to be FIFO. Okay, so for instance, we could imagine this kind of scheduling shortest seek time first. Okay, so you pick the request gather a bunch of items in a queue, you pick the one that's closest to where you currently are, and you go and service that request, and so on and then you take one and the next one and the next one. And you're always going toward the closest one physically on the disk. Okay, so this is probably you could imagine this is the best way to reduce seek time because we're only moving as little as we can by reordering the queue. The downside is you might lead to starvation because if you use a request keep coming in to a particular part of the disk like the middle then you might never handle things on the outside and so starvation could be a problem with this. Okay. We could also do something called scan which is basically an elevator algorithm that takes the closest request in the direction of travel. So we travel in and then out and then in this is just like an elevator which picks you up on the way. And so what we do is as we're scanning in and out, we find the items that are on the track we're on or the cylinder that we're on, rather than the next one on the queue. Okay, it's called scan. This doesn't have starvation, but it does kind of have that idea of SSTF because we're only picking up folks and moving as minimally as we can back and forth. Now the downside is that this can actually lead to a little bit of disadvantage to folks on the ends because we kind of meet in the middle multiple times and the ones toward the ends don't get serviced as frequently. So there is a version called circular scan where you go all the way out and pick people up and then you quickly scan back and you go all the way out again. So it would be like an elevator that was local on the way up but then did an express down to the bottom and then came back up again. It's a little bit more fair. Okay, not biased toward pages in the middle but on the other hand, you may or may not care quite that much about service time and so the elevator algorithm is pretty common option. All right. So, all right, questions. Now I'll point out that at the beginning of time, so to speak. The algorithms such as scan were actually in the elevator algorithm were actually implemented in the operating system because the operating system knew when it moved the head it was moving it to a particular cylinder track and sector. Now, a lot of that is hidden. The controller does the elevator algorithm, but operating systems in some cases still try to do the elevator algorithm. And so you often have to competing doers of the elevator algorithm and so that's not always a good thing. Okay. The controllers are very fast. This was a question about optimizing disk scheduling. Here's a question about how we hide IO latency. And if you remember, and this is a recall we talked about the blocking interfaces are when a task goes to do a read, it gets put to sleep until the data is ready. Or when you write you get put to sleep until the data can be put in to is going out to the interface that's a blocking interface and is usually the default on most on most process or excuse me most file systems have that as a default. But with the I auto control you can often put things in a non blocking interface which is just give me what you got and tell me how much you pulled in or sent out. Non blocking or asynchronous which is go ahead and send me a signal later when I'm completed. Okay, and both of those are options that you can often apply to a particular socket or file descriptor once they've been open. So now let's move a little bit forward. Okay, so going from the top down we've been talking in bottom up for, for most of the last couple of lectures but from the top down now if you remember the IO and API is in system calls basically have the same size buffers at user level, which work on memory and use memory addresses to address things in the buffer. And once we get into the file system we move to the block level. So even though at the POSIX sys call level, we get to think about bytes and reading and writing bytes of arbitrary numbers and reading and writing variable size structures. Once we get into the file system we don't have that as an option so typical file systems are divided up into four kilobyte blocks. Okay. And so those blocks are then stored on the disk itself, and we have physical indexing on to the disk. So the file system level we have logical indexing of which block offset are we at. So the first 4k bytes are block zero the next 4k bytes are block one and so on in the file. But once we get to the disk. Now we've got sectors which might be 512 bytes or four kilobytes. And somehow the file system has to talk about which sectors are in the file and in which order on a hard disk. In the case of an SSD. There's a similar idea but now the block numbers go through a translation later to talk about which physical blocks are of interest in one of the ways of knowing why there's a translation layer there as I mentioned I mentioned where leveling earlier where the controller inside the SSD is busy changing which blocks are physically where in order to make sure that things don't wear out. And so there's a translation layer which points the physical blocks and and then typically there are a bunch of erased pages that are on a linked list handled by the controller. So we got to figure out how to go from this arbitrarily no boundary kind of byte oriented view to blocks that are in a file system and maybe randomly arranged on the device themselves and that's our goal of the file system. And so really we're kind of getting our way past the system layer down into the file system itself. And if you remember for instance we have things like open create and close which open our files based on a file name, provide some flags and some permission bits and return a file descriptor. And so now the trick is how do we even build something like this so we got a file name which only really means something relative to the file system. And what comes back is something that then we can apply these read write seek f sync sync so on operations to it. Okay, so those are this is where we're at now we got to figure out how to actually do this. Okay, and when right returns data is on its way to disk and can be read but may not actually be permanent. That's an interesting problem which we're going to talk about as well because when we write arbitrary bytes. We may not be able to push them immediately out to disk if we're trying to buffer things. And therefore it's possible that we return to the user saying the right was completed but it's still in buffer storage will talk about that as well. So how do we build a file system well first of all what's a file system. So file system layers of the operating system that transform the block interface of the disks, which are, you know, here's some blocks on some track and sector and so on into files and directories which are really what you as a programmer want to use. And the components of these are many fold so first of all there's naming which is turning those names those file names into something that is understandable at the lower level for the physical blocks stored on disk. We have to figure out how to manage those blocks so we collect disks blocks into files, collect free lists etc. And know which blocks we can reuse, we need protection to protect layers of data from different users, we need reliability and durability questions how do we keep parts of the file durable despite things crashing, right media failures attacks etc. And the user's view is as follows let me give you another example of how this goes so the system's view. The user's view is durable data structures of arbitrary size said that earlier, the system's view is a collection of bytes. Before the system call level and doesn't matter to the system what you're storing there so if you remember those bytes are the operating system doesn't care whether you're storing a database with fixed records, or you're storing a bunch of individual text lines and some document and so on so the system's view is that system called interface level is again this collection of bytes. Inside the operating system it's a collection of blocks. Okay, and those blocks are sort of the basic unit for talking to the disk. And as I mentioned in a couple previous lectures, the block size say 4k is often bigger than the sector size on disk which is the minimum unit that you can read and write from that disk. And so we could look at it this way we got the user has a file which goes to the file system which goes to the disk. And if the user says give me bytes two through 12 and the file what actually has to happen well we have to fetch a block from the file system that has those bytes. Pull it into the file system and then return blocks bytes two through 12 to the user. Okay so that says there's some interesting buffering going on here because probably we're pulling a block off the disk, putting it in the file system cache and then extracting a few bytes and returning from the system call to the user. And even more so what if the user wants to write bytes two through 12. Well that's even more tricky because we certainly can't write a few bytes onto the disk so what happens there is we fetch the block we merge in the ones the user wanted to write. And sometime later maybe not immediately we write that back out to disk. Okay, so everything inside the file system is going to be in terms of whole size blocks. The actual disk IO happens inside in terms of blocks and any reads or writes that are smaller than the block size at the user's level have to be translated between the user and the file system into blocks and that's going to be part of matching up there that we have to handle. Okay. So what are some basic entities on disk of obviously files which are usable, usable, visible user visible groups of blocks. Sorry about that, arrange sequentially in logical space and directories which are indexes mapping names to files and we're going to need to figure out how to produce files and how to produce directories. What's interesting about that and fortunate is that directories are basically just files with names to file mappings and so assuming we can figure out how to make a file work, then typically making a directory work is much easier after that. We're going to access the disk itself is sort of a linear array of sectors. In the old days we used to identify sectors as by their cylinder surface and sector. Okay, and the operating system actually had to track that. I'll hold off on that question I'll answer in a moment. Now, however, the the operating system doesn't know exactly what cylinder surface sector a given block is on instead it's got a logical block addressing scheme that starts from zero and goes up to the maximum block and the controller of the disk translates that into cylinder surface and sector. And what the operating system really has going forward is it knows that if it has two blocks whose logical addresses are roughly close to each other. They're going to be roughly close to each other on the disk. And that's pretty much all the information that the operating system has other than maybe some information about, you know, how many blocks might fit on a cylinder, etc. Okay, the controller does the translation from address to physical position. And these days it's really the hardware is pretty much shielding the OS from the structures on the disk. Now there's an interesting question on the chat a second ago which was does L seek actually cause the head to seek. No, L seek really causes the pointer. In the users proc structure. To go to move. Okay, and so when you open a file, there is a current pointer of where you are in the file, and all that L seek does is changes that pointer to where you want to be in the file. And then when you do a reader right that translates into give me bytes at a different place and so L seek is really only seeking in the file structure not anything to do with seeking on disk. So it's it's a it's a seek within the file not within the physical media. All right, and so I wouldn't even call it a vestigial part of the API it's, it's a part of the bite side of the API not the disk side of the API. So what is the file system need. It needs how to track free disc blocks need to know where to put newly written data has to figure out how to track which blocks contain data for which files you need to know where to read a file from. It needs to figure out how to track files in a directory. So you have to find a list of the files blocks given its name. And where do we maintain all of this information. Okay, so which disc blocks are free which blocks contain data for which files where which are directories. Where do we maintain all of this while we might cash it in memory while the system is running, but by and large we have to push all that stuff out to disk. Okay, so basically all that information somewhere on disk. All right, and so next time we're going to pick this up, but data structures on disk are different from data structures in memory. For a number of reasons not the least of which they can't be arbitrary size that you're forced to deal with blocks at a time. And you have to pull in whole blocks and push out blocks even when you're writing something smaller. And you also have to worry about durability. So the file system is somehow keeping meaningful state even on shutdown. And even if you crash in the middle, you have to worry about that. And so this is also something we're going to talk about as we go forward. Okay. So what are some critical factors of the file system. I'm going to finish up in a sec. Durable data store. It's all on disk we figured that out. We have to figure out how to get performance, maximizing sequential access and minimizing seek that's going to be challenging. We have to figure out how to deal with protection checks. All right, well, fortunately, the POSIX interface says we open first and do reads or writes second so we can do all of our protection checks on the open side of the file system not on the reading and writing side. The size of a file is completely unknown until you finish writing to it. So this is a funny problem that you will not have appreciated I bet until we look at file systems, you open a file and you start writing to that file and then you close it. The file system doesn't know if you're about to write five bytes and close the file or a terabyte and close the file. And so we actually have to determine size as we go and that means we're going to have to figure out how to optimize our placement on the disk on the go. All right, and so that's actually like a side effect of the POSIX interface. We're going to have to figure out how to organize into directories. So what data structure is on disk for that. And we need to allocate in free disk blocks. So what we're going to do next time is we're going to actually look at a number of different file systems I thought I might get to the fat file system today that's okay we're not we'll do that on Tuesday. In conclusion, we talked a lot about queuing theory today we noticed that bursts and high utilization give you queuing delays, and this these queuing delays are fundamentally a part of the probability and the burstiness on the input side. Okay, we gave you a couple of useful equations for MM1 and MG1 queues, which are the simplest to analyze so that's memoryless on the input but arbitrary on the output. And it's just an equation like this if you can come up with T sur C and in the utilization so you can come up which can be derived in multiple ways you can come up with a queuing delay. Okay, we started talking about how the file system transforms blocks into files and directories and how to optimize for access and usage patterns. Clearly are going to need to figure out how to maximize sequential access and allow efficient random access at the same time. That's going to be part of our file system design and so that's going to get pretty interesting as we move forward. And as you can imagine, the, the optimizations a little different for an SSD. And we'll talk a little bit about that because we don't have to worry is about seeking but we may have to worry about writing too many things. Okay, because we don't want to wear it out. So next time we'll actually start talking about how we structure things and what the header is called an I node which is going to be our entry point into building a file system. All right, everybody, have a great day and rest of your day I mean and we will see you on Tuesday and watch for a lot more information about upcoming midterm on piazza. All right. Talk to you later. Bye now.