 All right, morning smaller and smaller group here. Let's try that again everybody. Good morning Everybody had a nice weekend All right Back to winter I guess If so if you look at the calendar We only have one week of class left. So we're going to do finish up performance today We're going to talk about virtualization on Wednesday and Friday next Monday We will do review for the final exam if you look outside. It looks like we have months of class left. So All right, so the final exam people have asked about this I sent an email about it just to remind people it'll be Monday. So it is two weeks from today at 8 a.m. So apparently Whoever schedules these things things you guys are glutton's for punishment. So but you guys, you know People who are here the ten of you Should be good at getting up in the morning to come to class down So this will be similar to that except an hour earlier, right? Sometimes that hour just makes a little bit more Difference than we wish it would so you know, I thought maybe actually maybe maybe what I'll do is the following to help people With this task maybe next Monday. We'll have class from 8 to 10 and Well, we can do exam review for you know an hour and 45 minutes or so I will consider this to see if an extra week of like building will help people Get up and be bright and wide awake at 8 a.m. The format of the exam will be similar to the midterm So, you know short answer questions a couple of longer sort of design problems Maybe a few multiple choice here and there but not too many right so any questions about the exam How many people are able to take the exam at the schedule time? Actually, maybe I should ask the converse. How many people are not able to take the exam at the scheduled time? Okay, good. I know there's at least one person so there probably will be an alternate exam But if you don't have a good reason to take it, please don't all right Okay, so we spent last Wednesday and Friday apparently with me very slowly going through Fairly obvious stuff about measurement and benchmarking. So today. I want to speed up a little bit get through this They get us out of this little performance unit that we've blundered into so any questions about measurement and benchmarking from Last week's material we talked about some of the challenges in measuring things unreal systems in terms of reproducibility in just terms of getting access to the kind of Fine-grained time information you might want and then we talked a little bit about benchmarks different types of benchmarks choosing the right Benchmark for a problem itself. So any questions on measurement and benchmark? All right, so we're actually not going to do too much review instead I'm just going to try to give you guys a big picture overview of where where we are right So what have we managed to accomplish so far? We set out to improve the performance of our system right and what we've talked about is the first thing That we discussed was how do you measure things right? How do you actually measure things on the system so that we can collect any sort of data right in order to collect data? There's this process process of measurement that we have to go through and so we discussed measurement And we also talked about some alternatives to measuring real systems in terms of model inter simulation All right, the next thing we did is we talked about how to choose a benchmark So the first thing is how do instrument and measure either a real system or a simulator or or You know an analytical model and then the second question was what are you going to measure that system doing? So we talked about benchmarks. We talked about micro benchmarks. We talked about macro benchmarks We talked a little bit about application benchmarks and and we'll you know We'll come back to these a little bit today as we go through some of the extra material So this is where we are right? We've we've figured out how to measure the system and we've decided what the system is going to be doing While we measure okay, and then today what we're going to do well, okay I mean I'll pose the question now what do we have to do right? So now you have a system that you can measure you have something for it to do. What are the next steps? What's the next thing that we need to do? Anybody what's that? Start to test okay, right? So I'm gonna start a gathering data and then what am I going to do this side of the room I hear things from the other side, but I'm not listening. I'm just listening to this side So you start to collect data then what do you do with the data? Okay, so I might I might use some statistical techniques to do what but what am I trying to do? I'm trying to take the data and do what? Analyze analysis analyze the data, right? So I'm gonna analyze the results. I'm gonna run some experiments and collect some data I'm gonna analyze the results that then what am I gonna do now? I've got some results what what was the whole point of this thing in the first place Left side of the room Again, I can hear muttering from over there. They're so eager whispered in my ear. I Know you know the answer You don't know okay analyze the results then then what what was the whole point of this exercising the first place? What's that? Well, I ran the benchmark right so I did I chose and run the benchmark I collected some results analyze the results now What am I gonna do? Use it to improve the system, right? Yeah, you know, I'm gonna decide what to improve Right, they're gonna make some changes to the system. Okay, and now what's the last what's the last step? I Don't know if there is a last up on the slide, but I'm gonna down there is last Then I'm gonna make some improvements right I'm gonna make some changes to the system Now what I have to do to know that the changes really had any effect I need to test again, right so I need to go back and Restart this restart the cycle right go up to the top pick up my benchmarks again Rerun them analyze the results and try to draw a conclusion, right? So essentially what we're doing is a form of experimentation This is this is in some ways very analogous to the scientific method that is used by other fields, right? Here though the the change or the hypothesis the hypothesis that we have is that we can improve the system by making some change to a certain component, right and that the tests that we're going to carry out to an to examine that hypothesis is You know running a benchmark collective results and seeing if the system speeds up right? There's one interesting difference I think and it's worth pointing out here between you know what I would call the scientific science the scientific method and the Computer scientific method right and this is an interesting observation it's important when you guys start to build real systems is that In nature something is true where it's not true, right? I mean there's some underlying ground truth to the world and that's what scientific experiments are trying to discern When we start talking about engineering we you know who's who's making these changes? Like the underlying order of the universe is making the change. No, who's making the change? Some poor sap programmer, right? No, like you or me, okay, and so You know I think there's interesting results in the history of computer science where you start to realize that just you know The problem is two things get mixed together one thing is is something a good idea or not The second thing is did people do it the right way or not, right? Did people succeed in doing it correctly so it's possible? I can have a hypothesis about a type of soft form of mutation I can go off and build it and it doesn't perform very well, right and the fault is not that the idea was bad The fault is that I made a bunch of mistakes when I implemented it and it turned out not to work, right? So just keep that in mind, right? Okay so Any final questions about where we are before we blast on forward? Going once going twice All right, so statistics, you know, and I don't know maybe maybe this is just me sort of giving myself kind of extended help session here, but I Think that you know if you look at the papers that come out of the computer systems community We have somewhat of a you know a tortured relationship with statistics shall we say, right? You know some of the statistical practices that are in use still today in the computer systems community I think are not necessarily the strongest, right and at least speaking for myself alone, right? You know one of the reasons I ended up doing computer systems as opposed to other things Is that I was too dumb to do mathematics, right? I was too dumb to do math and then I tried doing physics and I was too dumb for physics, too And then I did up in computer science, right? So and where where I feel like maybe it I have some some degree of refuge from for mathematics, right? And unfortunately unfortunately that that only goes so far math Math is one of those things. It's everywhere, you know, it turns out in all sorts of places Maybe even places that you wouldn't like it But you know again, I mean from these are just observations From my time the computer systems community, you know running an experiment multiple times, right? I mean that's whoo-hoo that that's like a you know a badge of honor, right? I didn't just run my experiment once see an improvement and then immediately declare victory, right? So this is kind of the beginning here and then you know maybe maybe if I'm feeling super generous statistically I'll put air bars on my graph, right? That that'll be kind of like the super bonus points, right? A little bit of extra special sauce, right? And of course, I'll only put the air bars on the graph If they're small, right? If they're big then I'll forget to put them on the graph, right? All right, so I think that we all is sort of like responsible human beings and responsible systems engineers or future systems engineers or future systems programmers can just aspire to do better, right? So on some level just this is an aspiration or wish for all of you and your future as programmers, right? You know do do better with statistics, right? Try to get to the real bottom of things So the first thing that I find helpful to do when I'm Analyzing a system or thinking about how a system is going to behave under a certain set of experiments is make some set of predictions About what's going to happen, right? So one of the things you can do is you can just say okay I'm going to collect some results and I'm going to try to produce a certain type of data visualization or graph and So I'm just going to draw a picture of what I think that graph is going to look like, right? Based on my intuition about the system and what I think is true You know here is the graph and you just draw it on a white board You draw on a piece of paper and you can say here's what I think it's going to look like, okay? Then you collect data and you see what happens, right? So again one way of doing this is just draw a picture, right? Draw a sketch of the graph that you think is going to come out from a particular set of experiments The reason to do this is because it helps develop your own intuition So and and on the other hand I mean you have to be careful with this because if the results come out the way that you thought They would that's not necessarily an indicator that you understand the system, right? You may have made an erroneous set of assumptions about the system And it just happened that the data looked like what you thought looked but at least this helps you when You're completely wrong, right? if your prediction of what the graph is going to look like in the actual graph are totally different then Something is wrong right usually the problem is your own intuition about the system, right? I would say that's probably the problem. I don't know nine nineteen times out of twenty, right? The other five percent of the time something's wrong with the experiment right something's wrong with the experiment somebody goofed out The simulator is broken the data wasn't collected properly, whatever right? This is a good this is a good way to do things the other thing you could do especially when you're working with simulators and Analytical models, but also when you're working with real systems is predict make predictions about simple cases and make sure that those simple cases That you think you fully understand look right when you actually start to do that the data visualization, right? So if the system doesn't behave the way you think it will in really really really simple cases Then the corner cases and the weird things and the stuff that you're trying to figure out You know you're you're going to be kind of really lost all right So the second the second big thing I think that that that world you know people are guilty of is Early and premature use of of summary statistics, right? So summary statistics is that a summary statistic is any sort of statistic that you know uses a single number or a small set of numbers to Summarize a large number of data points, right so things like computing the average, right? I ran the experiment a hundred times and I took an average and You know I and I you know I that that was it right or or even you know averages I think are particularly cruel because averages have are almost meaningless statistics, but but a mean Sorry means our average is the median right so I computed a median Okay, I mean at least a median has some sort of really well-defined statistical mean, right? These can frequently hide a lot of important information about the data that you've collected, right? And if you don't look at that data directly you can miss a lot of things right so for example I'm going to show you two different Sets of data, you know doesn't matter what these data are from right there just random things I collected up the internet. I don't know what the data is from right and I'm going to show you two graphs that have Very or could have very very very similar summary statistics, right same mean same median, right? So here's one, okay, so this is you know a very narrow, you know well-peak distribution It's got some error right, but but basically this this is this is you know some variable that I'm measuring that looks like it's fairly consistent, right? Who can guess what the next graph is going to look like anybody want to guess? What about that? So here's a bimodal distribution, right? So this graph there's two things going on here, right? Imagine you're again you're studying something like page fault performance, right? And you get this graph right you measure page fault handling time and you get this graph on the left, right? What is this graph saying to you? What does this graph say to you and anybody looks like it's pretty small standard deviation So there's some consistent performance that the system is experiencing right every time And this is probably not what you would see if you collected something about paid false statistics because there's different cases in here But this looks like something consistent is happening every time there's some error in the measurements It's caused a little bit of spread. That's normal, right? But but there's some consistent phenomena here. What about the graph on the right? Anybody what is what is that graph say to you about underlying mechanism? What's happening on this system? Anybody what's happening? Is there is there some single thing that's going on on the system? No In fact, there's what it looks like is there's probably at least two different cases that are that are mixing in your data set Right. There's something that's causing the system to behave like this some You know some code path or some particular set of circumstances or inputs are causing the system to look like this And there's a whole another set that are causing to look like that, right? If you so if again if you took the average of these data sets you could get the same number Right, and it's possible depending on the spread if you took the standard deviation you could also get the same number, right? So without looking at these data sets you can't tell what's going on, right? And the set of techniques that you're going to apply To improve the performance of the system that looks like the system on the left and the system on the right are different Right the system on the right, you know, maybe if I make this guy go away, right? Maybe there's some bug that's causing this this distribution over here, right? So anyway, the point is that look at the data, right? Look at the raw data graph, you know do histograms look at probability Distributions like look at the data itself before you start computing any sort of summaries to this, right? If you don't this will happen you there was a you know am I The woman who taught this class at Harvard before me has a famous story about a paper They were writing and you know she was supervising a research assistant and he was he kept saying Oh, and she kept asking him about this one number in the in the paper She kept saying you know, why is the standard deviation so high and he kept saying oh, it's cash effects, right? You know like something, you know, whatever I don't know it's just and then they finally looked at the data and it looked like this, you know There were these two, you know peaks with a massive valley in between of them And it turned out that there was a really good explanation why this was happening There were two, you know separate paths in the code that were causing extremely different performance, right? And and you know it just just because he had already come to some sort of convenient conclusion the research assistant was very, you know Grumpy about having to actually look into things more carefully and when he did he finally found out that there was something really interesting Happening, right? So so don't don't don't miss the chance to look at you know data directly before you start to compute More aggregate statistics. All right outliers are Outliers are themselves really important, right? So who can tell me what an outlier is? You guys didn't take statistics or you weren't forced to take statistics. So Sean Extreme data point, right? So, you know, we go back here, right? An outlier would be like this if there was a little bump. This is a new thing right over here, right? and You know, I mean Outliers are annoying, right? Outliers cause the deviation for your data to go way up Which is irritating when you're trying to convince yourself that it's a nice and tightly packed distribution Outliers tend not to fit into Theories that you have about data, right? I mean you come up with a theory that explains You know this and explains this but what about this little blip over here? What happened? This might have only happened once when you ran the experiment one time, right? You got this bizarre You know anomalous result, okay? It's really really tempting to just Ignore them, right? Cut them off, you know take the graph and just trim off that part of the graph, right? You know don't include them when you run your statistics just to see them off something weird happened You know somebody probably inputted the data wrong or there was some bug in the timing code or something, right? And and you know that that maybe that's true, right? But I think that outliers usually Deserve better than that, right? You should be kinder to them, right? They are different. They're they're weird, right? They don't they don't like to like run with the with the pack So they can be annoying to have to go deal with you know They're like the proverbial sort of lost sheep that keeps wandering off, right? But go get that sheet man, you know You got to like go find out what's going on? Why does he keep wandering over there? Maybe there's something really cool I've taken this metaphor too far clearly But the point is that you should you should take care of your owls, right? Take care of them be good to them they you may need to cut them off You may need to say you know what as far as I can tell you're just not that interesting You know you're just confused, right? And this just was some sort of weird measurement bug or whatever But before you do that you need to spend more time with them and you need to try to understand Crazy guys crazy data. All right Okay, so now so now we're kind of to the to the more fun parts, right? So, you know this has been so boring so far, right? I mean we we you know spend all this time worried about how to measure things and then You know now we just got through the statistics part, which is also terrible. It's like preachy and boring So now okay great Towards the fun stuff, right? So now you just have to you've run all your Experiments you've run some benchmarks. You've collected data. You've you've spent time carefully looking at the data and now Hey, just Improve the slowest part of your system, right? This is easy. I ran all the measurements. I can see what the slowest part is So this is what I do, right? No, who said no, why not? Right Okay, yeah, okay, so I just want to put something else out So even if this were this simple right getting programmers to do anything is impossible right like if you you know They're kind of like mechanics right if you tell them hey go work on that piece of code They're like why right I like this other piece of code. This is gonna be more fun to hack right so so anyway, maybe I don't know Maybe it's more sociological right But but the fact is actually one one of the sort of deeper lessons here is that despite the fact That you you know this is this is going to happen to you right you are going to Write a piece of code and as you're writing it you're gonna think man You know I could do better over there and I you know I remember I took this this class in college and he always said keep it simple So I used a really dumb data structure over there And I think this part can be improved and that part could be sped up and then someone's gonna actually come and ask you To improve the performance and normally when you talk to programmers about improving a performance They've got the list of things to do in their head already right They've got the ten things queued up that they know are gonna improve the performance of the system right because as they Were building the code they were leaving little breadcrumbs in there saying no I don't think this works I don't think this works it turns out that they're wrong all the time right and and so and this is this is the big You know this is the the big pitfall here Right is that someone can come to you and you can go off and you can do the stuff that you think is gonna improve performance Right, you know why run experiments? Why why actually have to find a benchmark and use it? I know what's low about the system right? I'll just go fix those parts right turns out that there's a massive gap between what programmers think is so about a system and What is actually slow about even their own code right all right, but we've already done the right thing We ran all experiments. We collected data. Okay, so now why wouldn't we want to improve the slowest part? So let's do a little thought experiment here, you know You took my advice not to write all of your code for a big software project in one function Instead you wrote it in two functions right you wrote one is called foo and the other is called bomb All right, and foo takes five minutes to execute bar and you've you've measured this right you ran benchmarks You collected data, you know you at first you want these two together, but then you realize oh man They're really different so what you realize is foo takes five minutes to execute bar takes five seconds to execute Okay, so again. I'm gonna vote. Let's work on foo Right five minutes. What is going on in there right? What could it pop like five? Like running this on like a two-bit machine or something right alright, so so this this looks like it's terrible right so right We're gonna clearly immediately start working on foo right No girl says no Michael said no to right So we've missed two there's actually two elements that we've missed here, right one is what Michael pointed out the slowest part of the code if It's never being actually executed will not slow down the system And even if it is executed from time to time it may not be bottleneck in the system right What's the other? Interesting so again your your program may have a limited amount of time You know you want to be done 5 p.m. Friday, so you can finish go home You know what what what's the other piece of the puzzle that we've missed here, right? What's the other variable that you don't know when you start something like this? Okay, who might be expected, but what about your time? What about your time? What's what's another what's another element here like what what do you really want to maximize in this situation? Okay, and analysis time right I'm gonna improve include improvement time So there's some there's some trade-off here, right that the what is the thing that we want right? We want the performance to improve right? We want to change your performance. What's the thing that we're spending our time? Right Yeah, well, I mean in this case it's program or effort right So maybe it takes you know 30 seconds to improve food from five minute to one right or it would take 30 days to get one second off of bar. I can't remember which one is which but anyway all right So the first is significance is Michael pointed out how much does food actually matter right? So that that we need to determine before we actually can make a decision right because that's going to help us guide Our efforts in terms of what to improve right the second thing is difficulty How long is it going to take to improve foo or bar? Now there are two things here But which one of these are you likely to know more about right now at this stage in the process you've run the experiments You've got some data. What what can we? What can we evaluate and like you know mathematically analytically at this point and what are we gonna have to guess at? What's that? right So we're gonna focus on significance because significance is something that we can We can measure and we can actually know Difficulty again, you can ask the programmer, but who knows right you just you don't really know how hard it's going to take So that the best thing to do is to start on the part that's causing the problem. All right, so let's start with significance All right How many people have been exposed to Omdol's law before in any setting raise your hand Okay, a couple. Oh good. All right. So you guys you guys should know this, right? I mean this should be like one big punch line I've been building up to that now feels sort of sort of boring, right? There's there's there are many many different ways to frame Omdol's law There's mathematical frameworks that will tell you exactly, you know What the speed up is by improving a certain component of the system? That's great All right, but I think as programmers. It's it's more important to really understand some of the more colloquial Expressions of Omdol's law right, so here's here's one. I found that I like right So Essentially what this is saying is that if you try to improve the performance of a system The thing that will constrain your efforts is the performance of the parts of system that you're not working on Now this kind of makes sense right so let's let's say you pick a certain portion of the system to work on What's the best you could do to that function on that piece of the system? Like if you were if you could wave a magic wand and you could you could you know build a quantum computer or whatever like what? What what what's what's the limit? What could what could you possibly do? What's the best you could do for performance? This should be an easy question. I take a function takes x seconds to execute What's the limit on the improvement I can make to that function? No, it doesn't depend. I'm saying the limit a hard limit Zero seconds right I could reduce it to zero second okay, and actually sometimes that's achievable right Why would that be achievable? Don't do it right maybe maybe the function doesn't do anything useful, right? So I can just remove it and type right those are the best. That's like You know like I definitely I mean that's the easiest thing to change about a system Just stop executing that worthless piece of code right awesome, okay? unlikely that's gonna happen right but But let's so let's go back to our food food and bar example to sort of work through a more food barred example indeed All right, so now let's let's let's look at what we could do right so let's say that You know Let's say in the roughly the same amount of time. Let's say we can guess at difficulty right again I'm not claiming this is possible, but let's say that we could try it And we have the choice of root reducing the execution time of food from five minutes to one minute or Reducing the execution time of bar from five seconds to one second right and This is you know, this is what on dolls are really starts to Mess with your mind right this is why You know despite the fact that we teach us over and over again People have a really really hard time putting in it to practice right because everything seems better about food right You know for minutes of improvement like wow, that's awesome Like that's a that's a huge performance boost right and also if you look at it proportionately You know 80% I can reduce the execution of food by 80% You know maybe the problem with on does law is teaching this to like upper management Right because if you went to upper management, you were like, hey I improved the fun piece of the you know the run-up of a piece of our code by 80% Right and your buddy who's competing with you for the promotion said oh, yeah, I've reduced my by 20% you know like You know with without some sort of deeper understanding of this right? I mean hey percentages win right? Whoo-hoo I Saved four minutes off the runtime of a quick you're always going to describe it as a critical system function critical absolutely critical so it looks like we have a clear winner here, right and But again, but let's but let's walk through the example right Let's say that this particular program and I should have had these out up to a hundred But whatever assume there's some other function in there, you know idle loop or You know stupid class or something that consumes the other 4.9% all right and let's just say that again let's say that bar bar is where the work gets done Right that that sounds wrong So foo is is not where the work is done foo is foo only happens once in a while All right, we're bar. Yeah, like we just go back to the bar over and over and over. Okay, so The spoon the speed up for foo, right What's the effective speed up? Well say I have you know one minute or did I use one minute here I? Used five Why did I use five? I don't anyway No, that's four minutes, but it's 95% of something anyway So I don't know okay, so let's say the program ran for four minutes for some reason probably made sense to me at the time, okay, so The the speed up that I get from improving foo right from five minutes to one minute. Oh I know why I know why cuz this is the sorry. This is the speed up in foo, right? So this is the amount of time that I save Every time foo executes right foo used to take five minutes now it takes one minute So every time foo executes I say four minutes four minutes, right? But the problem is again fooling the executes, you know one out of a thousand times Okay, so I'm only saving on average no point two four seconds every time foo runs, right? Whereas bar runs all the time Right. I'm saving one second every time bar runs, right the bar is running constantly and so over any You know you can you can calculate these out to see what the relative improvement would be, right? But over a long enough stretch of time the improvement to bar is Actually going to beat out the improvement to foo, right? So there were there were three things that we needed to know to answer this problem, right? What were the three and there's one that we haven't talked about yet? There was the whole difficulty aspect, right? So what were the three components that went into commit to computing, right? The first one is what? Well, how do and how am I measuring significance here? I'm using benchmarks to determine how often what percentage of the runtime of a piece of code is spent executing a certain function, right? And this is where you know tracing and other things can be really helpful, right? The trace will tell you hey, you know your your code spends 95 percent of its time executing bar, right? And then there's going to be a bunch of other functions and then way way down the list is this one that you were going to target for improvement, okay? What was the other thing I needed to know? Other numeric thing I needed to know here What what goes into this calculation? So significance explains this, right? This is where this number comes from the significance or the predicted runtime of a particular function the percentage of time that my code spent Running this function what how where did I get the the numbers that I'm using to multiply? Yeah, the amount of time that I was able to reduce that this that I could I could have sped up foo, right? So I started working on foo and and the point is that you know I could have sped up foo by four minutes were assuming in the same amount of time that it would have taken me to speed up bar by a second, right? And then what's the last thing that that that I'm kind of assuming here, right the difficulty aspect? What what what could so so I've I've made this calculation I've assumed that I took the same amount of time and I got four minutes of improvement on foo one minute on bar What's the one thing that could go wrong here that can make this turn out in the favor of foo? This is pretty close actually despite the fact that foo is really really really wrong just because food takes so long This is kind of a contrived example, right What's the last thing that could that could change here that would make foo the right thing to work on? What's that? Oh, no, no, let's just say that I'm using the right benchmarks, right? What's that? Programmer effort, right? So what if it took me, you know one day to speed up foo by four minutes and a year to speed up bar by a second, right? and Okay, sorry. I wanted to put this in here so when I worked in Microsoft I worked with people on the server performance team and You know again like they just go out and have a massive party for weeks on end if they can find one cycle One cycle on a critical path, right on a hot path that gets executed You know millions of times a second if they could find a single unnecessary cycle or a way to trim off one instruction, right? Did they go out and they did they just take a month off? They take a big vacation and all their clients are happy until they realize that they're still more fat cut. All right so so here's another you know a way of Of You know colloquial expression of issue of on Dulles law, which is and I think this is the way to think about it as programmers, right? You have to train yourself to ignore What looks bad, right? Because again as programmers You're gonna know that that part of the code is is really kind of gross and ugly and I didn't really like the way I did that stuff like that, but you know again This is this is when the sort of computer science part comes in, right figure out Where you're bleeding, right? What is hurting you? You know and work on that, right? You know that that that I don't I'm not going to try to metaphorize this I'll just do a bad job, but you know You know be be scientists, right now get in get into your system understand how the system works forget the code, right? just forget about the code and tire for a while until Your experiments and your profile and your benchmarks start to lead you in certain directions, right? Then you can look at the code, right? Because looking at the code might help you to ascertain how difficult it's going to be to make certain improvements, right? But just forget about what you think is wrong with the code because the computer doesn't agree the computer has a very different experience of the code that you wrote and Starting with your own intuition is always the wrong thing to do when it comes to performance improvement It will never lead you in the right direction and if it does It's only by accident, right? So and it won't unlike that ever happened again, right? There's There's also an interesting unfortunate corollary to Omdel's law, right? Omdel's law is like a whole bag of pain. It's just like one unfortunate Conclusion after another right so what's the other part of Omdel's law? Let's say, you know you've done the right thing you ran your benchmarks analyzed the data You identify the part of the system that you want to improve and you start working out as You continue to work on this part of the system. What becomes more and more and more likely you break another part Okay, let's say you're a good program not going to break anything. You have great interfaces good isolation Etc. Etc. As you start to work on one part and you improve the performance of that part What's what's happening at the same time? What's becoming more likely? Right, so it's more and more likely that you're working on the wrong part of the system, right? So you you know again you did the right thing you ran on the experiments you target this one piece of code You spent a couple days working on it and you know what you you did a great job in that part of code It's no longer the problem, right? So now it's like okay. Sorry go back to square one rear on the benchmarks and this is why this is a this is a loop Right, so once you make a change to piece of code you have to I mean look if that piece of code is is killing you right? If it's good to be 99% of the cycles on the system then by all means work on it for a week, right? But most the time that's not true most the time there's a couple of different things that are going on in different parts of the system and You've got them. You've got to keep your balance, right? So you can't get too wedded to this one piece of code, right? You got to get in there make some improvements speed it up step back rerun the benchmarks go on to the next problem Right, this is the way to get a holistic improvement. All right, so there's a great. I want to finish up with this. There's a great paper by you know computer science luminary Butler-Lampson which is published in in 1983 and this is at the time This I read this again the other night. This is published at you know the top systems Conference for research papers. This is back at a time You know that an earlier era in the you know Paleo history of computer science where you could actually publish papers that included the word hints And and actually we're we're just long and extremely useful sets of you know advice Offered to programmers and since the designer. So this is a actually a really really good paper And I you know it is you guys probably have stereotypes about what research papers are like But this doesn't really meet them right these uses quotes from Hamlet and it's got you know Lots of great examples that you probably will have to look up because you know again. This is 1983, right? But there's a lot of really really really nice Systems that are covered in here, and it's a good it's a good view into kind of the history of computer science But the best part about it of course is the advice because the advice is really good and most since they most As far as I know there's nothing in it that is still not true in terms of ways To look at how to improve systems and really again you can apply this to just program Right how to be a good program how to how to how to how to complete assignments how to how to write good pieces of code, right? So so one of his and I wanted what I wanted to do is just walk through a couple of examples There's a bunch of hints in the paper I didn't choose all the hints about performance not all the hints in the paper are about performance some of them about other Things right because performance is one aspect of building systems and there's other aspects interface design is one of them That he talks about in great detail and the others complete this right how do you meet the requirements for a particular piece of code, right? Okay, but let's let's look at a couple of these suggestions in the context of virtual memory because I think that maybe you guys been thinking about that a Little bit recently, so okay One of its suggestions cash answers, right? cash answers and you know in in operating systems Caches are everywhere, right? So we've I've introduced you I mean essentially again the operating a system From the perspective of the operating system the computer is a series of caches, right? That is trying to manage efficiently, okay? and you know we already know that one of The things that the memory management system is doing is helping the computer load and Cash translations for for virtual to physical addresses, right? If I don't cash those translations a system would never work It'd be way too slow, right? So I know about the TLB as a cash what about a way to apply this hit cash answers to managing the TLB itself so Again, the TLB is a cash, right? But what's one way to you know that the operating system might cash information about the TLB? You lead you in the right direction here. When is the information about the TLB lost on the context switch, right? Why? Maybe this is obvious, but why do I have to why do I lose state in the TLB when I when I switch between processes? Again, maybe obvious question been answered anyone Carl you're muttering Yes, there we go, that's where I was right the page the virtual to physical mappings Remember it's not actually virtual address the physical address is virtual address comma process to physical address Those are the mappings that are unique So I can't when I when I switch processes or switch threads. I have to you know reload or unload and reload the TLB Okay, so what's one way so again? I mean this seems kind of inefficient, right? I've got this cash, but every time I do a context which I'm losing state in the cash So what's one way to cash some information about the cash? What could I do that would allow that might improve the performance of the system slightly? Right so one thing I could do and with the TLB and this is not required or necessarily even going to improve performance of your system that much But an idea would be hey when I stop a process from running. I've got some important information there, right? I know the TLB entries that it was cashing So rather than just clearing the TLB. Why don't I unload the TLB save those entries when the process starts running again? I'll rather than just starting it with the cold TLB. I'll reload those entries into the TLB and let it keep running, right? So again, I mean this is a you know the the answer here is the answer to the question What does the process want in the TLB, right and rather than finding out slowly? When I have to kind of forget what's in the TLB for that particular process I can keep a copy of that answer around so they don't have to answer that question all over. Okay Okay, hints Right, not just in the title of the paper. It's also in the title of one of his one of the suggestions. So Hints are pieces of information that I was trying to come up with a really strong technical definition. It's hard to do but I think the easiest way to explain it is Hints are pieces of information that you can use to improve performance But you cannot use in any way that indicates that they have to be correct, right? So you can use a hint to do some optimization, but you can't rely on a hint for correctness, okay? Does that distinction make any sense? Maybe an example would be better, right? So what is What is one way that the the virtual memory management system already uses hints? What is what is you know what it I'm saying use hints you know take some information about the the system and use it to to try to improve things again not in a way that that you know Is is necessary for correctness. What is this similar to what is this similar to this mantra that we've been Coming back to over and over again in this class Use the past to predict the future another way of framing that is use the past as a hint To what will happen in the future, right? Just because a process Touched a page a few times doesn't mean that it has to keep using the page, right? It can go off and use another page. I'm not going to stop it be like hey you fooled me You know like you were using that page and I had all this special stuff set up for that page They went over there. No kill you. No, that's not gonna happen, right? Like not for correctness right hits use the past as a hint to what might happen in the future, okay? So what's what's a way? What's one area in the in the virtual memory hierarchy where this already happens page replace it, right? I mean that page replacement algorithm use some form of hints. That's all they have to go on, right? They're not good again They're not going to acquire that a process tell them what pages it's going to use because the process doesn't know And they're not going to get mad if the process goes off and uses them pages that it wasn't using a second ago But they're just going to use the pattern of page access is a hint to what might happen in the future, right? Try to keep those pages around in case they're used again, right? All right. I think I've got one more I've got two more one more. All right compute in the background right Don't leave the user waiting for something to happen Get back to the user immediately and then go off and finish the computation or or do any other sort of cleanup or anything You can in the background in a way that doesn't impact interactive performance, right? I mean your system could be running flat out a hundred percent utilization But as long as Firefox loads up snappily, you don't care you don't notice, right? If you're me as long as your terminal continues to display the prompt, you know, you're fine, right? So what what's something that virtual memory systems already do in the background to improve performance talked about this? that Swap specifically one Swap in the background. It's usually referred to as something else Yeah, cleaning right page cleaning, you know I walk through the core map and I write out changes to disk right if I do that cleverly I can use bandwidth that's available when the user is is not using the disk heavily so that when I actually have to Fault a page out or in I've got a bunch of clean pages that are easy to get rid of right. All right one more hint shed load And this is this is an interesting hint right this has to do with you know Your expectation is a programmer in the expectation of your system And actually, you know my my advisor my PhD advisor did work when he was PhD student himself on web servers that do a better job of Managing right you can imagine web servers are built for a certain degree of load and then you know I mean he has these or I don't know where he got them But he you know he in a presentation he would give he has these great great But graphs from 9-11 right in terms of like the hit rates that CNN calm was seen right this was not you know a normal morning in September right and Unfortunately a lot of web servers when they get into that sort of situation Rather than so so you know you can imagine the throughput of a web server You can measure it in sort of page request per second that it can serve right the problem with a lot of web services Once they hit a certain threshold. They actually degrade Right, so rather than getting up to you know serving 10,000 requests per second and being able to continue to serve 10,000 requests per second Even if they mean hit with a hundred thousand requests Once they get to a hundred they can barely do anything right they're just falling over themselves try like and so nothing happens right so you sit there and you're hitting refresh refresh you along with everybody else and It's and nobody is getting any service right so so this is a Bad example of how to shed low to me the example of how not to shed low But you know in what case could the virtual memory system get into trouble and have to start shedding load in some elegant or in elegant sort of way Usually you know, okay, so what okay, so what would I do if I start to thrash? It's a great question right so thrash it don't want to thrash get into a thrashing sort of situation. What can I do? Fundamentally just start killing processes right So first of all if I start running low on memory, I can stop launching new processes right how many people use Mac Use Mac OS X I just I just thought of this because there's this like if you guys ever like filled up the disc Almost to the very very tippity top So Mac has this great feature where the terminal tells you that it has stopped storing all of your backer a history To in order to conserve space, right? It's like oh you've got less than a gigabyte left So I'm going to stop storing your scroll back history. I was like thanks It seems like a really the first thing that I would stop doing right But anyway, I mean some form of load shedding right so you could certainly stop launching new processes And when you run out of memory a lot of systems. What do they do? They just start killing processes, right? What else can you do right? There's really no way out. You just have to start you know Put it you set up a firing squad and you just try to find people that are going to help you get out of this mess Right, and then you send an email to the system and then you say hey something bad happened And I had to kill a couple bosses, right? All right, that's it for today Wednesday Friday, we are going to talk about virtualization and that is going to blow your minds Right because virtualization is kind of like I've been I've convinced you I think that the operating system has this privileged relationship with the hardware right and this is like the moment in the matrix series where That that old dude with the beard and the white clothes tells neo that they've destroyed Zion like a hundred times Right like this is not unique, right? We're gonna we'll talk a little bit about virtualization how it works. See you on Wednesday