 Thanks, Tim. So this is joint work with Ressa and Martin Emsen from Denmark. What's happening here? So the motivation here is in the analysis of online algorithms, often there's a large theory practice is connect. The kind of things that we theoretical people predict about algorithms that are not born in reality and the kind of behavior that systems people observe in practice for many of our algorithms are not the one we predict from our theory. So it's very frustrating to have a conversation with a practitioner about one of the algorithms we design because we cannot talk the same language. They don't like what we do and we don't understand what they do. So this is a problem that pervades online algorithms and we're going to study in particular case of paging and caching in this talk. So I assume most of you are familiar with paging, but for those of you who haven't, who saw it too long ago in the undergrad, I'll have a 10-second review. That's about 10 seconds, right? So there's a cache of ISK. There's a slow memory. It's a fast memory. We're trying to decide who gets evicted from a page or not. The better known algorithms are LRU. Let's reasonably use five for first in first out. Flush when full, which is really bad, but it's very easy to implement in hardware. So it exists out there. There's some class of algorithms called lazy algorithms marking algorithms. There's others, but these are sort of the more famous ones. And as you know, we always study these things under the competitive ratio. We, I mean, theoreticians. People in the field do not use competitive ratio, but we use competitive ratio, which takes the cost of the number of faults of the online algorithm. And it's comparisons to the number of faults on the same sequence by an offline algorithm that knows the future. And then takes the worst case analysis of that ratio and calls that the competitive ratio. And as we know, in practice for paging, it doesn't work. It predicts, for example, that LRU is as bad as FIFO, which is as bad as FWF. Whereas in reality, LRU is better, FIFO is also, FWF is really bad. And so this result is not useful in practice. So if you think about it, what offline optimum is trying to do in some sense is measure how easy or hard the input was. But the problem is it's not a very good measure of that. Because for most problems, although offline optimum measures actually two things. It measures how hard the input was, but also measures how much it would help you to know the future. So the input could be really hard. But if you know the future, it becomes easy. But it was a hard input. So we're really combining two different numbers, which are different units, and adding them to one thing, which is the performance of offline optimum. We are adding your weight plus the time of day. So this is a bad measure. It's adding two different things. We would like to know which inputs are easy, and in those inputs we want our online algorithm to do well. We would like to know which inputs are hard, and those inputs our online algorithm can do worse. But offline optimum, which is somewhat attempting to measure that is not exactly capturing it. A second problem of online optimum of the competitive ratio is that it's adversarial in nature. It looks at the worst case, it creates the worst possible sequence, and we claim that paging is not a good place to use worst case analysis. There is room for adversarial analysis. I mean, I guess the best example of a place where adversarial analysis is very good is in crypto, right? In crypto, there is an adversary who is really trying to get you. So there's definitely room for adversarial analysis in other places, but not for paging. If you think about it, programmers write tend to at least the good ones, like to write code that has high locality of reference. Compilers, when they compile said code, they try to increase locality of reference. So the sequences you're getting, rather than being adversarial, they're being purposely built in the nicest way possible for your algorithm. And if somebody happens to give you code that has no locality of reference, it's perfectly okay to tell them, look, the computer is going to trash on your code. You have no locality of reference. Go on, write better code, and don't come back and bother me later. And systems people are perfectly happy to tell that to people. So why are we studying these algorithms on the worst possible sequence when, in reality, we should be studying them in sequences that are nicely built by a good algorithm designer and nicely built by a good compiler. We're not the first people to observe this disconnect. We're not the first people to notice that a competitive ratio fails. Here's a list of a sample of some papers where people have criticized the competitive ratio. And if you look at the names, it's a who's who of people who work in online algorithms. And at some point with my student, Reza, we wanted to find where was the first paper that somebody said the competitive ratio doesn't work. So we started sort of looking back in the past, farther and farther into the past, to find the first paper that had observed that the competitive ratio was not producing meaningful results. And we ended up finding the original paper by Dennis Lerner and Bob Tarjan in chapter two, introduced the competitive ratio. And in chapter three says, but it doesn't work. Therefore, we need resource enhanced competitiveness. So this is really not a new idea. The fact that, and it's surprising given that even from the beginning, people are claiming it's not working, why are we still using it, right? And I think there are good answers to that. I'm not going to go into them right now, but it's an interesting question. Why is competitive ratio has survived so long? So people have gone back and looked at the problem of paging or separating between LRU5 and FWF and tried to provide models that separate between them for the case of paging. Again, this is just a sample of work on the field. There's more work than this. If you're interested on the entire work, you can look at our survey, which is now six years old, but it contains anything before that. And as you can see from the venues, stock, Fox, Soda, Soda, Stock, Stock. Answering this separation question is considered a fundamental question. Even partial progress to this question makes it to the top venue in the field. And finally, in 2007, we were able to, using ideas of Susanna and her student, we were able to provide a model, a deterministic model, that fully shows that LRU is better, FIFO is so-so, and FWF is bad. So I'm going to talk a little bit about how that goes and ideas are after. So the first thing we showed in that paper was that if you do not somehow give different weights to different input sequences, you could not possibly separate. Somehow, if you treat all input sequences the same, all algorithms are equally good or equally bad. So somehow, you're going to have to classify your inputs and say, look, the ones that are nice, I have to do really well, and the ones that are not nice, I'm allowed to do worse. In some sense, offline opt, I'll start too, right? The ones have a higher offline number like that. The problem of offline opt is that I could measure. So we need something that is a measure of easy difficulty of the input of locality of reference, something that classifies the inputs, but it's not offline opt. So people have observed that before. We were the first ones to give a theorem, but the observation was there much before us. People have said, well, we needed some measure of locality of reference or difficulty, and people have proposed various alternatives, torn in fogs of Susana, which is a nice idea in stock, and so on. So in our paper, in Sona O7, we use the ideas from Susana to introduce locality of reference to weigh the inputs in a very natural way, and show that by using that natural weighing of the inputs, LRU is best. And this work has been extended. Budget analysis has been used in other fields, case server problem, bin packing. So the idea of objective analysis has been successful in separating in places where no one else were able to separate before. Some instances of case servers, some instances of bin packing. However, one of the problems that objective analysis has is that it is a difficult analysis, and for a model to be effective, to be good, it has to be effective. If the proofs are very complicated, people won't use it. Susana, for example, mentioned on Monday that one of the reasons we don't do average case analysis is because it's really hard. If it was easy, we would do it more often, but if it was so hard, we just walk away from it. It's just too difficult. So we said, is it possible to introduce another model that it produces as good results or close to objective analysis, but it's a lot more effective. And so we started thinking about how can we weaken this very fine tool, this very fine measuring tool to obtain somewhat less fine tool, but much easier to use. And we stumbled on this idea of parameter analysis, which Susana also discovered independently. So if you look at the paper as you see, the same general ideas, but different particular details. And I've taken to call these parameters analysis as a joint result, D-L-L-A-L, because it's sort of the same result that different people, D-L-L-A, independent scores at the same time. And we have further extended these for the working set, which in a paper that appeared this year in Wawa. And next year in SOTA, Gabriel Moruz and his student Adrienne Nuguesco also extend their parameterized analysis to the competitive ratio. If I have time, I'll talk a little bit about that at the end. So what is parameterized analysis of online algorithms? Well, we take the performance of the algorithm in an input sequence and we try to balance its cost not only by the size of the input, but by a measure of difficulty, but a parameter D. This is so M is the input size and D is an input difficulty. This is very similar to parameterized analysis in S-PETAS, where we try to bound the time as opposed to the cost in terms of a polynomial and a parameter K, which in some how represents the difficulty of the input. A very interesting observation of these is that parameterized analysis includes offline optimum. So one possible measure of difficulty for offline opt. So we're not giving anything away. You can still do offline opt in parameterized analysis, but you can choose a better measure if you so wish. So if you do the simple math, you can actually rewrite the competitive ratio in the form a of sigma less than cost of n times difficulty, where difficulty now would be the performance of the offline opt. So the measure of locality of reference that we introduced, which is slightly different in Susana, is what we call the non-locality of a sequence. So you request a page and you say well how long ago was the previous time I requested it? If the sequence has low locality of reference, I would request it a long time ago. If the sequence has high locality of reference, that request happened nearby. Then I take the next request and again I ask how long ago was the previous request, and I add up these distances in the past. So if the sequence has high locality of reference, the previous request would be small. When I add them across the entire sequence, it would be small. If the locality is low, then this number would be high. And you can see the numbers come really, really low. Sort of there are five percent of the maximum value, one percent of the maximum possible value. So this shows that the sequences are really not adversarial. This number could be as bad as 100 percent if there was no locality, and it's just it's in the low and below 10 percent. So once you introduce this parameter, you can actually measure the performance of any on paging algorithm, and it comes in this range between lambda k plus one k size of the cache and lambda over two. So lower numbers means less false, larger numbers means more false. So the best algorithm would be around here, the worst algorithm would be around here. The proof is rather simple. I'm not going to go through it, but it fits in a page. Why do I show a proof in a page? It shows the model is effective. It's easy to prove these things. So we can quickly prove that LRU is optimal because it's not hard to show that the cost is less than lambda k plus one less than or equal to lambda k plus one. This is what's the left side of the interval that I showed in the previous slide. Therefore, this is optimal. And the proof is a one-liner, yet again, effectiveness of the model. So it was very hard to prove LRU was optimal in all the models. Here's a one-liner. We can also analyze general class of algorithms, such as conservative or marking algorithms, algorithms and show that they lie somewhere in the middle of the range. All of them, this can be as bad as lambda over two, so this shows that they're much better than other algorithms out there. And the proof is also fits in a page. More interestingly, the proof of this is the same proof that shows that all marking algorithms are k-competitive in the competitive ratio. So we grab the same proof, replace up by lambda, and obtain a better result. Is this an accident? No. What happened is that people, when they proved that all marking algorithms are k-competitive, they built the adversarial sequence very carefully. They really thought about what are the key properties of marking algorithms, constructed a really nice, very carefully-constructed adversarial sequence. And then they grabbed this horrible ruler, this coarse ruler called offline optimum, and they were able not to get much out of it. But it wasn't because the sample wasn't good, it was because the ruler they used to measure the sample was really good. You throw away this bad offline optimum ruler, replace it with this finer locality of reference ruler, and the same sample gives you a much better answer, because you measure it better. So this happens not only for this problem, it happens for almost all of the proofs we have here. We need to grab, we can use previous adversarial sequences introduced for the competitive ratio, because they were built very carefully by our fellow researchers. And just by having a finer ruler, things that before measured just to k in the offline optimum ruler, here measured to lambda over 2, lambda over 3, and different values. And the fact that the proofs are easy means the model is effective. That means it's easy to prove things in this model. So this speaks really well for the model if I'm immodest about it, but it's a really bad idea for publishing papers, right? It's hard to publish papers where the proofs are easy. But in terms of obtaining results, you get a lot of results from out of it. So here's a sample of them, right? And you have matching bounds, and in general if you want to see the entire table, we've been able to analyze a ton of algorithms, obtaining different values and including some randomized algorithms and obtained performance, including the offline algorithm can be analyzed also in terms of the locality of reference. So the offline optimum is about three times better than LRU. So I think if you introduce a model and try to argue that it's better, it's only intellectually honest to also take the opposite side and convey its deficiencies or its possible caveats. So one natural question to ask about this model is, well, you introduce a measure of difficulty, non-locality of reference, it seems rather arbitrary, it's not the same as the sannas, it's close to the sannas, so why yours, why not sannas, or why not something else? Well, there's three answers to that. For some problems, often there's only one natural measure. So for example, for many things having to do with compression or randomness, entropy is the one natural measure, our k-order entropy. For some other problems, and this is the case for paging, the results are robust under any choice of measure you have. So we have sannas measure, our measure, and the one we propose in WAWA, they all give the same results. So what this is saying is you don't have to worry about getting the model exactly right, the measure exactly right. Here's an example, think about the RAM model. In the RAM model we claim that and and an addition take the same cost, or an or an addition take the same cost. But so long as they're within a constant of each other, your analysis still holds. So you don't need to actually work out exactly what is the number of times cps cycles of your addition or your own. So long as it's constant, your result will hold. And that's what we're saying here. The results seem to hold under small perturbations of the measure, propose some natural measure, all the results hold. And the third thing is that the measures we're using right now, offline optimum or input size, are also somewhat arbitrary. Offline optimum brings these knowledge of the future, which is not very relevant. And for not all analysis, for general classical complexity results, we have chosen to measure the difficulty of the input barely by input size. We say if the input is bigger, it's harder. If the input smaller, it's easier. Which is generally true, but not 100% true, right? You can have very large inputs that are trivial. You can have very small inputs that are hard. So for every measure there's a bit of arbitrary choice. Another drawback of this model is that the measure of the difficulty is problem-specific. We don't have offline optimum applies to all online algorithms. Here we need for every problem to define the parameter that we care about. Again, some measures carry across various problems. Anything for paging carries the least update. But yes, sometimes a measure is called arbitrary. We have the case, for example, for SPTAs, where there's many different parameters, three-weed, degree of the graph, whatever. And we just have a plethora of choices, and there's no way to, you select the one you need. And the third one is, well, yeah, it would be great if we could have a single measure that applied across all problems. And we wouldn't have to define problem-specific measures. And yeah, that would be great. Do we have any evidence such a thing exists? After 25 years of playing with the competitive ratio and offline optimum, maybe this is an ideal field of the world that doesn't exist. It's like world peace. It would be great if world peace existed, but it's just not there, right? And therefore, you have to work on the assumption that world peace is not there. Also, maybe it's even foolish to ask for a unique measure. Think about a physical object, such as this pointer. If a physicist wants to measure this object, he can give you many different measures, the weight, the density, the hardness, the volume. And it will be silly to demand from the physicists that they reduce it down to one number. How come you have so many different measures? Well, because there's so many different things to measure, right? And what do you measure? Well, it depends on what you want to use the number for, right? So maybe it's even silly to expect to have a single measure of difficulty. Maybe there's a rising to half. Okay, I'm going to be one quick question.