 Our next speaker is Suzanne Alvers from Humboldt University, Berlin. Thank you very much. It's a great pleasure to be here. I would like to talk about competitive analysis and approaches to overcome negative worst case results in the design and analysis of online algorithms. These approaches have been around for more than 25 years, so in the first part of this talk I would like to briefly survey some of the existing techniques and then in the second part of the talk I would like to study the list update problem, a cornerstone problem in the theory of online algorithms and again I would like to focus on refined analysis techniques. Online algorithms in general have been investigated extensively over the past years, over the past decades. In an online problem decisions must be made with incomplete information about the future. In a general scenario the input arrives incrementally over time, so we receive the input as a sequence of input portions, I1, I2, I3 and so on and whenever a new input portion arrives an online algorithm has to compute. Output has to react not knowing any future input. Despite the handicap of not knowing the future we seek algorithms that achieve a probably good performance and in this seminal paper Slater and Tarjan proposed introduced competitive analysis where an online strategy is compared to an optimal offline strategy. The optimal offline strategy knows the entire input in advance and can always construct an optimal solution. Here an online algorithm is called C competitive if for all input sequences the cost of a solution computed by A is at most C times that of an optimal solution for that sequence and this is a strong worst case performance guarantee in the sense that a competitive algorithm has to perform well on all possible inputs that might even be generated by an adversary. For many problems, for many online problems competitive analysis works very well. It leads to small constant factor performance guarantees, approximation guarantees, but there also exist problems where the competitive framework leads to overly pessimistic results and this became apparent already in the very first paper by Slater and Tarjan where they also analyzed the classical paging problem that we have to maintain a two level memory system consisting of a small fast memory and a large slow memory. The goal is to serve a sequence of requests to memory pages in the system so as to minimize the total number of page falls. Slater and Tarjan showed that popular online paging algorithms are such as least recently used and first in first out are K competitive where K, this parameter K is the number of pages that can simultaneously reside in fast memory so it's the capacity of the fast memory. This factor K is best possible as far as deterministic algorithms are concerned. No deterministic strategy can do better while these results are interesting from a mathematical point of view. They are not very meaningful from a practical point of view because a real fast memory, a real cache can usually store several hundreds or several thousands of pages so the competitiveness of K is huge. On the other hand, in practice both LIU and FIVO have good performance in practice relative to the optimum. It sees very small approximation guarantees and the second drawback is that in practice LIU typically outperforms FIVO and this does not show in competitive analysis either. These phenomena of very high, very pessimistic competitive ratios occur in other online problems as well. There exist basically three approaches to overcome such negative results. Let's say I listed three approaches Anupam tomorrow will talk about yet another approach but this slide talks about three approaches that are very prominent. One is resource augmentation. Here in resource augmentation we give an online algorithm more resources than an optimal offline algorithm. The online algorithm is handicapped by the fact of not knowing the future so why not give the online strategy a bit more resources. In a memory management problem we could provide an online algorithm with some extra memory. In fast memory let's say in a scheduling problem the online strategy might be given faster machines and more generally we might give an online strategy more flexibility to serve the input. The second approach is to define refined performance guarantees where we relax the constraints of strict competitiveness and a third in my view very interesting and fruitful approach is to characterize real inputs in practice. The inputs are not generated by an adversary but often have a special structure for instance in memory access problems they exhibit what is called locality of reference. In the following I would like to briefly review each of these techniques and let me start with resource augmentation we can reconsider again the paging problem and assume that an online strategy is given k pages of fast memory so the online algorithm has a cash capacity of k where as an online, sorry the optimal offline algorithm has only aged smaller than k pages in fast memories. Later in Tarjan the original paper showed that in this case the competitiveness of deterministic strategies drops to k over k minus h plus 1 so as you can see when k grows relative to h this performance guarantee approaches 1 which is very nice so we obtain small approximation guarantees In scheduling as I said we can assume that an online algorithm is given faster machines while an optimal strategy has speed 1 machines an online algorithm might be given machines of speed 1 plus epsilon for any positive epsilon this framework of resource augmentation was proposed in a very nice JACM paper by Kalyana Sundaram and Pruth they considered the classical problem of minimizing the total flow time of jobs on a single machine, a prominent classical problem in standard competitiveness it is not possible to achieve a constant competitive ratio, the competitive ratios depend on the number n of jobs now with resource augmentation one can achieve a competitive ratio of 1 over 1 plus 1 over epsilon so for instance very nicely using a speed 2 machine we get down to a guarantee of 1.5 which is nice and this result is also achieved by very natural algorithm shortest elapsed time versus non-clear variant setting where we don't have information about the job sizes and there exist many more papers in this framework which I do not mention I just mentioned one recent result published in 2009 stock 2009 where the authors basically extended this result to parallel machines where we have parallel unrelated machines and want to minimize the weighted sum of flow times. I mentioned that there is also the possibility or people have looked at the approach of giving an online algorithm more flexibility to serve the input I removed this material but if I have a few extra minutes now I can maybe mention it so what could this flexibility mean. So Rajiv Madhvani he proposed an approach where in a parallel processing environment you are able to migrate jobs so you are allowed to do a limited number of reassignments among the machines which is I think very natural or there is another approach where you are given a job buffer so some of you might know the classical problem where you want to minimize makespan on parallel machines this is a classical problem by Ron Gremm investigated in the 1960s so you get a sequence of jobs and want to minimize the makespan so the latest completion time of any job so Gremm proposed a greedy algorithm list scheduling where you always put a job on the least loaded machine this is a too competitive algorithm and you can hardly do better this is almost optimal you cannot get below 1.9 in terms of performance guarantee now if we get a small job buffer so in addition to the machines of course we have a small buffer and any incoming job is first place in this buffer and a scheduling algorithm then takes some job from the buffer then with this additional job buffer you can get down to a factor I think 1.45 so there is a significant drop in performance guarantee ok then the second approach the refined performance guarantees various concepts have been proposed ranging from loose competitiveness to bijective and parametrize analysis and to various worst order guarantees here I would maybe just mention the diffuse adversary proposed introduced by Kutz-Opias in a paper entitled beyond competitive analysis very close to the workshop a topic and they proposed the following framework they assume that the input is now generated according to a probability distribution D that comes from a known class capital delta of distributions now we analyze the ratio of the expected online cost to the expected optimum cost and this framework allows us to restrict the input just by looking at smaller classes of probability distributions of course this is closely related to the third framework where we want to model real inputs this is a promising approach because in practical applications the inputs are not generated by an adversary the challenge is always to find suitable models that characterize the input for instance in the paging problem real world sequences exhibit locality of reference meaning that when a memory page is requested to be requested again in the near future and there exists a considerable body of literature on paging with locality again the challenge is to find good models capturing locality and on this slide I mentioned three models there exist a few more in the literature in a seminal paper addressing paging with locality introduced excess graphs so we are given a graph that may be directed or undirected the vertices represent the memory pages and now two pages may be requested reference one after the other if they are adjacent in the excess graph alternatively one can model locality using Markov chains and finally I would like to present or mention briefly a third model proposed by myself a few years ago it relies on Denning's working sets a concept that you find in standard textbooks on operating system Denning observed that if in real world sequences if you determine a so-called working set size that is the number of distinct pages the number of distinct memory pages referenced in windows so in sub sequences of length n then for variable n we obtain a concave function a very slowly increasing function reflecting the fact that in larger windows actually very few distinct pages are referenced and these concave functions represent a very simple framework modeling locality now for each of these models one can then do refined analysis of paging algorithms and achieve smaller performance guarantee smaller than k this concludes my brief survey and let me move on to the second part where I would like to study another problem with respect to data modeling and locality of reference it's the list update problem which beside paging is a very basic and very central problem in online algorithms the initial paper by Slater and Tarjan also addressed paging and list update and I would like to present results that were obtained jointly with Sonja Lauer so this list update problem is also a very classical problem with a large body of literature with papers that date back to the 1950s and 1960s even in this problem we are given an unsorted linear link this important aspect is that the list is not sorted at all as input we receive a request sequence sigma where each request specifies an item in the list to serve a request an algorithm starts at the front of the list and then searches linearly through the items until the desired item is found so here to serve the first request to item x we start at the head of the list and then reverse the items y until we hit the item x serving a request in curse cost more specifically serving a request to the I's item in the list in curse of cost of I service cost is equal to the depth of the reference item in the list and then after a request the reference item may be moved at no extra cost to any position closer to the front of the list this can lower the cost of subsequent request and this example here it would be a good idea to move the reference item to the head of the list because then the second request to x could be served at a cost of one but again we are in an online setting where decisions must be made without knowledge of any future request the goal is to serve the entire request sequence so that the total service cost the total excess cost is as small as possible so basically these linear lists are a solution to the dictionary problem these lists are sensible if we want to maintain a small dictionary consisting of only a few thousand items let's say in a compiler another interesting application is data compression using these lists it is possible to build effective very effective data compression schemes one list update assumed that request sequences are generated according to probability distribution and then over the past years people have looked at competitiveness and the most important results are as follows as far as deterministic algorithms are concerned later entourage and showed that move to front is too competitive this is a very simple and elegant algorithm that as the name suggests moves simply moves a requested item to the front of the list and this factor 2 is best possible no deterministic strategy can beat the factor of 2 there exists yet another deterministic too competitive algorithm called time stamp this algorithm is relevant basically because it can be used to build good randomized algorithms on the other hand other well known popular strategies such as transpose and frequency count are not constant competitive their performance guarantee depends on the list length the number of items in the list using randomization not surprisingly one can beat the factor of 2 Reingold Westbrook and Slater proposed a bid algorithm which is moved to front on every second request but in a randomized fashion this strategy has a performance guarantee of 1.75 it is possible to combine bid and the time stamp algorithm I just mentioned to form an algorithm called combination or comb for short which is 1.6 competitive and this is very close to the best known lower bound which is slightly above 1.5 further algorithms have been proposed in the literature but most of them are actually variants of the above schemes while these competitive results are interesting and valuable there are some shortcomings first of all there exists a gap a significant gap between the theoretically proven and the experimentally observed performance guarantees of the algorithm for instance the experimentally observed performance of move to front is much smaller than 2 it is actually very close to 1 such experiments were done for instance by Ron Rivest and more recently by Bentley M.A.Q. Barach and Thalia Nietz and the second drawback is that the fund in many applications exhibits the best performance despite the fact that the randomized strategies have slightly smaller performance guarantees the reasons for all these shortcomings is again that competitive analysis allows considers arbitrary request sequences where as request sequences arising in practice have a special structure exhibit locality of reference in our paper we present a study of list update with locality I would like to mention that there is some related work most of which is authored or co-authored by Alex Lopez Ortiz and I'm glad he is here and tomorrow no the day after tomorrow will present some of these results so just very briefly in two let's say first papers they considered Denning's working set model that we proposed for the paging problem in terms of locality and then using the concept of so-called bijective analysis showed that move to front is always at least as good as any other online strategy but these papers as far as I see do not quantify algorithms performance such a quantization was done in another paper on parametrized analysis which I think Alex will present talk about on Wednesday in addition to these more theoretical works there are some experimental studies together with Ian Monroe where they analyzed list update algorithms also with respect to their application in data compression now in our paper together with Sonja Lauer we have the following main contribution so we present a combined theoretical and experimental study of list update with locality of reference the goal is to close the gap between theory and practice first of all we define or we introduce a new locality model that is specifically tailored to the list update problem it's based on the natural concept of runs I will explain in a minute what this means then using this locality model we are able to present refined theoretical analysis of various online algorithms we concentrate on the most important ones move to front bit and the randomized combination algorithm in order to analyze this combination we also had to look at the time stamp strategy again we looked at two performance measures the excess or service cost paid on a reference string and secondly in the competitive ratio in addition to these theoretical results we did an extensive experimental with real world traces from benchmark libraries we considered sequences as they arise in data compression routines and moreover we looked at memory excess strings in total more than 90 traces were considered and for each of the trace we did a comparison between our new theoretically proven bounds and we compared the theoretical bounds to the experimentally observed performance of the algorithm is that these theoretical and experimental bounds now match or nearly match the average the average relative error for move to front is below 1% for the other algorithms is a bit higher but still around 5% which I think is small and a second result a second contribution is that move to front responds very well to locality of reference with competitive ratios approaching 1 as the degree of locality increases in a reference string and we can also show this does not hold for the other three strategies we looked at they do not respond well to locality and we'll see in a moment what this means in the following I would like to present some of these results and let me start with the locality model so again locality means when an item is referenced it's likely to be referenced again soon and this naturally leads to the concept of runs where a run in a request sequence is a maximal sub sequence of requests made to the same item so for instance in this sequence up front we have a long run of requests made to item X followed by another long run slightly shorter of requests made to item Y now if you inspect real world sequences unfortunately they contain very few of these pure long runs as shown here but if you look at the end of the sequence you would probably agree that there is also a high degree of locality for item X we encounter many requests my laser pointer is bad we encounter many requests to item X which are just inter- single requests to other items and these requests to item X would form a long run if we consider smaller item sets and in particular item pairs and this is what we do in the model so given an original request sequence for any pair of distinct items X and Y we define or we consider a projected sequence sigma X Y that consists of only those requests made to either X or Y all other requests are cancelled out so only the requests to X and Y survive and intuitively these projected sequences have a high degree of locality or contain many long runs because if one of the item let's say X is more significant more relevant than the other item then this relation is likely to hold in the near future and we encounter further requests to this item before we hit the next request to the other item Y and now this locality model considers these projected sequences for all pairs of distinct items X and Y now for any projected sequence we introduce various locality parameters first of all let R be the total number of runs, run being again a sub sequence of requests to the same item, a run is called long if it has lengths at least two otherwise the run is called short so a short run consists of a single request only let L be the number of long runs and then it turns out to properly analyze algorithms performs beneath a third parameter which is maybe not so intuitively intuitive it's the number of long run changes so a long run and the next one occurring in the sequence form a long run change if the reference different items for example here up front the first two long runs to X and Y do form a long run change the reference different items but the last two long runs which are both to X do not form such a long run change it's not very intuitive for the moment but the performance of the algorithms depend on these long run changes then we sum up these locality parameters so X is the total number of runs summed over all items pairs and similarly we have the number of long runs number of long run changes then using these locality parameters we can analyze algorithms performance first of all the excess cost this slide shows the excess cost for the various algorithms and the excess cost of move to front is exactly equal to the number of runs plus the length of the request sequence and likewise we can do refined analysis of the other algorithms in terms of our parameters for the other algorithms the expressions are a bit more complicated but I would say not too bad important aspect is that we can develop a new lower bound on the optimum cost which is basically half times the number of runs plus the number of long run changes then given these bounds the excess cost we can move on and evaluate competitiveness and this slide shows the calculation for move to front so again competitiveness we divide the excess cost online to the excess cost offline or let's say our bound on the lower bound on the optimum excess cost and then simple algebraic manipulation give that this competitiveness is upper bounded by 2 over 1 plus lambda where lambda now is ratio number of long run changes divided by the total number of runs and note that this parameter lambda ranges between 0 and 1 it can be as high as 1 on request sequences is exhibiting a high degree of locality so if request sequence consists of long runs only then this parameter can be as high as 1 or this ratio can be as high as 1 and now as lambda goes to 1 obviously the competitiveness tends to 1 and this means that move to front actually responds perfectly to locality of reference we can do a similar calculation for the other algorithms and as an example I can show you the calculation for bit again we divide online cost by the optimum cost and obtain that for bit the competitiveness is equal given by the minimum of 1.75 and 2 plus lambda over 1 plus lambda and now as lambda goes to 1 we obtain a ratio of 1.5 but we do not obtain a ratio which goes as down which goes down to 1 so bit and this holds for the other algorithms as well do not respond well to locality then we complemented these results by an experimental study again the main goal was to compare the new theoretical bounds to the bounds we observe in practice as far as service cost and competitiveness are concerned I mentioned already we took real world traces from benchmark libraries for data compression many such libraries the Calgary corpus, Canterbury corpus many of you know that the popular compression algorithm BZEP2 relies on a so called Barrow's villa transformation followed by move to front encoding but in addition to this data compression traces we also looked at memory excess traces and I mentioned already that the errors we observe between theory and practice are now very small below 1% for move to front actually as far as excess cost is concerned the error is 0% because we have an exact bound on the service cost of move to front for the other algorithms the errors are a bit higher but still I would say low around 5% one can now study all this data that arises this slide shows question about those numbers on the other slide so when you say competitor ratio on these traces so does that mean you look at the trace you compute lambda and then you plug lambda into your formula and that's what it is but these are only empirical this isn't theoretical stuff this is just empirical still there is a point there so you compare this to like what 2 over 1 plus lambda is no no so you compare in the experimental competitiveness you look at the ratio online cost by the optimum cost now the optimum is an NP-hard problem so what we did is we took oops now it's on the wrong direction so we took our lower bound so actually the errors would be even smaller then we took this lower bound on the optimum cost the optimum might actually be higher so this is actually a pessimistic estimate of the error I'm not sure so this slide depicts the errors using these box plots which is a standard means to display numerical data maybe I've seen those plots before so we have the various algorithms for each algorithm we analyze I don't know 45 traces and then we have for each algorithm about 45 data points now the bold line always represents the median data point the box contains 50% of the data points for 25% is located above and below the median so the height of a box basically represents the variance in data and then also yeah this uppermost line the lower line always represents the outermost point that can be found at a distance of 1.5 between the interquartal range and all other points are outliers what you see basically is that everything is very well behaved the median data point is very low the height of the box is very low and we have very few outliers so not only is the average error very low but the variance in data is also very slow I think I conclude at this point so in summary as far as the second part of the talk is concerned we introduced a new model specifically designed for list update presented new theoretical analysis that hopefully capture phenomena that are observed in practice it shows that move to front is an excellent algorithm method of choice in practice and in general more generally I think that these combined theoretical and experimental studies could be done for other problems as well maybe one could do it for scheduling where usually resource augmentation is used to overcome negative results but there again the challenge is to find suitable models characterizing the job sequences and yeah I do not know how to do this thank you for your attention this is another critical question so I don't know years ago we did for makespan minimization we also did an experimental study and we had a very hard time finding good job trades I remember that my student at that time she wrote to I don't know super computer centers and we were able to get some data I'm not sure now that might be outdated it's very difficult for scheduling to find the data because you probably do not want to make any probabilistic assumptions so you do not want to generate jobs according to a probability distribution this is even more difficult depending on the problem you look at so we are for the energy efficient scheduling the most classical problem is the problem setting where jobs have deadlines so not only do they have processing times of processing models but also deadlines so you need basically two parameters that you and yeah this I would be glad if some bench might lie over you as you surveyed the online evidence people thought a lot about so the idea is to elude worst case analysis so I mean of that list of ideas I mean how many seems sort of appropriate only for online analysis versus how many potentially more broadly useful to say that flying out of the university and so on yeah that's a good question so resource augmentation I very much like the idea but of course it's a trick in the analysis I mean it's basically a means to remove the very bad performance guarantees it does not improve the algorithm basically you don't get any further insight to the problem let's say refined performance measures could be maybe an approach that applies to more general settings I also like this framework I'm a bit skeptical about probability distribution the question is always what are realistic distributions to find at least I always found average case analysis extremely hard but maybe there are more clever people in the audience who can do this average case analysis did you do worst case over a class? of course here you have a class of input or a class of distributions and then you pick a distribution this is I think very nice so this could be well suited to other settings as well I do remember as far as list update is concerned of course the story of literature addressing list update under the assumption that you have a probability distribution and then the big step forward let you look at competitive analysis instead of probability distribution now it seems we are again moving back to the probabilistic framework but yeah and what else did we have the data modeling I like the data modeling but of course it works well for the access problems the access problem locality is very natural and very accepted I'm not sure about computer science problems in general if there is always such a need underlying model that can be identified so as when Adrian Blum talked about clustering I said it would actually fail because clustering it's we cluster this thing because we want to get information about the data isn't it sort of we have this massive data set and want to find out what is it like so if we knew what the input is maybe we wouldn't really cluster sort of so we want to get information about the data maybe to come up with a model that characterizes the input so I see problems there facility locality in general for computer science so here these approaches have been looked at so what is applicable to the results along the lines of data modeling but also impressions like I think a lot of this is a phenomenon I think other questions