 Please, let's welcome Sirov Shatizy for his seminar presentation. OK, so thanks to the organizers for the invitation to his very nice workshop. So I'm going to talk about a method for lower bounds on fluctuations of random variables. So as we know, there are many ways of getting upper bound and fluctuations. So this is this broad field of concentration inequalities and variance bounds. But as I realized, well, thinking about a question that Yuval Peres asked me, and which is part of the talk, actually, that there are very few methods for lower bounds. So in fact, when I was thinking, I figured that the only ones I know are the following. So first, you can try to prove some kind of distributional convergence, a central limit theorem. That's great if you can do that. Then you have upper and lower bounds matching everything that you want. If you cannot do that, then you prove a lower bound on the variance. So there are some techniques for getting lower bounds on the variance. But you have to also prove a matching upper bound on some higher moment to apply this second moment method. So just a lower bound on the variance doesn't tell you that the fluctuations are of that order. There is a coupling method due to Swanti Jansson, which I didn't know. After I wrote my pre-print, I sent it to Mike Steele, who told me about this. So Swanti Jansson has an idea, has a lemma which he used with Bella Bollebach to prove a lower bound on the fluctuations of the longest increasing subsequence before everything was known about it. So there is a method. And then there are problem-specific methods. So there are various papers that have been written. And so you get a survey of all of these in my pre-print, which is on archive now. However, as I was thinking about this question that you've all asked me, I realized that there are many other open questions. That's not the only one. And there are many modern problems where none of these methods work. So I tried in various things that didn't work. I'll tell you about a new method with applications to the following things, first passage per collision, then traveling salesman and minimum matching, and the random assignment problem, some spin glasses, and finally, for this workshop, a little bit of application to random matrices. So first, what is meant by a lower bound on the order of fluctuations? If a lower bound on the variance doesn't tell you, then what does? And the most sensible way to talk about lower bound on fluctuations is through this notion of levy concentration functions. So if you have a random variable x, its levy concentration functions defined as follows. f of some number h is a probability that x belongs to an interval of length h. And you take supreme overall all intervals. That's the levy concentration function. And we'll say that a sequence of random variables xn has fluctuations of order at least delta n. So here's a sequence. Delta n is another sequence. If there is some positive constant so that the limsup of fnc delta n is strictly less than 1. So otherwise, you can find a sequence of intervals of length of order delta n so that the random variable, or actually the low of delta n so that the random variable belongs to these intervals with probability tending to 1. So the fluctuations are much smaller than delta n. So this is the notion, this is the sense in which I define the notion that a sequence of random variables has fluctuations of order at least delta n if this happens. Any questions about this? All right, so now that we have fixed one notion of a sensible notion of fluctuations of random variables, here is the main lemma. And the proof is very simple. So I'll show it in one slide. The lemma is a following. Suppose x and y are two random variables defined on the same probability space. Then for any interval a, b, the chance that x belongs to a, b is bounded by half times this 1 plus the chance that x minus y is less than b minus a plus the total variation distance between the law of x and the law of y. So the idea is that you want to show that probability x belongs to an interval i is uniformly bounded away from 1 for all intervals of length bounded by some delta. If you want to show that, you construct another random variable y so that the total variation distance between the law of x and the law of y is small. And x minus y is at least delta apart. If you can do these two things, and here small doesn't mean going to 0. Small just means that this sum has to be less than 1 so that this whole thing is bounded away from 1. So to show that a random variable x is fluctuations of a certain order, you just have to construct another random variable y on the same probability space so that the total variation distance between the laws of these two random variables is small. But they are far apart. They are apart by at least delta with some substantial chance. That's all you have to do. So as I mentioned, there's this coupling technique due to Swanty Janssen. And there's a similar approach, but in Janssen's lemma, he takes y so that the law of x and the law of y are exactly the same. And this lemma somehow gives you much more flexibility, as we'll see. Why do I have the total variation distance instead of something else like the barrier scene, the Kolmogorov distance or something like that? There are specific reasons. There is a distinct advantage of using the total variation distance. So we'll talk about that. Any questions about the statement of the lemma? So as I said, the proof is very simple. You take any interval, a, b, call it i. Then one is bigger than or equal to the chance that x belongs to i or y belongs to i, which by inclusion-exclusion, you have this. Now, the probability that y belongs to i is at least the probability x belongs to i minus the total variation distance. So in case there are some students here who don't know what's the total variation distance, if you have two probability measures, mu and nu, the total variation distance is the following. You take any event a, mu a minus nu a, take the difference of the probability of a under mu minus the probability of a under nu. And absolute value takes a premium over all events a. That's the total variation distance between two probability measures. So from that definition, it's clear that this is the case. And also, this intersection probability is bounded above by the chance that x minus y is less than the length of this interval. Because if they are both in this interval, the difference is bounded by the length of the interval. So if you substitute these two inequalities in here, you get this, which is just what I wrote down in the previous slide. Yeah, yeah, so Kolmogorov distance would be fine, too. But there is a reason why I'm using total variation. So we'll see. Yeah, you could have substituted total variation with Kolmogov distance. That would be fine, too. OK. So here, let's see a quick example, very quickly, before going into any kind of complexity. Suppose you have x1 to xn iid Bernoulli, half random variables, an essence of sum. We all know it has fluctuations of order square root n. Can we get a lower bound using this method? So what you do is the following. You define xi prime to be equal to xi with probability 1 minus alpha times 1 over root n, where alpha will be chosen later on. And xi prime to be exactly 1 with probability alpha times 1 over root n. So xi prime is either xi or it's 1. So for a small fraction, for a 1 over root n fraction, you kind of set it to be equal to 1 randomly. Now, the nice thing about the total variation distance, so sn prime is the sum of these xi primes, the nice thing about the total variation distance is that it has this projection property. So the law of sn, the law of sn prime are complicated objects. You know, you need the central limit term to understand those, or at least the binomial theorem. But since these are functions of these random vectors, you can bound it above by the total variation distance between the laws of these two random vectors. And that's much easier to compute. These two probability measures can be explicitly written down. And there's a formula for the total variation distance between probability measures in terms of differences of probabilities. And you sum them up over all atoms. And you can compute. So this xi prime is different from xi at roughly root n places. But still, the nice thing, and we'll see the reason why this is true. So this isn't exact computations. This is just a little bit of analysis. You write down the whole thing. You do a little bit of analysis. And you just have to use Chebyshev's inequality nothing more to do this upper bound. But this upper bound doesn't depend on n. It's just some universal constant times alpha. Where this alpha was this alpha here. So if you do this computation, you don't do it now. But if you do this computation, you'll see that this is bounded by constant times alpha. On the other hand, you see what's happening between sn prime and sn. So sn was sum of x1 to xn. Now you chose roughly root n of these xi's. And you set them to be equal to 1. Now when you're setting them to be equal to 1, some of them, half of them are already 1. But the other half are 0. So you are increasing by root n. So you're increasing sn by root n when you're going from sn to sn prime. So sn prime minus sn is of order root n. Because you're increasing them in this artificial manner. So therefore, by this lemma for any interval of width less than root n times alpha, the chance that sn belongs twice bounded by half times 1 plus this total variation distance plus this probability. And this probability, we know that it's going to 0. C2 was chosen in this manner. So if alpha is chosen small enough, this is uniformly bounded away from 0, from 1. So choosing alpha small enough, this right side can be made uniformly bounded away from 1. And this shows that sn is fluctuation of order at least root n. The only thing I didn't show you is this calculation. But we'll see how to do that in greater generality, this kind of total variation calculation. Any questions? OK. This is the basic example just to give you an idea of how this is done. So you construct something which is substantially away from your original thing, but yet it is done in such a manner that the total variation distance is not too much. It's kind of small. OK. So let us now see how to get this optimal lower bound and the fluctuations of the length of the optimal tour in the traveling salesman problem. So this is a much harder example. And we'll slowly go through the steps. So the traveling salesman problem, again, if somebody here doesn't know, so you have the stochastic traveling salesman problem. You have n random points on the plane according to some distribution, i.e. points. And you want to find a tour through these points. So there's a salesman who is traveling to all of these cities. And he has to have a tour where he ends at where he started from and has to minimize the total distance traveled. So that's the length of the optimal tour. And then you ask about the fluctuations of that. So this can be in two dimensions. It can also be in D dimensions. So there are a few steps to this. And these are systematic steps. So we'll see how whatever I'm outlining carries over to many other problems. It's not just for this one. The first thing is this Hellinger affinity. Let mu and mu prime be two probability measures in the same space with densities f and g with respect to some probability measure mu. And of course, mu can be mu plus mu prime over 2. So you can always find a probability measure with respect to which these are both absolutely continuous. So suppose these have two densities. Then the Hellinger affinity between mu and mu prime is defined as the integral of square root fg d nu. And by Cauchy-Schwarz, this is bounded between 0 and 1. And the closer it is to 1, the closer these two measures are, mu and mu prime. And one can show that this quantity doesn't depend on the choice of this measure mu. So this is independent of that. That's called the Hellinger affinity. Now, why is this useful? Suppose you have measures mu 1 to mu n, mu 1 prime to mu n prime. And you take these two product measures. Mu is mu 1 cross up to mu n. Mu prime is mu 1 prime cross up to mu prime. Then this bound is very well known to statisticians. It has been used very effectively in theoretical statistics. The total variation distance between probability measures, these product probability measures, which are complicated objects, can be bounded by something which is a much simpler object. So you take these individual mu i and mu i prime, take the Hellinger affinity between them, squared them, take the product, 1 minus that. The square root is a bound on the total variation distance between mu and mu prime. It's very easy. It's just Cauchy-Schwarz. But it's slightly tricky. You have to carry out the argument. But the nice thing about using the Hellinger affinity is that it allows you to get bounds on total variation distance based on simple computations that you can do for the individual measures. So this is one of the reasons why I'm using total variation. Now let x be a d-dimensional random vector with probability density function e to the minus vx on either rd or on the positive orton, 0 infinity to the d. And suppose v is a smooth function satisfying some mild growth conditions, which allows x to be a Gaussian or exponential random variable or many other things, but not uniform. So uniform, if you have uniform on an interval, v has to be infinity at some points. And that is not allowed. And take some epsilon and let y be x over 1 plus epsilon. So you scale down y by a small factor. So this is the only ugly thing that you will see. So you take the law of x and the law of y. So you take this x is a random vector. y is a scaling of x by 1 plus epsilon. And when you do the computation of the Hellinger affinity, you can write down, so this is the density of y, this is the density of x. You can write these things down, do a Taylor expansion while you have to justify all the steps, of course, but I'm showing this roughly. You do a first order Taylor expansion, you get 1 plus epsilon d over 2. And here you get this, expanding this, you get 1 minus epsilon over 2, something. And then you just apply integration by parts and, kind of like magic, the epsilon terms goes away. So the Hellinger affinity is at least 1 minus c times epsilon squared. And that's very important somehow. So when you take a random vector, d-dimensional random vector with a smooth density, and you do this scaling by 1 plus epsilon and take the Hellinger affinity between the laws of the two things, it's bounded below by 1 minus constant times epsilon squared, not epsilon. This is not true if, for instance, x was uniformly distributed in a cube. Then this is no longer true. It's the smooth density that matters over here. And then from this, now it's very easy to derive. Suppose x1 to xn are iid with density satisfying some conditions as above, and yi's are scalings of epsilon by 1 plus epsilon i. The derivation distance between the laws of these two random vectors is bounded by c times square root of summation epsilon i squared. And this is what allows you to get what you want. So somehow the same thing was at playing the Bernoulli example also. So there epsilon was 1 over root n. It was not smooth. It was a somewhat different setup. But epsilon was 1 over root n. So it is 1 over root n perturbation of the original Bernoulli. But the 1 over root n gets squared. And that's why when it's summed from 1 to n, you get something that doesn't depend on n anymore. So that's kind of what mattered. If you compute the Hellinger affinity between xi and xi prime in that example, and you applied this lemma, then you would get this result. Any questions about this? So now you have this. So you have the x1 to xn points in Rd. You have a slight scaling of those things. And you're total variation distance bound. OK. So now a general class of geometric optimization problems. So suppose you have a function which takes in n d-dimensional vectors and outputs a real number. And it has this homogeneity property. It's a homogeneous function. So if you take fn of lambda x1 to lambda xn, it becomes lambda to the R. So R is some fixed number times fn of x1 to xn. And a surprising number of very complicated functions can put into this framework. So for instance, if you take the length of the optimal traveling salesman path, it satisfies this with r equals 1. So if you take all these points and just scale them down a little, it's the original length times that scaling. And similarly, the minimum matching and the volume of convex hull and all kinds of things have this property. So let xi be these random vectors. And ln be this function. And what's the general lower bound on the order of fluctuations of ln? So here is a theorem. Let Tn be a sequence of constants so that lim in for probability ln bigger than Tn is positive. So Tn is somehow a lower bound on the size of ln. So Tn is a sequence of constants, so the chance that ln is bigger than Tn is not going to zero through any subsequence. So the lim in phase positive. Then ln has fluctuations of order n to the minus half times Tn. So roughly speaking, in all of these problems which have this homogeneity property that I wrote down here, the fluctuations are of order at least one over root n times the size of the object. So the fluctuation of the traveling salesman would be at least one over root n times the length of the traveling salesman path. The fluctuation of minimum matching would be at least one over root n times the length of the minimum matching. The question is whether this gives your sensible answer. That's the question. So it gives a lower bound by this machinery. Well, just a word about the proof, it's very simple. It's using the lemma. So you take the original points x1 to xn. You scale them down by 1 plus alpha over root n. So just a little bit of scaling. And then ln prime, due to the homogeneity property, ln prime, the new optimal length is just the scaling of the original length. And if ln is at least Tn, then you get a lower bound of the difference between ln and ln prime, which is like one over root n times Tn. On the other hand, by the proposition that we proved, the total variation distance between the two laws is bounded by some constant times alpha. It does depend on n. So it can be made small by choosing alpha small. And then you apply just the lemma. So if you take any interval of length, 1 over root n times Tn, this lemma will tell you that choosing alpha small enough, the probability is uniformly bounded away from 1. So the proof is very simple. Once you're given this whole thing, first of all, this lemma. And secondly, this total variation bound using the Hellinger affinity, it's just a few lines. Any questions about this proof? So just to recap, we have these IID points in RD. We have some function which has this homogeneity property. Let's say, if you scale all the points by lambda, then the result will also be scaled by lambda. It can be a very complicated function. And then this argument says that the fluctuations of the output are, at least, of order 1 over root n times the size of the output. So let's see what this gives. So suppose the length is either the length of the optimal traveling salesman path or ln is the length of the minimum matching either of these two. In both cases, the size of ln is of order into the 1 minus 1 over d. And the reason is very simple. It is a well-known result. It's because the nearest neighbor distance in d dimensions is n to the minus 1 over d. And both of these things are lower bounded, at least, by the sum of nearest neighbor distances times some constant. So that's why you get n to the 1 minus 1 over d. So you can take Tn to be n to the 1 minus 1 over d. So the theorem says that the fluctuations of order at least n to the minus half times n to the 1 minus 1 over d, which gives you this. And it is known that, at least for densities with compact support, which unfortunately doesn't include this unbounded densities, this is the right order. So surprisingly, unbounded densities, there are, I didn't see any results in a yaster round and I couldn't find anything. But presumably, if the densities are falling off fast enough, then you can apply this kind of technique and you can get the same rate. So this seemingly, I don't know if I should call it crude, but this technique ends up giving it to this result, which seems like the right result, is n to the d minus 2 over 2d. So I have not seen. So this is actually, I think it's an open problem. It's an open problem because if you just want to apply a Frankenstein kind of bounds directly, you would get infinities. So there is something. Otherwise, I would have just done it. So the only previous result I know is due to once a re, who proved this order 1. So in dimension 2, it's order 1. You see d minus 2 is 0. Re-proof the order 1 lower bound for traveling cell spend through uniformly distributed points at unit square. And just for completeness, I generalize this to d dimensions for uniformly distributed points. It requires a different coupling, this kind of very simple coupling that I showed it won't work. For minimum matching on a compact set, I think the question of the lower bound is open. So I'll mention it again in the list of open problems. OK. So now I'll not show you any more proofs. I'll just show some results that I have in this paper. So this is the problem that motivated this whole research, this question that you've well asked. It's about two-dimensional first passage percolation. So if you're not familiar with the model, so you have each nearest neighbor edge in the two-dimensional lattice is assigned a random weight. The weights are non-negative in IID. And the weight of a path is the sum of the edge weights along the path. And the first passage time from x to y, from vertex x to vertex y, is the minimum weight overall paths from x to y. So you take all paths. Each path is a weight. Minimize overall paths. And that's the first passage time, dxy. So is the model clear to everybody? The question is, one of the main questions in this area is, what is the order of fluctuations of dxy, depending on this distance between x and y, in particular, if the distance becomes bigger, goes to infinity, what's the order of fluctuations? And we know very little about that, although a lot of work has been done, a very deep work has been done. So the best known upper bound is the distance divided by the log of the distance square root, best known upper bound on the order of fluctuations. And this comes as a result of works of various authors. So Kestin proved it without the log. Benjamin Kalari and Schramm proved it with the log for binary weights. And these were extended later to much more general class of weights by these other authors. The lower bound, on the other hand, is in much worse shape. So Newman and Pisa showed that the variance of dxy is bounded below by some constant times log of the distance. However, this does not prove, so this is the question that you've all asked me, so this does not prove a lower bound on the order of fluctuations, because the upper bound doesn't match. So just a lower bound on the variance you cannot use to say it has fluctuations of order at least square root of log of the distance. So Pemandl and Peres, they proved an actual lower bound. They actually showed these fluctuations of order square root log n. But only if the weights are exponentially distributed. And the memory less property of the exponential was crucially used in this proof. So it doesn't seem to extend beyond the exponential distribution. And then using the techniques of this paper that I just showed, you can actually show this, that for a large class of distributions, the fluctuations are actually of order at least square root log of the distance. You just have to take these original edge weights and you scale them, but the scaling depends now on the distance of the edge from the origin. So you have to cleverly choose the scaling, and then this result falls out of this technique. So more on first passage percolation. So first, any questions about this? Yeah, so the coupling is, OK, I can tell a little bit about the coupling. So you take a point at distance n from the origin, and you want to show that the first passage time from 0 to that point has fluctuations of order at least square root log n. So you take a ball of radius n over 2 around 0, and take each edge weight, call the edge weight WE. You replace WE by 1 W over 1 plus some factor, where this factor is 1 over the distance of the edge from the origin times square root log n. So you do an inhomogeneous perturbation. And with that perturbation, it turns out that total variation distance as before is bounded by something that doesn't depend on n, which you can make smaller, small by choosing a parameter small enough. And on the other hand, the first passage time has to decrease by at least square root log n. So this coupling gives you this. So somehow, this kind of coupling doesn't seem to work in dimensions higher than 2. So maybe there is some other more clever coupling that can work in three entire dimensions. I don't know. But so this is just a perturbative coupling. And then you use the rest of the things, the Hellinger affinity and all the other things. But the coupling is inhomogeneous. So the weights vary inversely as the distance of the edge from the origin. Here, the conjecture is in two dimensions, it's x minus y, the distance to the 1 third. That's the conjecture. Distance to the 1 third. And as I said, the best known upper bound is distance over the log of the distance, square root of that. And so even showing little low of that, Burgan told me he tried for a long time to prove little low of that. So that would be a big step to prove little low of n over log n with that, and many other people. So the next thing is shape fluctuations in first passive percolation. So let bt be the set of all vertices that can be reached by time t, all x so that t0, x is less than or equal to t. And there is a result. Following works by Richardson and various other people. There's a result of Cox and Durett where they show that there is a symmetric convex set b0 so that almost surely for every epsilon, bt over t, if you take this all points which can reach by time t, scale that set by 1 over t, well, you have to kind of fuzzy it a little bit to make it a subset of rd. And then it's between 1 minus epsilon times b0 and 1 plus epsilon times b0. So there's a limit shape. So the shape of the region that can reach by time t approaches a limit if you scale it by t. So b0 is called the limit shape. And the fluctuations of this object, 1 over t, bt, are called shape fluctuations. And Newman and Pisa define this shape fluctuation exponent, which is naturally indicated by this theorem here. This is the smallest kappa so that bt is contained in t minus t to the kappa b0 and contained in t plus t to the kappa b0 for all large t almost surely. So the smallest kappa. So this gives a measure on how much this bt fluctuates from the predicted limit shape, this exponent. And it has been an open problem, actually, to show that this chi prime, this exponent is positive in any dimension. So this is the most natural thing we can define to understand the shape fluctuations. So what follows from these arguments is that in 2D first passage per collision, under some mild condition of the edgeway distributions, chi prime is at least 1 eighth. The main step is to show that there is a direction in which the first passage time to appoint a distance n as fluctuations of order at least n to the 1 eighth. And as I mentioned before, in this paper Newman and Pisa, they also did the same thing, but they've got a variance lower bound of order n to the 1 fourth in some direction. However, since it does not give an actual lower bound on the order of fluctuations, this cannot be used to get lower bound in chi prime. And it's conjectured, again, to answer Sylvia's question, it's conjectured that chi prime is 1 third in 2D mentions in 2D. OK. Yes. No, no, because it doesn't need an assumption because there is always the direction of curvature. So this is a fluctuation of the whole shape. So if there is a direction of curvature, that's sufficient to guarantee that the BT cannot be in a very thin band. OK.