 Thank you very much for the introduction. Thanks. I'd like to thank the organizers. It's a real pleasure and a privilege to be here. It's my first time at IHES. I'll try to give a talk about a topic that somewhat overlaps with Eval Pérez's talk from yesterday. But I'll try to keep all the definitions, to repeat all the definitions that are going to be used today, just in case some of you might have missed it. And the talk will be an attempt to understand what's happening to a random walk on a random environment. The random environment that we're going to discuss will be a random regular graph. So I'll use script G to denote the distribution. So this is for. So we'll take d to be at least three fixed. And we'll have this will be just nothing more than the uniform distribution over the regular graphs on n vertices. And the question that we'll be interested in is what is the quenched mixing time of simple random walk on G? This is the main question. And in particular, we'd like to understand does it exhibit cutoff? Very nice board. Please feel free to stop me if you have any questions. Let's go through a few quick definitions that Eval already gave you yesterday. So these will be imperative. If we have a finite irreducible aperiodic Markov chain to denote by xt, and let's say with its transition kernel as p and stationary distribution as pi, we'll define its mixing time. And when we say mixing time without saying anything, then the default would be the total variation mixing time, or L1. This is the first time such that the total variation distance drops to epsilon or less. So what is dTv? Here we just take the worst starting position and look at the total variation distance between the transition kernel raised to the power of t and pi. And I'll remind you that total variation distance is the supremum over events a of mu of a minus nu of a. And in our case, the discrete countable case actually will be in the finite case, then this is also equal to 1 half of the L1 distance between mu and nu as vectors. So you sum over the points of your probability space. x, for instance, if it's simple random argument, this would be n points. The points are the vertices. And you'll just have 1 half of sum over x, mu of x minus nu of x in absolute value. So total variation distance is associated with L1. And you could also ask, what is the mixing time if instead of measuring our convergence to equilibrium, according to L1, we'll measure it according to L2? And that's also interesting. But somehow, any LP that you choose, and this theme repeats itself in various different settings, any LP that you choose for p larger than 1 has an analytic flavor that fails to capture various physical quantities that are associated with a problem. And somehow, this characterization of L1, if you look at this specific formula, it looks like I'm taking the worst distinguishing statistic between the two measures. That is something that is unique to p equals 1. And maybe because of that, methods that work for, there are multiple methods that work when you try to analyze every p that is larger than 1, but fail at p equals 1. And there are theorems that unite all p that is larger than 1 and fail exactly at 1. And unfortunately, or fortunately, it depends on how you want to look at it, the p equals 1 is the case that we are most interested in. I wonder if this upper board is a what? There's a hidden hook. OK, so we have the question. We want to understand what the T-mix is. And let me also just tell you, relating to this second bullet, what does it mean to have cutoff to remind you? The definition of T-mix is valid if you have a Markov chain on five points. The definition of cutoff already discusses not one specific chain, but you'll have to have a sequence of chains, a system whose size will grow to infinity and will be captured by some sometimes implicit parameter n. In our case, it would be just the size of the graphs. We'll consider a family of graphs whose size tends to infinity. So for a sequence of Markov chains, x, t, and here's this implicit parameter n, sometimes I'll drop it. We say there is cutoff if and only if. So let's look at this definition for a second. We had T-mix of epsilon. And what does it mean, this T-mix of epsilon? We are looking at the total variation distance. Let's say that you forget about this maximum over x. Let's fix 1x. And we have one system on, I don't know, 100 sites. And now we are slowly increasing T. The distance to the stationary distribution is non-increasing. It's like that for any LP metric that you will choose. One way, there are multiple ways to prove it. So the distance looks like that. It starts at, let's say, 1 or nearly 1 because you have a point mass. And the stationary distribution will give it, let's say, I don't know, probability 1 over n. So it starts almost at 1. And then it will slowly work its way down to 0 by the fundamental theorem of Markov chain. Essentially, eventually it will converge to its stationary distribution. So as time goes on, this is your dTv at time t, it will go towards 0. And our mixing time is just saying, let's threshold it at some epsilon. And this will be T mix of epsilon. This is for one chain. The cutoff phenomenon is about understanding the shape of this curve. It's about understanding whether this thing happens like this or whether it happens like that. But in order to actually quantify it, I mean, this is meaningless if you have just 1 fixed n. So the difference between these two would be, before I write the formula, would be to say, when I have a sequence of such chains, I will ask for the transition from being near 1 to being near 0. This is about 1 and this is about 0. I would like this entire transition to take place in a window that is microscopic. And this would mean that there is cutoff. So if and only if, for every epsilon that is between 0 and 1, T mix of epsilon, and I'll put the subscript n divided by T mix of 1 minus epsilon, this goes to 1 as n goes to infinity. So this means that as you are fixing your whichever epsilon that you want to fix, so 0.99, 0.01, as you increase n to infinity, the leading order term would still be the same. So as you look at this curve that I drew for the convergence equilibrium from afar, it looks like it is essentially 1, 1, 1, 1, 1, 1, 1, and then 0, 0, 0, 0, 0, and so on. And all the action takes place in this little window. So this is one definition. And equivalently, we can say, so it's not exactly equivalently, it's more specifically, if you want to quantify exactly what happens in this window to go beyond the first order term, then we'll say that the sequence Wn is a cutoff window if Wn is, let's say, a little low of T mix, and then you put a constant here. I don't know, we can put a half. I have some kind of a moral inhibition from putting anything that is bigger, that will equal to a half, because usually in the literature, when you do mixing times, anything less than a half is good for some multiplicative behavior. So you take a quarter or 1 over E, these are the usual candidates, but not a half. But in our case, it won't matter. So it's a cutoff window. If it is microscopic, I took it to be a little low, and for every epsilon, there exists some C such that T mix, let's write a supremum over N, T mix of epsilon minus T mix of 1 minus epsilon is at most this constant times Wn. So this tells you that here we can put Wn, and Wn has smaller order than the length of time that it takes you to actually get to this Wn. So this would be here T mix of a half, but it would also be T mix of 0.99, and it would also be T mix of 0.001. You can place all of these points at the same location, because they differ by at most Wn. So this is cutoff. Any questions? You don't want to say that it removes the soup, and say this is true for N, N, N? But I can take C to be N. You have left-hand side by omega N. Yes, I can say it. OK, so for such that N for all exists, C is such that for large enough N. Thank you very much. OK, so this is what it means to have cutoff. And now we can ask ourselves, do we expect to revisit this question of our environment and say, we have a random walk on a family of graphs. These are just uniformly drawn graphs, so graphs drawn from the uniform distribution over all graphs that are three regular. And we'll choose that for every end. So this is our sequence of graphs. And the basic question is, do we have this behavior, this sharp transition from being near 1 to being near 0? And that would be the main goal of this course. Let's give you some. So in order to understand whether we expect it to be cutoff or not, I will remind you, first of all, of this quick criterion that is spectral due to Yuval. He may have mentioned this yesterday. So let's recall that a random D regular graph, G, so this is D, that is at least 3, with high probability, has eigenvalues lambda 1, that is 1, lambda 2, and so on, such that lambda i, sup over i, that is at least 2, is at most 2 squared D minus 1 plus little over 1. So this goes back to, this was proved by Friedman in 08. It was proved a long time before 08, but it appeared in 08. We have just, when you say the little over 1, you should say with respect to what? OK, so this, so there exists, so for every, OK, so there'll be a, you can choose as any sequence. Let's say, OK, for every epsilon larger than 0, this holds, for instance. So with high probability, this holds. So now the statement is formally, that's right. So the question is, so with high probability, I would mean with probability, with probability, that is 1 minus little over 1, where this little over 1 term tends to 0 as n goes to infinity. And now it is fully formal. OK, so when I saw Charles' title, at first I was convinced that he would talk about his proof of, oh, you will? OK, because so far he hasn't. OK, so Charles has a very nice, well, it's actually, it's in a breakthrough result from about a year and a half ago, he gave a very short proof of this result and actually a stronger result. I'll get to that a little later. And what is nice, I think, as I still haven't gotten to the plan of the course, but what is nice I find in this topic is that there's an intimate relation between understanding the spectrum of these random graphs and understanding random walk on these graphs. Somehow it seems that the techniques in this paper and also in Charles' paper and also in a paper that happened in between Friedman and Kohler are related to understanding, to counting, non-backtracking paths. And we are also, in order to understand this, we will find ourselves trying to understand the structure, the behavior to count, also non-backtracking paths and how they are kind of structured in the graph. So it seems like sometimes we use the spectrum in order to understand the simple random walk and sometimes we use the walk in order to understand the spectrum. OK. So going back to this cutoff, what can we say from this result? Well, we know from this result that simple random walk on G, by the way, I didn't say that, but when I say that the graph has eigenvalues, of course I mean that the adjacency matrix of the graph has these eigenvalues. So it's a symmetric matrix and its eigenvalues are real. They are between minus D and D, but on four binions and so on. I took this for granted. So simple random walk on G would have what is its gap, its spectral gap. So for the simple random walk on G, I would just take the adjacency matrix and I would just make one of the possible D moves from every vertex. So I need to divide the adjacency matrix by D. Lambda 1 becomes the trivial eigenvector of 1. And now I want to understand what the gap is. It's just 1 minus the second largest eigenvalue in absolute value, which is at least 1 minus 2 root D minus 1 divided by D. And I'll write here an epsilon. OK. So with high probability. And in particular, we see that the gap, the inverse gap, is fixed. So in 2004, Yuval had this criterion. He's in the room, so he won't mind if I tease him a little. He had this conjecture that this criterion would predict cutoff. But whenever someone would come up with a counter example, he would say that the counter example is not natural. So the actual conjecture is this criterion predicts cutoff whenever your Markov chain is a natural one. And therefore, it is really hard to refute. But so I tease him. But in some sense, at least, and you'll see from these three talks, I hope that you'll see that there is something behind that. Because in some sense, whenever the examples that we're going to discuss are nice and transitive, it seems like you cannot destroy this criterion. So any attempt to somehow refute it that we know of today involves some kind of an attempt to take different objects and hide an object of type A within a larger object of type B, which is kind of cheating. OK, anyway. So the criterion was saying that if gap times T mix tends to infinity, which is why it was called, at least in that paper, this is in aim 2004, this is the so-called product criterion, implies cutoff. OK, so the left hand is necessary. It is necessary for cutoff if you have a reversible chain. And that was in yesterday's lecture. Although then, yesterday, instead of calling it using this phrase, he said T rel, which is the inverse gap, is a little low of T mix. But it is the same. And not necessary, but the sufficient bit is the part of the conjecture, which strictly speaking has counter examples. But not when the system is nice and transitive and has, let's say, bounded degrees and so on, or comes from a spin system on a transitive underlying geometry like a torus. There we have no counter examples, and we've been proving cutoff in those situations one case after another. OK, so going back to our simpler setting of just a random deregular graph, what is our gap? It is the relaxation time is order one. Our gap is fixed. What is T mix? Well, it doesn't really matter what T mix is, because obviously it diverges within. You need to visit. You need to have a realistic chance to visit all vertices. The stationary distribution is uniform. And this is a bounded degree graph. So it is actually easy to see that it should be at least log N. But judging from this criterion, we would expect there to be cutoff. And indeed, this was an explicit conjecture of duet in 07. This is in his book, Random Graph Dynamics. So he conjectured that T mix on a random three regular graph, so take a uniformly chosen three regular graph. This should have T mix that is asymptotically three log base two of N. Now, in the book, he actually phrased it for the lazy random walk. Then it was a six instead of a three, but I'll write it like that if you don't mind. He wouldn't mind with high probability. And he didn't just conjecture this out of the blue. And I don't think that he was aware of the advances on the advances that came in this. He wasn't part of this same workshop. Lots of people were. Percy was there. David Aldous was there. But duet wasn't. What drove this conjecture was a previous work with Barastiki, previous work, meaning so Nathaniel, this was his thesis with duet. And they had the following result. So they showed that if you let's think about what this mixing time is. Mixing time is understanding when you cannot distinguish between the stationary distribution and the distribution of where the walk is with anything that gives you a weight that is more than epsilon, for instance. Because there's no better distinguishing statistic. And what Barastiki and duet looked at was one specific marginal, one specific test function, a very natural one, which is just the distance from the origin. So distance from the origin. And morally, they said that this distance from the origin, and I'll write it informally first, equilibriates at t, which is about 3 log 2n. And the formal statement, so formally, they showed that for every alpha fixed, the distribution of x, which is, let's say, so xt will be simple random walk. The distribution of simple random walk at time alpha log base 2 of n and x0 is the minimum between alpha over 3 and 1. The distance times log base 2 of n. So let's do it like this, one over. Now it's fine, better. So if we divide it by log base 2 of n, then up to lower order terms, what we get is alpha over 3, minimum with 1, meaning that if we look at just this alpha, we see that after time 3 log base 2 of n, it stays the same. So it kind of increases and increases and then becomes constant. And this is at 3 log base 2 of n. And then the natural conjecture is, well, probably this marginal tells us everything that we want to know. Once our distance equilibriates, it must mean that we are essentially where we wanted to get to. Oddly enough, this intuition fails as soon as you have even the slightest variability in your degrees. So instead of a random 3 regular, if 90% of the sites are 3 regular and 10% are 4 regular, it will already not be true. So this test function of the distance will look exactly the same way. It would increase linearly and then it would stabilize. But the location where that happens will give you a false prediction of mixing. Mixing will actually occur macroscopically later. That is something that we, at first, when you see the proof, it's kind of obvious. But at first, it struck us as a surprise. And this is something I want to get to on Friday. But now, at least, you see why T mix should be around that time. And now I can finally get to the plan of... You say the conjecture of T mix is 3 log term. Is that for a particular app slot? Or are you saying that's Cohoff app? OK. So when you say T mix and don't say anything, usually you mean the standard one, which is either 1 over 4 or 1 over E. And when Duret made the conjecture, he just meant, OK, this is T mix. And this is what it should mean. But what we'll see, and this is what I'm going to get to right now, so my plan is to give three different proofs of the following result. If you take G and D for D that is fixed, at least three, then with high probability, simple random walk on G has cut off at time. And it turns out that this is the right generalization of three. OK. D over D minus 2. So this is the result. And the plan is to give three different proofs, each of which serves a different purpose as we later found out. The first one, proof one, would be strictly combinatorial or essentially combinatorial. It will be counting non-beck tracking. So this is non-beck tracking. And this is the first proof that we had of this result. It was joint work with L and Sly. This was in 2010. Proof two will be spectral via analyzing. Again, the non-beck tracking random walk operator. And when we say a spectral analysis of the non-beck tracking operator, it's much more delicate than one may think, because this operator was not normal. And this uses the method that we had in a paper from a year ago or so with Yuval, this gaffer paper with Yuval. And proof three will be a hybrid of the two. So we'll need to say something about the structure of the graph and where the random walk would be and something about paths. And then we'll use a spectral argument. And this appeared in a paper with Netanel, and Yuval, and Ellen. This will, at some point, appear. Remind me later, Yuval. I think I got the gullies. Yes? What is the mixing time for the Google Street for the minute? Excellent question. Could you wait with this question for one minute? OK, I'll answer. It is the same. It looks exactly like the result for the random deregular one. But here's why I wanted to wait with the question. The reason for giving the three different proofs is the following. So first, let's think of it this way. Here we have the world of expanders. So now I'm going to draw bounded degree graphs. So we have expanders. Expanders do not need to be regular. You can have fantastic expanders where half the degrees are three, half the degrees are four. There can be counter examples to this cutoff criterion that one can construct. There's a paper. We have a paper with Ellen where one constructs an expander that is even regular, but is not transitive. It just has, in a sense, it has different parts where you kind of destroy this conjecture. But if the expander is nice and transitive, then we should expect it to have cutoff according to this thumb. Now here, we have random graphs. Well, the graph is not really transitive, but it is sort of transitive. It is symmetric. I'm not sure everybody knows what an expander is. Ah, OK. An expander is, let's say, just a bounded degree graph that satisfies this. So I'll write it like that. I'll write it here, maybe. I'll write it over here. So an expander is a sequence. Again, this definition only makes sense if you have a sequence of graphs. Every graph by itself can be an expander. But it is a sequence of, let's say, bounded degree graphs. So bounded degree, I mean that I'll fix some delta in advance. So by some absolute constant delta, this will be an upper bound on my degree in all the graphs in my sequence. And so technically, an expander will be parameterized by two parameters. One of them I'll call, let's say, alpha. And one of them will be this delta. So you could say that you have an alpha delta expander if all graphs have degree that is at most delta. And the gap is, and the gap of simple random walk, is bigger than alpha, bigger than 0. And these are two absolute constants. OK. This would be an expander. It seems like an easy, like a natural object. But constructing one explicitly was a long-standing open problem. There are random constructions are fairly classical. It took a long time until the first explicit constructions arrived. And certainly, one can, there's a famous theorem due to Alonso Bopana that tells you that this is essentially the best possible gap that you can have. And the question that I got before about this Lubowski-Philipp-Sahler construction, that explicit construction of expander is a very special one. It is one that has this gap without an epsilon. So that makes this object fascinating and extremely difficult to construct. Can I erase this for now? I want to. Let me put here another piece of the puzzle. These are regular graphs. OK. So what do we think happens? If the expander is nice and, let's say, transitive, then we expect there to be cut off. OK. So transitive is some part here. Transitive. Transitive. We do not know this conjecture of Yuval. This is a conjecture of Yuval from around the same time. It appeared in multiple papers, and some of which he mentioned yesterday with the pills in the paper, I think, with Malvina, Luchak, and David Levin, and also in his book with Levin and Wilmer, that says that on any transitive expander, you should have cut off. So here, I'll write cut off, or I'll point to it. Transitive, it means that the automorphism group of the graph is rich enough to allow you to send every vertex to every other vertex. So for every pair of vertices, U and V, there is an automorphism of the graph such that the image of U is V formally. OK. So for instance, if you have a Cayley graph, then it's trivial. You can just take the, your automorphism will just be to multiply U by inverse V, by V times inverse U. So this is transitive. This is still open. And a few years ago, until we had that result with Uval, I would go around and say, here's the conjecture of Uval. Every transitive expander has cut off for simple random walk. And I would say as a joke, at the end, I would give this as an open problem and say prove that every family of transitive expander has cut off. Then there would be another bullet that would say, find one family of transitive expander with cut off. Because the conjecture was that this entire family has this phenomena, but we couldn't even have one small example that would somehow boost our belief that this indeed is a widespread phenomena. OK, now we know that inside this family of transitive graphs, there is a family of Ramanujan graphs. So R is called Ramanujan. And this is something that I'll get to. OK, that's true. So it is regular, but not just transitive. So, but I meant to say that we do have a family of graphs that are transitive, so those that happen to be Ramanujan and transitive, for which we know that cut off does occur. So now we have examples. And this is this proof. Now, so here we know that there is cut off. For random graphs with degree sequences, with nice degree sequences, we know that there is cut off. And essentially, elsewhere, we are still struggling with somehow understanding, as you may have heard from yesterday, from Yuval's talk. It is still kind of mysterious. However, I chose to look at this specific shaded part for this summer school, because the three different proofs will somehow allow us to invade the different parts that are not in the intersection. Namely, this proof, which is the one that we're going to start with, it's very, so this is the shaded part is G and D. It's a random D regular graph. The first proof, you could somehow see that it makes sense to use it in other settings. And there was, let me quote exactly, so for instance, there's a paper. In a nutshell, I want to say that each of these three proofs extends to somehow different directions and allows you to understand the simple random walk in different generalizations of this intersection, of this kind of shaded area. So there's a paper by Ben Hamou in Salez. Justin, it's a very nice work. Justin's going to appear here twice. So this is, oh, actually, it's from 2015, but it just appeared. So this is in AOP. So the basic idea of trying to understand what the walk looks like from one end and what it looks like from the other and developing the trees coming from the source vertex and coming from a potential target vertex is useful in other settings as well. And it found use both in this paper. This study is non-backpacking random walk on random graphs, but not non-regular. And Shao and Piet, and Justin had another paper. When did you put it on the archive in 2016? This is random walk on a directed random graph, which brings a host of problems because you do not understand what the stationary distribution is like. And it's a beautiful paper, but I won't get to discuss any of it, maybe, but probably not. Now the second part of the proof, the spectra one, I'll prove it for the random deregular case, but it actually is the same one that we used in order to cover Ramanujan graphs that are not random. So somehow it will invade to this direction. So as I was saying, the proof number one allows you to, at least the main technique, allows you to invade to the non-regular world. Proof number two allows you to invade to the world that is non-random, that is deterministic. And the hybrid one allows you to go again to the non-regular world, the random one. However, it is imperative to use this one if you want to discuss not the non-backtracking random walk but the simple random walk. So in some sense, doing a non-backtracking random walk saves you, there's a world of problems that you encounter once you allow your simple random walk, the small luxury of going back to where you were in the previous step. Seems like a small change, but and the directed world is somehow closer to the non-backtracking world in the sense that you don't experience all those issues. However, the lack of control over the station at a distribution brings different issues. OK, so anyway, so the third one allows you to go here for a simple random walk. This is the third one, and this is the second one. OK, so that is the plan. Any questions? Yes, you can always speed up. No, I can't speed up. So let us start. Here's that theorem that I wrote on the upper right, but with a little more details. And then I'll try to start with the first warm-up towards proof one. Today, we'll get to do this warm-up and maybe sketch the argument of proof number one. Tomorrow, the idea would be to go through this argument more carefully and go through the second one, and then on Friday to do the last bit, to finish the second one and do the last bit. OK, so let G be G and D be fixed, and it's going to go to infinity. So the first statement, because I mean, this is not too formal the way it's written right there. With high probability there is cut. I want to say it explicitly. So for every S that you choose, positive or negative fixed, simple random walk on G satisfies that DTV, the way that we defined it, at time D over D minus 2 log base D minus 1 of n plus S square root log n converges in probability to the probability that a standard normal is at least an explicit constant times S. And this explicit constant, in case you care, is this. So let's look at this statement for a second. This statement tells us that as we decrease S to minus infinity and increase S to plus infinity, we go from 1 to 0. So minus infinity will be a 1, and plus infinity would be a 0. And this entire window looks like a square root log n. And whereas the our main order term is log n. So we have a microscopic window where all the action takes place. So that's what I mean. So this in particular means that you have cut off at that location. This was the main order term. And two would be here's another result that for every epsilon fixed, you have the following non-backtracking random walk on G satisfies with high probability that the distance to equilibrium at time log of Dn. And you can ask, why do I care about this D? I mean, it's constant. And the reason is that for non-backtracking random walk, the window becomes constant. If you add, OK. So this is what you could say for the non-backtracking random walk as opposed to the simple random walk. And the reason that you have this change, notice I'll highlight the window. So here we have an order 1 window where the action takes place. And here we have an order root log n window. And one could somehow picture the following setting. You have a D-regular tree. So the root has degree D. But every other child, every other node in the graph has D minus 1 children on n vertices. And I'm going to consider random walk starting from this guy, starting from the root. Now, if you just looked at this graph and said, what is the mixing time? Do you have cut off? I mean, if you look at this graph, you'll say, well, this graph has an awful spectral gap because I can somehow slash this entire left sub tree sends only one edge, let's say, to the root. But it contains a constant proportion of the graph. So this is a terrible expander. The gap is tiny. But the point is that, and you can show using this, that if you start from the leaves, and this was in, this is, I think, the next slide of Yuvala. I don't know if Yuvala is going to say it. Why the tree does not have cut off is an exercise. So here's an exercise. I was going to give a few others today, but a little later. But here's an exercise. If you start from the leaf, no cut off on the tree. And when I say no cut off, it means worst case. Now notice that starting from the root is far from being a worst case. If you start from the leaf, you'll have to climb all the way up to the root and then down in order to actually have a good chance of seeing a constant proportion of the vertices that you are missing if you are in your own branch. So it will take you time that is linear. You are fighting. It is exponentially unlikely to go against the flow, but the height is low. So you will, whereas if you start from the root, what happens? This is a much simpler scenario. And by the way, whenever you have a situation where you are fighting bottleneck, like you started at a leaf and you're asking yourself, how long will it take me to actually visit the other branch? So you'll go up and down and up and down because your height is a bias random walk with drift downwards. Whenever you are in a situation like that, this is also a thumb bullet needs to be quantified, but you won't have cut off. This is one of these situations where this criterion would fail, but in the rigorous side of it, you will have an eigenfunction that detects the fact that you are here. And this eigenfunction will be the dominant one. This will give you the inverse gap. And the inverse gap would have the same order of the mixing time because the time that it will actually take you to escape this bottleneck, if you escape it, then right afterwards you will have mixed. So in these situations, you typically do not have cut off. But anyway, if you start from, so this was an exercise, you'll also find what is T mix? What is T mix? OK, if you do start from the top, what happens? We are doing, let's say, first of all, let's do simple random walk. So we have D minus 1 children from a typical point. Forget the root. From a typical point, we have D minus children going down and one parent going up. So our speed is D minus 2 over D. I will say it again a little more slowly in a second. That means that in the inverse of this time, times, which is D over D minus 2, times log base D minus 1 of n, which is the height, because every time I multiply by D minus 1 and I have n vertices. So this is the height of my tree. By this time, I will have reached the bottom plus fluctuations that are square root log n because of the central limit theorem. So what you see here, and actually this constant, they all come, they are exactly those that come from this picture. We'll see this in a second. However, what happens if we do a non-backtracking random walk? Oh, I should have said, as I walk down, why is it that I even want to hit the bottom? All that I was saying right now is, what happens to the hitting time to the leaves? I was saying that this is concentrated around this point with a window of root log n and that actually this random time does behave like that. Why is this the mixing time? Well, whenever I hit the leaves, I am by symmetry because this is just a tree uniform over which leaf it is because this is symmetric. And now there's a cascade of weights. Most of the weight is here and then here and then here. And then this was the slide, one slide after the point where Yuval stopped. When you have exactly a biased random walk on the line, the hitting time to the bias towards the right, the hitting time to the right point, is asymptotically the mixing time, also up to the second order term. So this is why I care about hitting the leaves. Now what happens if I do a non-backtracking random walk instead of a simple random walk? Then there is no jiggling up and down. I mean, it looks like a small change. The non-backtracking random walk, to remind you, is a Markov chain over edges, which, if I went from u to v, it's a Markov chain over directed edges. If I went from u to v in the previous step, in my next step, I will go from v to z1 or the way to zd minus 1, I can go to any of the directed edges leaving these vertex except the one that goes back to u. So I'm not allowed to do that. You could think of it as if I'm just visiting vertices, v1, v2, v3, v4, and I can go to any, in my current step, I can go to any vertex except the one that I actually came from. But formally, in order to remember that, you'll actually remember the directed edge. So this is the non-backtracking random walk. How long will it take it to reach a leaf? That's simply just its height is deterministic. Always go down. So it will take its time log d minus 1 of n plus nothing. It is deterministic. So the move from simple random walk to non-backtracking random walk, well, of course, one needs to somehow, once it hits the leaves, this analogy, this example, dies because then the non-backtracking random walk gets stuck at the leaf and you need to somehow, but at least morally speaking, we see how the window of root log n being replaced by an order 1 window makes sense if you think that the random walk on a random deregular graph looks like it's walking on a tree. Now locally, a random deregular graph is locally tree-like. And then at some point, you start closing edges and the structure becomes complicated. This picture tells you that you can somehow ignore the fact that the structure becomes complicated. And it's as if you really do have just a tree on n vertices. And once you reach the end, you are fine. When faster the scroll code comes from the Santalumin theorem of the head? Yes. Disappointing, right? But simple, either way. Yes. And actually, we will see right now, if I can. OK, we will see right now how it comes from the Santalumin theorem of the height. So since I'm proving things in a simpler way, because this is a summer school, I will not try to prove the best possible result. In the paper with Allen, we had one part where we did this for the simple random walk, and then one part where we really tried to get this window to be the right one, the constant one. And then we also, and we cared because we wanted to show that if the degree goes to infinity, then this entire window gets eliminated and you get just two points. Just this and the next one. That's if d goes to infinity arbitrarily slowly. No more window. It's just one point. And then you could say something like this for the simple random walk. OK, the simple random walk would then look like its window would look like a log n divided by d log d. That would be the, and this also comes exactly from the same Santalumin theorem. If this is into the little of one. OK, so that paper has things in a slightly more accurate way than what I aim to do in this summer school. In this summer school, I will try to, instead of going through the proof as it was in that paper, I'm going to prove it a little differently. We're going to go through the non-back, instead of proving things directly for simple random walk, we're going to prove them for non-back tracking random walk with a worse window, which would make things much simpler. And we'll use an analogy from non-back tracking random walk to simple random walk, which I'd like to do right now. Yes. In the picture, with a tree where you started from, it seems to me that once you heat the diameter, basically, you have cut off or something like that. You have mixed, right? Or the typical distance, yes. Is that something that you would be discussing here in general, or is it only in, for example, the tree that once you heat the father's thing that you have mixed? OK, the diameter here would be twice, and also the typical distance. But something along those lines will be discussed. The tree is a good thing to keep in mind when you are thinking about a random deregular graph. But in some strange way, you see, this point, once you hit, once you go distance log base d minus 1 of n, that's your distance. Forget about the time that it takes to get there. You see all the vertices. However, these points have, like these points and those points, their distance is twice log d minus 1 of n in the tree. You need to go all the way up to the root in order to. But when you start from the root area, right? The point that I'm trying to make is that on a random deregular graph, it looks like that, but from every vertex, which is counter-intuitive. These points, when you start from them, from most of them, it also looks like when you go distance log base d minus 1 of n, instead of 2, just 1, you again see all the vertices. So a random deregular graph is kind of a funny object. But it is true that in the random deregular graph, here's a nice little exercise. It's a good exercise, and I will discuss it on Friday. So if you have u and v are fixed vertices, and when I mean fixed vertices, you have a random graph on labeled vertices, v1, v2, v3, v4, all the way to vn. And I'm naming these two vertices. This is v1 and v2. So the distance with high probability, so with probability that is 1 minus liter of 1. So let's put it like this. For every epsilon with high probability, the distance in g between u and v minus log d minus 1 of n is less than epsilon log n. So the typical distance in your graph, so I fixed these two points. Now I'm randomizing the graph, and I have a distance between these two points. And I'm saying, or equivalently, this is saying the average of this. So I'm taking two random points. What is the distance between them? It looks like log base d minus 1 of n. And actually, you could say something much better than this. But this is the exercise. And what it turns out is that not only is it less than epsilon log n, the worst distance, the maximal distance between every pair of vertices, this can be replaced by a constant. The worst distance will be log base d minus 1 of n plus a log log 10, which is actually there. That's the worst one. But take this as a simple exercise. And it somehow goes back to what we said here. Log base 2 of n, when you reached log base 2 of n, that was the typical distance and also asymptotically the maximal distance in a random 3 regular graph. So I was going to erase, let me show you this reduction to the non-backtracking random walks, which we will use in order to give the first proof. And then we will also see where the theorem, then we'll also see where the constant comes from. The theorem is still here. So let's think of g now that is just a deregular graph on n vertices. It doesn't need to be random. And t will be the deregular infinite 3, routed at o, twiddle here. OK. And we'll also fix some x in g and write z is the distance of z from o for z. One last thing, let's take phi to be the cover map from t to the vertices of g, cover map, which maps the root of the tree to x, which would be the origin of our walk. So to remind you, this is just a map that is locally bijective, which is, or equivalently, you can think of it as saying you start from x and now you're looking at all possible non-backtracking random walks on your g. And these correspond to td. I hope you just think of it this way. So observation one is that simple random walk, script s on td, started. If you do simple random walk and start it at o, gives rise to simple random walk x non-script on g simply by following the cover map, started at x. So this is a trivial observation by definition of this cover map, but this was used in the proof of the Alonbo-Pana theorem, that I said that in a family of, OK, to extend that the gap could not be better than this, or equivalently, that the soup over lambda has to be at most 2d minus 1 plus little over 1. So that proof that appears in the Lubotsky-Philipp-Sarnak paper used this observation. So this was observation one to the right. Observation two is that I met my exercise goals for today. That was the little b. Is that if you start a simple random walk on t, starting at the root, and you condition that this simple random walk at time t is at distance l, then this probability distribution, this is a probability distribution, is nothing more than uniform distribution over the set of guys whose distance is l. This is trivial. But that means that if we start a random walks on g using the first observation, if we start a random walk on g, then the probability distribution at time t condition on what the height of this random walk on the cover tree, being l, is just breaks down to summing over vertices whose distance is l. And now we have the probability starting from, sorry, whose distance is 1. So what I mean to say is something very trivial. I want to say that just like the walk on the tree, if you know that you are a distance l, then you are at a uniform leaf. If we now look at the graph, if I tell you that your distance from a, I don't want to say distance from the origin, I have to talk about it in terms of the tree, because the graph may have cycles. If I tell you what the distance on the tree is from its origin, then I can just write down the distribution as a marginal of these non-back tracking distributions. So the only kink comes at the first point where the graph actually has a choice over d vertices instead of d minus 1. So that's the only delicate notational bit. Write it in a separate line so that you'll see. The random walk is an object on vertices. The non-back tracking random walk is an object on directed edges. So in the first step, it chooses a vertex and that dictates what directed edge I'm at. So it chooses a vertex and I'm looking at the direct siblings of the origin over there. So we have o and then we have z1 all the way to zd. And now each of these has just d minus 1. From here on. So these are these z's. And we have the probability from x to phi of z. This is our directed edge. Starting from x, going to phi of z. And we have the probability that yl minus 1 is here where y is yt is the non-back tracking random walk with two components. So these are two vertices. So in other words, I want to understand simple random walk on the graph. It suffices to do the walk on the tree. And then if I tell you that you're at distance l, then you just look at where the walk is on the tree. It's uniform over those leaves. And then I just pull it to the graph. So this looks like a very simple observation, but it is very useful. Because now we can do the following. Just a few more minutes, but we can finish this one. The work on site here corresponds to the name but helping you with the task. Yeah, this is non-backtracking random walk on g. Starting from x, phi z, non-backtracking. This is, oh, I didn't finish the sentence. My apologies, non-backtracking walk on g. So the script is going to be in t. The non-script is in g. And this is ambiguous, but this is my non-script y. So, Nicola, now the CLT. Simple random walk on t is transient. So in particular, xt visits o, finally many times, almost surely, many times, almost surely. So that means that our, and elsewhere, so elsewhere we know that xt plus 1 minus xt, the difference in heights, is now just this guy that we wrote already deleted. It's this guy that increases by 1 with probability d minus 1 over d and decreases by 1 with probability 1 over d. This is just on td. So that means, by the CLT, that xt minus d over 2t divided by 2 root d minus 1 over d root t. If you look at, I mean, the 2 comes from shifting this by 1, and then the difference between the two, this is just the variance of a, and then it becomes just the variance of a binomial. So this goes to a normal, to a standard normal. So what can we conclude? We can conclude that if l is t mix of epsilon for non-backtracking random work on g, so in particular, it must be that l goes to infinity. So we'll have our CLT over there. And pi is the uniform distribution. I'm not writing that. And t, we choose to be d over 2 log, sorry, d over 2l for the speed of random work, plus s square root l. Then I want to now bound the distance between my probability distribution of simple random work on g started from x and pi. What is it? Well, I'm looking at this guy here, and I know that being a marginal of non-backtracking random work, the non-backtracking random work has two ends, this is just the marginal of where its end is. But taking a projection can only decrease total variation. So this will be a projection of the right-hand side. It can only decrease. So this is a projection. And I chose l to be t mix of epsilon. So I'm writing epsilon here, plus the probability that the normal random variable is at least some constant times s. So in other words, either I did not reach time l on the cover tree, which is captured by this because of the central limit theorem, or I reached time l, which means that my non-backtracking has mixed because of this property. So I pay an epsilon for the mixing time. OK. So in other words, if I, OK, so we are done for today, but I'd like to say what we just proved. We essentially showed that the upper bound, which is always, almost always, the harder part of showing, of getting the right asymptotics of mixing times, is reduced to the following problem. The lower bound actually applies here for any deregular graph. It doesn't have to be random. You can't do better than whatever this theorem tells you. Mixing at this time is optimal for any family of deregular graphs, just because it boils down to just seeing the vertices. Otherwise, you haven't gone far enough to actually see almost all vertices, so you're certainly not mixed. The upper bound here tells you that if you can show that at time log base d minus 1 of n, no constants, plus little o of square root log n, you are mixed. Anything that is little o, anything, then the main order term that you'll be left with is this root log n, which will overshadow the error that you had for non-backtracking random walk. And then you get an upper bound that is the right one. I still didn't convince you that this is the right one, but it is the right one. So we will prove, in this tomorrow, we'll show a log log, which is not the true one. The true one was a constant, but the log log will get washed off by this root log n that we are adding anyway. So we have a reduction from simple random walk to non-backtracking random walk, which makes our life much easier. And it is valid for any graph. We didn't use the fact that this is a random graph. In particular, this is what we will also use for fixed graphs for these Ramanujan graphs and so on. OK, that's it. Thank you.