 So the purpose of this last lecture is of two types. The first one will be to finish a rigorous proof of the theorem 2 here, at least the point 2 of theorem 2 that I will comment on in a minute. After that, we will go into eigenvectors dynamics and see what we can do for eigenvectors. I will just very briefly give motivations about the study of eigenvectors of my random matrices by the end. So what you can see here on the board is a couple of theorems. The first one telling you that there is some invariance of statistics up to some time. And the second one telling you that after a shorter time, you have already the GOE statistics. Okay, so this is GOE of size n. And in particular in the bulk, what we have give the heuristics for yet on Tuesday, is that for time up to n to the minus one half minus epsilon, so almost up to n to the minus one half, the statistics after running your Dysen-Brahm motion have not changed, okay? At time t and time zero, it's basically the same up to a small error. My capital F here is just a nice function. And for the edge, you can go actually much further. Here I put n to the one over 100, but just for sake of concreteness, but you can go even a bit beyond one. And for considering theorem two, which is much more challenging to prove, you have already relaxation of your dynamics after a shorter time. So just time one over n in the bulk and n to the minus one third at the edge. So I forgot to mention the dynamics I consider, of course, are the matrix Dysen-Brahm motion, okay? Which in use is some spectral dynamics. Theorem one only relies on the matrix point of view, the fact that the matrix satisfies this equation. But theorem two requires the spectral dynamics to be proved, okay? At least the first line, which is Dysen-Brahm motion. So let me comment a little bit about the spectrum dynamics. I wrote the eigenvector evolution here. The eigenvector associated to lambda k is denoted by UK. And this is something we proved as an exercise yesterday. So everyone here knows how to prove this, right? So in particular, what you see is that eigenvectors are very unstable in directions related to eigenvalues that are very close. This is reminiscent of the fact that if these two eigenvalues coincide, you cannot even distinguish eigenvectors, okay? So when you have, for example, your basis on, say, SO3. So this is eigenvector related to lambda 1, lambda 2, lambda 3. If lambda 1 and lambda 2 are very close, then instantaneously, I'm going to rotate very, very fast in the corresponding plane, okay? And I'm going to rotate at a very slow pace, if I'm in a plane related to distant eigenvalues. So it would be important for us, and that would be the core of the class today, to understand the mixing time for these dynamics. And it's very high dimensional, when it's dimension n square. Well, this one is dimension n, so it would be harder in some sense to understand this one. Now, what I promise you on Tuesday is to give a proof of theorem 2. And here are some heuristics. So remember that it's a coupling argument. But I will just make it one step more rigorous today than it was on Tuesday. So the coupling argument is, on the one hand, you have your dynamics for lambda k, but you also choose y at m0, which is DOE. And the same dynamics for, and the initial condition different from your lambda. And the dynamics for y are just the same. So you do your dyk is equal to 1 over dbkk. And then by looking at the difference between y and x, you get the parabolic equation. So delta k t equal exponential t over 2 yk t satisfies the parabolic equation. Which behaves qualitatively speaking, something like partial derivative in time of your delta k t is not too far from 1 over n delta k minus 1 t plus delta k plus 1 is 2 delta k t divided by gamma k. So here I only keep the nearest neighbor interaction after my subtraction and the parabolic equation I obtained. And it's not too far from discrete laplacian divide speed up by this length. In particular, if you inject what is the difference between typical locations at the edge or in the bulk, you will find out that you get exactly what you expect and to the minus one third speed at the edge and 1 over n in the bulk. Okay, so that was the very first step of heuristics. I will now do a coupling in a slightly different manner, but it's more really speaking the same thing. Now define sk nu at time zero, which is nu at time zero to be nu yk plus 1 minus nu xk. So I do just a linear interpolation in my initial condition. That's nothing more, okay? Remember, all my eigenvalues are ordered. And you run the Dyson-Brahme motion dynamics with this initial condition. And then we will derive the nu to interpolate between both cases. So now you do your dxk of t equal negative, okay? So if I can prove that the derivative in nu at time t of this object is very small, then it will imply by integrating over nu that my value of xk and yk are very close, okay? Because of course remarked that xk nu 1 is just equal to sk 1 t. Just sk t, yk t, and xk 0 t is my lambda k. Just because my initial condition will match with y or x at the extremities of my parameter nu, and then it does the same dynamics as we wrote here for y and lambda. Now, the interpolation is at time zero, but the equation itself. So the interpolation is just at time zero. And it's not true that at time t, xk is a linear interpolation between y and lambda. It's just not true, okay? But still I will prove that it's a small number. Now you can just derive this equation in nu, okay? So you need to be careful when differentiating a differential equation in this parameter. But you can just do it, take a small increment to convince yourself you can do it, okay? So in particular, if I define, this is by notation, let's say, a k at time t to be the differentiate in nu of my xk nu 2. And I multiply by a factor exponential t over 2, it will be to get rid of the arced union back. So you just define this, then by differentiating on both sides here, you get the usual parabolic equation for this, for this guy. So the reason I prefer this interpolation to the one I was doing here by taking directly the difference is that when I was taking the difference for delta k on the denominator, I had xk minus, lambda k minus lambda l times yk minus yl, if you remember from Tuesday. And this is somehow, for algebraic reasons, it would behave better. So you have this equation, okay? So now, there is nothing new so far compared to what I was saying on Tuesday. If you prove that it has been smoothing a lot, it means that a k and a k plus 1 for any nu will be very close. And as a consequence, by integrating over nu, it means that xk and yk are very close. The gap between xk and the gap between yk are very close, okay? So we want to, so for edge universality, say, namely to prove theorem two part two, what we need is the following fact. Is that uniformly in nu between 0 and 1, your ak for t of order much greater than n to the minus 1 third, your ak nu t is actually smaller than its typical size. If it's a small o of n to the minus 2 third, then we already have universality. So in the bulk, I was talking about the gap, so I need to compare ak and ak plus 1. But at the edge, it's not about the gap. Your tracy-widam statistics is just about distribution of the relative position. So if you can prove this one, so this is what we aim at doing, right? And if you can prove this, just integrate in nu now. You integrate between 0 and 1 in nu. This estimate, what you obtain is that the difference between xk and yk, between xk and lambda k, sorry, is going to be a small o of n to the minus 2 third. Sk, so that's my y. So lambda kt is equal to yk t plus a small o of n to the minus 2 third. But this would be enough for tracy-widam. So we want to prove this equation. So the individual eigenvalues after some time, they are on the scale of the natural fluctuations. They just stick together. Okay, the gap is smaller than the typical scale of distribution. So it's enough for universality. If you practice estimating, you'll have fewer. Why nt, lambda nt, or for k plus 2 n? Oh, that's n, yeah. What I meant by k is n, here. Thank you. Actually, it's true for any edge eigenvalue of the first 10 eigenvalues, for example. So we want to understand this. And the reason I focus on the edge here is because there is a particularly simple proof here. It was in particular, particularly, I mean, very difficult to make these types of heuristics correct because your lambda ks may collide. There are some shocks, actually, even though they will never exactly collide, they may get very close together and it happens. And to deal with these shocks, actually, the key is to introduce the following observable. You define ft, let's say ft z to be, so I need an exponential t root minus t root. So from now, I just omit nu in my notation because everything I'm going to say is uniform in nu anyways. Okay, and I want to prove that for 1 fixed nu. This is small. So you define this object. It's a strange object. So this one is the derivative in nu of your xk t divided by xk t minus z. So this can be seen as a derivative of the characteristic polynomial in the parameter nu. But we will not really use this fact. So what's the idea? It's very hard to control ak, but maybe some average of the ak is going to be easier to control. So when you take average of ak, you take a sum of weights times the ak's. And the point is that the weights I'm going to use here is the imaginary part of 1 over xk minus z because we know it's a probability measure, typically. It's the imaginary part. It's a convolution of the spectral, empirical spectral measure on some scale. So we just introduce this guy and this is the sum of all k. And now it's an exercise to prove that this f satisfies a stochastic advection equation. And then it's just a one-line argument to conclude about universal. So on this one you have to trust me, but it's really just a calculation. So let me explain you a bit the idea here. If I take any average of the ak's and I try to understand the evolution of this average with time, in my evolution of the ak's there are shocks. I have this singularity. So it's very hard to control any type of equation that will emerge from an average. On the other hand, if in my average I put the evolution of xk itself, then a miraculous consolidation will occur which gives an equation with no shocks. So this equation has no shocks or no singularity. Now how should we understand this equation? We should understand that the second line is just error term and the first one is the dominant term. So st is a still-gest transform. s always stands for the still-gest transform. Thanks. So st here, that's my sum of 1 over xk t minus z with a 1 over n. That's my still-gest transform. And what we will see is that all of this is just error term. And st is a still-gest transform of somehow a Wigner matrix. So we expect that it's very close to the limiting still-gest transform of by the local law. So st is supposed to be about m of z but m of z is now is remember your m of z is minus z over 2 plus square root of z square minus 4. That's the still-gest transform of the semicircle. So we expect st of z to be about this. So that's an equation. You add z over 2 and you only get the square root of z square minus 4. And this is error term. So let's understand why this is error term. So this is error term for z in the mesoscopic scale. So everything here is in the mesoscopic scale. So my eta is of scale greater than the typical gap between eigenvalues below. So in part, think about this. Here I have, this is an object of order 1 because it's just a still-gest transform. It's just a finite formula. This is a first-order derivative. Here I take a second-order derivative. So it's a bigger, it's a bigger size typically. However, I have an extra 1 over n. So if I am on a scale n to the minus 1 plus epsilon, it would be a smaller size. Okay. And this you can also control and prove it's a smaller order. So this equation, let's call it 1. 1 is well approximated and this one is easy to justify. So it's well approximated by dft is just 1 half of the square root of z square minus 4 dz. Everyone follows at least the chain of equalities here. So what type of equation is this? It's an advection equation, meaning that think about what happens. You start with at some z and the derivative in time of the value here is equal to square root of z square minus 4. But this is typically, if you are in the bulk for example between minus 2 and 2, this is basically purely imaginary because it's at first order. So it means that it's something like i times a constant dz. If you have i times a constant dz, it's exactly the most standard type of stochastic advection equation. It means that the value at time t here is equal to the value at time 0 much higher. Okay. So I will go into more details after. So it really tells you, because you have the good sign here, it's a plus, it tells you that it's regularizing. This ft at time t for very small scales is going to be given by f0 at larger scales. So, but it happens that this equation can be just solved exactly because the characteristics, so this type of advection equation is solved by the method of characteristics. But for this factor here, there is an explicit formula for the characteristics. So the derivative in time of ft is equal to, thank you. So this is an advection equation with solutions of following one. ft at z is equal to f0 at a deterministic point zt, where zt is the following. Where's my zt? Okay, so it's a bit formula, but not so bad. And so this you can just check. Okay. Of course, it's not for any function here that you would be able to find the characteristics, but here you have a formula. So what does it look like? What do the characteristics look like? So as I told you in the bulk, if you start at z0 somewhere here, this is minus 2, 2. If you start here, what you will find out is that the characteristics take you almost straight up. And so this is, sorry, this is z0. But when you add the edge, it's a bit different. You will go on a transverse all the way. You will actually, at first order, it will be like a parabola like this. So z0 is just my initial point here on any, the mesoscopic scale you prefer, any of zt. So in particular, what it means is that the average of the uk's, of the ak's that I introduce in this observable, after a long time, it's just like taking the average on a bigger scale before. Okay. So it's an automatic call-smoothing effect. It's just that this is a good observable that makes it completely transparent. So now let's understand in particular what happens at the edge in terms of scales. So if z0, let me take z0 to be just one point on the mesoscopic scale at the edge, which is just 2 plus i times n to the minus 2 third. Okay. Remember that my gap between two again values at the edge is n to the minus 2 third. So I just consider z scale. Actually, I need to be in a mesoscopic scale for my theorem to apply. So let's, let's add some epsilon. It's fine. And here I can apply any types of local laws and my approximation of this by the deterministic advection equation is correct. And now what you will find out is that zT is the following type. Then zT, if you just tailor-expand this, it is at first order. So it's going to be 2 plus t squared plus, so that's my zT minus z0. Plus i n to the minus 1 third at second order in t. That's an expansion from here. So what it means is really that you have this parabolic shape at the edge. Yes, because I removed the zero. Thank you. Fortunately, you're around. So what it means is that after time t, what I observe here is going to be like what I observe when I'm actually quite far away from the spectrum. It's far from itself. But when I am far away here, the average of what I see just below is basically zero. It's very small. So it means that my AT, the quantity I was interested in from the start here becomes a very small naturally by this equation. So this implies, this implies that for t, much greater than n to the minus 1 third. If you're much greater than n to the minus 1 third, then this term is going to win over this one. So it means that you are really far away from the spectrum. You don't even see it anymore in the average I consider. And as a consequence, what you end up with is the fact that f t of z, zero, is equal to f zero of z t. This we already know at first order. But we now know thanks to this that this is actually you can just check. It's a small o of n to the minus 2 third. So many calculations have to be checked here. I'm sorry about it. I don't have time to put full details about them. But I just want you to remember that if you consider this type of observable, then all types of shocks you may be annoyed by, they just disappear. And you have an automatical smoothing effect which gives you some type of regularization for the things you are interested in. Any questions about this? The cancellation between the shocks is just because you compare the equation for a k and for x k. Yeah, so apply it to your f t of z. Just apply it to, of course, you have to look at the the usual derivative in time of a k because a k is smoothing time. A k has no martingale term. So the evolution of this guy makes some shocks appear. But on the other hand, because of this x k here that you derive, you will have other shocks and they happen to just cancel each other. Is the proof clear? So I mentioned it at the edge because it's simpler at the edge than in the bulk. You just have this characteristic taking you away, so it becomes small and that's it. And also I believe that for the purpose of proving tracy-widam in a model you like, no matter which model you like, introducing this just makes it very, very transparent. All right, so if you don't have more questions, we will go to eigenvectors. Okay, so what do we want to prove for eigenvectors? You have let's say h, so yeah, h-vigner matrix. And remember that my notation for eigenvectors, they are here. This is the u k's. And for any sequence q, q n, so q n is in our n and it's normalized. And it's a deterministic sequence of unit vectors. You have u k projected on q n. So this u k depends on n implicitly. And you multiply by square root n. This converges to a standard couch. So this is the first part of the statement. So what do you expect for your eigenvectors? That they are uniform on the sphere somehow, right? Because this is a case for GOE. For GOE it's completely transparent because the model itself was designed so that you have invariance by orthogonal conjugacy. Here it's something you probably expect. And you can choose your favorite direction. Of course you have a representation of the uniform measure on the sphere as a sequence of independent Gaussians that are normalized. So then this theorem for the uniform measure on the sphere would be obvious. So we want to understand this here. So that's the first thing. And the second one is the fact that the eigenvectors are flat. And this is something that we may call a probabilistic version of quantum unique algorithm. I mean that for any i, so you pick your favorite interval deterministic in one n. And you look at the mass, the l2 mass given by your eigenvector on i. So you look at the sum of uk of alpha square on i, all coordinates in i. And you wonder how much does it deviate from what I expect. If it was completely flat, all of this uk alpha square would be of order 1 over n. Okay, because each coordinate has size 1 over square root n. So you want to compare it with what it is when you remove, when you're recenter. Okay, and you consider the probability that this object is large or small or things like this. And depending on the i, so you want to prove that the probability that it's large goes to 0 as n goes to infinity. So this is for any positive epsilon. But if i is a macroscopic proportion of the coordinates, then this is a good, some good scale because it's an object of order 1 typically. And you compare, we want to prove that it's a small probability to be greater than any constant. But you may choose i which is just a small fraction of the coordinates. It also works, you just need to multiply by a factor n over cardinality of i. So you are interested in the probability that this quantity is large. Now if i is just one coordinate, this is a random variable divided by n. I multiply by n, so it's just a random variable all together. So if there is no chance, it's true. So I need to add just 1 over the cardinality of i. Okay, and now when I was correct. So I will not come back to, I will not mention the origins of this name. It was introduced by Ronik and Sarnak in the context of manifolds, some context very close to the Boygasian and Anish Mith conjecture I mentioned. For generic manifolds, you expect that in the semi-classical limits, the eigen states occupy space very well in terms of their L2 norm. This is a modest probabilistic analog here. How do you prove these types of things based on the dynamics? So theorem one, you're just going to work exactly the same way, proving that you have invariance of your eigenvectors up to some time. Thanks to the local law as an input, which is a fundamental object here. You can just prove theorem one in the same way. Remember that for theorem one, we were just using the fact that the entries of the Green function were bounded. So we were not using dynamics at all for theorem one. But we want to understand theorem two, the analog of theorem two, exactly for eigenvectors now. This is true for all k. So you choose a deterministic, so things are a bit non-precise here, thank you. So your i is a deterministic subset of one n, which may vary with n. But it's a deterministic one. And k, same thing. It's a sequence of numbers, all of them between one and n, but the deterministic one. And then it's correct. The probability that this is true for one k or for all of them together? This one? Yes. This one is for one k. It's a challenge to prove it for more, for all together, and it's a challenge to have the absolute optimal error term here. As I mentioned by the end of the lecture, this is an important thing actually. If you don't prove it for all ks, then I think quantum munic is unique. Unique munic means that all eigenfunctions are flat. It means that you're, okay, yeah. So let's call it quantum munic, quantum ergodicity if you want, but quantum ergodicity on the other hand is not that either, because it's a bigger average. But that's perfectly fine. The first part, you take k also, the domestic sequence also. Yes, yes. So what you have is... So theorem one is still true for the observable, which is, for example, a function of this, or a function of square root n. So with the same scales. It's going to be correct. So we don't really care about theorem one here. For theorem two, we want to analyze these dynamics and understand the mix. So you see, as I mentioned at the beginning of the class, what is complicated is that some directions are going to move very, very fast, some others will not move fast. How does it all together combine to get uniform on the orthogonal group? So that's the question. So I would love to have a proof based on arguments like this, just identifying this as a diffusion with a very good convexity type of constant, but we don't have such a proof. And what you can do is to introduce again a set of observables which behave pretty well under these dynamics. So theorem two is just going to be true on the same scales. So is still true for the same f. And here is a sketch of proof. So for eigenvalues, we were proceeding by coupling. And here I have no idea how to couple. This is dimension n square. Since I could not find any way to make it parabolic. And on the other hand, what we want to prove convergence to is much simpler. It's just a Gaussian. So maybe just by taking moments, we can prove that they converge to the moments of a Gaussian. So for this, let's look at the very first moment. Let's try to understand where it goes to zero and then the second one. And let's try to understand the heuristics here. Okay, first moment. So first, I just make my life easier. I will not take any qn. I will just take the first coordinate of the canonical basis. Okay? So take qn, go to one. Let's prove it in that case. We'll be convincing enough. Okay? So what do we get here? I just project on E1. This is projected on E1. This is projected on E1. So it's an equation involving all projections of my injectors on E1. And if I take expectation, this term goes away. So I just end up with an exponential decay to zero. So the d over dt of the expectation of my ukt. And I condition on my path lambda from zero to infinity. I will comment a little bit more on this. What you obtain is that this is minus one over n, minus one over two n, sum of one over lambda k minus lambda l square, l equal to one over two n, l different from k, sorry. Multiply by the same thing. So the reason I can do this conditioning and take my lambda k out is because these b's here, and that's very important fact, are completely independent. The noise driving the evolution of eigenvalues is independent of the noise driving the ones of eigenvectors. So you can see these couple dynamics. And that's a key point here. It's actually a very nice fact for these dynamics. Imagine you are first given the eigenvalue path and on the top of it, conditionally on this eigenvalue path, you run the eigenvector dynamics. These dynamics that we proved yesterday were first proved in the context of covariance matrices by Marie-François Brue in the 80s. And it's even simpler to prove it for the Hermitian case. I think they require even more attention. Those are really interesting. So now what is the size of this here? Well, if you are in the bulk, you know that lambda k minus lambda l is like k minus l over n. So this will be of order n. And this will be n to the 1-th third on the edge. So this is like n in the bulk and n to the 1-th third on the edge. Okay. So obviously here it's extremely simple to prove that the expectation goes to what you want on the good time scales and at an exponential rate. It's a very fast, very strong convergence. But this is a bit too easy. Let's look at the second moment and let's give it a name. Let's call it gt of k, the expectation of my uk of 1 square conditionally on my path. Now it's an itto formula on the square and you will find out that you get the exact same parabolic equation as the one we are used to. Okay. So again, exercise. So the uk of 1 after projecting on u1, this equation of dimension n square because the equation of dimension n takes a square. You apply it to the square, take expectation to make the martingale terms go away and this is what you're putting. But for this equation, we just know that it regularizes on these time scales too. We just, this is basically everything I told you about eigenvalues. That it is a very same equation. So regularization on the same scale. So we know the first and second moment converges to the Gaussian ones but this is not very surprising because if you believe in some symmetry between the uk's, of course all expectations because it's the same for uk square in the limit because it's normalized in l2. What's more difficult to understand is the higher moments. If you take uk 1 to the 4, for example, then it will not be true that the equation becomes autonomous in these observables. The uk 1 to the 4 will also include some mixed terms like uk 1 to the square times ul 1 to the square. So you need to enlarge your space here to get the good point of view. So to enlarge your state space. So the good way to do this is as follows. So it's really a random work in a random environment perspective here. It's the random environment is given by your eigenvalues trajectory which is random and on the top of it you run a random work. So for general moment, here is how it goes. I need to first talk about something completely different. You look at configurations on the set of n points. So you start at 1, 2, 3, at 2n and you look at some configuration let's call it eta in the particle systems point of view that's a traditional way. So let's say for example they have three particles at i, 2 at i and 2 and 1 at side j. And this is one configuration eta of my three particles. I can decide. So these are indistinguishable and I just decided about the configuration how to put these three somewhere in between 1 and 2 and 1 and n. So this is my eta and here I have three particles. So this number of particles let's call it p. And to this eta I associate an observable gt of eta which will be the expectation of a mixed moments depending on the position normalized by the corresponding outend moments. So you take gt of eta to be the expectation of the product of uk product over all k uk1 to the power 2 eta i 2 eta k. And this is an expectation conditionally on the trajectory again where my eta k denotes the number of particles of eta at side k. And I normalize by what I expect it to be in the limit. I expect these guys to be independent gaussians, right? Each coordinates correspond to any independent gaussians. So I just normalize by the corresponding thing for gaussians. So let's normalize with a square root n. These are independent standard gaussians in this n case. And now the good algebraic fact is that the gt of eta satisfies the power brick equation just as the space is not just one particle it's not just k it's not a function of k it's just that these particles will move. Then it's a fact and please make sure you know your eta for proving it. The fact is that the derivative in time of f t at side eta over gt at side eta this is basically a sum of my gt at the side eta ij where eta ij means I take a particle at side i and I bring it to j minus that I will gt of eta divided by lambda k, lambda i minus lambda j. And I have a factor here which is twice the number of particles at i times one plus twice. That's sum overall. Is this readable? So eta ij so remember eta k is a number of eta subscript k is the number of particles at k eta superscript ij is a configuration obtained after taking a particle at i bringing it to j. Of course you cannot bring a particle from i to j if there is no particle at i but this is fine because I have a factor two eta i here. So anyways it's an equation and again these guys are positive so it's a parabolic type so all of them will become equal. So in the in the large n limit and large time limit all of these observables become equal which means that the the moments converge to converge to the Gaussian ones. Once you know they are equal it's easy to justify that they are equal to one actually. Okay we're just a normalization and if they are equal to one because of the ratio is with the Gaussians you just have whatever you expected. Okay I think the message here about the two examples I gave you for one good observable observable for edge universality and one good one for bulk universality is for for proving that you have this statistics go down or or or Tracy Whidham or Gaussian QE and so on. You cannot rely on any explicit integrability obviously but you can rely on somehow integrability of the dynamics. Your your exhibit observables that satisfy nice dynamics. Okay so are there questions about about this? I just gave the main ideas of course technically from there proving that it becomes equal it's it's a technical idea. This is true for any fixed eta. So the equation is true for any fixed eta. So you really need to think about this as your your three particles are going to move they will jump. Okay and they will jump to to converge to the equilibrium measure through these dynamics and the rates depends on the inverse of the distance square between the eigenvalues. So a particle will be very likely to jump to its nearest neighbor and not very likely to go far on one just by one jump. And when you say they jump you mean that GT you interpret GT as the density of particles? No what I mean by they jump is that this is a generator of a process on this on the space of all particles eta and this process corresponds to jumps. Okay it means that GT is a is a sort of a density density function and you look at the density or the probability of being at the at the configuration eta? Oh yeah okay. So the GTs will all will all become become flat. I mean yes you can you can consider it as just what you said but if you start with a Dirac for example it would just completely split out. This kind of the analysis of this equation can be done only if you know the equilibrium measure pretty well because you need tools like Dirichlet forms and so on. You need to know the equilibrium measure to to talk about Dirichlet form use reversibility things like this. So you need to have an explicit equilibrium measure for this. It happens that if you consider the GOE the GUE dynamics the equilibrium measure on the space of configuration eta is just the uniform measure. Any eta is as likely as the other. It's not obvious from the equation but it's true. But if you consider GOE it's a slightly different I mean this is GOE GUE is a bit simpler and you get equilibrium but if you consider GOE it's a more complicated equilibrium measure but still explicit. So once again it's a weird fact that GUE is simpler than GOE from a calculation perspective. But I don't have much time to just write what the equilibrium measure is but it's it's not a complicated one even for GOE. What is the equilibrium measure? Is it given by gt of eta or t of infinity? No no no. No let me So imagine you start with the with the configuration eta at m0 eta at m0 and then you put a you put a clock on each particle of eta which will ring at a rate and this clock will ring for example for this link you put a clock on each link between one particle and empty sites and the clock will ring at this rate and when the clock rings the particle jumps. Okay so that's what that's what happens. So your eta is zero it's it's eta itself will not converge to something but the probability distribution of eta will converge to the equilibrium measure. If it's clear maybe let's let's go to motivations for eigenvectors study. Okay so is this clear? So I hear what you need to do is to and about the spectral gap uniformly the lambda So you cannot redo that. There is no you know the problem is again that some directions are good some are bad some lambda keys are closed then mix very fast but when it's far further so what you what we are using technically speaking here is the maximum principle. You take the maximum over all configurations so you pick the configuration in eta so that zt of eta is maximum and we prove that the derivative in time goes at least at some rate by maximum principle and for this as an input we need the local law from the very first class and so it's a grand val argument and it proves that gt of eta goes very fast to one so it's a more classical PD argument here any other question? So let me tell you a little bit about the the problem of going non-min fields okay and it will be related to to a question about how strong is our quantum megacity is it is it unique or not and things like this so let's consider band matrices but before going into that I just want to mention this is far from a closed subject in particular what we prove here is that some observables which are these projections on the eigenvectors contrast to equilibrium very fast we don't know that the measure itself is close to r like in total variation okay the total measure of the group being close to r in total variation we just not know it it's it's way too strong as a if you find an argument for this it would be very interesting okay so it's a it's a way to make linked with what was last week general mixing type arguments and so on but here we are in dimension divergent so so for band matrices so so it's good I still have these statements here now here is a classical problem which is also related to the so-called understand transition that was probably mentioned by Simone last week you now consider in don't don't consider a vigner matrix but you consider a matrix with entries being only close to the diagonal so in other words you you take some bandwidth here which will be of order actually for convenience I will call it 4w-1 and where my matrix is still of time size n by n and you wonder about what are the special properties of this matrix where my hij's are independent of finite moments of order orders and but now if w is very small compared to n it's basically a sparse matrix not only a sparse matrix but a sparse matrix where you impose the geometry so we have the expectation of hij square here is 1 over let's take it 1 over 4w-1 this way the sum of variances is constant on each line so we have the semi-circle distribution is the limit in the same scale we can prove so you still have a semi-circle distribution between minus 2 and 2 moments are finite of order orders and you wonder what happens about eigenvectors for example or eigenvalue statistics for this matrix so the extreme cases we know when it's w of size complete size so that it's a full matrix you end up with obviously the GOE and the delocalization of eigenvectors this is what we just proved for Wigner matrices but if the matrix is completely diagonal then of course the eigenvectors are completely localized and the statistics typically if your distribution is smooth for the entries the statistics would be Poisson okay so is there a phase transition so it's really a question of transition here okay so the conjecture by Fyodorov and Myelin and it was there were numerics in the 80s and 90s about it is that the transition occurs exactly for W of order square root n W is much smaller than square root n but you have this Poisson plus localization and for W much greater you have go down in the bulk say plus delocalization so I must say that at the edge things are pretty well understood thanks to work of Soudin and the reason the edge can be done is that the moment method applies no matter that the eigenvalue that matrix is full or not you can take the moments and if you do very smart combinatoric tricks like Soudin did you can calculate the moments of even for the band matrix and in the large moments give you access to the greatest eigenvalue it just singles it out in the bulk there is no such argument and so this is a dimension one band matrix but this is really not to be it's good to keep in mind that it's not restricted to the analog with a random Schrodinger in dimension one because you could look at the band matrix in dimension two namely you take a box in Z2 imagine you make it periodic boundary condition and you decide that each vertex is going to interact not only with its nearest neighbor but with neighbors up to distance w and this is n so this this works in any dimension and it's a difficult problem to prove that for this model in dimension two actually for any w of type n to the epsilon you are supposed to have the Godin delocalization regime so there is no transition for one for a polynomial n this transition is supposed to occur for a logarithmic n in dimension two but let's stick to dimension one so of course this model was introduced a supposedly an easier model for the understand transition so what is known about this on the Poisson and localization side there is a work of Schenker who proves that w up to n to the 1 8 you have indeed delocalization and on this so Schenker okay for w of order up to n to the 1 8 and on the delocalization side there are basically two approaches the first approach is by supersymmetry so this is a term you heard in Christoph's talk you heard about in Christoph's talk already supersymmetry and Bérezain integrals so what the type of results that I known is that if your matrix has a specific variance profile and the entries are Gaussian then you have a representation of the Green function in terms of very high dimensional integrals I don't understand everything about and you can perform some rigorous asymptotic analysis and it was proved by Charbina that if w is greater than a small constant times n and the entries are Gaussian then indeed you have good n and delocalization so this is the work of Charbina so Tatiana Charbina and Tatiana and Maria Charbina have proved that actually you really have the transition at square root n but not for the local statistics for the characteristic polynomials in some sense which is for technical reasons an easier thing to handle so for the local statistics it's just not known for any model that there is a transition with the supersymmetry method there is also work of Erdosch of Bauer and Erdosch who prove delocalization in some regime I think w up to n to the 6 over 7 something like this but this is always in the sense of either Gaussian entries or entries that match Gaussian random variables up to fourth moment because then you can do some moment matching and what I want to just mention in the next 15 minutes is how is the QUE picture or QE picture and the dynamics can help also to understand this problem in the delocalization side and this allows us to do something quite modest w greater than just a small constant times n so the constant I could actually be like one over double log n if you want but we are not in the polynomial scales and but the entries don't have to be Gaussian just any entries is fine okay so it's a it's a famous problem this conjecture was was given by Fyodorov and Mierlin and what fails here so remember the global approach we have here so the second point tells you that after some time you already have relaxation actually theorem 2 may work just you start with your Wigner spectrum with not Wigner with your band matrix spectrum you run the dynamics a little bit actually proving that after a little bit of time you you have a GOE is fine not easy but fine but theorem 1 has absolutely no chance to be true remember that theorem 1 relied on the fact that the variance was constant along the dynamics if you start with the matrix where the geometry is imposed it's completely wrong so for the proof theorem 1 fails so you need to do something different so what can you do so what I just want to explain in the remaining time is the technique of mean field reduction I think it's interesting to remember I just say as a general technique if you have a model where some geometry is imposed you can maybe somehow come back to a mean field problem so here is one way to think about it you have your matrix so let's let's call this matrix H let's call it AB V star B and D where I'm going to take this block of size 2W and this one of size N minus 2W so one way to think about it is that my A is this one and trace the complete one so now you're going to write down what the eigenvalue equation means by going to short complement in terms of A so you have your AB star BD times let's say you write your eigenvector WK and PK and this eigenvector is just the block decomposition for UK typical eigenvector for H and you just assume that this is lambda K times WK so this implies that you have the following fact some matrix I'm going to define Q evaluated at lambda K at WK is equal to lambda K and this Q, QE is A, it's just a short complement it's A minus B star D minus E to the minus 1 B it's actually a B so you define this one for a fixed parameter E this is your matrix now when you do the short complement E becomes random itself it's lambda K so it's a complicated equation because the dependence in lambda K you cannot say I have a fixed matrix and I look for the eigenvalues because the matrix itself depends on the eigenvalue but let's make a graph of the eigenvalues of QE as a function of E and then what we are interested in will be the intersection with the x equal y axis so here is some big graph I want to do here is the parameter E and for each fixed E I want to put the eigenvalues of QE so I will have two W points for each parameter E and they will vary in a continuous way typically in E except when I cross an eigenvalue of D itself because I have a singularity so let's single out the eigenvalues of D here which will be the singularities so let's call them mu 1 mu 2 mu so it's N minus W minus 2 W so this here are the eigenvalues of D and then my eigenvalues of QE will depend on E and I know that I will have some vertical abscissa at each mu i so typically for one given E I will have my two W eigenvalues somewhere here they will evolve and this one will just drop at the first mu 1 this one will drop at the first mu 2 at the second mu and so on but when this one drops another one appears here okay is this kind of graph so you so you get a graph of this type and then some others appear they may drop again anyways it's a complete mess if you want it's fine okay it is something like this and but the eigenvalues you are interested in is is going to be the intersection because Q at lambda needs to have eigenvalue lambda so you are interested in these numbers here now how can you really understand this so let's let's make a zooming so so the first time we consider these graphs it was not clear at all whether there is some structure say but it's very simple you if you zoom in you take n extremely large and you zoom around some one given axis and what you will observe after this zooming is that the the blue lines are basically parallel in the limit they they they never intersect there is a the fact that they never intersect it's qualitatively easy to to proof but but then it really looks like they are parallel not necessarily 45 degrees this way but just parallel if is a really our what it means is that for one given e say here for one given e you consider the eigenvalues of q e but now q e is a matrix of type some some some complicated object plus a mean field and a is independent of this one so for fixed e it's it's a one deterministic matrix plus a mean field so it's a mean field matrix and the techniques I just mentioned in this class they apply there is a lot of work but they apply okay so here what it means is that we actually know that for fixed e the statistics of these gaps are g o e but now if the line is if the lines really are parallel you just make a projection and it means that this one will also be g o e okay so that's how you want to proceed so of course I'm just pushing the problem further here I want to prove the lines are parallel so how do you prove the pros the lines are parallel and it's it's the fact that these slopes are exactly related to quantum unique type Ergolisity types of problems so and just one formula to write what is so let's call let's call these slopes c 1 c 2 maybe for fixed e you can you can order your your eigenvalues and your these are just some lines depending on e and one formula is that the derivative in e of your c um k of e is basically 1 minus 1 over the sum between 1 and 2 w of your uk that's the k so it says that's your k uk of alpha square and alpha equal to 1 so it's it's not completely true you need to put a small perturbation but whose effect is negligible but this is really related to this object so you look at your original matrix it has a negative vector remember your h has a negative vector uk the big matrix uk was decomposed in blocks but you may wonder about what is the mass given by the first two w coordinates so if this mass happens to be very close to a deterministic constant then this is independent of k and the slopes are the same so it's independent of your k so imagine you have the strongest form of q e statement you want okay then this mass is nothing but 2 w over n it's a it's a it's a constant and it does not depend on k these are just the components of w uk yes yes yes so the question is a relative mass of w k compared to p k um so um so so now you you understand why knowing q u e for the band matrix implies somehow university so but again i'm just pushing the problem one step further i want to prove q u e for my band matrix but this is an easier problem because um imagine that for this mean field type of matrices you know that the eigenvectors are flat you you know that q u e is true for each one such matrix eigenvectors give have a well-spread out mass then you can do some patching namely your so you have your band and you what what we just did is to look at this short complement related to this box and we know that the mass of the eigenvector is what you expect say but now you can do it for another box say this one and you know that the mass of the eigenvector here in this box is going to be what you expect okay but because these are after a short complement these are mean field problems so you can actually prove q u e okay and you do it for another box so if each of these boxes the corresponding coordinates of your eigenvectors so the l2 mass here is going to be well-spread out the l2 mass here as well and so on then you can just patch and it means that the whole the whole vector has the mass you expect in any smaller box and so i think the moral of the story is that the l2 mass of eigenvectors is a more extensive quantity so that you can patch and after learning something about eigenvectors you can deduce universality thanks to the fact that the lines are parallel okay so that's the idea how far can you go so so far we cannot go very far we can go to here because there are technical issues everywhere but let's come back to your question if you knew if we knew the very strongest form of q u e for these eigenvectors for example if the basis was really hard in the sense of if for any such matrix q e so here's a problem for if for any such matrix u e the eigen basis was a very very close to hard okay say in total variation optimal rates and so on then we could push very far here and and it actually gives heuristics for the square root n transition because lines keep parallel exactly up to when you put all indices together with the optimal q u e up to square root n you you can still out that the the parallelism keeps enough and in some sense it's also give heuristics for the Poisson regime because these lines oscillate too much after when you go beyond square root n but this is this is a very hard very hard problem but but to make progress here it would be wonderful if someone could find a way to use mixing type arguments for the dynamics I consider for eigenvectors okay so that we get a stronger stronger form of q u e all right so that's the end of this lecture and I just want you to remember that this dynamic thing which looks a bit artificial in some sense for the types of problems we we consider it happens to be not so difficult you have you had this zero one and two which summarize the general idea but oh they're not here anymore but but there is still a lot of room for improvement in the technical ideas okay so that's it just want to make sure the dynamic for the dynamic resist do you run also run the dynamic for the zeros or no so we never run the dynamics here with the starting point we first perform a short complement formula you obtain matrices of this type and on this one you run the dynamics because on this one you have a chance to keep variance constant so that's the idea of mean field reduction you first reduce your problem and then you run the usual things the usual types of arguments and but if you run dynamics here directly there are two natural choices if you run it everywhere then the eigenvector and eigenvalue equation is integrable we have a very nice equation but the theorem one will fail because variance is not constant on the other hand if you only run the dynamics on the basis where there is initially randomness the dynamics are not integrable anymore it's a very it becomes a very complicated dynamics involving eigenvalues and eigenvectors together which I have no idea how to analyze question here when you show complement formula there's the the fact that a b and d are independent very important yes it's completely fundamental because I can see this matrix as I have a deterministic one plus something independent I'm adding it's um I mean maybe you can relax a little bit this hypothesis but the heart of the argument it is important yes yeah so we are interested in local things right we are looking at what happens in a box like this and we we want to prove universities so we are interested in what happens around somewhere where there is an intersection and we can once the box is centered around an intersection you can you there is an argument which says there is no singularity in that very small box and here you say that you use qe for this partial partial eigenstates but you need all of them together yes all the equations so this is why um oh no no you need you need qe for one uk but for different boxes one uk but different boxes and and you don't need all your case if you want universality because you if you want universality you just need to know that two lines are parallel so two uk's will be enough and but what you what you point at the fact that the strongest forms of qe are very important it's absolutely clear here