 All right, thanks a lot again for the organization. So here's a little bit of a reminder of things we have done yesterday. We consider the so-called local law that I'm going to write in a minute. For this is stated about the green function evaluated in the domain D. And D can go up to the scale, almost up to the microscopic scale. Namely, we look at the imaginary part of the at most, at least, n to the minus 1 plus epsilon, right? So this is my green function. The stillness transform is always defined in this way. And one nice fact about the stillness transform is that when you take the imaginary part, this is depending on the eta. This is a way to smooth the empirical spectral measure. And you smooth it on a scale, eta, given by a Cauchy law, around the energy level E that you consider. For Z, my notation would always be E plus i eta. OK? So this is why this is a nice object we want to consider. G satisfies a short complement formula that we explained yesterday. And as a corollary of this short complement, we see that S is not far from the fixed point of this equation, which is a quadratic equation. We saw that M, the solution of this equation, is nothing, but the stillness transform of the semi-secular. So if S satisfies the same equation up to a very small error by stability type of estimates, we would have a very good estimate on S. Another thing that I'm going to use today is that G is very conveniently differentiated with respect to the entries of the matrix. So if you derive with respect to, say, the entry ij, and you want to consider the entry gkl, then you have this type of formula. So you're going to get minus gki gjl plus gkjgl. So this is something you can check. OK? And this is when i and j are different. Well, on the diagonal, you have the same similar type of formulas. But you have a calculus like this that you can really use conveniently. OK? So what the local law tells you, and this is the result of Erdos' Yao and Yan, is that indeed S is very close to M. So uniformly in this domain D, you have S of z minus M of z. We satisfy an excellent estimate just of order at most one over in eta. Let me remind you that this notation means that it's dominated by n to the epsilon times this factor for any positive epsilon, OK, with absolutely overwhelming probability. And this is an excellent estimate when you take the trace. If you look at the individual entries of the matrix, you have a slightly deteriorated estimate, but still a good one which allows you to identify the leading order of the entries, which is going to type gij, still at z, minus delta ij M of z. This is of order at most the square root of the imaginary part of M divided by an eta plus 1 over an eta, OK? So if you are in the bulk, you need to think about this estimate as being dominated by the first term. And the imaginary part in the bulk is nothing but basically the density of states. So this is order 1. So that's 1 over square root of eta, which I mean it's 1 over an eta in D, OK? So this is a first reminder. We had two corollaries about it. So corollary 1 is the fact that eigenvectors are delocalized. So my notation for eigenvectors is always going to be u1, un. The subnorm of uk is of order at most 1 over square root n. And by the way, I think yesterday in the proof, it was mentioned to me I did a small mistake by a factor n. So I will not tell the mistake, but find it, OK? And the corollary 2. The corollary 2 is the fact that the lambda i's are very close to the quantiles. So my notation for the quantiles is that up to gamma k, the integral of my semicircular distribution is k over n, or let's say k minus 1 half, asymmetry. And we have lambda k, which is distant from gamma k, with factor at most n to the minus 2 third k hat to the minus 1 third. Remember that my k hat is the distance to the edge, OK? So this is the first thing I wanted to remind you. So now we change the topic slightly, going to dynamics. So let's say dynamics for universality. And this is an idea by Erdos, Schlein, and Yau. Then it was declined in a variety of forms from a technical perspective. But the idea of using dynamics for general investment should be that goes back to this work. And we consider the Deisen-Brahm motion. Namely, we start with a matrix H0, which I'm going to choose to be a generalized Wigner matrix. And dHt is dBt over 10, minus 1 half of HT dt. So it's strictly speaking not the Brayern motion, but the Ostrandeunbe version of it. So my notation for b here is a matrix, which is symmetric. And each entry follows a standard Brayern motion. And on the diagonal, we have a slightly different normalization. It's bii over square root 2, which is a standard Brayern motion. Remember, this factor square root 2 was designed so that you keep the invariants by orthogonal conjugacy. So this process is also invariant by orthogonal conjugacy. And all of these Brayern motions in the upper triangle are independent. So let's say a little bit about the scales here with this square root n. This square root n is such that after time of order 1, you have an entry of like a Gaussian over square root n, which is exactly the equilibrium measure. And it means that for time much greater than 1, you already reached equilibrium. So you have a GOE, which is extremely good accuracy. Because the Ostrandeunbe when it goes to infinity converges to Gaussian very, very fast. So in particular, for T much greater than 1, it's so close to GOE that the local statistics can be identified. But the question is, can we run this process a bit shorter time so that actually we already reached local equilibrium? This is certainly not true for any statistics of the matrix. But it may be true for the gaps or for the tracer within. But before going to that, we will first prove that the local statistics for Wigner matrices at the starting point has not changed up to some time. So I'm not going to Dyson Brown motion in terms of evolution of eigenvalues right now. I want first to mention the following fact. Let's call it a lemma. Pixelma epsilon. So the first part of the lemma is that for T, which goes up to n to the 1 half minus epsilon, minus 1 half minus epsilon, the local statistics in the bulk have not changed. So for any bulk index k, and you take some nice f infinity with compact support, the expectation of f of your whole s c of lambda k, this is all at time t. My eigenvalues depend on t now. So this is basically the same as at time 0, plus a small o of 1. So it's clear that if you wait a little bit of time, just a tiny germ of time, things have not really changed. But what this tells you is that you can actually go up to this time in the bulk. Yeah, because my scaling is that this is for the 1 over n. And at the edge, you can prove that you can go to much shorter, much longer time. Let's take, you can go up to time 1 actually, not even n to the minus epsilon, at least up to time 1. The expectation of your functional for n to the 2 third, so this is my trace we've done type of scale. This is the same thing at time 0, plus a small o of 1. Actually, I must say, I put one here, but I even can put n to the epsilon. So sorry about it. You can put n to the epsilon. It will not make a great difference, but a little bit I want to emphasize at some point. Does there exist an epsilon or t smaller than any n epsilon? There exists a small epsilon, so this is all correct. And let epsilon be any small constant smaller than the sum of epsilon 0, for sum epsilon 0. OK. How is this consistent with the mixing of the forcing moment? You just set up there that if you go, it's greater than 1. Yeah, that's the whole mystery. So if you look at the entries, in fact, for time n to the epsilon, you already have a Gaussian. What I tell you here is that nothing has changed. So actually, if you believe this, you already have university. You just already have it at the edge, because you can go up to some time. It's clear that if you choose n to the epsilon, your matrix is so close to the GOE that it's tracy with n at the edge. But on the other hand, nothing has changed, so end of proof, which is true for tracy with n, not for the bulk. Are you saying that for each realization, the eigenvalues are almost every realization, the edge eigenvalues do not move? So no. It's different. I need to integrate on the initial value, so that's the trick. I need to have a generalized beginner at the start. Of course, if you start with something which has no tracy with n at the edge, one given this gap, this cannot be true. My expectations here are over everything, including edge H0. OK, so let's understand why this is true. And this will give us the proof for the edge universality. So you're saying that it does not move, so you are a tracy with n, but if you let the time evolve, then things change. I mean, even if you start from already Gaussian guys, it's a dynamic, so the tracy with n. OK, imagine you start with the GOE. Yes, so you have tracy with n. This is just constant in time. Because you make expectations, OK, ah, the distribution. OK, so let's sketch the proof. So here is the fact, which you can prove as an exercise. Maybe it will be part of the exercises on Thursday. So I will comment on whether my condition is generalized beginner or a beginner a little bit later. So if you have f, so if you define m to be the supremum over the following guys, i smaller than j, s between time 0 and time t, and some theta, which I will define in a minute, of my h ij at time s to n, plus the same thing, multiplied by the third derivative with respect to h ij of some function f. So f, again, is just a nice function here, OK, c infinity with complex support. So you look at the supremum of this guy. And then the expectation of f for your matrix, ht is the same as at time 0, plus a big O of t squared n times m. So what is theta? Theta is defined in the following way. You have your big matrix, h. And you're going to change only the entry ij. So this theta actually depends on ij. So you define theta in the following manner. But I have no theta here, so it doesn't make sense. It's for f at theta j. Sorry, there's not much space here. Add the matrix theta h, which I'm going to define. So this theta h, the entry kl, is going to be just hkl if kl is not ij. And it's going to be some number theta times hij with theta between 0 and 1. So let's call it u. So in other words, you first pick ij, and then you change only the entry ij, and its symmetric, into something multiplied by a factor between 0 and 1. So this is reminiscent of some type of Taylor expansion, obviously. Now how do you prove such things? It has nothing to do with a random matrix theory. You just have this collection of h's, which are your random variables. You apply your itto formula to f of ht. But in the itto formula, then you're going to change the hij entry into 0 so that you decouple expectations that can be split. You have a cost, which is given by the Taylor expansion. And from there, you just calculate things, and you will end up with this. You make a Taylor expansion up to other 3, and you're done. OK. Any function here? That's right, OK? So nothing mysterious here. How can you apply it to these types of things? So first of all, I need to mention that this fact requires, and that's important, that the expectation of the hij square is equal to constant in time, in time s. Because I do a Taylor expansion at order 3, and these terms, they will cancel in some way. OK. But this is where what I wanted to say about the Wigner versus generalized Wigner. Remember, my generalized Wigner has any type of variance profile. Just the sum is 1 on one line. So in particular, the variance entries are not going to be constant. So please just cross this. It's for Wigner. Because this lemma applies to Wigner, and this one will also apply to Wigner. We will see a trick after to go to generalized Wigner. So this is for Wigner matrix. So now, how do you apply this for this type of lemma? Well, you want to apply it, for example, let's say for 0.1. Let's just calculate. OK, so apply to 1. So it's not clear how you access the gaps from the stitches transform, for example. If I tell you, I know the series transform everywhere. How do you get the gap? It's quite technical, and I will not enter into this. So what I will prove is not that the gaps are invariant, but the stitches transform is invariant. But it will happen to be enough by some kind of local control integral argument, and you can access a gap from the stitches transform locally, especially because we have rigidity. We know where the gaps are going to be. So apply to 1 for stitches only. So I want to calculate my derivative in time of the expectation of my 1 over n trace of 1 over hs minus z. And I'm interested in local statistics, so I want my z to be now microscopic. So z is equal to e plus i, and not an eta in my domain d, but an eta slightly smaller, n to the minus 1, and e in the bulk. So for this, I just need to calculate, for example, the derivative of gkk, that I'm s. My functional f is just gkk. So I need to calculate the third derivative with respect to the entries, and that's it. If we have a good bound on the third derivative of gkk, we are done. And this is exactly what the local look gives you. It gives you some very good stability results. So now you can apply the formula I just erased, that if you derive with respect to hij three times, your gkk, you will get something of this type. You're going to get some gkagabgb, gb, oh, sorry. I should write it as gkabccd. So I derive three times. I end up with four terms, dE, and I finish with k. And if I derive one times, I have a minus sign. If I derive three times, I will also have a minus sign. And that's a sum over all abcd, so that I want my set ab. Oh, I did what I did is completely terrible. dE and f, I guess. Yes, let's write it this way. And I want my ab to be coinciding with ij, my cd to coincide with ij or ji as well, and ef the same. So this is a sum over, so I want ab to be ij, and the same for cd. So you have such a formula. Now, the sum is a finite sum, actually, because for i and j given, you just have a finite sum. So if I can bound any single term by a quantity of order 1, then my third derivative will be of order 1. What is inside? Just after my sup, I have h times square root n, but we know it's an order 1 quantity, because h is a random variable divided by square root n. Same for the cube, order 1. So m will be of order 1. If I can prove all of this is order 1. But if m is order 1, because I have a t square root n there, I end up exactly with my lemma 0.1, because I have a t. I can go up to n to the minus 1, have minus epsilon. So if any of these g's is of order 1, we are done. Let's say of order n to the 1, we are done. The problem is that the local law gives you g in the mesoscopic scale. But here, my g is exactly on the microscopic one. So how do you understand the size order of g in a slightly smaller scale from the bigger scale? In general, this is not something possible. But there are many ways to understand that g first satisfies some leapshift estimates, even when you get to the real line. But these leapshift estimates, of course, deteriorate. But it doesn't matter, because you have an estimate up to n to the minus 1 plus epsilon. So you may lose a factor n to the 10 epsilon when you go on a smaller scale. But it doesn't matter. It's just an epsilon. It's 10 times epsilon. So this is maybe not order 1, but it's of order n to the 10 epsilon. But by the end of the day, by changing your epsilon, you are done for any epsilon. OK? So from a practical point of view, how do you actually really prove something like a good bound on g on a smaller scale? You use the following fact. The imaginary part of g at e plus i eta. You can prove that it's smaller. So I take some eta, which is smaller than eta tilde, microscopic scale and mesoscopic one. And you just lose a factor eta tilde over eta. So this is just a deterministic inequality. This you can prove, just an exercise. It's a property of the function x square over x square plus 1. So you only lose a factor n to the epsilon, 2 epsilon, and then you're done. OK? So is the sketch of the proof somehow clear to everyone? OK, I hope the main ideas are clear. Yeah, that's a matrix. Oh, I put g. Sorry, that's a trace. That's my stegest form I wanted to know. Yeah, it's only true for the trace. It's not true for the matrix. So in the bulk, it's pretty clear what happens. At the edge, it's a bit of a mess to do the whole calculation. I will not do it here. But what happens at the edge is because the scaling is bigger, the microscopic scale is also bigger, you actually gain many more factors. So you can go up to time much, much greater. And it happens at time 1 is possible. OK? So the proof of edge university is just over. So let me just write it. So proof of edge university. So you pick your t to be n to the epsilon. You know that the expectation of f of your gap minus this is the same as time 0, essentially. But on the other hand, for such a large t, this is the expectation for GOE of the same thing plus some exponentially small term. Because all entries are so close to a Gaussian distribution you are. So from a technical perspective, to prove this, you can just say your matrix is now a Gaussian plus a tiny perturbation. But for Hermitian operators, we know that we have perturbation estimates so that if it's really tiny, this guy has not moved. So it's a bit hard to believe such things works. But I encourage you to actually do the calculation for 0.2 based on the lemma. Now for 0.1, it looks much more complicated. But there is first a point I would like to say about 0.2. So let's keep talking about the edge, but for generalized beginner now. Yes? Are you going to say something about how to ban this term, this G term? How to ban which term? In the sound, this product of G. Oh, yeah. So that's all. That's all. I explained it for the trace. For the trace, if you have a good estimate on mesoscopic scale, you get a slightly deteriorated estimate on the microscopic one. But you only lose a factor n to the 10 epsilon, which is fine. You don't have such an inequality for all matrix elements, but you can use different types of arguments. You have just lip sheets. The fact that it's lip sheets is enough, essentially. And with some constant, lip sheets constant, and you're fine. It seems like you're not saying anything. So let's say something. If you calculate, let's take your variable G, D, E at Z. So you agree that this one can be written. You derive in G. This is 1 over H minus E, essentially 1 and 3 of it. So you're going to be able to write this guy as a sum of products of two G's. So the derivative in Z satisfies some bounds, because G satisfies some bounds on the mesoscopic scale. So if you start with the mesoscopic scale, you have some bound on a slightly smaller scale, and so on. But you only lose a factor n to the 10 epsilon, say, along this process. Because you can start at any mesoscopic scale, which is almost microscopic. We are just playing with the n to the epsilon factors here. But because my estimates are so good, here it doesn't matter. So the actual size of these G's is small enough for just an average? So each G, so what you can prove is that for T for eta equal 1 over n, my microscopic scale, each one, G, D, E of Z, you will be able to prove it's of order n to the 10 epsilon. And this is at most with overwhelming, probably. So remember, on mesoscopic scales, I have much better. It's order 1. But when I go to the microscopic one, I don't lose much. I only lose a factor n to the n. And why is it OK not to lose much? It's OK not to lose much, because then all of these guys will be n to the 100 epsilon. n to the 100 epsilon. And then where are we here? But this epsilon is arbitrary. So this epsilon here does not have to be related to the epsilon there. So you choose one over 1,000 of the other, and you are done. So I hope it's a bit more convincing. So what about generalized beginner? Now, I'm considering these dynamics here. Why did I consider these dynamics? Because it preserves the variance. And when I do this lemma, I need the variance to be constant. OK? Because remember, my h0 is vigner, so the variance is 1 over n from the starting point. Now, if I start with generalized vigner, I have a problem. I cannot replace lemma. But I can change the dynamics. OK? I can change it by, instead of having my 1 over square root n here, I could choose the square root of sij, which is the variance. And then it will also preserve the variance. OK? So everything would go through. And what would we get by the end of the day? We would get that this at time t is the same as this at time 0. The problem is that this is not GOE at all anymore. It's a Gaussian matrix with entries which don't have the same variance. But we don't know, Tracy, we don't for these things at priori. So it's a problem. So we want one more argument to conclude for generalized vigner matrices, to get beyond this constraint of having the same variance. OK? So how do you do that? Now you really go to the dynamics design bar motion in terms of eigenvalues. OK? So remember that I order my eigenvalues along the dynamics. And you still consider this dynamics here. So dHt is dbt over square root n. So I still consider this one. OK? And now I want to see what is the induced dynamics on eigenvalues. So theorem. So if you consider the following still gets the differential equation. Let's say dxk of t is dbk. Let's call it dbk tilde t over square root n plus 1 over n 1 over xk minus xl minus 1 half of xk. So this is an equation you can consider. Now is there a solution? What are its properties? So that's what we're going to say here. So there exists a unique strong solution. So strong solution. And x as a vector as a process is equal in distribution to lambda as a process. So in other words, when you look at eigenvalues, it satisfies this dynamics. So I will be a bit sketchy. Like, do you specify the process in the initial condition for x? Oh, yeah, I mean, if I want this equality low, I need the initial conditions too much. So x0 is 0. Thank you. So here is the sketch of proof. So imagine you can apply it to a formula in a blind way to your eigenvalues as a function of a matrix. Then you need the perturbation, the classical, say, Hadamard perturbation formulas for how does the eigenvalue change when you change the entries of a emission matrix. So what is known is if I define f dot as derivative with respect to one matrix entry of a function f. So this dot depends on i and j. Then lambda k dot is going to be h dot conjugated with uk. And lambda k dot dot uk plus the sum where l different from k of h dot uk star ul divided by lambda k minus lambda l. So exercise, let's process. How do you prove such things? Well, do it just in the physics way. You just write what the eigenvalues and eigenvectors mean and differentiate. And then you are in terms, and you will end up with this. And it strongly uses the fact that it's diagonalizable in an orthonormal matrix. So if you just use this. I know there will be a team of the eigenvectors. I don't know. No. No, they show up along the derivation. But at some point, you will get some uk dot uk. But you know the norm of uk is preserved. And actually, so the derivative of eigenvectors is going to show up when you consider the Ito formula for eigenvectors, which is something we will see on Friday. So once you have this, you can agree that you can blindly apply Ito. And what you will end up is this b tilde is not exactly b, but it's b conjugated by uk. So apply Ito and just get equation 1 with b tilde t as a matrix, which is defined as the integral between 0 and t of ut star, eigenbasis, db, us star, dbs, us. So it's kind of enrolling your branch motion with the eigenbasis. But we know from Levy's characterization of Gaussian vectors that this will also be a branch motion, because this is an autonomous basis. You have to do the calculation to convince yourself about it. But if you define this, this is a branch motion, a matrix branch motion, by Levy. Why matrix, in the case of scale? This is my matrix here. I mean, any single entry, this is an equality of matrices. It's us star. So us stars as my all eigenbasis at time s. So you have this equation. Now, is this really legal, what we have done here? Not really, because you only can differentiate or apply Ito. If you know, you don't have multiplicities of eigenvalues. You want to be in a smooth domain. So you need something to justify that you will not have a collision of eigenvalues. So this can be made rigorous, provided that if I define tau epsilon the first time, so that there exists i different from j with lambda i s minus lambda j s equal epsilon, if you define this first time, then as epsilon goes to 0, this as still goes, that's the first time. As epsilon goes to 0, tau epsilon goes to infinity. Yeah, I was going to that. So I assume I don't. Just like if continuity of that city is orange. You don't need to, because you run your dynamics. Why do you need to start the dynamics at time 0? You can start it as a small germ of time. Even if you have a discrete matrix first, if you just make a tiny convolution with Gaussian, almost surely you already have different eigenvalues. So that's really not a problem. So strictly speaking, Nicolae is completely right. They need to start with different eigenvalues. Just a question about the db, is there a matrix, is it diagonal or not? No. It's symmetric, but not diagonal. Oh, I understand the confusion. No, I don't understand the confusion. So this db really is a matrix here. That's my evolution of ht. It is not diagonal. Is it diagonal? b tilde k is the kk entry of b tilde. And my notation for b tilde k, that's a kk entry of b tilde. So how do you prove this? Anyone has an argument, just a one sentence argument for when you look at these dynamics, you never collide. Just one sentence argument. Which is not rigorous, but convincing. OK, yeah, you can find the best of process behind it. But if you want to check the indices of your best of processes to make sure it will not touch 0, it requires a bit more, just even shorter. If you believe that at least up to the first time it touches, it describes the eigenvalue evolution of your matrix. So remember the co-dimension argument in the first class? The co-dimension of eigenvalues coinciding. This is a co-dimension 2. So if you have a branch motion in your space, which is of dimension n times n plus 1 over 2, is it going to touch a subspace of co-dimension this minus 2? Well, no. I'm typically not. So this can be made rigorous. And it's just clear. But on the other hand, we want to go a bit beyond that. I would like to add a parameter beta to go to beta ensembles. And I would like to argue that only for beta greater or equal to 2 to 1, it's true that it will not collide. So what I just said was the co-dimension argument can be made rigorous and that's fine. But let's have an alternative proof. So as far as I know, the very first proof of 2 was given by Makin as an exercise in his book, in one of his books. I forgot which one. So let's rewrite this equation in a slightly different manner in a long-joven type by identifying this as a derivative of a Lyapunov type. So the good way to do this is, so 1, let me return it this way. So I just can write it in this way, derivative in xk, sorry, where my xk is, where my phi is a flowing function. So that's my minus 1.5. So I add a 1.5. So it's 1 quarter. So everyone agrees with these types of writings. Now let's look at the evolution of phi along the dynamics. And we will find out that it's very close to being a super martingale. And the only fact that it's a super martingale implies that there will be no collision. So just a calculation. My d phi is equal to some martingale term, some local martingale term. Actually, it's really a martingale term. Plus a drift, which is of the following type. It will be 1.5 minus the sum of lambda k squared, plus n over 2. The key is a drift has no singularities. So why is this phi is nice? That if you apply it to, look at the drift, you will have a lot of mini-consellations. Okay, the consellations being due to the log factor. And because the log factor will make you appear some 1 over lambda k, 1 over xk minus xl. But if you put two of them together, they will vanish. And you end up with something with no singularities. So that's an exercise. So in particular, so this is your sum of lambda k squared. So come on, your lambda k, you don't expect them to get extremely big. It would be weird to have one again value that gets extremely big. So let's just discard this fact for now. So we can say that this is bounded by a constant. And if it's bounded by a constant, then if you just look at phi plus a constant times t, this will be a super martingale. Yes, I don't care about the ends here. I just want to prove a qualitative thing about my process. So if you define your, so if you define tk the first time, such that there exists, so this k is a big k, there exists a lambda j greater than k. So before tk, I can bind my sum of lambda k squared. Then if you look at your phi, but stopped at tk and at t epsilon, and you add a constant, well, just for the sake of concatenation, remove a positive constant times your t. So this is a time t. If you remove a large constant depending on k, then this is a super martingale. This constant depends on k and n. And maybe, yeah, that's it. So what does that mean? So the expectation at a large time is going to be smaller than the expectation time zero. But this is my phi, so the expectation at time zero is going to be plus a constant, sorry, t times the n of k is going to be smaller than my expectation phi at my time t. But now let's do one step back. What does that mean? So you have t epsilon and tau k. If two eigenvalues get extremely close together, what happens to phi? Well, where is my phi? Here it is. So it just gets very big. My epsilon is small. So this is basically getting extremely big if two eigenvalues get close together. So it will just contradict this. So this needs to tell you something about the probability that two eigenvalues get close together before time t. It gives you something. And this is something when t goes to infinity, you just conclude what you want. So from there, conclude. Exercise. Do you think you have a quantitative bar of an epsilon per eigenvalue? That's true, but it would be a bad one. It would be a bad one. But it's true. So strictly speaking, you will conclude that tau epsilon goes to infinity as epsilon goes to zero. Not exactly tau epsilon, tau epsilon nth tau k. So you need to deal with this tau k. You need also to prove that tau k goes to infinity as k goes to infinity. But this one is completely obvious because tau k is what? That's the trace. That's the sum of your lambda k squared. So that's the trace of h squared. So that's the sum of n squared bar of motions. Whatever estimate you have on base cell or branch motions would be enough. So is the ID clear here? The key is you get a good constellation so that there is no singularity there. So you can exhibit a supermorting build and that contradicts the fact that tau eigenvalues can't get very close. So just one comment. If I imagine I put a beta here, so what if I put a beta here? What if I change my process? So what you will find out about the drift is that it has the following form. It will have a singularity now. It will have a singularity with 1 over beta minus 1, 1 over n. So the sum of 1 over lambda k is the calculation. So 1 over beta minus 1 factors the sum of the inverse squares of all distances. So in particular, only for beta greater than 1 this will have a good sign so that you can still exhibit a supermorting build. And in fact it's wrong. If beta is smaller than 1 there will be collisions. But this is harder to prove. So our matrix and co-dimension argument is kind of critical. But half more hour I guess. So now how do you use Brown-Motion for a university purpose? To do with this. So it's a coupling argument. So that's Brown-Motion for a university. So the very first proofs of university thanks to Brown-Motion by Ardor-Schlein-Nyau were using some hydrodynamic and entropy types of arguments. It was related to background estimates and a whole field of knowledge which is completely different from what I'm going to explain here. This argument by coupling is probably somehow more just easier for probabilities and it gives university in a strong sense. Namely, you don't need to average your own spectrum to have university. So here is a coupling. So let me state the theorem first. So now let h0 be really a generalized beginner. Then let's talk about the bulk first. For t, a time greater than n to the minus 1 plus epsilon, my expectation of f of ht. So my functional equation I really want to look at the gap. So rho semicircle at lambda k, t times lambda k plus 1 by kt. This quantity is equal to the same one as for GOE plus a smaller order. And the analog for the edge is that you need a longer time to get local relaxation. Instead of time 1 over n, you need time n to the minus 1 third. We proved already university for the edge. We haven't proved yet university for the bulk and you proved that for nuclear matrices you can go up to 1 over slow 10 physically. And there you say that already 1 over n is sufficient. Yeah, so these are different statements. One of them compares what happens after time t after what happens at time 0. The other one compares what happens at time t with time infinity. So what will be important for us is that there is an overlap. And the subtlety here is that I'm talking both about generalized Wigner and Wigner. So for Wigner matrices we already proved edge university and you don't need the dynamics of eigenvalues for that. Only the dynamics of the matrix as a matrix are enough. But for generalized Wigner we have not proved edge university yet and this will be important. It's something like this. Imagine you know the central limit theorem but only when all your random variables have the same variance. Well, it happens that you can prove central limit theorem by Lindberg method, for example. Even when the variances may vary. Why? Because you know that the central limit theorem is true for the Gaussians with different variances because of a nice probability of calculations of Gaussians. But we don't know that for random matrices. There is only one model for which we know how to calculate things for random matrices. Just give us a little tip in the point for Wigner, not generalized Wigner. So actually we will prove it for generalized Wigner. So it's clear that if you accept this theorem here, the crossed one, you already have university for Wigner. Now for generalized Wigner it's pretty easy because generalized Wigner all entries have size at least one over n times a small constant. So you always can realize such matrix as another one plus an evolution of a diagonal motion. So I will write this a little bit later. Most importantly I want to give you just the idea for this by coupling. So proof idea. So here is how you proceed. You are going to consider so here we know that lambda zero as a vector at m zero is Wigner or generalized Wigner. Let's introduce another initial condition. Let's call it y zero being a GOE. Assume you start with the eigenvalue of a GOE. So independent and GOE. And now you will run the Dyson-Brahme motion dynamics in two different initial conditions but for the same Brahme motions. And we are able to couple this because we proved that there is a strong source. So what do you get? I just write these things. So you know that dx, d lambda k that's just right. That's db tilde k t plus 1 over n. But I run the same dynamics for y. So all of these depend on t. So I can just subtract and my Brahme motions goes out and I end up with a classical differential equation type of thing for the difference. So if I define delta k t to be, so I need an exponential t over 2 times yk minus lambda k. So what is the equation for so what you get is just that the partial derivative in time of your delta k after taking the difference is equal to 1 over n sum of delta L minus delta k t t divided by the product of my lambda k minus lambda L times yk minus yl. So notice that I put an exponential t here that's in order to get rid of my in the equation I obtained. So it's a nice parabolic equation which is not non-local but smoothing. So in particular what this means is that for t equal to infinity all of these delta k's are going to be equal. So let's just make a reasoning for t equal to infinity first. All delta k's are equal means in particular that delta k plus 1 is equal to delta k. They're equal to infinity because they have an exponential so for t large you have delta k plus 1 equal delta k plus a very small error thanks to the smoothing. But if delta k plus 1 is equal to delta k look at the difference this is. You reorganize the terms it means that the gap for one of them is equal to the gap for the other ensemble. So now it will be about understanding the time scales. But it's clear that this coupling argument if it has a smoothing effect like this is one tool for university. In coupling you think that you use the same realization of the formula? Yes. Okay. So you can make a drawing you can make a simulation and you will find out that this trajectory is keep oscillating of course in the long run but they will oscillate together after some time. So let's understand the time scales. So there are different levels of rigor here that I can use. First let's assume in this equation I only keep the nearest neighbors in my sum over L different from k. If I only keep the nearest neighbors and I assume that this lambda k are sticking to that typical location just to understand the scale. If they stick to the nearest neighbor it's going to be size 1 over n in the bulk another 1 over n 1 over n squared here plus 1 over n. So I have an n times a discrete Laplace. So if if we just keep L equal k plus or minus 1 and we assume that lambda k at time t is always equal to gamma k which is obviously completely wrong but by rigidity we know it's not too far. Then my equation let's call it 3 looks like a discrete Laplace and 1. This is approximately in the bulk. Look at the theorem for t over the greater than 1 over n this dynamics relax. This cannot be true for any initial condition. If you have a big gap in your spectrum it will just be wrong obviously. But if you have a sufficiently spread out spectrum initially it will be true. So what you end up with is derivative in time of this delta k. t is something like n times the discrete Laplacian applied at delta. Now how fast does this go? If you don't have a n you have an initial condition no matter what you need to wait time infinity for things to get flat. But if you speed the smoothing by factor n you need to wait time 1 over n which is explaining the scale here. Nothing more. Now let's talk about the edge here. If you assume lambda k equal gamma k but you assume the edge now the edge gap is going to be n to the minus 2 third. So you may think it's n to the minus 2 third so it's n to the minus 4 third so n to the 4 third divided by n so it's 1 to the 1 third but it really means that time of relaxation is n to the minus 1 third which is what is explained I erased it. That's very unfortunate. So for the edge let me conclude first for the bulk this suggests complete smoothing after t much greater than n to the minus 1 and for the bulk for the edge after t much greater than n to the minus 1 so now what do we need to do to make this rigorous well a lot the problem is there is a general theory of course about parabolic equations even in the non-local setting but having very singular coefficients like this which even though they will not cross may get extremely close this is not really taken into account into most series so the real fact is when they get very close actually the smoothing should be faster right however the general techniques that are in the literature don't really allow the coefficients to explode above so initially Erdos and Yau derived a method to deal with these equations which was inspired by the Georgian Ashmozer theory of parabolic equations and it was quite challenging and what we will see is that there is an easier argument to make this rigorous with a maximum principle so here I gave you the first step but you let's try to go to another level of rigor let's assume now it's not it's not local let's keep the non-locality but let's still assume that we stick to the typical location so if not local keep the non-locality and for simplicity let's say that gamma k is just k over n where you have a picket fence type of then the equation becomes v over dt of your delta k t that's the sum you obtain this equation so let's try to this is a discrete space equation let's try to look at the continuous space analog some f t of x is equal to the integral of f t of y minus f t of x y minus x square u y let's say it's on r and then we will talk about the factors a bit later on but this is just a prototype for my continuous space analog so what is a fundamental solution for this equation it's a translation invariant equation so if you're in Fourier space you actually should be able to solve it in EID after time t what it looks like so you start with a Dirac at E and then propagate but in a non-local way in particular it will propagate pretty fast you will be able to have a non-negligible mass far pretty fast so what is a prototype of a distribution like this it's a cushy so the fundamental solution is cushy in particular the kernel will be as follows just a cushy type of distribution and the smaller t the closer to a Dirac so why do I mention this because we will study eigenvectors on Friday and it happens that the same equation will occur and the fact that it propagates so fast like a cushy will have some impact on our understanding of eigenvectors of random matrices now I think it's a good time to stop I will so what I will do next time is to give one rigorous proof of the theorem of relaxation not in the bulk the bulk is just too hard but at the edge at the edge there is an easy argument to prove that after time more than n to the minus one-third you really get to receive it and after that we will go to eigenvectors and look at why this is some object important to understand to go beyond mean field models that's it any question what do you mean that you will give a rigorous proof it seems that you were able to have like two so so here I give heuristics right I assume that my lambda k sticks to gamma k it's just hard to control to make rigorous the assertions that I can approximate my equation by another equation where I change the coefficients this is a normal generalization type of statements you have an equation with random coefficients you want to approximate it with one with deterministic ones there is a whole theory for this and most of the time the coefficients are bounded you have some ellipticity and some here it may just explode it's just hard to control so we need to find something rigorous and there is one simple observable we can create which will at the edge make it very very easy in the bulk it's just harder no it's a stupid question what do you mean by there is no singularity in your argument I mean that in the drift you mean for the proof that again values will not collide in the drift I had a function of the lambda k which is smooth in the lambda k remember that when I told you if beta is different from one you have this extra factor one over the sum of the inverse of distances between the lambda k which is singular so it just makes it harder to control