 So the main question of universality is about how to get rid of the Gaussian assumption on the matrix element. So you might like to have a theorem, which says that you start with a Wigner matrix, no matter what the distribution of the matrix elements are. This local statistics of the eigenvalues, so you take the eigenvalue, to the endpoint function, the joint distribution of the eigenvalues, eigenvalues, you take its k point marginal, you take its rescaling as we did before, and no matter what the origin and distribution was, you always should get the same Dyson sign kernel. So that's the famous Wigner-Dyson meta universality, and it has been recently solved over several years in a strategy, which we call the three-step strategy, which I will explain in bit more details. And this was done in for both symmetry classes, the real symmetry and the compression case. There have been related results, the first result, which went beyond the Gaussian calculation was Johansson quite early. So it was Johansson who did the Hermitian case, so Hermitian case refers to the complex Hermitian case, but with a large Gaussian component. So in the original calculation of Dyson meta Godin dealt with the Gaussian case, now Johansson could generalize it to the situation when the matrix elements are not Gaussian, but they still have a big Gaussian component, a large sizable Gaussian component, so you can think of it as a convolution of an arbitrary distribution with an independent Gaussian distribution. But this method works only for the complex Hermitian case, because it uses the Harris-Chandra formula. And then there was an independent result by Tau and Wu, which was done by the moment matching, the form moment theorem, which in the original version was also on the Hermitian case, because it went back to Johansson's result. And there have been similar developments for many, many other models. Here I will talk about bulk universality, but there have been results on the edge, then there are results also about the beetle of gases, which are a different type of ensemble. It's even not necessarily a matrix ensemble. And then there are also other types of matrix ensembles, for example, sample covariance matrices, this is exactly what Vishard invented, so it predates the Wigner matrices, there are sparse matrices, sparse graphs, Heldesheini graphs, regular graphs, and so on. So there is a huge zoo of possible models and also possible questions. I will just focus on the simplest situation. Ok, so let me just explain you this three-step strategy on one side, and then we will go more into details later, just to have an idea. So the first step in this three-step strategy is this local density law, the local semicircle law, which I already emphasized, mentioned before. So the local semicircle law, basically local density law, we talk about density law, when it's not the semicircle, we will see examples of that. So the local semicircle law, local density law cels, the following, you take an interval a, b, you always take it somewhere in the spectrum, and then you would like to know how many eigenvalues, what is the density of the eigenvalues in this interval, you take 1 over n times the number of eigenvalues in this interval. It's a very mature question. Now you divide by 1 over n, because in this interval, in this interval of typical size order 1, you expect order n eigenvalues, and then the Wigner semicircle in the simplest case says that as n goes to infinity, this goes to the integral of the semicircle density, so this semicircle functions as the square root of 4x squared function. So this just tells you that the number of eigenvalues in this interval divided by n converges to the integral of the semicircle in the corresponding slice. It's the area of this regime about the interval a, b. So that's a very natural statement. Here notice that this is of the type of a low-flage number result, because the left-hand side is still random. The number of eigenvalues in an interval is a random metric, and the number may fluctuate and thus fluctuate a little bit, but it doesn't fluctuate too much as this theorem says, because once you divide by 1 over n, then you get a deterministic number. So that's a low-flage number type result. Now, this holds the original Wigner semicircle holds when a and b is fixed. So fixed means that n independent, and of course that's sufficient to conclude that the local empirical density weakly converges to the semicircle low. Now, the local low here and the local density down to a scale 1 over n ask the same question, so local low, ask the same question if a and b is not fixed, but if a and b shrinks with n, so a and b may go to, let's write it in this way, b minus a smaller than 1 over n, much smaller than n. So for example, this is some power n to the minus, n to the minus a. There is a better letter. n to the minus c. OK? So now you ask the question when you shrink this interval, and of course when you shrink the interval then you may see more fluctuations. The right question to ask of course is not exactly this normalization because the number of eigenvalues will shrink as the size of the interval shrinks. So let me write it in the following way. Let me also divide it by the b minus a as long as the a and b is fixed whether I divide or not it doesn't make much difference but when b minus a starts going to 0 as n goes to infinity then it's better to take this normalization because in that way this integral here still goes to an order 1 quantity and it just goes to the semicircle at that point and then this is also the right normalization on that side to make sure that this quantity is order 1. OK? So we look at the question whether this limit holds if b minus a is shrinking that's smaller than order 1 and the question is that what is the smaller scale on which you can still hope for such a result. You still hope for a result that the number of eigenvalues in this interval appropriately normalized you got something which is deterministic. Now you can easily convince yourself that you cannot go too far below because you cannot take a too small interval because if this interval is so small if the size of the interval is 1 over n or smaller then you are on the scale of the individual eigenvalues and then if you take a very, very tiny interval here b like that an interval which is whose size is comparable with the eigenvalues and sometimes you may have an eigenvalue falling into an interval sometimes you don't and this quantity sometimes is 0 sometimes it's 1 so there is no deterministic limit so you cannot go you cannot go with c above 1 you cannot have c equal to 1 or above but the hope is that this is true stay holds for any c smaller than 1 so it means that you can go down to scales which is a little bit bigger than 1 over n and that's what we call the local law ok, so that's the explanation for this sentence now precisely how much what it means a little bit about 1 over n this depends a little bit how careful you want to do the want to do the work for all these talks I will just talk about an n power so I will take an n power which is strictly smaller than 1 so n to the minus 1 plus epsilon there have been results which tried to improve that so there are results which talk about scales of log n to some power over n and so on so there are improvements in that direction but I'm not going to talk about that ok, so that's the local law now let me mention immediately the local law comes in a stronger way to be useful here I just presented the local law in such a way that what is the eigenvalue distribution but actually it comes together with the resolvent in a minute but when we talk about local law in general we will also mean an entrivized form of the resolvent just put it here, the resolvent I will always use the notation g of z for the 1 over h minus z where z is a spectral parameter it's in the upper half plane so the upper half plane is always denoted by h and the z will be denoted by e plus i eta and eta is positive we always under in the upper half plane we take that branch of the resolvent ok, so this is the resolvent and the first question what you may want to know about the resolvent is this normalized stress I'll just try to put it up here so it's normalized stress is the 1 over n times lambda i minus z summing up for all eigenvalues and especially we would like to understand this for small for small eta the resolvent is a quantity which gives you lots of information about the matrix it's a very commonly used object not just for random matrices also for operator theory and the critical question when you talk about the resolvent is always the imaginary part of the spectral parameter the smaller the eta you can understand the resolvent the better resolution you get about the spectrum of the corresponding operator so in particular if you are looking for a local law you want to understand the resolvent you want to understand the distribution of the eigenvalues when you better get the resolvent under control on a small scale namely the scale on which you can understand the scale eta on which you can understand the resolvent will be basically the same scale as the scale on which you will be able to conclude the local law we will explain more details in precise formulas but here I just wanted to make a remark that typically when we talk about local laws in this setup you may think that you are interested only in the resolvent in the trace of the resolvent but actually we are also interested in matrix elements of the resolvent we also need need to understand the matrix element gij of z and when we talk about local laws then we also mean it in the stronger sense of course if you understand the matrix elements then the trace is understood already but the other way around it's not at all clear the trace has an extra averaging over the diagonal elements and in principle trace contains much less information ok, so this is number one this is the local density now the local density is still on the scale above 1 over n and it's still a low flash number type result so it doesn't tell you anything about local fluctuations eventually if you want to understand this sine kernel then you have to go to executive scale 1 over n nevertheless within this approach, within this three step strategy we found that even though the local density itself doesn't answer to the question the local doesn't answer to the question immediately it serves a very useful and very important, a priority bound for the other two steps and the other two steps is that the disembranding emotion which is when you do the following you take your Wigner matrix and you say that you don't study the Wigner matrix immediately but rather you add that tiny Gaussian component to that and then you study this new matrix the original Wigner matrix plus a Gaussian component a little bit similar in the spirit of Johansson and then you start you try to understand that now of course that's not the original model that's not the original matrix because you added a Gaussian component and the original philosophy was that you didn't want to do any Gaussian but nevertheless you do that it's a small Gaussian component and then you use the disembranding emotion which is a stochastic a stochastic differential equation to show that with the new matrix with the original matrix plus a tiny Gaussian component does have the right local statistics that's step number two and that's a separate theory and then there's a step number three when you have to go back because you wanted to prove something about the Wigner matrix with arbitrary distribution and you could prove something about the Wigner matrix plus a tiny Gaussian plus a tiny Gaussian component okay good but that's not the original thing so now you have to go back but it turns out that this you can do by perturbation but by some perturbation theory it's not completely trivial but it's in spirit it's a perturbation theoretical argument when you can remove this tiny Gaussian component it's a little bit funny because it sounds like that you added this tiny Gaussian component to get the universality but then you remove it nevertheless this is the way to go okay so now by now the step number two and three is considered almost quite standard because there are very, very general theorems which are available to express the power of these two methods of these two steps so basically there are theorems which you can take from the shelf which says that if certain conditions hold then there is a machinery and the machinery immediately gives you both the step number two and step number three and the input of this machinery is always some version of the local law some version is that some control on the trace of the resolvent here or even some control on the matrix elements actually usually you need both for the step number two you need only the trace of the resolvent but for the step number three you also need some information about the individual matrix element so the step number two and step number three these are very abstract very general results by nowadays which do not depend on the model at all they depend only on the input and the input is always some kind of local local law and only the local law which is model dependent so now the theory looks like that that if you give me a random matrix model under certain conditions under certain setup then you have to work for the step one for the local law and that depends on the model and you need some different methods to do that but after that if you got the local law then you just apply this machinery and you get universality so I will mostly focus on the local law here so here are models of increasing complexity which I am going to discuss in these lectures and here I am trying to come up with more and more general models according to this Vigners philosophy Vigners philosophy was that he used this Vigners matrix only to model something which was much more complicated we will talk about physics next time he certainly he knew that Vigners matrices are just caricatures and in order to believe more and more in this Vigners thesis it is better to extend the scope of all the possible models of all the models for which you actually can prove universality it still will not cover the original model of Vigners but at least it justifies if you study the problem if you prove universality for a bigger bigger class of models then it justifies more and more that Vigners vision was correct so this is the spirit in which we are we are constructing we are looking for more general models so the first model is the Vigners matrix which I already introduced is characterized by i, i, d entries and in particular the variance matrix which I will introduce here this I will call s, i, j this is chance there expectation of h, i, j square in the Vigners case the variance matrix is very simple each matrix element is the same is just constant it is 1 over n and now in that case what we are interested in is we are interested in the density ok, this I already explained but we are also interested in the matrix elements of the of the resolvent or even in general the structure of the resolvent so in the Vigners case you have the density which is semicircle the g, the resolvent is a sensory diagonal not exactly diagonal but a sensory diagonal which is very small sensory zero and then the diagonal elements are non-zero and moreover the diagonal elements are comparable, almost the same so that is the structure of the Vigners matrix and its resolvent now the next category is the generalized Vigners matrix which we introduced a few years ago here we keep the independent entries we drop the condition that the variances are the same which is this condition that the sum of the variances in each row is constant is the same for every row which for convenience you can choose to be 1 so these are the deviates from the completely mean field model when every matrix elements are identical but still kept this property that the row sum the variances is the same for every row and in that case the picture is basically the same you don't get any new feature the density is still a semi-circle the resolvent is diagonal and then the diagonal matrix elements are the same the next class is the Vigners pipe matrix which comes with still independent entries but now we drop any condition on the variances so the variance matrix is an arbitrary n by n matrix and then we study this model and in that case you see that the density is not semi-circle anymore it heavily depends on the matrix S is determined by this matrix S Salon, yeah? Do you need a lower bound than anything? Yeah, I mean of course I mean in order to have precise results sometimes you need lower bounds that's right, yes but you know this is I'm just giving you a no value for the moment there are no precise CRMCR okay so now for the Vigner the density is determined by this by this matrix S the second moments it's not a semi-circle but the G is still diagonal but the diagonal matrix elements are not the same in general so the matrix starts having a non-trivial structure and then the last model is what we call the correlated linear matrices we drop the condition of independent entries we replace it by non-trivial structure in that case of course the Sijj is not sufficient to describe the second moment in that case you have to replace the Sijj by all possible covariances so it becomes a fourth tensor which describes all possible covariances of the all possible correlations between matrix elements and in that situation even the G is not diagonal anymore simply you have a correlation between matrix elements and this correlation induces a non-trivial decay of the of the resolvent okay so these are sort of the four models of increasing generative which we are going to discuss and then there are many many other extensions of the original Vigner model which I'm not going to discuss so let me just flash it up for a moment the idea is again that you try to verify Vigner's vision in a way that you try to extend its scope and you try to come up with more and more general models in various directions so there are many many different directions then one can do it okay so now let me do let me give you an overview of the results there will be three types of results the first result is about not about random matrix itself it's about the density of states it's about what replaces the semicircle in the general situation and especially it's singularity structure so in other words how do this picture, how do the semicircle how do the replacement of the semicircle looks like if it's not the semicircle anymore because the number one the number two is what I already explained the local low, we would like to understand the g the resolvent and especially we are looking for deterministic quantity which describes the random resolvent with a high precision so in the case of the semicircle this is just the stirtjastrans from the semicircle which I will define in a minute but in a more general situation when the resolvent has a non-trivial structure its diagonal elements are different or even there are non-trivial off diagonal elements then we will altogether need a matrix a deterministic quantity which describes this fluctuating random object the resolvent so these are the local lows and the third type of results are universal which I already explained before so let me give you a few pictures those who are used to semicircle now you will see some non-semicircle lows so this is still the semicircle that's what you get if you have a generalized Wigner matrix situation there are some of the variances in the rows of the variance matrix is constant the next picture is when it's not what we call the Wigner type matrix some of the variances in the row is not constant anymore so here is a variance profile but you see the matrix with color codes in the matrix S the variance matrix the color codes indicate the different sizes of the variances we just came up with some ad hoc picture variances and then here is the density of states so definitely not the semicircle it's still symmetric the density of states is symmetric but it's an arbitrary picture and now here are some further pictures these are various pictures of density of states for Wigner type matrix here you see the variance matrix of the Wigner type matrix so I mean this is here we typically chose a block matrix when this variance the S matrix has 2x2 block structure and in these places there are different numbers and if you change these numbers then the variance profile then the density profile changes so here we model the situation when the density splits from having one interval one interval support into three interval supports by changing one of the parameters along this model so especially at the moment when the interval is split then you see a special casp singularity emerging so in particular you would like to understand what kind of curves what kind of pictures may emerge and the picture and the answer will be very roughly speaking that the curves what you see here these are analytic, real analytic curves apart from the singularities apart from the points when it comes to zero so in particular this point here is also a real analytic point it doesn't look so but it's a very very small but real analytic part but then when it comes down to the real axis then it develops real true singularities and the type of singularities square root singularity at the edge at this edge also in this internal edge it doesn't look so but these are all square root singularities like a semi-circular but then when you are exact at the casp then it's the one in the cubic root singularity now there is another feature this one feature which we investigate there is another feature you may ask yourself what determines what triggers the splitting of the interval splitting of the support into various intervals and after some investigations we found that actually what triggers this splitting is the discontinuity in the S-matrix so here I chose an S-matrix which I know there are some numbers 0.10, 0.2, 0.2 and 1 for example and because within this S-matrix there are discontinuities when you go from this point to that point the value of the variance jumps this is the reason which allows you to have different support if you smooth it out in such a way that you take this profile of the S-matrix in such a way that there is a little transition layer between 0.2 and 1 so you go continuously even a tiny layer like this red layer because then suddenly you restore the single interval support so then the picture is like that ok, so these are certain qualitative questions which we are looking for the third type of question that which I already mentioned but there are explicit formulas to express that so the singularities happen only when the density goes down to 0 and there are two fundamental type of singularities the square root h singularity and the cusp singularity but also we can describe very precisely forming so what happens when you are right before the cusp and what happens a little bit after the cusp so in one case you have a smoothed out cusp you don't see here but there is no singularity at that point it smoothed it out and then here you see a tiny when two square root singularities meet at a very tiny gap and we have explicit formulas for that and all these formulas look horrible but these are also universal formulas because they are basically universal singularity patterns which arise and also only these two types of singularities square root and cubic root singularities can occur plus their smooth version ok so now here are the main results about the locales and universality it's still a little bit informal so the first result is about the Wigner type matrices we look at the Hermitian matrix Wigner type with a general variance profile and here now I already put in the proper assumption which was mentioned over that we assume that the singularity and the variances are comparable with one over n so there is an upper and lower bound a matching lower bound and then the optimal local holds optimal means in this sense that the local holds for any c up to any c smaller than one and also the bulk universality holds and then the more recent result is about the correlated case so let me formulate it more precisely so by correlated matrices we mean the following we take a deterministic matrix which you can think of as the expected value of the h or earlier we didn't include any expected expectation we always assume that the Wigner matrix has centered matrix element but you don't have to do that you can also include an expectation deterministic matrix here plus you take a random matrix which has now zero expectation because I already subtracted or wrote up the expectation separately and I put in the one over square root of n scaling explicitly that we made this W the matrix element of W to be order 1 and now we assume a polynomial decay of correlation for these W's so W's you should think of it as a random field so you have the W ij in the matrix element and by a matrix and the matrix element you can think of it as a random field and now if you have a domain A within the matrix then you can ask you can look at the matrix element you can look at the random variable W a which depends the collection of the random matrix element which are indexed by this set A and then you can take any function of that you can do the same thing for any function of another set B and then you can ask yourself how much the random variables here and the random variables here are correlated what is that covariance and then the way to describe it is that you take any function of these random variables phi and psi with some nice properties some smooth functions for example how this correlation decays in terms of the distance between A and B and that's how you describe correlation decay and the answer and the condition what we need that the distance decays is a polynomial decay and currently we can do power 12 which is not yet optimal but this is already a polynomial decay yes this A and D they are not the same that's right very good thanks yes the global density look typical pictures of global density sure of course yes everything is in order the scaling is done in such a way this one over square root of N guarantees that everything is order one but the picture is very different and also in that case the local I didn't formulate it yet it will come next time but in that case as I said it's the whole resolvent is a non-trivial object it's not just a trace it is a matrix a non-trivial matrix and then there is another condition which is sort of the condition the analog of the lower bound of s i j which we need and then the optimal local and bulk university holds optimal local means which I haven't specified yet it means that we can describe how the g resolvent looks like it is a deterministic matrix in which the resolvent approximates so here is a picture which I basically drew here