 And by summing, you get that the whole thing, the whole difference is the law of one. And that gives you your universality. So you just need to understand how the Stiltschitz transform changes if you make this one tiny perturbation. You just change one entry of your matrix. How does this affect the Stiltschitz transform? OK. And so this is, again, something that you can do with Taylor expansion. And the reason why I focus on the Stiltschitz transform is that the Stiltschitz transform has a particularly simple Taylor expansion. See, the Stiltschitz transform of a matrix. So by construction, what it is, it's one of an n as a trace of the resolvent of one of a mn. And so the resolvent of a matrix defines the m minus e inverse. And m is always going to be a emission, and z will always have a positive imagine possible, this inverse is always more defined. So Stiltschitz transform is the trace of a resolvent. Trace is linear. And resolvent has a very good Taylor expansion because there's a single resolvent identity. So it's a little algebra identity. I don't know if we get to get at what it is. But if you perturb a matrix by perturbation, you can always write that as the original resolvent minus resolvent of a resolvent m plus a. This is basically just a basic identity, which is very easy to check. And in terms of this, this identity doesn't even need anything to be commutative. It works even for non-community developments. And so this identity is just an algebraic identity that you can check. It's true whenever m and a is a emission, and z is a positive imagine part. And OK, so this expresses a perturbive resolvent in terms of the original resolvent times the perturbation times the perturbed resolvent. But what you can do is that you can iterate this identity. This perturbed resolvent, you already have a formula. And so you can iterate this and end up with a normal series. So you have a perturbed expansion. So the resolvent of a perturbation is just the original resolvent minus original resolvent a, which is a resolvent, plus resolvent a, resolvent a, resolvent, and so on and so forth. So there's an expansion, which is, it generalizes the geometric series. So and then of course you can take traces. And now you can express the stultuous transform of a perturbed matrix by the stultuous transform of the original matrix, plus a bunch of other expressions that involve resolvents and perturbations. So what do we have? We have the stultuous transform of a base matrix plus a small perturbation. And the smallness is coming from the root n. So you're taking this normalized matrix and you're making this perturbation small and rank 2, very small perturbation. So this term here, you can just use this normal expansion. And you can write it, first of all, as the, so you can write it firstly as stultuous transform of the base matrix plus the next term or minus the next term, which is something like 1 over n trace resolvent times perturbation times resolvent again. And then there's another term, so normalized trace, resolvent, perturbation, resolvent, perturbation, resolvent, and so on and so forth. And crucially, there's also a, some we need to say there's a c of j and then I can maybe get a c of j squared here and so on and so forth. All right, so you have this expansion. So basically this term is supposed to be size one. This term will be of size one root n because of this one root n here. And, ah, okay, so, okay, yeah, so let me just assert that this expression is going to typically be bounded, or at least maybe only going by entry epsilon. This expression will turn out to be a size, and minus one half is epsilon because of this term here. The next term will be of size n from minus one to epsilon, and so on and so forth. Now, to actually prove this, okay, so to actually prove those bounds, you need to understand the, so when you actually multiply resolvent times, times rank times elementary matrix and so forth take traces, you will quickly find yourself wanting to compute individual entries of the resolvent matrix. And so the reason why this is true is because of eigenvected de-reversalization. Okay, so this is something that was talked about a lot in earlier lectures. Let me not say too much about it, but just this resolvent can be split up using the SVD in terms of eigenvectors. All the eigenvectors turn out to be very spread out in space, all the components are pretty small. And if you, and this is where these technical assumptions by being sub-Gaussian come in, actually, in order to verify this de-localization. But you can compute, and it's done in the notes, the coefficients of this resolvent using the eigenvector de-localization and you end up with these bounds. Okay, so every term in the series is one of root and better in the previous term. So this is actually a very rapidly converging series. So you can truncate at any point. So for example, you can take the first three terms and truncate, much as we do in the central limit theorem case. Okay, and now the same miracle that works in the central limit theorem works here too. So this funny matrix, M and TOTA, it's complicated, but it is independent of CIJ. It only depends on all the other entries, not on the entry that we're perturbing. So once again, so when you take what we expected to just transform, once again you can factor out the CIJ from this expectation, you can just take it out in expectation because of independence and this guy can take out as a second moment and so forth. Okay, and so once again, if you swap the CIJs with the CIJ primes, because the first two moments match, it doesn't, so these expressions are messy, but we don't care what they are, okay? Because all that matters is that when you change CIJ to CIJ prime, they don't change. So when you look at the difference, they all cancel each other out and all that's left is the error term. So, yeah, so, oh, yeah. Oh, I should have said that this resolving bound is only true if, ah, okay. This is true, but yeah, there's a better bound involving vectors of an eta, sorry. Okay, I'm not gonna, okay. So the bound is a little bit more complicated than this. It involves eta. The further eta, the larger eta is, the better the bound gets. Let me, it's written up in the notes. I'm not gonna recall exactly what it is right now, but yeah, so this error also has some extra factor here involved too. So, okay, okay, so when you swap, when you're swapping each term, I think if you do all the analysis, it ends up being one of the three halves minus epsilon. Okay, and this is extra term, one of eta coming from when you carefully try to bound these things properly. And as long as eta is big enough, in this case bigger than n to the minus one half, this will be less than one of n squared. So, yeah, so it's slightly more complicated, but it's really the same method as was used in the central limit theorem. And as long as you have two matching moments, you can get down to n to the minus one half. If you have more matching moments, you can take the expansion further and you can take eta closer to the relaxes. And so you get closer and closer to the microscale. And in particular, as I said, if you take four matching moments, you get universality even below, even a little bit below the microscale. Okay, so, yeah, so basically, so the summary is that the expected still to transform at any given point, obeys a four-moment theorem, even up on microscales up to the scale one over n, the scale of the eigenvalues spacing. Actually, at the edge, I think it's a little bit better at the edge of the spectrum going, because at the edge the eigenvalues are spaced further apart. So things are slightly better at the edge, because let me just talk about the bulk. Okay, so what we just proven is that this statistic has a four-moment theorem, which means that if you change the matrix, as long as you keep the four-moment same, the limiting statistic doesn't change. Okay, so it turns out that you can, this argument is quite general. So here I just took a simple expectation of the still to transform. You can also take correlations. You can take the still to transform at two different spectral parameters, Z1 and Z2, and take a joint moment of that. And you can take higher moments, and you have similar theorems. And because of that, you can use some abstract nonsense, and you can also show that K-point correlation functions. Okay, so I won't quite define what these things are, but that every random matrix has a spectrum, and the spectrum has a random point process, and the point process comes with certain correlation functions, which you have to rescale in a certain way. I've discussed a little bit in the previous lectures. But if you rescale things properly, the correlation functions also have a four-moment theorem, a pulomacroscopic scale, in the weak topology. You have to integrate the correlation which is against some test function. But, and also the individual eigenvalues too. Like, if you take individual eigenvalues of your random matrix, and you take some nice test function of individual random, let's say a finite number of these eigenvalues, also, there's a similar argument that shows that there are any function of a certain number of, say, bounded eigenvalues of your random matrix, also a basic four-moment theorem, up to a microscopic scale. Okay, so because of this, you can already get a lot of the universality results. So, let's sort of do a picture, okay? So, this set here is supposed to, we did not set up a set of real vacant ensembles. Okay, so let's say real vacant ensembles just for the sake of discussion. Okay, so for example, GOE would be one point. There's ensembles, there's also a Bernoulli ensemble, and many other ensembles like this, okay? So, every time you pick some random variables for your entries, you get a vacant ensemble. Okay, so it's a reasonably big class of ensembles, and we want to prove universality results that given any two ensembles in this space, they have the same statistics asymptotically, which is great because with GOE, you can actually compute everything, and we know pretty much every statistic of GOE, at least in principle. And so once you have universality then, and you can compute things for GOE, then you can compute things for everybody else as well. So, what these four-moment things do is that they sort of foliate this space, okay? So, I'm drawing these sort of contour, these level set lines. So, these are lines of ensembles with four matching moments, okay? So, any two ensembles on the same curve, the same, it's more like a hypersurface on the same surface with the same four matching moments. See, there's basically, it's the first two moments I already fixed. So, assuming IID, so there's really only two numbers that are remaining. So, this is a two-parameter family of sort of co-dimension two surfaces here, okay? So, for every choice of third-moment and fourth-moment, you get some surface, and then so there's two-parameter surfaces filled out this big sort of infinitimensional space here, okay? So, what these four-moment things do is that they give you universality, but only along this direction, only along this foliation. So, for example, so, for example, any matrix ensemble which has the same four-moments as GOE will have the same spectrocytics as in quantity, as GOE. Okay, so that's what you can get purely from the four-moment theorem, from this Linux strategy. So, unfortunately, for example, it doesn't give you Bernoulli, because Bernoulli and GOE, they only match the third to third order. They have different fourth moments. So, the four-moment theorem does not directly tell you that Bernoulli matrices and GOE have the same statistics. But, okay, so in order to get more complete universality results, you need to combine the four-moment theorem with other universality methods. So, all right, so what people have been doing so independently of this is that they have found other classes of regular matrices which are known to be universal. So, the first classes of this type was called Gauss-Divisimum matrices. Okay, so these are matrices where the entries are not Gaussian, but they're the sum of a Gaussian and something independent of Gaussian. So, they have a Gaussian component. So, these are some matrices which are going to split up as some multiple of a GOE matrix and then a multiple of some other matrix for some theta between zero and one. I didn't go to T. So, there was initially work. So, okay, if you place GOE of GUE, right? So, okay, let me blur the distance between GOE. So, if you place GOE of GUE to a little bit simpler, there was this early worker of Johansson who was able to analyze the spectrum of Gauss-Divisimum matrices in the GUE setting and he was able to show that at least a level of correlation functions that there was universality in this class. That everybody in this class had the same assorted statistics as GUE. So, the thing like Gauss-Divisimum matrices is that these matrices do not have to have the same fourth moment or third moment as your GOE matrix. So, this class is somewhat transverse to the affiliation that the Fulminum theorem gives you. But, you can combine this result with the Fulminum theorem. And so, now everybody, and so, you can combine the two universality to get a better universality result. Any matrix which is not Gauss-Divisible but matches moments with a matrix which is Gauss-Divisible will also have the same assorted statistics as GUE because you use it for a moment to get from your original matrix to a Gauss-Divisible matrix and then you use Johansson's result to get from the Gauss-Divisible matrix to GUE. So, because your Hansson's result is in some sense transverse to the Fulminum theorem, you can combine the two to get a better result. And so, this works reasonably well. Unfortunately, it still doesn't, turns out that the Bernoulli Ensembles is not equivalent to any Gauss-Divisible Ensembles. So, this actually covers almost everything except the Bernoulli Ensembles, actually. But, over time, people have expanded so that they have gotten better results than Johansson. So, for example, using these methods of localization flow of the Erzschlan and Yao, you can extend the university class of matrices to matrices that are not just Gauss-Divisible with a fixed T, but yeah. So, thanks to also later work of Erzschlan and Yao. We also have universality for certain, well, okay, there's actually many other authors involved here. It depends on exactly what statistic you are measuring. So, this is still just transformed. There's correlation functions. There's also energy average correlation functions and there's a technical distinction there. But, to oversimplify, we now have universality for Gauss-Divisible Ensembles with, but you only need a very tiny fraction, a very small amount of the Gaussian part. So, T can be a small, it's a minus one percent small. So, just a little bit of Gaussian-ness in your Ensembles is actually good enough to get universality because of the rapid mixing properties of Dyson value motion, basically. Well, okay, so that's, you can see that covered in additional lectures. So, that expands the classic Gauss-Divisible matrices. Still, it doesn't quite cover Bernoulli matrices. Bernoulli matrices don't have any Gaussian come on whatsoever, but now, okay, but it gets so close. It gets so close to Bernoulli that you can now find matrices in this class that are sort of, that almost match all four moments with Bernoulli. They don't completely match all four moments. No one's on what does, but they get so close that you can use a slight perturbation of the four-moment theorem to get from Bernoulli to something in this much larger class. And so, by combining, okay, so this is what, what's called the perturbative step in Erdogan's free step strategy. So, nowadays, you only need a little bit of the four-moment theorem and because all these heat flow methods do most of the heavy lifting, to get universality for, at least for certain statistics, like, say, energy average correlation functions. But there are other settings in which, you still need, in which the best universities are still rely mostly on the four-moment theorem. For example, if you don't work with Hermitian matrices, but you work with non-Hermitian matrices, IID matrices like we did in the earlier lectures, and you're interested in local statistics of eigenvalues of IID matrices. So, not just the circular law, which gives you the bulk distribution, but micro-scale laws. We don't currently have an analog of the sort of gastro-visible theory for the non-Hermitian case. Basically, because the analog of Dyson value motions much worse in the non-Hermitian case. In those eigenvectors, there was eigenvalues and we don't understand it very well. So, we don't have this transverse class. And so, currently, actually, for IID matrices, the best local university results still come purely from the four-moment theorem. You need to match four moments with an existing class like Gaussian matrices in order to understand the asymptotics. But maybe in the future, that will be combinable with some other transverse university result to get a larger class of universality. Also, if you're interested, if you want to understand eigenvalue gaps, gaps between edges and eigenvalues, then this theory of Gaussian-visual matrices and so forth works very well. And we have universality for the entire class. If you're interested just in the distribution of a single eigenvalue. So, a single eigenvalue, it's believed that nobody's a central limit theorem. There's eigenvalue rigidity. It homos around a classical value, but it should do so in a Gaussian way. And that we only know, so that's known to be true for Gaussian matrices. And because of the four-moment theorem, it's also known to be true for things that match the Gaussian matrix. But unfortunately, it's actually believed that we false, that once you move away, once you change the fourth moment, the actual, or what should happen, is that it should still be Gaussian, but the means should shift. There should be some noticeable influence of the fourth moment on the distribution of a single eigenvalue. You don't see it with gaps because the shift cancels itself out. And you also don't see it with the collision function because there's also an averaging effect. But for individual eigenvalues, sometimes actually the fourth moment is necessary. So sometimes, in some cases, the fourth moment is actually the correct universality class. But in other cases, there are some transverse universalities that you can use to go further. Okay, I'm out over time, so thank you very much. So is it possible to use the exchange strategy by somehow weakening the second moment matching as long as the sum of the second moments match? Like if you have the other thing? Possibly, okay. So there was a variant, I didn't mention this for like a time, there was a variant of the exchange strategy where you don't exchange the entries one by one sequentially. But you do so randomly. So what you do actually, you put sort of a post-on clock on that sort of each entries is a continuous time frame from zero to one and as a random time from zero to one, each entry just flips. But it doesn't in any order. So this is a variant introduced by Nose and Yin. And it does have some slightly better properties than sort of the classical matching. In that it does introduce a certain averaging across all the entries. So violin and as a single entry, you do them all at once. And so in the case of this context of animatrices, it means that you don't necessarily need to understand each individual component of a Green's function, but just sort of an average or trace of the Green's function. And that often has better estimates. So yeah, it could be, maybe this is already maybe done in some of the papers. Yeah, so perhaps in some of the papers involving generalized victim matrices, maybe you could take advantage of that and you could maybe in particular allow variants to be traded among the entries. Yeah, there's certainly one possible variant. That's better like less than whatever plus or minus. Like negative? 12s, yeah. It's nice. No, it just comes from having enough moments. So yeah, if you have five matching moments, you get even smaller and so forth. That at least for some statistics, okay, so for the simplest statistics I mentioned, they're still just transform. Yeah, it just comes out of the calculations. Because when you do four moments, see the error terms of size in the minus five parts, and there's like a one by root n amount of room. Because in the minus five parts, it's sort of root n better than what you need. It's one of n squared. And so yeah, because of that, you can go a little bit below the natural scale. And if you work it out in this case, so in the minus one 12th over the natural scale. Although if you work, the more and more complicated expressions that you play with, I think the gain you get gets less and less. So yeah, so unfortunately at the end of the day, even if you had like a hundred matching moments, if you're interested in say, I can add the gaps. I think you can get, even if I just, there is some power saving. I think you can get in the minus one minus a little bit, but that's not very much. The gain doesn't go to infinity as you increase the amount. I think it's always like, I'll be over like some like a 12th.