 The closer z gets to the real axis, the harder it gets to understand this transform. But for z far away from the real axis, it's fairly easy. And the sutra transform is also fairly localized. So if z is over here, one of our w minus z is large here and small here. So what the sutra transform is doing is trying to look at a local portion of the spectrum. So evaluating the sutra's transform at this point here tells you a fair bit about what the spectrum is doing in this region here. So this is why the sutra transform is so good for proving local laws. And the key point in the Wigner case is that even if you're in the bulk of the spectrum, if you want to understand some energy in the bulk of the spectrum, you can approach the energy. So even though it's in the interior, it's nowhere near the edge in R. But in the complex plane, it is on the boundary. And you can approximate the energy in the complex domain by points that are away from the spectrum. And this is how we understand the behavior of the NGE by taking complex numbers outside the spectrum but approaching the energy that you want. So that works great in the real case. But in the complex case, our spectrum looks like this. And the sutra transform method has some chance of working if you want to understand what is going on near an NGE level on the boundary. But if you're trying to understand the NGE level here, there is no complex plane outside the spectrum that you can use to analyze using the sutra transform. You would have to enter the spectrum in order for the sutra transform to become effective. And once Z enters the spectrum, we don't have good ways to control this transform. There's no easy ways to control this quantity. Because the formula is now the size of the spectrum that is two-dimensional, and we have a complex parameter which is also two-dimensional, and you can't approximate this. You can get around this by making Z quaternionic rather than complex, but that is a whole other story, which maybe I won't talk about. Then again, you can start approximating. And that, in some sense, is what we actually do in a very... Well, okay, it's not obvious that that's the same thing, but okay, yeah. There are quaternion sort of going to be floating around very implicitly in the background, but not in the foreground. But yeah, if you want to keep yourself in the complex mode, you have to abandon the sutra transform. Okay, so that leaves the moment method, sorry, the log potential method as our one remaining method. Now, okay, so at this point maybe I can at least give a heuristic proof of a circle ball. Okay, so if you believe that there's a convergence theorem floating around, okay, so you want to prove that these spectral measures converge to the circular measure. So hopefully, this should be equivalent to saying that for each Z, for each spectral parameter Z, if you look at the log potential, it should somehow converge to the logarithm over here. Now, actually, there are a few of them that do this, and it's actually not that hard, because the log function, or I can precise it more, the log function, or I can precise it one over two pi times the log function. It's actually the Green's function, or the fundamental solution for the Laplacian in the complex plane. That, yeah, so if you take the Laplacian of this function in the distributional sense, you get the Dirac delta. What that means in practice is that there's actually a formula to recover that you can actually recover the measure from its log potential. So, at least formally, the potential at any given point, sorry, the measure at any given point is minus one over two pi times the Laplacian of the log potential. Okay, now, of course, if your measure's not continuous, this is not a function, so you have to interpret this in a distributional sense. But let me ignore these analytical details, but roughly speaking, once you know the potential, you can take Laplacian and you basically recover your measure. So, in fact, as you have worked distributionally, so you take a test function and you test it on both sides and you do also analytical sort of abstract nonsense. But, okay, but using this sort of type, something like this, you could actually make this step. So this step is actually fairly well understood. So you just need to understand the log potentials. Now, this log potential is an explicit integral. You're integrating the log of something on a disk, and this is something that you can actually compute explicitly. This right-hand side turns out, so you can use, for example, Jensen's formula, or you just go ahead and integrate it. I mean, this is basically an elementary integral. And what you can find, eventually, after some computation, is that this log potential, if C is far away from the spectrum, then this circular measure is equivalent to a point mass, and you just get log Z, which is the outside of the disk. But if it's the inner disk, you get a quadratic function, and it's actually Z squared minus one over two. So it turns out. So this is something that you can compute explicitly, and these two functions should match, so at the boundary when Z is equal to one, these match to second order. So this you can check by direct calculation. It's also compatible with this identity. You can check that if you take the Laplacian of this function and divide by two pi, and put a minus sign, you'll get like a circle of all. It's a nice little computation. So this is quadratic, you get a constant over here, and you get one over pi, in fact. And log is a harmonic function outside of a wave and zero, so you get zero over here. So this is a function whose Laplacian times its constant is the circle of all, and it has the right growth and vintage, so it sort of has to be the right answer. Okay, so anyway, this is a computation. Okay, so yeah, this heuristic proof is not entirely computation free, but it has less computation than most of the other proofs of the circle. So, all right. So what we reduce to understanding now, so what we wanna show is that you can take one of a, you can take the normalized log determinant of this shifted normalized matrix. Okay, so this should converge to log Z if Z is bigger than one, or Z squared minus one over two, Z is less than one. Okay, so roughly speaking, what we wanna show is that the determinant, the magnitude of the determinant of this matrix, okay, so this should approximately be something like, I think Z to the n, if Z is big, and if Z is small, it should behave like the exponential of n times Z squared minus one over two. Okay, so I'm just sort of taking exponentials of this, okay. And I'm gonna use approximately very, very loosely. Basically, any factor which is exponential, little all of n, will disappear when you take log about by n. Okay, so this is a very loose approximation. We're only caring about the exponentially growing terms. Anything like polynomial terms we're just gonna ignore. Okay. Okay, so roughly speaking, the circle was really a statement about determinants. It's a statement about what these determinants look like. Now, how do we understand these determinants? So there's a bunch of ways. So we want this for every Z. So let's first look at the Z equals zero case. Okay, so maybe I'll just do this case. Okay, so, okay, maybe I'll just explain why the determinant when you don't shift at all, when Z is zero, is approximately equal to minus n over two. So this is a computation that you've done a long time ago by Turan. So let's do this for the Bernoulli matrix. Plus minus one is everywhere. Okay, so what is this determinant? Okay, so we can do the many, many formulas for the determinant and some are more useful than others. So we're gonna use the least useful formula for the determinant. Usually this is what we don't use, which is the lagness for expansion. Okay, so you can expand this as the sum of all permutations of the product of zi sigma i equals one to n times the signature. Okay, this is the lagness expansion of the determinant. You sum up all the permutations. You get all these strings and then you you can multiply by the sign of the permutation and this is some n factorial terms. Okay, so this is usually fairly intractable. But what we observe, okay, so first the n factorial terms. In the case of the Bernoulli sign matrix, every single term is plus or minus one. Okay? And because they're all iid, okay, each term is gonna be plus or minus one equal probability. So these are all Bernoulli variables. You're summing n factorial Bernoulli variables. Now, they're not quite independent. Sometimes, okay, so if you because some of the permutations cross, okay, so there will be two permutations which they can share a common entry. And then because of that, there's some correlation between these entries. However, what you can see is that even though they are sometimes dependent, the covariances between any two of these products is zero because no matter which two permutations you pick, as long as they're different, one of the permutations will contain will involve an entry which is not contained in the other one. And that guy has a random sign and is independent of everybody else. And so the covariance of the product of any two of these terms is zero. So there are these pairwise independent signs. They're not joint independent but they're pairwise independent. So that's still good enough to understand at least the mean invariance of this quantity. So this random variable has mean zero and variance. I'm sorry, there's a factor. Sorry, let me back up a little bit. Sorry, yeah, no, no, no, no, no, no. This is, that's right. What's wrong here? Ah, okay. Yeah, there was a one of a root N normalization when we take a term that is N times. Okay, sorry, okay. So this sum has mean zero and variance in factorial because you're adding up in factorial pairwise independent signs of plus minus one. But then you multiply by this quantity here. So the variance is just divided by N for the N. Which was Stirling's formula, it's about E to the minus N. Okay, up to factors which I'm going to ignore. So the standard deviation of this quantity is about E to the minus N over two. And that's why you should expect this expression here. Okay. Now, it's a nice computation to do the same thing for other Z's. And you get a more complicated expression for the variance, but you can still use this usually bad idea of taking the Leibniz expansion. And you can still compute the variance of the determinant and the standard deviation. And if you have to make one leap of faith, which is you have to assume there's enough concentration of measure that the standard deviation is actually the typical size of your matrix, of your determinant. But if you do that, you will actually get back this formula here. So it's a nice little computation. Maybe, I don't know if Nick is planning to do that, or maybe we'll get you to do it, but that's one. Okay, so this is the quickest heuristic expression I know, which still has some calculation, but less than usual for the circle of all. Okay, but it's not rigorous. And yeah, particularly because the controlling the variance doesn't play well with the log. Yeah, and there isn't enough concentration of measure to actually make this really work properly, but it is so heuristically correct. All right, so we do something else. All right, yeah, so how to control the log of the determinant, and I guess that's also normalization. Okay, so, yeah, so computing the variance of the determinant is suggestive, but it doesn't tell you everything. I think it gives you like an upper bound of this quality, but not a lower bound, something like that. There's a Jensen inequality that intervenes at some point, and it only goes one way. Okay, so as I said, the whole problem is that this matrix is not Hermitian, and we don't understand non-Hermitian matrix very well. So the key trick at this point is what we now call the Gurkel Hermitization Trick. Okay, and it's very simple. The determinant of a matrix, even if a matrix is not Hermitian, this determinant is related to the determinant of a Hermitian matrix. You can multiply A by its adjoint, and the determinant of this guy is a square in absolute value. So it actually becomes crucial that it's absolute values here. The absolute value of this, in fact, well, this is positive, but this determinant is the square of this, so I guess I've put a new problem for one half year. Okay, so the determinant of this non-Hermitian matrix is related to the determinant of this Hermitian matrix. And so actually all you need to do, okay, so this quantity is also the same as I think one of the two M log determinant of this Hermitian matrix. So, and we understand much better how to understand spectral measures of Hermitian matrices. So, for this matrix, the moment method, social transfer method, they work. And so we understand the law of this. Okay, so the eigenvalues of these understood, can be understood, okay? So, so you can write this in terms of the eigenvalues. So this is the same thing as, okay, so in terms of the eigenvalues, this is one of two N times the sum of the logs of the eigenvalues of these things. So it's actually the same thing as one half times the integral of log X, and it's your divinity against the spectral measure of this matrix here. When you integrate the infill spectral measure of this, against log X, you're just summing all the logs, the eigenvalues, that's the same as log determinant. And then there's a normalization, and then there's one over N here, one over two here. Okay, and this is the type of thing we understand. So in fact, the eigenvalues of this matrix, these are just the squares of the singular values. Okay, so another way to think about this, these eigenvalues, you just take the singular values of this matrix and you square them, and that's what the eigenvalues are. Another way of saying it is that the absolute value of a determinant matrix is not just the product of the eigenvalues, it's also the product of the singular values. And so you can use singular values, which are much more stable than eigenvalues. So, basically we just need to understand the singular values of this quantity here. Okay, and this can be done. Okay, so in the work of Gerckel and then later by and several other people, they carefully computed this law and they found that these guys converge to some limiting law. Okay, so in fact, I think it's the Manchenko-Pastore law, maybe some, it's a perturbation of the Manchenko-Pastore law. But there is a explicit limiting law, it depends on Z. But for each Z, there's a limit law for this thing, which converges, and so what you would like to say, okay, so this distribution converges in a reasonable, like in the vague sense, to a museum. So what you'd like to say is that this integral should converge to log X times this integral, whatever the limiting measure is. Okay, and if you can justify this, then you just need to compute this integral, which is somewhat, it's an explicit integral in building special functions, but it can be done. And miraculously, after a somewhat serious computation, you get back exactly what you need, which is these quantities here. So that is the original strategy of Gurko. And, but there is a problem here, which is the convergence that you have is converging in the vague topology. So what that means is that when you multiply this measure against any continuous function, complex model of continuous function, you will converge to what you expected or what you wanted to converge to. But you're integrating against log, and log is not quite continuous at the boundary, okay? It's not a continuous complex water function. So your spectrum is not negative, because now we have singular values, okay? And log looks like this. And it's not continuous. It's unbounded at plus infinity, and it's unbounded at zero. Now, the boundary at plus infinity is not a real problem because we understand very well the operator norm. Okay, so we have always good bounds, really good bounds for the largest singular value of these matrices. And roughly speaking, we know that the measure, the spectrum measure is usually supported some bounded set, like it doesn't really, normally it doesn't even exceed two. And so you can cut off, okay, the singularity at infinity is really easy to do with, particularly since log grows so slowly. But, so the real problem comes with the other singularity at zero. Yeah, so if, for example, if the least single value of this matrix is zero, then everything falls apart because, in fact, now you think log of zero, this is divergent, everything's divergent. So in order to, you know, even if all the other singular values are dispute nicely, it just takes one, you know, the least single value can ruin everything. If that singular value is zero, it doesn't matter that the rest of your measure is completely well-behaved. This expression diverges. And so you need bounds on the least singular value of your matrix. But fortunately, now we have them. Okay, so the, you know, so the whole point in the previous lectures was to develop bounds. Okay, so, okay, so the bounds I gave in the previous lectures are a bit too weak for what you want here. But the least singular value of these matrices, what you can show is that for any fix z, this should be bounded from below by say n to the same minus 10. With reasonably high probability. With really good, you know, probably, I don't know, say one minus bigger or n to the minus five or something. Okay, so you don't get exponentially good probabilities anymore, but you do get reasonably good failure rates. And you get reasonably good, good boundary. Now this is normally not so great, but n to the minus 10, you know, it looks really, really small. But there's a log function here. Okay, so at n to the minus 10, the log function is still literally mild. It's only a size log n. So it's still unbounded. Okay, so now you can cut away this part of the spectrum and you can work from say, in the minus 10 onwards. You have to use some entropy net thing to deal with. So there's many different z's that you need, but there's about n squared z's that you need to care about. And so you can apply union bound and this error is fairly manageable. So this function is still unbounded, but not very unbounded. It's only unbounded by a log. So in the early work of, particularly the work of Bayh, so they still had this log of lake loss. And the way they got around that was that they went back to this convergence theory. And Bayh in particular spent a lot of effort proving this type of convergence with a good convergence rate. So not only do these measures converge as angles of infinity to the limit measure, he found a rate of convergence of this measure. And his rate of convergence was polynomial. It converts like n to minus negative power. This required some work, but he was able to do that. And that rapid rate of convergence was able to counteract this logarithmic growth in the log function here. And so he was able to conclude the argument, at least assuming that he needed to assume that the individual distributions were not Bernoulli. They had to be somewhat continuous distribution. And some of the conditions as well. But yeah, so, but yeah, but when your measures are bad, particularly if they're very heavy-tailed, then you don't have this polynomial convergence rate. And so this strategy, actually, so a few years ago, Van Ruy and I pushed this strategy as far as it could go. And it was able to handle any ID matrix as long as you had one extra assumption. In addition to the second moment being bounded, you needed some moment above the second moment. It's a 2.1 moment to be bounded as well. We needed this extra moment to get a little bit of polynomial convergence rate over here so that you could make the strategy work. But the full circle law doesn't have this extra condition. You only have a second moment bounded, not second plus epsilon. So we use a different strategy to get rid of that. So maybe just very briefly, I'll close with what we did there. So rather than focus so much on the limiting law, the limiting circle law, what we did instead was that we focus on proving universality. Okay, so as I said, you don't really have to prove that everybody converts the circle law. You just need to prove that everybody converts to the same thing. And then since you know that, let's say for Gaussian ensembles, the Gaussian guy converts to the circle law. That's enough to make everybody convert to the circle law. So what we try to prove is some sort of universality statement. But if you have two I.D. matrices, same, maybe this one was Bernoulli and this one was Gaussian. Rather than work out what these guys converge to, we just would like to show that this, we just wanna show that this log determinant of this guy is close in some sense to the log determinant of the other guy. And again, I won't be precise about what close means. If these are close, and you know that one of these guys converges to the right thing, then the other one has to converge to the right thing as well. Okay, so, okay. So as I said, you can write this as the sum of singular values. Your logs of singular values. That's, okay, so actually for technical reasons, we didn't quite use this particular formula. So there's yet another formula for the determinant, which is useful. So maybe I just mentioned this. If you have a determinant of N by N matrix and the rows X1 and XN. Geometrically, what this is, this is the volume of the parallel pipette whose size is X1, X2, X3, X4. Okay, so it's a geometric volume. And so you can use the base times height formula. This quantity is also equal to the base, which is the length of X1 times the next height, which is the length of the distance from X1, X2 to the space span by X1, which is called V1, times the distance from X3 to the space span by X1 and X2. Okay, so base times height times height, the N fold base times height formula. And so this log determinant, you can also write as the sum of log of various distances, the distance of each row to the previous guy. Each one of these guys is roughly like a singular value. There's a relationship between distances of vectors to hyperplanes and singular values. You may have seen some of this in some of the TA sessions. Okay, so we used this decomposition instead, but you should think of it as being like the singular values. So once again, we have this log function and we have all these eigenvalues or singular values and we try to sum the log function on these singular values or actually on these distances, but you can put a thing of us as singular values. So because you have the least singular value bound, we can show that the very small eigenvalues, the very small singular values don't contribute very much to the sum. So not just the least singular value, but maybe the first like n to the point 99 singular values turn out to actually give a fairly negligible contribution. They only implement as point one of the whole sum at times log n and that's something which you can handle. And because we still have some convergence, you can get, if you look at these singular values from let's say point one n, up to n, you look at the bulk of the singular values, here log is bounded and the convergence, just qualitative convergence is good enough. And so there was just this intermediate, these sort of smallish singular values, like the n to the point 99th smallest singular value, which were the real problem. And so there was one additional tool that we could use which is particularly apparent using this distance formulation, which is that these distances turn out to be a construction to measure them. So they have a certain mean invariance and the mean invariance is easy to compute, but actually they're very well concentrated, that actually the random, but they stay very, very close to their mean value. And you can show that the mean value of this quantity doesn't depend on your model. If you change from Bernoulli to Gaussian, the mean doesn't change, the variance doesn't change. And so if you have enough good enough concentration to measure, you can show that this distance, the distribution of this distance for one model is almost the same as if it's in another model. And as long as you're staying away from the smallest singular values, if you're working for sort of smallest singular values, these distances also stay away from zero and so log becomes continuous. And so you can also show that log of the other distance for one model is very, very close to log of distance for another model. And so you can show that for even this most difficult contribution to this integral, that contribution is universal, it doesn't depend on the model. And so by combining both the bulk theory, the least singular value theory, and then also this concentration of measure theory, we use an integral of telegram to conclude here. You can wrap everything up and you can put the full circle of law. Okay, out of time, so that's a probably good place to stop. Have any questions? Is there someone else? Oh, I see, okay. Maybe the free probability people have a way of thinking about, but yeah, I don't know. Yeah, it looks very suggestive. Yeah, I mean, yeah, the semi-circle measure is up to some normalization, the projection. That's exactly the real part of the circle of law, but I don't know whether that's just a coincidence or whether there was some natural reason for that. Yeah, but maybe in free probability length, there was some explanation for that, but you have to ask an expert in that. Are there questions? Yeah. Right, okay, so to understand the eigenvalues of this quantity here, you don't have to understand there's still just transform of this. Okay, so you have to take this and then you have to subtract another parameter. Take the inverse of that and then take the trace of this. Now, there's some way to view this quantity as the resolvent of Mn minus a quaternion inverse in some of this programming. I forget exactly how it's done, but... The two by two matrices. Yeah, yeah, okay. Quaternion is just code for two by two matrices, yeah. Yeah, yeah, so... Yeah, variants of this method can show, for example, universality of local statistics. Yeah, so if you take something like, you take some fixed function of all the eigenvalues with some hope of normalization. You can show that if you change the model, you place one model with another model, but it still means your variance one, that the law of this doesn't change very much. And yeah, so you can start proving central limit theorems for some of these things by some methods like this. Offhand, I don't remember exactly what's currently known. I mean, there's a limit as to how local you can go and still have universality. If you go really low, you need matching moments, which I'll talk about next time. Yeah, there is some way to adapt this to linear statistics, but I don't remember the full statement. All right, so we resume at 315 with Robert Berry from NCTM. Let's thank Mr. Bowie.