 Okay, thank you. So today we're going to cover the circular law. Oh, sorry, thank you. I'm going to cover the circular law today, which you can think of as the non-commissioned analog of the semicircular law. So of course, we've seen the semicircular law many times already. So just to remind you, if you take a regular matrix, m by n, so it's symmetric, and let's assume that the upper triangular entries are, let's say IID, means you are variance one. Okay, so because I've made variance one, I have to normalize it now. So you do it by a square root of n to make the operator norm bounded, and then you can define what's called the empirical spectral distribution, I'm sorry, mu, of this measure. So what you do is that you take all the eigenvalues, okay, so you look at the n eigenvalues, you look at the, you put a Dirac mass at each one of these eigenvalues, and then you take the average, so you get this probability measure. In the semicircular cases, it's on the real line. So you have all these eigenvalues. So you have this Dirac mass, and the semicircular says that, so the written semicircular says that as n goes to infinity, these measures converge to the semicircular measure, which is a square root of 4 minus x squared positive part, and then divide by a constant, which I move high, I forget exactly what it was, but it converges to the semicircular law. In distribution, so the convergence, first of all, this is random, so you have to specify either convergence and probability, or more surely. It turns out that actually both are true, at least maybe if there's some moment conditions on the measure. And the convergence here, the way in which this discrete measure converges to the semicircular law, you can either make it convergence in distribution. So one way to say this is that if you just measure this measure on some interval, which is basically counting how many eigenvalues there are in this interval, you should converge to the corresponding area of the semicircular law, which is just the integral of this density. Or equivalently, another way of describing this convergence is you can say that this convergence is in the vague topology, which means that for all test functions, that if you integrate this measure against your test function, then you will converge to the integral of the same test function against the semicircular law. So rather than take a histogram, you can take some test function, some nice compact is a border continuous function, and you can ask for convergence of all these statistics, and that is vague convergence. These equivalents are basically because the limit measure is continuous, you can approximate this sharp cutoff of basically you can approximate an indicator function above and below by continuous functions, and because the limit measure is continuous, so you can show these things are equivalent. Anyway, these analytic technicalities don't want to draw too much on these, but okay, so this is the semicircular law, which you've already seen many, many times. Okay, so the circular law is similar, but now the matrix is an IID matrix, so not Hermitian, so the entries are, let's say, all independent, again, but still, again, means zero variance one. You can relax the condition of being independent of IID quite substantially, but for simplicity, we're just going to talk about the IID case. Okay, so once again, you have eigenvalues, but now because you're Hermitian, so in the Wigner case, the eigenvalues are real. Now they're just complex, and in the real case, they come in order. The rule line is ordered, so you can talk about the smallest eigenvalue, second smallest, third smallest. In the complex case, they don't come in any order, so this is just an unordered set, but you can still form the empirical spectral distribution. So the eigenvalues of the normalized matrix. So remember, in the first lecture, we saw that this matrix has operator norm about root n, so to normalize it to be a bounded norm, and hence bounded eigenvalues, you should divide by root n. That's the right scaling. So once again, you can again form the exactly the same empirical spectral measure. Okay, so this is now a discrete measure in the complex plane. And then the circular law says that, again, as n goes to infinity, both in probability and almost in the almost sure sense, these empirical measures converge again in the vague sense to a limit, and the limit is now the circular measure, which is a normalization of pi times the indicator function of the complex unit disc as times the vague measure on the disc. So if you take these eigenvalues and get bigger and bigger, what will happen is that they will fill out, eventually, the unit disc uniformly, with density one over pi. Pi, of course, is the area of the unit disc. So you get no eigenvalues outside the disc, or very few. Actually, it turns out you get basically none. And all the eigenvalues are across the inside the disc, but furthermore, they're just so uniformly. There's just as many eigenvalues over here as over here or as over here. Okay? And again, this convergence is in the vague sense, you can either draw a rectangle and ask how many eigenvalues are in a rectangle, and it will be proportional to how much it is up to the mass of the circular law gives to the classic rectangle. Or you can take a test function. And again, it's equivalent because this is a continuous measure. Okay. So this is the circle law. So it took many decades to prove this law. So the Vignal law, I don't have the dates when it was proven. I think 50s or 60s. Sorry, I didn't record exactly when it was done. Well, I guess... Well, that was also several papers because there's various hypotheses on the measure on the individual entries that you can start relaxing. But it was all done at most by the 60s. But yes, the circular law... So in the case of Gaussian matrices, there's two Gaussian ensembles that are of interest here. There's the... So when I say variance 1, what I mean is that the expectation... So these are complex numbers, and the expectation of complex numbers squared is 1. That's what I mean by variance 1. Of course, mean 0 is mean 0. So there are several special ensembles that we care about. So I focus mostly on the Bernoulli ensemble where the entries are plus minus 1. That's the most commentarily interesting ensemble. But the most sort of exactly solvable, from the point of view of explicit algebraic computations, the most explicit models are the Gaussian models. So there's the real Gaussian matrices that exceeds real Gaussian of mean 0 variance 1. And then there's also the complex Gaussian where the mean 0 and variance 1. So they've both been studied. So I think META was the first to analyze the complex Gaussian, which has the simplest distribution of the eigenvalues. In that case, the distribution is given by a nice log gas and it's determinant and there's lots and lots of formulas. So that's the easiest case. And there you can verify pretty much by direct computation, the circular law in that case. And then I think Edelman computed the same thing for the real Gaussian. There the law is already more complicated. If you look at the correlation functions of the real Gaussian, it's no longer determinant, there's some faffians up here. It's a bit messier, but it's still computable. And then Gerckel tackled the general case, although his first few arguments had a few gaps in them. The general strategy was sound. And then later papers of many authors including Gerckel made the arguments rigorous and covered more and more ranges of random variables until now we can actually handle all random variables and mean 0, variance 1. Okay. I was on my 10 years ago or so. No, less than 10 years by Van Voo and myself. 2009 maybe. So, all right. So that's the law. Now, okay, so there's several questions here. So first of all, how do you prove this? Okay. Secondly, why the circular law? Okay. So first of all, the circular law is actually a slight misnomer. It should really be called the disk law. So the arguments are not constituting on the circle, they're constituting in the unit disk. I think the reason that they call the circular law is because the regular law is very close to semi-circular law. And there's also a quarter-circular law which I talked about before. And so I guess they wanted to keep the name somewhat similar. But it should really be called the disk law. But anyway, that's not the question. The question is why, you know, why is it close to the uniform measure on the circle and not some other measure? There's so many other possible measures you can think of here. And third question is why the universal? Okay. So, you know, Bernoulli, Gaussian, all the different matrix ensembles with different mean and variance, so the same mean and variance with different distributions, they all converge in the limit to the same answer. Now, this is, of course, this is universality. This happens all over the place in probability and certainly in the matrix theory. So it's a familiar phenomenon. But what's the source? Where does it come from? Okay. That's not these questions. Yeah, so I can certainly answer the question on how do we prove these things. We have a pretty good understanding of why things are universal. I still don't have a really good satisfying expression of why particularly we get the circle of law. I mean, if you buy that it's universal, then once you know that things are universal, if you want to check the circle of law, you only need to check it for one ensemble. So like if you have some argument that says that Gaussian matrices obey the circle of law, and you know that the limiting law is independent of the distribution that is universal, then that will tell you that the circle of law holds in general. So universality sort of partly addresses this question in the sense that you only need to check one ensemble. But even checking the complex Gaussian case, which is the simplest case, to reach the circle of law, you still have to do like a page of computation, which is not bad in the grand scheme of things, but still I'll try to address parts of this question as time goes by. So how do we prove limit laws for spectral measures? So the empirical spectral measure by itself directly is not an easy object to address, but there's various things related to the spectral measure that you can address. So there are various methods to understand spectral measures based on understanding certain transforms of this measure. So the most popular and original method is the moment method. And what you do is that instead of dealing with the measure directly, you deal with moments. You integrate your measure against a polynomial, z to the k, and these are numbers. And the point in dealing with these moments is that these numbers have a very simple interpretation in terms of the original matrix. This is just the trace of one root n of m into the k, just from the spectral theorem. So the thing is that the measure is difficult, but the matrix is easy. All the entries are IID. It's hard to think of anything much simpler than that in probability than a bunch of IID random variables. So some polynomial of degree k in those random variables is somewhat less easy. But still, at least for small k, this is something that we understand pretty well. And we have all these moment convergence theorems. So if you know that these moments converge, so under certain conditions, if you can prove that these moments converge to what they should converge to, which is the moments of the corresponding limit law, there are various moment convergence theorems that say that if you know that this is true for every k, then hopefully you can deduce that the measures actually converge. Now, in practice, of course, everything's random, so you have to quantify what this... So I haven't talked about what kind of convergence you need here. You may need convergence in the almost sure sense or in probability or whatever, and then that would impact what kind of convergence you get over here. Let me ignore these issues. These are sort of real analysis sort of issues which are not the most difficult aspect of this theory. So let me be vague. Oh, vague is ready. Okay. Let me be not so precise as to what convergence means here. Okay. All right. So that's the moment method, roughly speaking. Then, as we've seen particularly in the literature's lectures, we have the distorted transform method, which is based on a slightly different statistic. Okay. Let me do w minus z. Okay. So you're taking a measure and you integrate not against a polynomial, but against a one over linear function, one over w minus z. Okay. So this is a quantity which is also relatively, it's not as easy to understand as this quantity here, but this also has a nice interpretation in terms of the matrix. This is just a trace. Actually, there's a sort of normalization. It's one of the end of this trace, normalized trace of normalized matrix minus z times the identity matrix inverse. Okay. So it's just the trace of the Green's function or trace of the Resolven. Okay. And this is something that we do have, it's not as easy to understand as these quantities here. This is no longer polynomial in the entries, but we have techniques to understand this quantity here. Okay. And again, there are convergence theorems that say that if these stouters transforms converge in an appropriate sense to the stouters transform of a limit measure, and maybe you need some other conditions on these measures, then hopefully you can deduce that these measures converge in an appropriate sense to the limit. Okay. So there are various sort of analysis theorems that do that. Okay. Now, so these are the two major methods that we've seen already in other lectures, and they don't work very well for this non-Hermitian problem. And so there's a third method. So I'll explain why they don't work in just a little bit, but let me just finish this table. Yeah. So there's a third method, which you might call the logarithmic potential method, which works not with moments and not with stouters transform, but what you do is that you integrate your measure against a log function centered at some point Z. Let's actually make W here. Okay. So you integrate against the logarithm of the absolute value. Okay. So the reason why this is good is because this also has a nice linear algebra interpretation. This turns out, if you just use the spectral theorem, so this is just, okay, so explicitly it's 1 over N times log of the eigenvalues of this matrix minus Z. Okay. That's just by definition of the measure. eigenvalues of this normalized matrix. And then the thing about log is that sum of log is the log of a product. So you have the product of these eigenvalues, but the product of the eigenvalues is the determinant. So this is the same thing as 1 over N times log of the absolute value of the determinant of normalized matrix minus Z. Okay. So, okay. So again, this is not as easy to understand as the moments or the greens function, but it's still a relatively simple, at least in the linear algebra point of view, a natural expression to play with. And then, once again, there are convergence theorems that say that if you know that these log potentials converge in an appropriate sense, maybe in probability or almost surely or whatever, and there's Z as a parameter. So you may ask for almost every Z, you have convergence almost surely, for instance. There's various flavors of convergence. Again, I don't want to dwell on the details. They are somewhat important, but not the most challenging portion of the theory. Again, there are theorems that tell you that if you have convergence of log potentials and maybe some tightness bounds, you need to stop the measures from running off to infinity. So you need some control on the measures in addition to this, but those are usually available. Yeah, plus maybe some other technical conditions, then you can conclude that these measures converge to the limit measure that you want. Okay, so these are the three basic methods that we have. Well, okay, there are more, but these are the three that I'll talk about. Now, the moment method and the sort of transfer method work quite well for emission matrices, although as Lacher pointed out, moment method is particularly good for understanding the edge of the spectrum, not so good in the bulk, but the sort of transfer method was good for understanding the edge and the bulk. So let's see, where should I start? Okay, let me start with the moment method. So the reason... Okay, so there's a couple of things. So why doesn't the moment method work in the complex case? I can give you two explanations. One is that this convergence theorem, I stated here, convergence of this type are generally true and real line and false in the complex numbers. See, on the real line, once you have convergence of every moment, then you have convergence, then you can replace the moment by any polynomial, and of course you get the same... by linearity, you get the same sort of convergence. And on the real line, or at least on any compact subset of the real line, the polynomials are deaths, okay, by the Weiss-Russ approximation theorem. So once you have convergence of polynomials, you get convergence of all test functions, assuming, let's say, that you have some uniform compact support on all these measures, and that gives you vague convergence. Okay, so moment convergence theorems are available on the reals, but on the complex numbers, they're not available. The complex polynomials are not dense in the space of all continuous functions. The Weiss-Russ approximation theorem is not true in the complex domain, but it's not true. Okay, when you integrate a polynomial or any holomorphic function on a closed contour, you get zero, but you integrate a non-holomorphic function, even if it's smooth, you get non-zero. And so you cannot approximate, for example, Z bar is a good example of a complex smooth function which is not approximated by polynomials. And so the moments only give you part of the information you need to understand the measure. So just to give you one example, the circular measure, the circle, if you take all the mass of the circle and you compress it to a point and you just take the Dirac mass of the origin, these guys have the same moments. If you integrate Zk against the circular measure, you can just check that it's the same as integrating against the Dirac mass. So if k is zero, both sides are one, and if k is positive, both sides are zero. It's because of the circular symmetry of the circle law. In fact, any measure which is rotationally symmetric will have the same moments. Yes? That would fix things, yes. So if you knew, so if you could control mixed moments of Z and Z bar, then, and if these all converge to the right thing, that would give you, yes, then the Ys mass function would work, assuming your measures and you'd be done. But the problem is that these statistics, there is no easy formula like this that relates these mixed moments to some nice simple expression of the, it's not just taking m into the k and m star into the L, okay. The matrices are not normal. And so, there is no good, these are about as hard to understand as the entire circular measure. So we haven't got any mileage. There's a natural approach, but you're stuck at the first step. Okay. Yes. So that's one way to see why the moment method is insufficient to tackle this problem. Another reason why the moment method can't work, a different obstruction. So I'll just say in words first and then explain what I mean with this. Moments are stable and spectra of non-commissioned matrices are unstable. Okay. So what I mean by this is that if you take your matrix, this IID matrix, and you just change it a little bit, like maybe you change one entry. You just change one entry, flip a plus one to a minus one. That barely changes these moments. These moments, they're not sensitive to tiny changes in a single variable. They're polynomials of all the coefficients and most of the terms don't involve the one coefficient you're tweaking. These moments are very stable. You change the matrix a little bit. Nothing much happens. And in the Hermitian case, eigenvalues are stable in the Hermitian case. Maybe I should emphasize the spectra of Hermitian matrices are stable. So if you have a Hermitian matrix and you add a small error to it, say you change one entry, you have to change two entries because it's Hermitian. Then the eigenvalues of this matrix are very close to the eigenvalues of the first matrix. There are all these inequalities that relate them. For example, the valid inequalities, for example, the jth eigenvalue of this Hermitian matrix, if these are both Hermitian, differs from the eigenvalue of the same eigenvalue of the original matrix bound by, for example, the operator norm. This is an easy inequality that you can prove. And there are many, many other inequalities like this. It's a whole story in itself, by the way, but okay. Yeah, if you change things a little bit, the eigenvalues don't change very much. Okay, so this is a true for each j. There's this variance where you sum nj, you sum what they call the the violin and the inequalities are very useful, and so on and so forth. Okay, nothing like this is true in the non-commissioned case. In the non-commissioned case, the eigenvalues are, first of all, lambda j doesn't even make sense because the eigenvalues are not ordered. So that's already a problem. But even if you somehow ignore that problem, the eigenvalues are still technically continuous. That for every epsilon is a delta such that if you change the matrix by an epsilon, which way does it go? By a delta, the eigenvalues won't change by an epsilon. But the dependence of epsilon and delta is very, very bad. So, you know, from the point of estimates, which is what we care about, it's as if they won't continue this at all. So at this point we give the standard example which everybody gives. So suppose you consider the right shift operator. One's on the upper diagonal and zero's everywhere else. So this matrix is no potent. If you raise it to the nth power, you get zero. The diagonal just keeps shifting over and over to the right, and after a while you get nothing. So this matrix is no potent, so all the eigenvalues are zero. So the spectral law of this matrix is just the drachmas of the origin. It's not a big cluster of eigenvalues. It's zero, no eigenvalues anywhere else. So that's what the spectral law looks like. Okay. But now if you just perturb it, and I'll just perturb it in one entry, the bottom entry, and just by a very small amount, by an exponentially small quantity, 2 to the minus n. So if I take the same matrix, I just change one entry by an exponentially small amount. What happens to the matrix? Well, it's no longer an important. If you compute the nth power, now that sort of the the adjacency matrix of the graph now has a big cycle. What you get is 2 to the minus n, 2 to the minus n, 2 to the minus n. A diagonal matrix. So it is now 2 to the minus n. And so in fact the eigenvalues you can compute, the eigenvalues of this matrix are the nth roots of 2 to the minus n. So it's one-half times all the roots of unity. Another way of seeing it is that the characteristic polynomial of this matrix is just z to the n, and the characteristic moment of this matrix is z to the n minus 2 to the minus n. Plus or minus, probably minus. Okay. All right. And so these eigenvalues are spread out uniformly on the circle of radius one-half. Okay. So you have a matrix of spectrum is all stuck at zero, and then you take this exponentially small change and the eigenvalues repel each other very fast and have moved a constant distance away from where they were, and are now spread out somewhere else. Okay. So there's a dramatic change between these two in percolar structure measures, even though the matrix changed very little. And because moments are stable, the moment method doesn't see this. If you look at these two measures, in fact all moments up to the n minus first order are the same. And this distribution, they have exactly the same moments, first second, third, fourth, all up to n minus one. That's actually, that's Fourier analysis, okay. But it's precisely the Fourier inversion formula, actually. It's only the nth moment that differs. So these two measures are almost indistinguishable from each other by the moment method, because the moment method is stable. It can't see this perturbation. But the spectrum here is unstable. So small perturbations become important. Which is a real headache for many reasons. So for example, in the Wigner case, this issue of tails, so sometimes you don't have bounded entries. Sometimes the entries can be occasionally large, but often you can truncate your matrix, you throw away the large entries, and because of all this stability type theorems, you can show that the outliers, the very large measures, as long as you're not really heavy-tailed. So as long as the second moment is finite, you can show that the effect of the really large values are not very important, and you can truncate, and you can restrict to the, without us assuming that all your entries are bounded. And because of the instability, you can't automatically do that in the non-machine case. So the case of heavy-tailed matrices is significantly more delicate than, say, bounded entry. You can't just immediately restrict to the bounded case, which you often can in the emission case. Okay, so that's the moment of effort. So let's go on to the Stortr-Transform method. Now, this is a bit better. It's better because the Stortr-Transform was worse. Worse in the sense that it is less stable with respect to perturbations in the matrix, which sounds bad, but you have to use a method which is unstable with respect to the matrix, if you have any chance to prove the circle of law. Yeah, so this trace, if you change M by a little bit, this resolvent can change quite a bit, particularly if Z is, for example, if Z is very close to the spectrum of M, then it's very unstable. Okay, but the thing is, you get to choose what Z is. Z can wander all around the complex plane. And as a general rule of thumb, if Z is very far away from the spectrum of your matrix, then things are more stable and you have a better understanding of this quantity. It's when you push the spectrum or more generally some kind of pseudo-spectrum, which we'll talk about in a little bit, then this becomes unstable again. Now, in the case, this is a very good tool to use because in the emission case, the spectrum is real. It lies in a real line, so typically it's concentrated on some interval. For example, minus 2 to 2 is very typical. Okay, so if you have a big matrix, the spectrum goes over here. But Z, you can pick anywhere. And the further the Z-ways is from the spectrum, the more control we have on the transverse. So basically the key parameter, which you saw a lot in the earlier lectures, is the imaginary part of the spectrum parameter Z. Someone called Ada.