 Thank you, Sharav. All right, so going back to the subjects of discussion that we had two days ago, we talked about the Wigner semicircle law. We talked about when it applies. We first showed it for Wigner matrices with moments of all kinds. And then we explained how you can remove the moment assumption. So the only moments that you have to continue to assume are those of order two. Today, we will move in a deeper direction. So we will examine what happens once you subtract off the first order term, which is this semicircle law. And you look deeper. You look at the fluctuations from that. So a quick reminder of what the notation stands for. We have a Wigner matrix, WIJ. All the variables are centered. The variances are one for the off-diagonal entries. And we had this condition in which I will continue assuming for the first part of the talk. I'll explain at the end how you can remove it. Then you have moments of all kinds. I should also specify the fact that, which I didn't write here, so I'm going to just add it on the board, that we assume that the variance of the diagonal entries is constant. So WII is constant, or C greater than 0 is constant. We are going to specifically examine the matrix W bar, which is the scaling of the matrix W by 1 over square root of n. And the object that we're interested in is the empirical spectral distribution. Although in retrospect, I think that I don't remember actually writing F of W bar again in this talk. But this was the object of our interest last time. So this is the notation. We showed that F of W bar converges to sigma, the semicircular distribution. And we showed that this is a convergence in probability. And in particular, you can think of it in this following way. And I'm sure that you've seen this already a bunch of times in other lectures or in the research talks. You can write that the average of the values F of lambda i of W bar. So you average them over all the eigenvalues. And this quantity will converge to the integral of F over the semicircular density. Very simple. We know that it is true for polynomials. And then through approximation, via stress approximation, the fact that we know that the probability of the eigenvalues falling outside of some compact is small, et cetera, et cetera, et cetera, you can extend this for quite a large class of functions. This has a flavor of a law of large numbers, which we shorten, abbreviate, to LLN. So to recall, in one dimensions, if you think of x1 through xn to be independent samples from a distribution with mean mu and variance sigma squared, the law of large numbers is the following thing. If you average the xi's and let n go to infinity, then that will converge to mu, the mean, as n grows large. So that's a law of large numbers. Of course, there's a deeper result that's available. And that's the central limit theorem, which says that actually you have more than just that. In fact, if you sum the xi's and subtract off n times the mean, and appropriately normalized by square root of n times sigma, then what follows what this variable that results is going to be asymptotically distributed like a standard normal. So it means 0 variance 1. That's a central limit theorem. So since we have a similar result to the law of large numbers for values of functions evaluated on the eigenvalues of Wigner matrices, we can ask, well, can we perhaps get something like the central limit theorem? But before we do that, let's take a look to understand what these fluctuations from the semicircle might look like. So I've drawn some pictures here. They are supposed to show you the exact distributions of the, well, actually the shape of the densities to be more exact, of the eigenvalues for the GOE, GUA, GSC. I think that in particular these are the GUEs, because those are the simplest ones to draw. So if you look at one, one by one matrix, and you look at what the distribution of its eigenvalue is, of course, you're going to see the Gaussian curve. The eigenvalue is Gaussian. So no surprise there. But let's see what happens when you do two. You get this picture that's perhaps reminiscent of the little prince with the hat, which is actually a snake having swallowed an elephant. And this is what it looks like, the distribution of a random eigenvalue, n is equal to 2. n is equal to 6. It starts looking more and more like the semicircle, except you have these six bumps. And by the time n is 100, if you're really in the back, you might not be able to see this. But you're in the front. You might be able to see all these tiny little bumps distributed around the semicircle that define the fluctuation. Those bumps that you see define the fluctuation. And in fact, what you see is that it's distributed in some interesting way around the semicircle. So that's what we're going to be after. So now let's look again at how things look in 1D. Actually, let me move this out of the way, because I don't want it to confuse people. So again, you take n samples from the same distribution, subtract off the expectation, scale by square root of n times the standard deviation. And what you get will converge to a standard normal. Well, for Wigner matrices, we will have something similar. If you sum for some smooth enough, nice enough, function f, if you sum the values of the function over the eigenvalues of the Wigner matrix and subtract off the expectation, and then divide by the standard deviation, which is going to be a number that depends on the function f, then what you get is going to be converging to a standard normal. There's one big difference between the two statements, the first one and the second one. There it is. So for the classical central limit theorem, you have to scale by square root of n to get convergence, because the fluctuations are big. That's not the case here. So if you think a little bit about what that means, it means that the eigenvalues of the Wigner matrix have a lot less randomness in them than samples from the same distributions. They are very highly correlated. They push each other apart, and they will fluctuate very little around what is called classical position. Sorry, what is it? Not classical position, classical location. That's what it was. And you will hear a lot more about that, I'm guessing, because there will be discussion of universality next week. But this is what this is saying. There's no square root of n. There's a lot less fluctuation in the eigenvalue than you would expect if they were independent. Yes. Here, it's just the one over. So we're looking at W bar, which is 1 over root n W. No, because what that 1 over root n does is it puts the eigenvalues on a compact set. But that's all. You still have n of them. So again, think about it a little bit. What this is saying is that you sum n objects and you subtract off the expectation. And what you expect to get is of order 1 over n, not 1. I mean, I'm not sure what you mean by hitting a matrix. For example, if these excise were plus or minus 1s, that also puts them on a compact set. But you don't rescale it to being close to 0. It's still going to be on minus 2, 2. So these are objects that are all of 1, just like here. If you replaced a no because so OK. So what's the problem here? The problem here is that I want to look at f as being defined on a compact set. If I allow f to be defined on the whole real line, it's a totally different situation. But this is the best comparison I think that you can make. Think of these excise as taking just a few values, binomial. So taking values in minus 1, 1. And the fluctuation, because of the independence of the samples, the fluctuation is of order root n. But here, the lambda i's are of order 1. When you evaluate the function on them, that gives you something that's of order 1. You sum n objects that are of order 1. And yet the fluctuation from the expectation is also of order 1. This is very strong. This says that the correlation between the eigenvalues is very strong. And in particular, that's what happens. So the eigenvalues are going to be pushed apart as much as they can. And therefore, there will be very little wiggle room around those classical locations. Anyway, we can talk more about it offline, perhaps. But again, let me assure you once again that it's not the square root of n that's missing from here is not in here. It's not in the scaling of the eigenvalue. OK. So how might one prove something like this? And one way to do it is to, for example, look at f as being a particular format. And of course, so far, we've worked with monomials. So let's continue working with monomials. You look at the variable x and k, which is going to be trace of w bar to the k minus the expectation. You want to show that if you scale by the variance, which will, of course, have to be computed, then what you get here is going to be a standard. It's going to approximate. It's going to converge in distribution to a standard normal. And of course, the way you're going to do that is going to be, again, relating to the method of moments. You will look at this variable, and you'll compute all the moments, of course, starting with the variance, and show that their asymptotics are those of the moments of the Gaussian. So that's the thing to do. And of course, because you have to normalize by, the first thing to do is to compute the variance. So let's take a look at the variance. This is what we want to look at. Now, I will remind you that we've actually already looked at the variances when we proved the concentration of the moments. So remember that the way we showed the convergence of the empirical spectral distributions was to first look at the expected moments, conclude that they converge to either zero or the Catalan numbers, and then remove that expectation by saying, I'm going to now look at the variance. And because the variances are going to be small, I will conclude that these quantities themselves are going to be concentrated. One thing to remember is that before when we were looking at the empirical spectral distributions, there was an additional 1 over n here, which came from the fact that we had to average that trace to get a moment of the empirical spectral distribution. Here, we will not have that. So we'll just look at this. You might remember that we showed previously that these quantities were of order 1 over n squared. But that's because there was a 1 over n square implicit in there, because we were looking at the average. We were looking at 1 over n trace of W bar to the k. Now, we've removed that 1 over n. Therefore, we only have a 1 over n to the k. And we showed that what was here was also the order of n to the k. So now, we're looking at something that's of order 1. And we said, well, for the purposes of showing concentration, we didn't need to actually compute this. We didn't need to know what it was. Now, however, we will find out what it is, because now we need it. So how did we do this? Recall that we had this notation, so W i, for some k tuple of numbers, i1, i2, ik. We had the word associated to it, W i, which was just W i1, i2, W i2, i3, W ik, i1. And associated to it, we also had a graph, G sub i, which had vertices labeled i1, i2, ik. Of course, you're going to have as many vertices as distinct labels in i1, i2, ik. And edges were given by ij, ij plus 1. So that was the graph. So now since we're looking at the covariance of the product, W i, W j, let's see what conclusions we had drawn before about such covariances. And of course, we're interested in looking at this. This is the covariance. This is just rewriting the covariance over all possible pairs of k tuples, i1, ij, where i and j take values in the set 1 through n to the k. We will need the graphs Gij and their union. And the words will represent closed walks on the join of these two graphs on the join of j with Gij. Recall that if it's not the case that every edge in these walks, or in the join of walks, if you want, appears at least two times, it follows that one of these variables, W i sub l, i sub l plus 1, will appear exactly once. And because it's a centered variable and we're taking expectations, that will ensure that that term disappears. So we are only interested in taking the covariance over pairs, W i, W j, such that each edge in the graph, in the join of graphs, is walked on by these two walks at least twice. And of course, if E is the number of edges in the join, G i union Gj, and V is the number of vertices in the same, because it comes as a join of walks, this graph is connected. Therefore, we have to have that V minus 1 is less than or equal to E on the one hand. On the other hand, in the join of walks, we will have precisely 2k edges taken with multiplicities. So since each edge appears at least twice, this means that the total number of edges is at most k. So far, so good. Are we remembering slowly how things are? Next. So what does this mean? It means that we could have, at the most, V minus 1 less than or equal to k. And in fact, that means that V is less than or equal to k plus 1. Could we actually have a V to be k plus 1? And the answer is no. The answer is no for the following simple reason. If there were the case, since, OK, let me write it. So can we have V is k plus 1? In this case, you'd have to have that both of these inequalities are actually equalities. So on the one hand, you would have that the join of graphs is a trick, because that's the only connected graph for which is equal to V minus 1. And on the other hand, you'd have that the number of edges is precisely k. So that means that each edge is worked on exactly twice. But then, that would mean that the graphs given by gi and gj, since their join is a tree, the graphs themselves being connected will also have to be trees. In each one of the trees, in the walks, wi, respectively, wj will have to be worked on exactly twice, because closed walks on trees work that way. And therefore, what we have in wiwj is a product of independent things. There's no overlap. If I have a tree here, each edge is worked on twice. If I have another tree here, each edge is worked on twice, I join them. I'll have to join them at a vertex, because I'm not allowed to overlap an edge. That would destroy this inequality here. So then wi and wj are independent. And therefore, the covariance is 0. And therefore, I'm not interested in that term. So this is of no interest. This will not give me anything in that sum. So not surviving expectations. So that means that this inequality actually becomes interesting only in this form, v less than or equal to k. Moreover, what we want is to take v as large as possible. We want to have as many vertices in this join of graphs as possible. Why is that? Why is that? Well, because remember that we were splitting the terms according to the pair of graphs that they define and the walks on those graphs. And there was always a finite number of such objects up to choosing the labels on the vertices. And of course, if you have v vertices and n labels to choose from, you're going to, oh, and the order matters, then the number of choices is going to be essentially this. They have to be distinct. The labels have to be distinct. And therefore, what you get is a sum that looks like that. And since you want to find the asymptotic value, and v is always less than or equal to k, it means that you want to look at those precise instances for which v is exactly k. That's the term that's of highest order. And that's the term that we're going to focus on. Because everything else will disappear. If v is less than or equal to k, then n to the v over n to the k will go to 0. And this here is now a sum over a finite number of objects. So like I said, only the highest order term will survive. OK. At this point, we have to start splitting hairs, so to speak. No, not really splitting hairs. Splitting cases. We have to look first at the case when k is even. So we're looking at even powers of the trace. Sorry, the other way around. We were looking at trace of even powers. OK. And we need to have some edge overlap between the graph induced by i and the graph induced by j. Otherwise, they are independent. And the term disappears. The covariance of a product of independent terms is 0. OK. So since v is equal to k, let's see how we can do this overlap. We can have several, at first glance, we have several possibilities. We can have e is equal to k is equal to v. The graph has a cycle or a loop. Of course, if e is equal to v, that means that the graph has a cycle or a loop, because it's connected. We could have e is k minus 1, which is v minus 1. In which case, the graph has to be a tree. And because e is k minus 1, that means that we have an excess multiplicity of 2. So 1-edge has multiplicity 4. Everything else has multiplicity 2. Conceivably, we could also have this other option, which is exactly the same. But instead of having 1-edge of multiplicity 4 and everything else of multiplicity 2, we could have two edges of multiplicity 3. So the question is how to divide this excess multiplicity of 2. And we can, of course, 2 is either 1 plus 1 or 2. OK. Actually, not all of these are possible. In fact, you can't have a loop. And the answer to that is very simple. So suppose we did have a loop. So you had e is equal to k. Let me remind you, k is even. e is equal to k. e is equal to v. And you want to do an overlap on a loop. Everything is, let me do it this way. The loop is worked on twice. Here I have a tree coming from i1, gi1. Here I have some other tree, gj. And I overlap on a loop twice. Why can't this happen? The edges would have to be disjoint, yes. So the only overlap is on this loop, which means that the loop has to belong both to gi and to gj. Can that happen? What would that imply? I have a closed walk on gi, on a tree. What kind of a length can a closed walk on a tree have? It has to be even. When you have a closed walk on a tree, whatever edge you've walked on, you'll have to walk back. So every edge is going to be walked on at least twice. Well, in this case, we know exactly twice. But I have this dangling loop. But that means what? That means that I have an even number of edges here plus one loop, so that's an odd number. What does that contradict? K is even. That contradicts the fact that K is even. So if you have K even, you can't do a closed walk on a tree in which every edge is taken twice and then add a loop, because that gives you an odd number of edges in the walk, not in the graph. So this is not actually a possibility. You can't overlap on a loop. That's why I took that away. OK, actually the third option here is also not achievable for the same reason. If the graph is a tree and we have two edges with multiplicity three, the overlap will either have to be on an edge of multiplicity two, in which case one of those, or maybe both, will have to have a closed walk on a tree in which every edge is walked on twice, but one of the edges is only walked on once, and that doesn't work. In case it's a parity, it's a parity discussion. If you start, if you walk on a tree, kind of trying to show you how this works, say I want to walk on this tree starting up here and I want to do a closed walk. Say I go this way, but then I have to come back. And I go this way, and this way, but then I have to come back. And then I go this way, and this way, and then I have to come back, and so on. There will be no edge that is walked on once or three times. You can't have that. So that destroys the third possibility here as well. It follows that there are exactly two possibilities. Either the overlap is on an entire cycle. So the two graphs both look like cycle with dangling trees off of it. Cycle with dangling trees off of it, and you overlap the cycle. You have two trees which overlap on an edge. And those are the only possibilities for which v is going to be equal to k. So let's count what happens in, oops, sorry. Let's count what happens in either one of these two possibilities. Let's start with the one where you have the cycle. Maybe I should. So like I said, one of the two graphs looks like this. You have this cycle and a bunch of trees dangling and so on. And then you have the exact same cycle with other trees dangling from it. Maybe not at the same locations. You'll see that I'm partial to binary trees. Let me try and draw a non-binary tree. So you take, this is going to be G sub i. This is going to be G sub j. And the way you will walk is you'll start walking somewhere here. And you'll go this way. You'll walk on the trees. You'll continue to the next tree. Walk on the tree. Continue to the next tree. Walk on the tree. And so on and so forth until you get back. Each one of the edges on a tree are going to be walked on twice. The cycle edges in this walk are going to be walked on once. And exactly the same thing is going to happen here. When you look at IWJ, what happens is that you take the join of the two graphs. So at that point, you'll have that cycle. And all of the trees that dangle from it are going to continue dangling. So I'll say that this is a vertex that's identified here. So you'll have two things there. One, two, three, and so on. So that's the graph. How do we even do this? The way we do this, the way we do this count, is to say, OK, I'm going to start with assuming that the cycle has length r. If the cycle has length r, then I have a remaining number k minus r over 2 of vertices to place in trees around this cycle in either one of the two graphs. So we'll take a partition of this number, k minus r over 2. We'll say, OK, so we have, these are the dimensions of the trees. These are the sizes of the trees that I'll dangle off of the cycle. And then you pick a tree with the prescribed size.