 the first two lectures on more basic material, which I think will serve as a foundation for all the speakers of this conference. So I will go a little bit fast, because it should be stuff that you know you've heard of, and then you can dig deeper if there are things that I mentioned today that you don't already know. By the way, I've written a book with Jean-Philippe Bouchoque that came out about a year and a half ago. If you want to buy a nice hard copy, you can. It's quite expensive, as all academic books. I think it's available on the web illegally for free. I don't care. Go ahead and download it, if you can find it. I'm not collecting any royalties, anyway. OK, so but a lot of what I'm saying is in this book, and there's a lot more in the book than what I will be able to say in these six sessions. OK, so I want to talk about random matrices. And why are random matrices more interesting or at least different than random numbers? If I give you a random number, suppose I give you a random number 0.7, you can't do much with 0.7. It's just a number, you don't know the distribution, you don't know the mean, et cetera. Whereas if I give you a very large matrix, matrix has n square elements. And if n square goes to infinity or for physicists or applied people, very large say n is a 1,000 or a million, well, you easily get millions of entries or thousands of billions of entries. And so you can do statistics just directly on the entry of a matrix. And then you have phenomena that happen such as self-averaging that certain quantities that you can compute on one single matrix because a matrix is so large that these quantities actually tell you something about the ensemble of matrix that is withdrawn from. So this is why I find random matrix theory interesting. And this concept of self-averaging, the mathematician will talk about concentration of measure, which means that the probability of something is extremely concentrated on one point. And for all practical purpose, it becomes non-random. So the kind of matrix I'm going to talk about are going to be self-adjoint. So I have a matrix A that will be equal to its adjoint. I will talk mostly, I'll do everything with real matrices. But since I only talk about distribution of eigenvalues and outliers, all this is the same whether the matrix is real or complex. So actually, I will use a transpose because I will be thinking about real matrices. But everything I say will be true for emission matrices where you need to take the complex conjugate when you take the transpose. And so these matrices have real eigenvalues. And this is one reason we study them because non-self-adjoint matrices are not always diagonalizable, the eigenvalues can be complex. But with these matrices, we're safe. All the eigenvalues are real. And we're going to talk about, I'm going to define moments of matrices. So I'm going to have a moment operator or an expectation value operator, which I will call tau, which tau of a matrix will be simply 1 over n trace of the matrix. And since these matrices will be diagonalizable, I can always look at this in terms of the eigenvalues. And this will be 1 over n sum over k of lambda k. So this is kind of an expectation value. And the point is that another thing I want to say is that the matrices we're going to look at will have a spectrum of eigenvalues that is bounded. So there'll be a largest eigenvalue and a smallest eigenvalue. And even as n goes to infinity, the eigenvalues are bounded. And so all the moments exist. So I can define then the moment of a, so I can define tau of a to the k. And this I will call the kth moment of a matrix. So here, I'll be sort of going back and forth between sort of averages and specific matrices. So this is the moment k of this specific matrix. But all these things will converge in the large n limit. So this is why I'll be a bit sloppy sometimes. When I say mk, do I mean the kth moment of that specific matrix, which really means 1 over n trace of that matrix to the power k, which is really the same as 1 over n sum over l of lambda to l to the power k. So this is a specific moment of that specific matrix. But when the matrix is extremely large, this actually converge to the moment of the distribution. Yes. And this is one of these quantities that I call self-averaging. So one very important tool to study random matrices, especially bounded random matrices, it sounds very strict to say that I'm looking at bounded object. But it turns out that most random matrices that we want to encounter are actually bounded. So we'll talk about the Wigner ensemble, where the distributional eigenvalues is a semicircle, and it's strictly bounded. Unlike the Gaussian, for instance, if you look in classical property distribution, one of the most natural distribution you encounter is the Gaussian. And the Gaussian is not bounded. So the tools introduced like the Stilge's transform is not very good at dealing with unbounded objects. But for random matrices, the spectrum turns out to be very often bounded. And therefore, the Stilge's transform is a very strong tool to use with bounded objects. So what is the Stilge's transform? Well, before I define it, I will define the resolvent. So the resolvent is a matrix function. So we'll call g of z for a certain matrix A. And it's just the matrix z identity minus A inverse. So it's a function of a complex variable z. And it's a matrix function. So if A is a random matrix, this is a random function. So it's a random matrix that depends on a complex parameter. But this will be very useful because this object, if you look at it in terms of the eigenvalues of A, you can write it as sum over L of vLvL transpose z minus lambda L, where lambda L and vL are the eigenvalues and eigenvectors of the matrix A. So what it is, it's a sum of rank one object of projectors with a pole at the corresponding eigenvalue. So if you zoom in on this pole, you'll be able to find a projector on the eigenvalue. So this will be useful. I'll come back a lot to this resolvent. But for now, I just wanted to take the trace of this object. And again, my traits are always normalized. So define the Stilge's transform g of z as the normalized trace of this resolvent matrix, or which is the same as 1 over n trace of z identity minus A inverse, which in terms of the eigenvalues of A is the same as 1 over n sum over L of 1 over z minus lambda L. So it's a random function because A is a random matrix. It's eigenvalues are random. But the point is that almost everywhere this function will converge. When the matrix is large, this actually becomes a deterministic function. So we'll see that everywhere except where their eigenvalues is. So how do you see that this function g of z, this function of complex variable, converges to a deterministic function? Well, let's look at it. It's an erase. So let's look at this random function for z outside the real line. So I'll look at z equals x, a real number, minus i eta. So I put myself below the real axis. So I take some complex z. And I give it a negative imaginary part. And I looked at the value of this function. So I can just write g of z, which is the same as g of x plus minus i eta. And it's 1 over n sum over L of 1 over x minus lambda L minus i eta. And as usual, when you have a complex number denominator, it's better to use a conjugate and to put the complex number on the numerator. So I get 1 over n sum over L of x minus lambda L. It's a real part plus i eta over the real part square plus eta square. The eigenvalues are eigenvalues. Now, let me just focus on the imaginary part. So this is real. This is imaginary. And this is a real number at the bottom. So I can now try to understand. I could do the real part, but I'll just focus today on the imaginary part of this object. And so I have that the imaginary part of g of x minus i eta is just 1 over n sum over L of eta over x minus lambda L square plus eta square. Now, if you look at this object here, this ratio, that's what it's called the Cauchy kernel. So up to a factor of pi, which is normalization. So let me write this this way. So it's 1 over n sum over L of k eta of x minus lambda L. And then I need to put a pi. And I always forget where to put the pi. Just see my notes because I'll always get this wrong. So I need to put a pi here. And so I define this object k eta as 1 over pi eta over, as a function of, say, y, y square plus eta square. And this is a Cauchy kernel. What it looks like, it's a function that's centered at 0, has power dot tails, the maximum and the power dot tail, and has with eta. And it has integral 1. That's why I call it a kernel. So this is an object that this is just an object of with eta and integral 1. And so what I'm doing here, what I'm saying is that g evaluated at x and with imaginary part is essentially the convolution of the discrete distribution of the lambdas. So think of my matrix. So these are to look at the eigenvalues of my matrix. It has some eigenvalues lambda 1, lambda 2, lambda 3, t-t-t-t-t-t-t-t-t-t-t-t-t-t-t-t-t-t-t to lambda n. So it has n eigenvalues. This is kind of the empirical distribution I have for each eigenvalue. And these are random. And then I convolut this with a kernel of with eta. So basically over each eigenvalue I do a little kernel like this and then I sum this. So I'll get a smooth curve, which is basically just a convolution of this discrete set of eigenvalues and this kernel. Now the point is that if eta is large enough, so I must say as well that so lambda, so typically there's many smallest eigenvalue which I'll typically denote lambda minus and the largest eigenvalue lambda plus, which are really the not lambda minus and lambda plus are more like the edge of the spectrum, where lambda one is the smallest eigenvalue on that particular sample. So the notation is slightly different, but I'm abusing it a little bit by saying the lambda minus and lambda one are essentially the same as this problem. So what I'm saying is that since all my eigenvalues are bounded, I have typically a distance one over n between eigenvalues, okay? So that's depending on the density, but if the density, and again, there are also other cases where I could have a direct mass, but for now I'm not considering this. So I'm supposing I have all my eigenvalues are spread out, then because they're spread out over a finite interval, the typical distance between them is one over n. And so if I have eta, that's much greater than one over n, then actually the picture is more like this. So every kernel encompass quite a lot of eigenvalue, actually infinitely many as n goes to infinity. And so by sort of the law of large numbers, this convolution of this random distribution of eigenvalue with a broad kernel gives a deterministic answer, okay? So this is the reason why for eta much greater than one over n, I could claim that g of x minus i eta which I would say for a given matrix A converges to g mu if you want of x minus i eta where this is, or it's equal to its average expectation value. It doesn't depend on the sample, okay? So this is very important self averaging property of the shield just transform. And this is true, well, it's true only away from the real axis, it's very important that there's an image of the real axis. And it's important that there's an imaginary part. And it's important that the imaginary part is big enough but as n goes to infinity, the imaginary part could be as small as I want. So if I fix eta and I have a sequence of matrices that are bigger and bigger and bigger, at some point the matrix will be big enough that eta will be considered small enough and this thing will converge, okay? Now, the other thing that's nice about the fact that n goes to infinity, I can, if n goes to infinity, I can always find an eta that's much, much smaller than one but much, much bigger than one over n. So for instance, typically in application, you would say if you take eta to be of order one over square root of n, okay? So I'm gonna pick an eta that's, again, much, much bigger than one over n. So when I smooth out with eta, I get the law of large numbers and I get a deterministic answer but on the other hand, I want eta to be extremely small so that I'm probing the function with a very small kernel and what happens then is that I'm basically probing the function with a kernel that goes to zero. So when I'm, well, I'm just gonna write it. The argument is a bit weak here but row of lambda, but maybe I should have written this before. Okay, so, sorry, my argument is a bit clumsy. Go back a little bit. So what I'm saying is that when eta is large enough, when eta is large enough, I don't see the individual eigenvalues. I just see the distribution and the distribution I'm claiming, the distribution of eigenvalue of one sample is the same of the distribution of eigenvalues of the ensemble. And so when I compute this still just transform at a certain value of eta, I get the convolution of the density of eigenvalues with this kernel and now that I have this expression, I can let the kernel size to be as small as possible so it becomes a Dirac. And in that limit, I just, I get the limit where eta goes to zero but you really have to think that eta goes to zero in this way. So eta must stay much greater than one over N and but it may become extremely small, so for instance, and this is a good choice, eta goes, eta be eta is one over square root of N, then you can let eta goes to zero and then you get the density. So I have a factor of pi here and so you get pi times the density. Okay, so this is a very important formula that I've learned in physics but it took me a while to realize where it came from. It's called the, let me write it a little more. So I'm saying two things. Let me write gN for a fixed size N. I'm saying two things. I'm saying that this converges to a fixed function. Okay, for, so for, if I exclude the real axis, okay, I could actually compute it outside where the other eigenvalues but I've only shown you. So if I take a complex Z away from the real axis then these still just transform, compute on one sample converge to the still just transform of the ensemble. And the other thing I've shown here of eta goes to zero plus of this function, g of x minus i eta, I get pi rho of x. And this is called a, the plem, oh, let's forget how to write this word. This name is plemelge formula. So it says that the still just transform encodes the, encodes the density, okay? Imaginary parts, sorry, exactly. Okay, and the real part will converge to the Hilbert transform. Okay, but I don't have time to show. Okay, I'm all right. Okay, so those are basic facts that I want. But I really, because for practical application it's actually, it actually matters that essentially what I want to say is that adding an imaginary part to the still just transform sort of blurs. It's, so I think that the real important thing is that adding an imaginary part to the still just transform is like convoluting with a Cauchy kernel. And if the imaginary part is big, you can do with a broad Cauchy kernel. And as the imaginary part gets small, you can verge almost, you can do, you make a convolution with a Dirac, which gives you back the up to a factor of pi. It gives you back the density, okay? And so therefore you could also, so once you have this density, you can rewrite the still just transform then as a density. Okay, so this is the limiting still just transform. You can write as an integral, say if the spectrum goes from lambda plus to lambda minus. Okay, so you can write it in integral form. So the still just transform. So here maybe I should rewrite the definition. GN is for a given is one over N sum over L of one over Z one number L. So this is for one matrix and it converges to this, this thing that only depends on the density of eigenvalues. Okay, another few things that you could say. And again, on the real axis, this is not true. On the real axis, this is a random function. If you pick a Z on the real axis that's close to an eigenvalue of that particular matrix, you can have a huge number. So this is a real function. Actually on the real axis, this is a real function that has jumps. I mean, it's extremely wild. If I sort of wanted to plot it, it would behave like this. It would have poles everywhere. It sort of goes like this. So it sort of diverges to infinity up and down every time there's an eigenvalue, okay? Where typically something like this has a branch cut. So instead of having a large amount of poles, like it has n poles and n goes to infinity, this function with poles, again, on the real axis, it doesn't converge, but outside the real axis, actually all these poles as seen from above, seen from far away just look like a branch cut, okay? And this is quite important. So this kind of formula gives you a branch cut between lambda minus and lambda plus, but elsewhere on the complex plane, it's perfectly analytic. And in particular, it's analytic at z goes to infinity. So as z goes to infinity, this behaves as one over z, okay? So densities are normalized. So when z is extremely large, this becomes just one and this tends as one over z, okay? And actually you can do an expansion and you also, and again, you could also show this sort of convergence as a power series at infinity. It's another way to see this convergence of this complex function. It's very simple to see that if you take this formula and you expand it in powers of z at infinity, just write the result, you get the g of z equals sum L equals zero to infinity mL, mL over z to the L plus one. So you recover the moments. Remember, mL is just one over n sum of lambda, sort of mixing k's and L, okay? So these are the sample moments of that particular matrix. And this converges to, so this would be a gn, it converges to gz, which actually I'll write the same expression, well, except that here maybe I put a little n, this is for that particular matrix. And then here, these moments converge to the moment of the distribution of the eigenvalues. In particular, this would be one over z plus m1, one over z square plus m2, zq plus order one over z4. So g is analytic at infinity, and it's actually the moment generating function of the moments of this density row, okay? So I'm gonna look at one particular case. So a random matrix you should all know, be familiar with is the Wigner ensemble, which I will write as, I'll write the matrix x is a Wigner matrix, I'll write as h plus h transpose over square root of 2n. And h here is a iid matrix that's not symmetric. So I want x to be a symmetric matrix. So the Wigner ensemble is a symmetric matrix. So the way I build my symmetric matrix is by taking a non-symmetric matrix and adding its transpose, some normalization factor of square root of 2n. And in the matrix, I could take it normal zero one, but I'll take it normal zero sigma squared. So I have a parameter sigma squared. But a unit Wigner matrix would have elements here, normal zero one. And the moments, yes, the mean is zero and the variance is sigma squared. That's why I chose this normalization here. Okay. And now I would like to compute, if I can compute a still just transform of this ensemble, I will, I can get the density just at looking at the imaginary part, okay? And again, I won't do the computation, but what you can do is you can look at this matrix. So again, I'm all thinking about this in large n limit, but I could look at the one one element in the corner of that matrix. So again, what I want to do is compute the resolvent Z identity minus X, take the inverse. This is a resolvent, this is a resolvent matrix. And then take the normalized trace of this object. So I need to invert a large matrix. If I take the one one element and I use Schwarz complement, and then I do some averaging, I won't do the computation. I quickly get an equation that looks like this take the expectation value of one over G of Z equals Z minus sigma square expectation value of G, okay? G of Z. Now, and I'll also use one more trick here. I've assumed that if I look at the element one one and look at the rest of the matrix, the rest of matrix is also a bigger matrix of size N minus one and the Wigner matrix of size N minus one is really the same thing as a Wigner matrix of size and as N goes to infinity and N and N minus one are the same thing. So this is how I get such an equation, okay? There are many, many ways to get an equation you can but anyway, okay? And since I argue that if Z is not on the real access or it's at least not on where the eigenvalues are if Z is far from any eigenvalue on a real access or elsewhere in the complex plane then the function G of Z converges to the non-random and so I can get rid of the eigenvalues and I get the following expression that one over G equals Z minus sigma squared G, okay? So this is the equation that that's satisfied by the Stieljitz transform of a Wigner ensemble, okay? And this is a, it's a very simple quadratic equation in G so I can just solve it, okay? As two nice things, also later it will allow me to compute the inverse but let me first show you the solution which you should all know as well at least you should know the density. So the Stieljitz transform of the Wigner ensemble you have to be slightly careful because it's a quadratic equation, you'll get two roots and only one is the physical one and the physical root depend on Z so you have to be careful how you write it but if you do things carefully you get that for a Wigner you get this expression let me write it as Z times one minus square root of one minus four sigma squared over Z squared over two sigma squared, okay? So this is a Stieljitz transform of the Wigner ensemble with variance sigma of variance sigma squared and if you put sigma squared equals one you get the unit Wigner ensemble. I've written it this way because you see there's a square root and the square root is singular when the inside crosses zero and I want this to be analytical for Z big so when Z is big, this is small and this is an analytical function so this is why I put it this way so if you look at it in the complex plane you get two numbers, you get two sigma here minus two sigma and two sigma you'll get a branch cut here but outside the branch cut the function so for Z large, you're outside of the branch cut this is perfectly analytical function that you could expand at infinity so this is why I chose this if you write Z squared minus then you have to, it's never clear when you should take the plus sign or the minus sign but if you put one over Z squared in the square root then you get an expression that's correct everywhere and analytical everywhere except on the real axis between minus two sigma and plus two sigma, okay? And from this, this still just transformed you can recover the density so I said that I take the imaginary part of G of X minus I eta I will get pi times the density of eigenvalues at X, okay? And you see that this thing for this to have an imaginary part, here actually I need to be on the branch cut, okay? And you quickly convince yourself that row of lambda must be equal to it's four lambda, no so lambda squared minus four sigma squared square root over two pi sigma squared, okay? So this is the Wigner's then this if you draw this distribution it's a semicircle that goes from minus two sigma to two sigma and it's properly normalized and everything, okay? So this is the Wigner's semicircle law that you get for the Wigner ensemble, okay? And it has the properties that we want namely that it's analytic elsewhere everyone under complex plane except where they're eigenvalues and it's real and outside with another important thing I need is that if I call this lambda plus and it's called lambda minus beyond lambda plus it's actually real I can look at it on a real axis but this function doesn't make sense on a real axis here but outside for Z real and greater than lambda plus this function makes sense and it's monotonically decreasing as all still just transformed should be. Remember maybe it's easier to if I look at the integral form it's easier to see if I write G of Z as integral row of lambda Z minus lambda from lambda minus to lambda plus the lambda if I look for Z greater real and greater than lambda plus this is a monotonically decreasing function that tends to one over Z, okay? So here, so this is row of lambda and if I plot G of Z or G of X so G of Z on the real axis it starts with some value and then decrease as G of X goes to one over X, okay? Which means that it's invertible that's very important that this function if I go from X to infinity up to lambda plus this is a monotonic function I can invert this function and invert is very, very useful that's why I kept this formula here because I can read off the inverse just from this formula. So the inverse G of Z is Z of G. So Z of G is just one over G plus sigma squared G, okay? So the inverse of the still just transformed for real values beyond and I'll call this number this quantity G star, okay? So if you want G of Z is monotonic so G becomes smaller and smaller and smaller so Z of G increases as I increase G and so this is well-defined for G between zero and G star, okay? So there's a neighborhood of zero where this function is invertible, okay? Why is that important? So I wanna finish very, very quickly with something that should take many hours to explain the concept of free-ness which, okay? So one thing I forgot to mention but I'll be looking a lot at matrices that are rotation invariant, okay? What's a random matrix that's rotation invariant? It means that if I take say a matrix A and I consider a random rotation O A O transpose that these two are equal in law meaning that the probability of a matrix A is the same probability as any rotation of the matrix A. This would say that the ensemble A is rotation invariant, okay? And I will, so concept of free-ness is a, well, it's a concept that it's used if you deal with sort of algebras of non-commuting objects there's a notion of independence for non-commuting object that's very powerful, it's called free-ness and I'm not gonna go into the algebra of it but it turns out that large rotation invariant matrices are free. So I would say that A and B are free with respect to one another and I'll tell you why it's useful but I just wanna define A and B are free if they're large matrices. In practice, it's only true at infinity, okay? The animations of size n, if notion of free-ness doesn't exist is not true at size n but asymptotically as n goes to infinity is what I'm gonna say is true and at least one is rotation invariant. Okay, so for instance, if I take A plus a random matrix a random matrix OB or transpose by multiplying by random rotation matrix this I made B explicitly made B rotation invariant, okay? And so any matrix A could be a fixed matrix or it could be a Wigner matrix or it could be in the next hour we'll discuss about Wishart matrix it could be any matrix and then I take B again, any matrix it could be random, it could be fixed but I multiply by random rotation by here O is a rotation for another language taken from the hard measure on the orthogonal group and of course this is true as well for unitary matrices if I'm looking at complex admission matrices, okay? So if A and B are free or if they have these two property if they're large and at least one of them is rotation invariant that I can make explicitly then I can compute this object C the sum of the two matrices, okay? At least I can say many things about the matrix C which would be the sum, okay? So remember in classical probability distribution if I have a random number A and a random number B if I take this, if I say C equals A plus B so these are classical random numbers I can say something about C if A and B are independent, okay? And then C will be the convolution of the law of A the law of C will be the convolution of the law of A and the law of B, okay? For non-commuting object there's a similar notion, similar but different it gives different results and gives different law of large numbers many different things are different and this is called the notion of free probability but what I wanna say is that it applies to large matrices where one of them is rotation invariant and one thing I can say there's a function R that we'll define in a moment I can say that R of the matrix X will be equal to R of the matrix A plus R of the matrix B, okay? So there exists a function that characterizes the spectrum of the matrix C and this function is additive, okay? And this is extremely useful because if I can go from the spectrum to R and from R to the spectrum while knowing the spectrum of A and B if I take a rotation invariant combination of A and B I sum them I can compute the spectrum of the sum, okay? And the reason I insist on the fact that the still just transform is invertible is because the function R is directly related to the inverse of the still just transform, okay? So I wanna finish and then we can go for the coffee break with this fact. So this function, this wonderful function that's additive for free, for additional free objects is defined as R, let me write as R of G is the function Z of G. So it is Z of G is the inverse of G of Z. So I think the limiting still just transform on the real axis but away from the eigenvalues it's invertible and I call this function Z of G and it turns out this function always has a pole it always starts as one over G. So the R transform actually you remove this pole minus one over G and then it becomes an analytical function it's regular and zero, okay? So let's do this just for the Wigner ensemble. For the Wigner ensemble I have the Z of G is one over G plus sigma squared G and so the R transform I just get rid of the one over G. So I have that R for a Wigner matrix X of G is simply given by sigma squared G, okay? This is obviously the simplest R transform, okay? Every other random matrix ensemble will have a much more complicated and but this is natural because the Wigner ensemble is a bunch of Gaussian random numbers and the central limit theorem on Gaussian random numbers and it turns out that if you sum a lot of random matrices you end up with a Wigner matrix. So there's a central limit theorem that I won't go into the detail but if you add a bunch of randomly rotated matrices and you rescale them to keep the variance constant basically you'll get rid of, you'll start maybe with a very complicated R function but by adding and scaling and adding and scaling you get a central limit theorem and you converge to this, okay? And then from this, once you know the R transform you can just go back here, add back the pole at one over G, invert the function and you get the still just transform of the ensemble C and then you look at the imaginary part of that on the real axis and you get the density. So this is how you compute say sums of random variable. That's one more thing I wanna say. I won't use it for the addition but in the next hour I'll talk about multiplication. It's called a subordination relation. So this equation here, well, this combined with this I can rewrite it that if I look at the still just transforms I put G of C of some variable Z. This is just, it's really just some algebra. I take this formula and this formula, I recombine them. I get this nice formula that G A of Z minus R B of G C of Z. Trust me, you can do it as an exercise. Try to combine this and this, okay? And you have that the still just transform of the matrix C at some point is equal to the still just transform of the matrix A but at a different point. And of course this different point it's a nasty implicit equation because the point at which you need to evaluate depends on the R transformer B and but also on the still just transform that you're trying to get to, okay? So this by itself, well, it's useful but not super useful. Now what turns out if I go back to my big resolvent. So I had this matrix, big G of C which was defined as Z identity minus the matrix C inverse. This is a random object, okay? This is a n square dimensional object. So it's random in no way converges to anything in particular. So now I explicitly need to take the expectation value but if I look at this expression here of C which is given by A and B I can take the expectation value with respect to this object here. B could be itself random or just expectation value with respect to these random rotation and has this beautiful relation that's the same as this but now for a large matrix but in expectation value at the expectation value of this resolvent matrix G C of Z equals the resolvent matrix of A which I assume A is fixed. If it's not fixed then it's conditioned on A, okay? Evaluate it, it's the same expression, okay? So what does it mean? Well, there are two consequences. Well, GA is a matrix built from A. So it commutes with A. It's a multiple of identity. Well, this is for C but for A is the same thing, same expression. So it's a multiple of identity, the matrix A some inverse. So this is a matrix, it's a large matrix that commutes with A. C is a matrix that's built from A and some random stuff. So it's usually C doesn't commute with A but in expectation value it does. So one beautiful thing and it sort of makes sense just why the C in expectation value commutes with A because so again, the random matrix C doesn't commute with A. I add something random but think in the basis of A. So the basis where A is diagonal, C is something diagonal plus something random but something random that's rotation invariant cannot have off-diagonal elements on average. Of course on one particular realization it will have off-diagonal elements but on average there's kind of a symmetry. Anything that you put off the diagonal you could flip some axis and you would put the same thing with the minus sign. So the average of the off-diagonal elements are zero. So it's important the expectation value of C conditioned to A is in the same basis of A. And this is a set, well this state is more than that but what's implied here is that, and why is all this important because I'm gonna talk about eigenvector problems and if I'm gonna try to look at eigenvectors of matrices, remember that this big matrix G, it's a projector on each of the eigenvectors. And so I'll need to compute expectations of projectors and I'll need this formula. So definitely on Wednesday, there's a similar formula for matrix multiplication that I'll show later. And if I wanna look at eigenvectors of matrices that have been multiplied, this expression is extremely useful. And I think I should stop here and ask for questions or a coffee break. Thanks. Any questions or you can ask them here. Jean, any questions from the... There were no questions. I think you're very clear. Okay. Sorry, I was a bit fast but this is sort of material that people should know in random matrix theory. I think it was really a great introduction for everything we're going to do. Actually, we will know much more about free probability with Kami, right? So... Okay. All right, so we have a coffee break upstairs on the terrace, I think. And we start, let's be there around 1040 or, okay. See you.